TY - JOUR
T1 - Towards semantic role labeling & IE in the medical literature.
AU - Kogan, Yacov
AU - Collier, Nigel
AU - Pakhomov, Serguei
AU - Krauthammer, Michael
PY - 2005
Y1 - 2005
N2 - INTRODUCTION: In this work, we introduce the concept of semantic role labeling to the medical domain. We report first results of porting and adapting an existing resource, Propbank, to the medical field. Propbank is an adjunct to Penn Treebank that provides semantic annotation of predicates and the roles played by their arguments. The main aim of this work is the applicability of the Propbank frame files to predicates typically encountered in the medical literature. METHODS: We analyzed a target corpus of 610,100 abstracts, which was selected by searching for publication type "case reports". From this target corpus, we randomly selected 10,000 sample abstracts to estimate the predicate distribution, and matched the predicates from this sample to the predicates in Propbank. RESULTS: Of the 1998 unique verbs in our sample, 76% were represented in Propbank. This included the 40 most frequent verbs, which represented 49% of all predicate instances in our sample and which matched the Propbank usage in a study of representative sentences. We propose extensions to Propbank that handle medical predicates, which are not adequately covered by Propbank. CONCLUSION: We believe that semantic role labeling using Propbank is a valid approach to capture predicate relations in the medical literature.
AB - INTRODUCTION: In this work, we introduce the concept of semantic role labeling to the medical domain. We report first results of porting and adapting an existing resource, Propbank, to the medical field. Propbank is an adjunct to Penn Treebank that provides semantic annotation of predicates and the roles played by their arguments. The main aim of this work is the applicability of the Propbank frame files to predicates typically encountered in the medical literature. METHODS: We analyzed a target corpus of 610,100 abstracts, which was selected by searching for publication type "case reports". From this target corpus, we randomly selected 10,000 sample abstracts to estimate the predicate distribution, and matched the predicates from this sample to the predicates in Propbank. RESULTS: Of the 1998 unique verbs in our sample, 76% were represented in Propbank. This included the 40 most frequent verbs, which represented 49% of all predicate instances in our sample and which matched the Propbank usage in a study of representative sentences. We propose extensions to Propbank that handle medical predicates, which are not adequately covered by Propbank. CONCLUSION: We believe that semantic role labeling using Propbank is a valid approach to capture predicate relations in the medical literature.
UR - http://www.scopus.com/inward/record.url?scp=33947402171&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33947402171&partnerID=8YFLogxK
M3 - Article
C2 - 16779072
AN - SCOPUS:33947402171
SN - 1559-4076
SP - 410
EP - 414
JO - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
JF - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
ER -