High throughput modularized NLP system for clinical text

Serguei Pakhomov, James Buntrock, Patrick Duffy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

20 Scopus citations

Abstract

This paper presents the results of the development of a high throughput, real time modularized text analysis and information retrieval system that identifies clinically relevant entities in clinical notes, maps the entities to several standardized nomenclatures and makes them available for subsequent information retrieval and data mining. The performance of the system was validated on a small collection of 351 documents partitioned into 4 query topics and manually examined by 3 physicians and 3 nurse abstractors for relevance to the query topics. We find that simple key phrase searching results in 73% recall and 77% precision. A combination of NLP approaches to indexing improve the recall to 92%, while lowering the precision to 67%.

Original languageEnglish (US)
Title of host publicationACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
Pages25-28
Number of pages4
StatePublished - 2005
Event43rd Annual Meeting of the Association for Computational Linguistics, ACL-05 - Ann Arbor, MI, United States
Duration: Jun 25 2005Jun 30 2005

Publication series

NameACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

Other

Other43rd Annual Meeting of the Association for Computational Linguistics, ACL-05
Country/TerritoryUnited States
CityAnn Arbor, MI
Period6/25/056/30/05

Fingerprint

Dive into the research topics of 'High throughput modularized NLP system for clinical text'. Together they form a unique fingerprint.

Cite this