Abstract
Large observational data networks that leverage routine clinical practice data in electronic health records (EHRs) are critical resources for research on coronavirus disease 2019 (COVID-19). Data normalization is a key challenge for the secondary use of EHRs for COVID-19 research across institutions. In this study, we addressed the challenge of automating the normalization of COVID-19 diagnostic tests, which are critical data elements, but for which controlled terminology terms were published after clinical implementation. We developed a simple but effective rule-based tool called COVID-19 TestNorm to automatically normalize local COVID-19 testing names to standard LOINC (Logical Observation Identifiers Names and Codes) codes. COVID-19 TestNorm was developed and evaluated using 568 test names collected from 8 healthcare systems. Our results show that it could achieve an accuracy of 97.4% on an independent test set. COVID-19 TestNorm is available as an open-source package for developers and as an online Web application for end users (https://clamp.uth.edu/covid/loinc.php). We believe that it will be a useful tool to support secondary use of EHRs for research on COVID-19.
Original language | English (US) |
---|---|
Pages (from-to) | 1437-1442 |
Number of pages | 6 |
Journal | Journal of the American Medical Informatics Association |
Volume | 27 |
Issue number | 9 |
DOIs | |
State | Published - Sep 1 2020 |
Bibliographical note
Publisher Copyright:© The Author(s) 2020.
Keywords
- COVID-19
- COVID-19 TestNorm
- LOINC
- Natural language processing
- Testing name normalization