Background: Cardiac Resynchronization Therapy (CRT) is an established pacing therapy for heart failure patients. The New York Heart Association (NYHA) class is often used as a measure of a patient's response to CRT. Identifying NYHA class for heart failure (HF) patients in an electronic health record (EHR) consistently, over time, can provide better understanding of the progression of heart failure and assessment of CRT response and effectiveness. Though NYHA is rarely stored in EHR structured data, such information is often documented in unstructured clinical notes. Methods: We accessed HF patients' data in a local EHR system and identified potential sources of NYHA, including local diagnosis codes, procedures, and clinical notes. We further investigated and compared the performances of rule-based versus machine learning-based natural language processing (NLP) methods to identify NYHA class from clinical notes. Results: Of the 36,276 patients with a diagnosis of HF or a CRT implant, 19.2% had NYHA class mentioned at least once in their EHR. While NYHA class existed in descriptive fields association with diagnosis codes (31%) or procedure codes (2%), the richest source of NYHA class was clinical notes (95%). A total of 6174 clinical notes were matched with hospital-specific custom NYHA class diagnosis codes. Machine learning-based methods outperformed a rule-based method. The best machine-learning method was a random forest with n-gram features (F-measure: 93.78%). Conclusions: NYHA class is documented in different parts in EHR for HF patients and the documentation rate is lower than expected. NLP methods are a feasible way to extract NYHA class information from clinical notes.
Bibliographical noteFunding Information:
This research and publication of this article were supported by the Medtronic, Inc.
© 2018 The Author(s).
- Clinical notes
- Electronic health records
- Natural language processing
- New York heart association (NYHA)