Discovering and identifying New York heart association classification from electronic health records

Rui Zhang; Sisi Ma; Liesa Shanahan; Jessica Munroe; Sarah Horn; Stuart Speedie

doi:10.1186/s12911-018-0625-7

Discovering and identifying New York heart association classification from electronic health records

Rui Zhang, Sisi Ma, Liesa Shanahan, Jessica Munroe, Sarah Horn, Stuart Speedie

Research output: Contribution to journal › Article › peer-review

24 Scopus citations

Abstract

Background: Cardiac Resynchronization Therapy (CRT) is an established pacing therapy for heart failure patients. The New York Heart Association (NYHA) class is often used as a measure of a patient's response to CRT. Identifying NYHA class for heart failure (HF) patients in an electronic health record (EHR) consistently, over time, can provide better understanding of the progression of heart failure and assessment of CRT response and effectiveness. Though NYHA is rarely stored in EHR structured data, such information is often documented in unstructured clinical notes. Methods: We accessed HF patients' data in a local EHR system and identified potential sources of NYHA, including local diagnosis codes, procedures, and clinical notes. We further investigated and compared the performances of rule-based versus machine learning-based natural language processing (NLP) methods to identify NYHA class from clinical notes. Results: Of the 36,276 patients with a diagnosis of HF or a CRT implant, 19.2% had NYHA class mentioned at least once in their EHR. While NYHA class existed in descriptive fields association with diagnosis codes (31%) or procedure codes (2%), the richest source of NYHA class was clinical notes (95%). A total of 6174 clinical notes were matched with hospital-specific custom NYHA class diagnosis codes. Machine learning-based methods outperformed a rule-based method. The best machine-learning method was a random forest with n-gram features (F-measure: 93.78%). Conclusions: NYHA class is documented in different parts in EHR for HF patients and the documentation rate is lower than expected. NLP methods are a feasible way to extract NYHA class information from clinical notes.

Original language	English (US)
Article number	48
Journal	BMC medical informatics and decision making
Volume	18
DOIs	https://doi.org/10.1186/s12911-018-0625-7
State	Published - Jul 23 2018

Bibliographical note

Publisher Copyright:
© 2018 The Author(s).

Keywords

Clinical notes
Electronic health records
Natural language processing
New York heart association (NYHA)

Access

10.1186/s12911-018-0625-7

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6069768

OpenUrl availability

Full text

Cite this

@article{9a7c19d42859492a93100d0d1d0715cb,

title = "Discovering and identifying New York heart association classification from electronic health records",

abstract = "Background: Cardiac Resynchronization Therapy (CRT) is an established pacing therapy for heart failure patients. The New York Heart Association (NYHA) class is often used as a measure of a patient's response to CRT. Identifying NYHA class for heart failure (HF) patients in an electronic health record (EHR) consistently, over time, can provide better understanding of the progression of heart failure and assessment of CRT response and effectiveness. Though NYHA is rarely stored in EHR structured data, such information is often documented in unstructured clinical notes. Methods: We accessed HF patients' data in a local EHR system and identified potential sources of NYHA, including local diagnosis codes, procedures, and clinical notes. We further investigated and compared the performances of rule-based versus machine learning-based natural language processing (NLP) methods to identify NYHA class from clinical notes. Results: Of the 36,276 patients with a diagnosis of HF or a CRT implant, 19.2% had NYHA class mentioned at least once in their EHR. While NYHA class existed in descriptive fields association with diagnosis codes (31%) or procedure codes (2%), the richest source of NYHA class was clinical notes (95%). A total of 6174 clinical notes were matched with hospital-specific custom NYHA class diagnosis codes. Machine learning-based methods outperformed a rule-based method. The best machine-learning method was a random forest with n-gram features (F-measure: 93.78%). Conclusions: NYHA class is documented in different parts in EHR for HF patients and the documentation rate is lower than expected. NLP methods are a feasible way to extract NYHA class information from clinical notes.",

keywords = "Clinical notes, Electronic health records, Natural language processing, New York heart association (NYHA)",

author = "Rui Zhang and Sisi Ma and Liesa Shanahan and Jessica Munroe and Sarah Horn and Stuart Speedie",

note = "Publisher Copyright: {\textcopyright} 2018 The Author(s).",

year = "2018",

month = jul,

day = "23",

doi = "10.1186/s12911-018-0625-7",

language = "English (US)",

volume = "18",

journal = "BMC medical informatics and decision making",

issn = "1472-6947",

publisher = "BioMed Central",

}

TY - JOUR

T1 - Discovering and identifying New York heart association classification from electronic health records

AU - Zhang, Rui

AU - Ma, Sisi

AU - Shanahan, Liesa

AU - Munroe, Jessica

AU - Horn, Sarah

AU - Speedie, Stuart

PY - 2018/7/23

Y1 - 2018/7/23

N2 - Background: Cardiac Resynchronization Therapy (CRT) is an established pacing therapy for heart failure patients. The New York Heart Association (NYHA) class is often used as a measure of a patient's response to CRT. Identifying NYHA class for heart failure (HF) patients in an electronic health record (EHR) consistently, over time, can provide better understanding of the progression of heart failure and assessment of CRT response and effectiveness. Though NYHA is rarely stored in EHR structured data, such information is often documented in unstructured clinical notes. Methods: We accessed HF patients' data in a local EHR system and identified potential sources of NYHA, including local diagnosis codes, procedures, and clinical notes. We further investigated and compared the performances of rule-based versus machine learning-based natural language processing (NLP) methods to identify NYHA class from clinical notes. Results: Of the 36,276 patients with a diagnosis of HF or a CRT implant, 19.2% had NYHA class mentioned at least once in their EHR. While NYHA class existed in descriptive fields association with diagnosis codes (31%) or procedure codes (2%), the richest source of NYHA class was clinical notes (95%). A total of 6174 clinical notes were matched with hospital-specific custom NYHA class diagnosis codes. Machine learning-based methods outperformed a rule-based method. The best machine-learning method was a random forest with n-gram features (F-measure: 93.78%). Conclusions: NYHA class is documented in different parts in EHR for HF patients and the documentation rate is lower than expected. NLP methods are a feasible way to extract NYHA class information from clinical notes.

AB - Background: Cardiac Resynchronization Therapy (CRT) is an established pacing therapy for heart failure patients. The New York Heart Association (NYHA) class is often used as a measure of a patient's response to CRT. Identifying NYHA class for heart failure (HF) patients in an electronic health record (EHR) consistently, over time, can provide better understanding of the progression of heart failure and assessment of CRT response and effectiveness. Though NYHA is rarely stored in EHR structured data, such information is often documented in unstructured clinical notes. Methods: We accessed HF patients' data in a local EHR system and identified potential sources of NYHA, including local diagnosis codes, procedures, and clinical notes. We further investigated and compared the performances of rule-based versus machine learning-based natural language processing (NLP) methods to identify NYHA class from clinical notes. Results: Of the 36,276 patients with a diagnosis of HF or a CRT implant, 19.2% had NYHA class mentioned at least once in their EHR. While NYHA class existed in descriptive fields association with diagnosis codes (31%) or procedure codes (2%), the richest source of NYHA class was clinical notes (95%). A total of 6174 clinical notes were matched with hospital-specific custom NYHA class diagnosis codes. Machine learning-based methods outperformed a rule-based method. The best machine-learning method was a random forest with n-gram features (F-measure: 93.78%). Conclusions: NYHA class is documented in different parts in EHR for HF patients and the documentation rate is lower than expected. NLP methods are a feasible way to extract NYHA class information from clinical notes.

KW - Clinical notes

KW - Electronic health records

KW - Natural language processing

KW - New York heart association (NYHA)

UR - http://www.scopus.com/inward/record.url?scp=85050823536&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050823536&partnerID=8YFLogxK

U2 - 10.1186/s12911-018-0625-7

DO - 10.1186/s12911-018-0625-7

M3 - Article

C2 - 30066653

AN - SCOPUS:85050823536

SN - 1472-6947

VL - 18

JO - BMC medical informatics and decision making

JF - BMC medical informatics and decision making

M1 - 48

ER -

Discovering and identifying New York heart association classification from electronic health records

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this