Prospective recruitment of patients with congestive heart failure using an ad-hoc binary classifier

Serguei V. Pakhomov, James Buntrock, Christopher G. Chute

Research output: Contribution to journalArticlepeer-review

43 Scopus citations

Abstract

This paper addresses a very specific problem of identifying patients diagnosed with a specific condition for potential recruitment in a clinical trial or an epidemiological study. We present a simple machine learning method for identifying patients diagnosed with congestive heart failure and other related conditions by automatically classifying clinical notes dictated at Mayo Clinic. This method relies on an automatic classifier trained on comparable amounts of positive and negative samples of clinical notes previously categorized by human experts. The documents are represented as feature vectors, where features are a mix of demographic information as well as single words and concept mappings to MeSH and HICDA classification systems. We compare two simple and efficient classification algorithms (Naïve Bayes and Perceptron) and a baseline term spotting method with respect to their accuracy and recall on positive samples. Depending on the test set, we find that Naïve Bayes yields better recall on positive samples (95 vs. 86%) but worse accuracy than Perceptron (57 vs. 65%). Both algorithms perform better than the baseline with recall on positive samples of 71% and accuracy of 54%.

Original languageEnglish (US)
Pages (from-to)145-153
Number of pages9
JournalJournal of Biomedical Informatics
Volume38
Issue number2
DOIs
StatePublished - Apr 2005
Externally publishedYes

Keywords

  • Automatic classification
  • Congestive heart failure
  • Machine learning
  • Medical informatics
  • Natural language processing
  • Naïve Bayes
  • Perceptron

Fingerprint

Dive into the research topics of 'Prospective recruitment of patients with congestive heart failure using an ad-hoc binary classifier'. Together they form a unique fingerprint.

Cite this