Interpretable predictive models for knowledge discovery from home-care electronic health records

Bonnie L Westra; Sanjoy Dey; Gang Fang; Michael S Steinbach; Vipin Kumar; Cristina Oancea; Kay Savik; Mary T Dierich

doi:10.1260/2040-2295.2.1.55

Interpretable predictive models for knowledge discovery from home-care electronic health records

Bonnie L Westra, Sanjoy Dey, Gang Fang, Michael S Steinbach, Vipin Kumar, Cristina Oancea, Kay Savik, Mary T Dierich

Research output: Contribution to journal › Article › peer-review

14 Scopus citations

Abstract

The purpose of this methodological study was to compare methods of developing predictive rules that are parsimonious and clinically interpretable from electronic health record (EHR) home visit data, contrasting logistic regression with three data mining classification models. We address three problems commonly encountered in EHRs: the value of including clinically important variables with little variance, handling imbalanced datasets, and ease of interpretation of the resulting predictive models. Logistic regression and three classification models using Ripper, decision trees, and Support Vector Machines were applied to a case study for one outcome of improvement in oral medication management. Predictive rules for logistic regression, Ripper, and decision trees are reported and results compared using F-measures for data mining models and area under the receiver-operating characteristic curve for all models. The rules generated by the three classification models provide potentially novel insights into mining EHRs beyond those provided by standard logistic regression, and suggest steps for further study.

Original language	English (US)
Pages (from-to)	55-74
Number of pages	20
Journal	Journal of healthcare engineering
Volume	2
Issue number	1
DOIs	https://doi.org/10.1260/2040-2295.2.1.55
State	Published - Mar 2011

Keywords

Data mining
Electronic health records
Home care
Oral medication management
Rules classification

Access

10.1260/2040-2295.2.1.55

OpenUrl availability

Full text

Cite this

@article{3ca7753344994f45bc5f1b8f7a8d26c8,

title = "Interpretable predictive models for knowledge discovery from home-care electronic health records",

abstract = "The purpose of this methodological study was to compare methods of developing predictive rules that are parsimonious and clinically interpretable from electronic health record (EHR) home visit data, contrasting logistic regression with three data mining classification models. We address three problems commonly encountered in EHRs: the value of including clinically important variables with little variance, handling imbalanced datasets, and ease of interpretation of the resulting predictive models. Logistic regression and three classification models using Ripper, decision trees, and Support Vector Machines were applied to a case study for one outcome of improvement in oral medication management. Predictive rules for logistic regression, Ripper, and decision trees are reported and results compared using F-measures for data mining models and area under the receiver-operating characteristic curve for all models. The rules generated by the three classification models provide potentially novel insights into mining EHRs beyond those provided by standard logistic regression, and suggest steps for further study.",

keywords = "Data mining, Electronic health records, Home care, Oral medication management, Rules classification",

author = "Westra, {Bonnie L} and Sanjoy Dey and Gang Fang and Steinbach, {Michael S} and Vipin Kumar and Cristina Oancea and Kay Savik and Dierich, {Mary T}",

year = "2011",

month = mar,

doi = "10.1260/2040-2295.2.1.55",

language = "English (US)",

volume = "2",

pages = "55--74",

journal = "Journal of healthcare engineering",

issn = "2040-2295",

publisher = "Multi Science Publishing",

number = "1",

}

TY - JOUR

T1 - Interpretable predictive models for knowledge discovery from home-care electronic health records

AU - Westra, Bonnie L

AU - Dey, Sanjoy

AU - Fang, Gang

AU - Steinbach, Michael S

AU - Kumar, Vipin

AU - Oancea, Cristina

AU - Savik, Kay

AU - Dierich, Mary T

PY - 2011/3

Y1 - 2011/3

N2 - The purpose of this methodological study was to compare methods of developing predictive rules that are parsimonious and clinically interpretable from electronic health record (EHR) home visit data, contrasting logistic regression with three data mining classification models. We address three problems commonly encountered in EHRs: the value of including clinically important variables with little variance, handling imbalanced datasets, and ease of interpretation of the resulting predictive models. Logistic regression and three classification models using Ripper, decision trees, and Support Vector Machines were applied to a case study for one outcome of improvement in oral medication management. Predictive rules for logistic regression, Ripper, and decision trees are reported and results compared using F-measures for data mining models and area under the receiver-operating characteristic curve for all models. The rules generated by the three classification models provide potentially novel insights into mining EHRs beyond those provided by standard logistic regression, and suggest steps for further study.

AB - The purpose of this methodological study was to compare methods of developing predictive rules that are parsimonious and clinically interpretable from electronic health record (EHR) home visit data, contrasting logistic regression with three data mining classification models. We address three problems commonly encountered in EHRs: the value of including clinically important variables with little variance, handling imbalanced datasets, and ease of interpretation of the resulting predictive models. Logistic regression and three classification models using Ripper, decision trees, and Support Vector Machines were applied to a case study for one outcome of improvement in oral medication management. Predictive rules for logistic regression, Ripper, and decision trees are reported and results compared using F-measures for data mining models and area under the receiver-operating characteristic curve for all models. The rules generated by the three classification models provide potentially novel insights into mining EHRs beyond those provided by standard logistic regression, and suggest steps for further study.

KW - Data mining

KW - Electronic health records

KW - Home care

KW - Oral medication management

KW - Rules classification

UR - http://www.scopus.com/inward/record.url?scp=84864070998&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84864070998&partnerID=8YFLogxK

U2 - 10.1260/2040-2295.2.1.55

DO - 10.1260/2040-2295.2.1.55

M3 - Article

AN - SCOPUS:84864070998

SN - 2040-2295

VL - 2

SP - 55

EP - 74

JO - Journal of healthcare engineering

JF - Journal of healthcare engineering

IS - 1

ER -

Interpretable predictive models for knowledge discovery from home-care electronic health records

Abstract

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this