TY - JOUR
T1 - Interpretable predictive models for knowledge discovery from home-care electronic health records
AU - Westra, Bonnie L
AU - Dey, Sanjoy
AU - Fang, Gang
AU - Steinbach, Michael S
AU - Kumar, Vipin
AU - Oancea, Cristina
AU - Savik, Kay
AU - Dierich, Mary T
PY - 2011/3
Y1 - 2011/3
N2 - The purpose of this methodological study was to compare methods of developing predictive rules that are parsimonious and clinically interpretable from electronic health record (EHR) home visit data, contrasting logistic regression with three data mining classification models. We address three problems commonly encountered in EHRs: the value of including clinically important variables with little variance, handling imbalanced datasets, and ease of interpretation of the resulting predictive models. Logistic regression and three classification models using Ripper, decision trees, and Support Vector Machines were applied to a case study for one outcome of improvement in oral medication management. Predictive rules for logistic regression, Ripper, and decision trees are reported and results compared using F-measures for data mining models and area under the receiver-operating characteristic curve for all models. The rules generated by the three classification models provide potentially novel insights into mining EHRs beyond those provided by standard logistic regression, and suggest steps for further study.
AB - The purpose of this methodological study was to compare methods of developing predictive rules that are parsimonious and clinically interpretable from electronic health record (EHR) home visit data, contrasting logistic regression with three data mining classification models. We address three problems commonly encountered in EHRs: the value of including clinically important variables with little variance, handling imbalanced datasets, and ease of interpretation of the resulting predictive models. Logistic regression and three classification models using Ripper, decision trees, and Support Vector Machines were applied to a case study for one outcome of improvement in oral medication management. Predictive rules for logistic regression, Ripper, and decision trees are reported and results compared using F-measures for data mining models and area under the receiver-operating characteristic curve for all models. The rules generated by the three classification models provide potentially novel insights into mining EHRs beyond those provided by standard logistic regression, and suggest steps for further study.
KW - Data mining
KW - Electronic health records
KW - Home care
KW - Oral medication management
KW - Rules classification
UR - http://www.scopus.com/inward/record.url?scp=84864070998&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84864070998&partnerID=8YFLogxK
U2 - 10.1260/2040-2295.2.1.55
DO - 10.1260/2040-2295.2.1.55
M3 - Article
AN - SCOPUS:84864070998
SN - 2040-2295
VL - 2
SP - 55
EP - 74
JO - Journal of healthcare engineering
JF - Journal of healthcare engineering
IS - 1
ER -