TY - JOUR
T1 - Prediction of cardiovascular outcomes with machine learning techniques
T2 - Application to the cardiovascular outcomes in renal atherosclerotic lesions (CORAL) study
AU - Chen, Tian
AU - Brewster, Pamela
AU - Tuttle, Katherine R.
AU - Dworkin, Lance D.
AU - Henrich, William
AU - Greco, Barbara A.
AU - Steffes, Michael
AU - Tobe, Sheldon
AU - Jamerson, Kenneth
AU - Pencina, Karol
AU - Massaro, Joseph M.
AU - D’Agostino, Ralph B.
AU - Cutlip, Donald E.
AU - Murphy, Timothy P.
AU - Cooper, Christopher J.
AU - Shapiro, Joseph I.
N1 - Publisher Copyright:
© 2019 Chen et al.
PY - 2019
Y1 - 2019
N2 - Background: Data derived from the Cardiovascular Outcomes in Renal Atherosclerotic Lesions (CORAL) study were analyzed in an effort to employ machine learning methods to predict the composite endpoint described in the original study. Methods: We identified 573 CORAL subjects with complete baseline data and the presence or absence of a composite endpoint for the study. These data were subjected to several models including a generalized linear (logistic-linear) model, support vector machine, decision tree, feed-forward neural network, and random forest, in an effort to attempt to predict the composite endpoint. The subjects were arbitrarily divided into training and testing subsets according to an 80%:20% distribution with various seeds. Prediction models were optimized within the CARET package of R. Results: The best performance of the different machine learning techniques was that of the random forest method which yielded a receiver operator curve (ROC) area of 68.1%±4.2% (mean ± SD) on the testing subset with ten different seed values used to separate training and testing subsets. The four most important variables in the random forest method were SBP, serum creatinine, glycosylated hemoglobin, and DBP. Each of these variables was also important in at least some of the other methods. The treatment assignment group was not consistently an important determinant in any of the models. Conclusion: Prediction of a composite cardiovascular outcome was difficult in the CORAL population, even when employing machine learning methods. Assignment to either the stenting or best medical therapy group did not serve as an important predictor of composite outcome. Clinical Trial Registration: ClinicalTrials.gov, NCT00081731.
AB - Background: Data derived from the Cardiovascular Outcomes in Renal Atherosclerotic Lesions (CORAL) study were analyzed in an effort to employ machine learning methods to predict the composite endpoint described in the original study. Methods: We identified 573 CORAL subjects with complete baseline data and the presence or absence of a composite endpoint for the study. These data were subjected to several models including a generalized linear (logistic-linear) model, support vector machine, decision tree, feed-forward neural network, and random forest, in an effort to attempt to predict the composite endpoint. The subjects were arbitrarily divided into training and testing subsets according to an 80%:20% distribution with various seeds. Prediction models were optimized within the CARET package of R. Results: The best performance of the different machine learning techniques was that of the random forest method which yielded a receiver operator curve (ROC) area of 68.1%±4.2% (mean ± SD) on the testing subset with ten different seed values used to separate training and testing subsets. The four most important variables in the random forest method were SBP, serum creatinine, glycosylated hemoglobin, and DBP. Each of these variables was also important in at least some of the other methods. The treatment assignment group was not consistently an important determinant in any of the models. Conclusion: Prediction of a composite cardiovascular outcome was difficult in the CORAL population, even when employing machine learning methods. Assignment to either the stenting or best medical therapy group did not serve as an important predictor of composite outcome. Clinical Trial Registration: ClinicalTrials.gov, NCT00081731.
KW - Cardiovascular disease
KW - Chronic kidney disease
KW - Glomerular filtration rate
KW - Hypertension
KW - Ischemic renal disease
KW - Renal artery stenosis
UR - http://www.scopus.com/inward/record.url?scp=85067478940&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85067478940&partnerID=8YFLogxK
U2 - 10.2147/IJNRD.S194727
DO - 10.2147/IJNRD.S194727
M3 - Article
C2 - 30962703
AN - SCOPUS:85067478940
SN - 1178-7058
VL - 12
SP - 49
EP - 58
JO - International Journal of Nephrology and Renovascular Disease
JF - International Journal of Nephrology and Renovascular Disease
ER -