HITON: a novel Markov Blanket algorithm for optimal variable selection.

C. F. Aliferis; I. Tsamardinos; A. Statnikov

HITON: a novel Markov Blanket algorithm for optimal variable selection.

C. F. Aliferis, I. Tsamardinos, A. Statnikov

Institute for Health Informatics

Research output: Contribution to journal › Article › peer-review

241 Scopus citations

Abstract

We introduce a novel, sound, sample-efficient, and highly-scalable algorithm for variable selection for classification, regression and prediction called HITON. The algorithm works by inducing the Markov Blanket of the variable to be classified or predicted. A wide variety of biomedical tasks with different characteristics were used for an empirical evaluation. Namely, (i) bioactivity prediction for drug discovery, (ii) clinical diagnosis of arrhythmias, (iii) bibliographic text categorization, (iv) lung cancer diagnosis from gene expression array data, and (v) proteomics-based prostate cancer detection. State-of-the-art algorithms for each domain were selected for baseline comparison. Results: (1) HITON reduces the number of variables in the prediction models by three orders of magnitude relative to the original variable set while improving or maintaining accuracy. (2) HITON outperforms the baseline algorithms by selecting more than two orders-of-magnitude smaller variable sets than the baselines, in the selected tasks and datasets.

Original language	English (US)
Pages (from-to)	21-25
Number of pages	5
Journal	AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
State	Published - 2003

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

OpenUrl availability

Full text

Cite this

@article{b6597556e904415093eaf046553e6c14,

title = "HITON: a novel Markov Blanket algorithm for optimal variable selection.",

abstract = "We introduce a novel, sound, sample-efficient, and highly-scalable algorithm for variable selection for classification, regression and prediction called HITON. The algorithm works by inducing the Markov Blanket of the variable to be classified or predicted. A wide variety of biomedical tasks with different characteristics were used for an empirical evaluation. Namely, (i) bioactivity prediction for drug discovery, (ii) clinical diagnosis of arrhythmias, (iii) bibliographic text categorization, (iv) lung cancer diagnosis from gene expression array data, and (v) proteomics-based prostate cancer detection. State-of-the-art algorithms for each domain were selected for baseline comparison. Results: (1) HITON reduces the number of variables in the prediction models by three orders of magnitude relative to the original variable set while improving or maintaining accuracy. (2) HITON outperforms the baseline algorithms by selecting more than two orders-of-magnitude smaller variable sets than the baselines, in the selected tasks and datasets.",

author = "Aliferis, {C. F.} and I. Tsamardinos and A. Statnikov",

year = "2003",

language = "English (US)",

pages = "21--25",

journal = "AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium",

issn = "1559-4076",

publisher = "American Medical Informatics Association",

}

TY - JOUR

T1 - HITON

T2 - a novel Markov Blanket algorithm for optimal variable selection.

AU - Aliferis, C. F.

AU - Tsamardinos, I.

AU - Statnikov, A.

PY - 2003

Y1 - 2003

N2 - We introduce a novel, sound, sample-efficient, and highly-scalable algorithm for variable selection for classification, regression and prediction called HITON. The algorithm works by inducing the Markov Blanket of the variable to be classified or predicted. A wide variety of biomedical tasks with different characteristics were used for an empirical evaluation. Namely, (i) bioactivity prediction for drug discovery, (ii) clinical diagnosis of arrhythmias, (iii) bibliographic text categorization, (iv) lung cancer diagnosis from gene expression array data, and (v) proteomics-based prostate cancer detection. State-of-the-art algorithms for each domain were selected for baseline comparison. Results: (1) HITON reduces the number of variables in the prediction models by three orders of magnitude relative to the original variable set while improving or maintaining accuracy. (2) HITON outperforms the baseline algorithms by selecting more than two orders-of-magnitude smaller variable sets than the baselines, in the selected tasks and datasets.

AB - We introduce a novel, sound, sample-efficient, and highly-scalable algorithm for variable selection for classification, regression and prediction called HITON. The algorithm works by inducing the Markov Blanket of the variable to be classified or predicted. A wide variety of biomedical tasks with different characteristics were used for an empirical evaluation. Namely, (i) bioactivity prediction for drug discovery, (ii) clinical diagnosis of arrhythmias, (iii) bibliographic text categorization, (iv) lung cancer diagnosis from gene expression array data, and (v) proteomics-based prostate cancer detection. State-of-the-art algorithms for each domain were selected for baseline comparison. Results: (1) HITON reduces the number of variables in the prediction models by three orders of magnitude relative to the original variable set while improving or maintaining accuracy. (2) HITON outperforms the baseline algorithms by selecting more than two orders-of-magnitude smaller variable sets than the baselines, in the selected tasks and datasets.

UR - http://www.scopus.com/inward/record.url?scp=14344265835&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=14344265835&partnerID=8YFLogxK

M3 - Article

C2 - 14728126

AN - SCOPUS:14344265835

SN - 1559-4076

SP - 21

EP - 25

JO - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium

JF - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium

ER -

HITON: a novel Markov Blanket algorithm for optimal variable selection.

Abstract

UN SDGs

OpenUrl availability

Other files and links

Fingerprint

Cite this