An empirical study of applying ensembles of heterogeneous classifiers on imperfect data

Kuo Wei Hsu; Jaideep Srivastava

doi:10.1007/978-3-642-14640-4_3

An empirical study of applying ensembles of heterogeneous classifiers on imperfect data

Kuo Wei Hsu, Jaideep Srivastava

Computer Science and Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

6 Scopus citations

Abstract

Two factors that slow down the deployment of classification or supervised learning in real-world situations. One is the reality that data are not perfect in practice, while the other is the fact that every technique has its own limits. Although there have been techniques developed to resolve issues about imperfectness of real-world data, there is no single one that outperforms all others and each such technique focuses on some types of imperfectness. Furthermore, quite a few works apply ensembles of heterogeneous classifiers to such situations. In this paper, we report a work on progress that studies the impact of heterogeneity on ensemble, especially focusing on the following aspects: diversity and classification quality for imbalanced data. Our goal is to evaluate how introducing heterogeneity into ensemble influences its behavior and performance.

Original language	English (US)
Title of host publication	New Frontiers in Applied Data Mining - PAKDD 2009 International Workshops, Revised Selected Papers
Pages	28-39
Number of pages	12
DOIs	https://doi.org/10.1007/978-3-642-14640-4_3
State	Published - 2010
Event	13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009 - Bangkok, Thailand Duration: Apr 27 2009 → Apr 30 2009

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	5669 LNAI
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Other

Other	13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009
Country/Territory	Thailand
City	Bangkok
Period	4/27/09 → 4/30/09

Bibliographical note

Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.

Keywords

AdaBoost
bagging
diversity
heterogeneity
imbalanced data

Access

10.1007/978-3-642-14640-4_3

OpenUrl availability

Full text

Cite this

Hsu, K. W., & Srivastava, J. (2010). An empirical study of applying ensembles of heterogeneous classifiers on imperfect data. In New Frontiers in Applied Data Mining - PAKDD 2009 International Workshops, Revised Selected Papers (pp. 28-39). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5669 LNAI). https://doi.org/10.1007/978-3-642-14640-4_3

An empirical study of applying ensembles of heterogeneous classifiers on imperfect data. / Hsu, Kuo Wei; Srivastava, Jaideep.
New Frontiers in Applied Data Mining - PAKDD 2009 International Workshops, Revised Selected Papers. 2010. p. 28-39 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5669 LNAI).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Hsu, KW & Srivastava, J 2010, An empirical study of applying ensembles of heterogeneous classifiers on imperfect data. in New Frontiers in Applied Data Mining - PAKDD 2009 International Workshops, Revised Selected Papers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5669 LNAI, pp. 28-39, 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009, Bangkok, Thailand, 4/27/09. https://doi.org/10.1007/978-3-642-14640-4_3

Hsu KW, Srivastava J. An empirical study of applying ensembles of heterogeneous classifiers on imperfect data. In New Frontiers in Applied Data Mining - PAKDD 2009 International Workshops, Revised Selected Papers. 2010. p. 28-39. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-642-14640-4_3

Hsu, Kuo Wei ; Srivastava, Jaideep. / An empirical study of applying ensembles of heterogeneous classifiers on imperfect data. New Frontiers in Applied Data Mining - PAKDD 2009 International Workshops, Revised Selected Papers. 2010. pp. 28-39 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{d54eec11f7d54c8d9612fc7a123d4bb6,

title = "An empirical study of applying ensembles of heterogeneous classifiers on imperfect data",

abstract = "Two factors that slow down the deployment of classification or supervised learning in real-world situations. One is the reality that data are not perfect in practice, while the other is the fact that every technique has its own limits. Although there have been techniques developed to resolve issues about imperfectness of real-world data, there is no single one that outperforms all others and each such technique focuses on some types of imperfectness. Furthermore, quite a few works apply ensembles of heterogeneous classifiers to such situations. In this paper, we report a work on progress that studies the impact of heterogeneity on ensemble, especially focusing on the following aspects: diversity and classification quality for imbalanced data. Our goal is to evaluate how introducing heterogeneity into ensemble influences its behavior and performance.",

keywords = "AdaBoost, bagging, diversity, heterogeneity, imbalanced data",

author = "Hsu, {Kuo Wei} and Jaideep Srivastava",

note = "Copyright: Copyright 2010 Elsevier B.V., All rights reserved.; 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009 ; Conference date: 27-04-2009 Through 30-04-2009",

year = "2010",

doi = "10.1007/978-3-642-14640-4_3",

language = "English (US)",

isbn = "3642146392",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

pages = "28--39",

booktitle = "New Frontiers in Applied Data Mining - PAKDD 2009 International Workshops, Revised Selected Papers",

}

TY - GEN

T1 - An empirical study of applying ensembles of heterogeneous classifiers on imperfect data

AU - Hsu, Kuo Wei

AU - Srivastava, Jaideep

PY - 2010

Y1 - 2010

N2 - Two factors that slow down the deployment of classification or supervised learning in real-world situations. One is the reality that data are not perfect in practice, while the other is the fact that every technique has its own limits. Although there have been techniques developed to resolve issues about imperfectness of real-world data, there is no single one that outperforms all others and each such technique focuses on some types of imperfectness. Furthermore, quite a few works apply ensembles of heterogeneous classifiers to such situations. In this paper, we report a work on progress that studies the impact of heterogeneity on ensemble, especially focusing on the following aspects: diversity and classification quality for imbalanced data. Our goal is to evaluate how introducing heterogeneity into ensemble influences its behavior and performance.

AB - Two factors that slow down the deployment of classification or supervised learning in real-world situations. One is the reality that data are not perfect in practice, while the other is the fact that every technique has its own limits. Although there have been techniques developed to resolve issues about imperfectness of real-world data, there is no single one that outperforms all others and each such technique focuses on some types of imperfectness. Furthermore, quite a few works apply ensembles of heterogeneous classifiers to such situations. In this paper, we report a work on progress that studies the impact of heterogeneity on ensemble, especially focusing on the following aspects: diversity and classification quality for imbalanced data. Our goal is to evaluate how introducing heterogeneity into ensemble influences its behavior and performance.

KW - AdaBoost

KW - bagging

KW - diversity

KW - heterogeneity

KW - imbalanced data

UR - http://www.scopus.com/inward/record.url?scp=77957079123&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77957079123&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-14640-4_3

DO - 10.1007/978-3-642-14640-4_3

M3 - Conference contribution

AN - SCOPUS:77957079123

SN - 3642146392

SN - 9783642146398

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 28

EP - 39

BT - New Frontiers in Applied Data Mining - PAKDD 2009 International Workshops, Revised Selected Papers

T2 - 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009

Y2 - 27 April 2009 through 30 April 2009

ER -

An empirical study of applying ensembles of heterogeneous classifiers on imperfect data

Abstract

Publication series

Other

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this