Two factors that slow down the deployment of classification or supervised learning in real-world situations. One is the reality that data are not perfect in practice, while the other is the fact that every technique has its own limits. Although there have been techniques developed to resolve issues about imperfectness of real-world data, there is no single one that outperforms all others and each such technique focuses on some types of imperfectness. Furthermore, quite a few works apply ensembles of heterogeneous classifiers to such situations. In this paper, we report a work on progress that studies the impact of heterogeneity on ensemble, especially focusing on the following aspects: diversity and classification quality for imbalanced data. Our goal is to evaluate how introducing heterogeneity into ensemble influences its behavior and performance.
|Original language||English (US)|
|Title of host publication||New Frontiers in Applied Data Mining - PAKDD 2009 International Workshops, Revised Selected Papers|
|Number of pages||12|
|State||Published - 2010|
|Event||13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009 - Bangkok, Thailand|
Duration: Apr 27 2009 → Apr 30 2009
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Other||13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009|
|Period||4/27/09 → 4/30/09|
Copyright 2010 Elsevier B.V., All rights reserved.
- imbalanced data