TY - GEN
T1 - Evaluating boosting algorithms to classify rare classes
T2 - 1st IEEE International Conference on Data Mining, ICDM'01
AU - Joshi, Mahesh V.
AU - Kumar, Vipin
AU - Agarwal, Ramesh C.
PY - 2001
Y1 - 2001
N2 - Classification of rare events has many important data mining applications. Boosting is a promising meta-technique that improves the classification peformance of any weak classifier. So far, no systematic study has been conducted to evaluate how boosting performs for the task of mining rare classes. In this paper, we evaluate three existing categories of boosting algorithms from the single viewpoint of how they update the example weights in each iteration, and discuss their possible effect on recall and precision of the rare class. We propose enhanced algorithms in two of the categories, arid justib their choice of weight updating parameters theoretically. Using some specially designed synthetic datasets, we compare the capability of all the algorithms from the rare class perspective. The results support our qualitative analysis, and also indicate that our enhancements bring an extra capability for achieving better balance between recall and precision in mining rare classes.
AB - Classification of rare events has many important data mining applications. Boosting is a promising meta-technique that improves the classification peformance of any weak classifier. So far, no systematic study has been conducted to evaluate how boosting performs for the task of mining rare classes. In this paper, we evaluate three existing categories of boosting algorithms from the single viewpoint of how they update the example weights in each iteration, and discuss their possible effect on recall and precision of the rare class. We propose enhanced algorithms in two of the categories, arid justib their choice of weight updating parameters theoretically. Using some specially designed synthetic datasets, we compare the capability of all the algorithms from the rare class perspective. The results support our qualitative analysis, and also indicate that our enhancements bring an extra capability for achieving better balance between recall and precision in mining rare classes.
UR - http://www.scopus.com/inward/record.url?scp=51649085353&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=51649085353&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:51649085353
SN - 0769511198
SN - 9780769511191
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 257
EP - 264
BT - Proceedings - 2001 IEEE International Conference on Data Mining, ICDM'01
Y2 - 29 November 2001 through 2 December 2001
ER -