Feature selection via probabilistic outputs

Nicholas A. Arnosti, Andrea Pohoreckyj Danyluk

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper investigates two feature-scoring criteria that make use of estimated class probabilities: one method proposed by Shen et al. (2008) and a complementary approach proposed below. We develop a theoretical framework to analyze each criterion and show that both estimate the spread (across all values of a given feature) of the probability that an example belongs to the positive class. Based on our analysis, we predict when each scoring technique will be advantageous over the other and give empirical results validating our predictions.

Original languageEnglish (US)
Title of host publicationProceedings of the 29th International Conference on Machine Learning, ICML 2012
Pages1791-1798
Number of pages8
StatePublished - 2012
Externally publishedYes
Event29th International Conference on Machine Learning, ICML 2012 - Edinburgh, United Kingdom
Duration: Jun 26 2012Jul 1 2012

Publication series

NameProceedings of the 29th International Conference on Machine Learning, ICML 2012
Volume2

Other

Other29th International Conference on Machine Learning, ICML 2012
Country/TerritoryUnited Kingdom
CityEdinburgh
Period6/26/127/1/12

Fingerprint

Dive into the research topics of 'Feature selection via probabilistic outputs'. Together they form a unique fingerprint.

Cite this