Feature selection via probabilistic outputs

Nicholas A. Arnosti; Andrea Pohoreckyj Danyluk

Feature selection via probabilistic outputs

Nicholas A. Arnosti, Andrea Pohoreckyj Danyluk

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

This paper investigates two feature-scoring criteria that make use of estimated class probabilities: one method proposed by Shen et al. (2008) and a complementary approach proposed below. We develop a theoretical framework to analyze each criterion and show that both estimate the spread (across all values of a given feature) of the probability that an example belongs to the positive class. Based on our analysis, we predict when each scoring technique will be advantageous over the other and give empirical results validating our predictions.

Original language	English (US)
Title of host publication	Proceedings of the 29th International Conference on Machine Learning, ICML 2012
Pages	1791-1798
Number of pages	8
State	Published - 2012
Externally published	Yes
Event	29th International Conference on Machine Learning, ICML 2012 - Edinburgh, United Kingdom Duration: Jun 26 2012 → Jul 1 2012

Publication series

Name	Proceedings of the 29th International Conference on Machine Learning, ICML 2012
Volume	2

Other

Other	29th International Conference on Machine Learning, ICML 2012
Country/Territory	United Kingdom
City	Edinburgh
Period	6/26/12 → 7/1/12

OpenUrl availability

Full text

Cite this

Feature selection via probabilistic outputs. / Arnosti, Nicholas A.; Danyluk, Andrea Pohoreckyj.
Proceedings of the 29th International Conference on Machine Learning, ICML 2012. 2012. p. 1791-1798 (Proceedings of the 29th International Conference on Machine Learning, ICML 2012; Vol. 2).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

@inproceedings{a6da3773df0e4f0ca105bbe5751fd74a,

title = "Feature selection via probabilistic outputs",

abstract = "This paper investigates two feature-scoring criteria that make use of estimated class probabilities: one method proposed by Shen et al. (2008) and a complementary approach proposed below. We develop a theoretical framework to analyze each criterion and show that both estimate the spread (across all values of a given feature) of the probability that an example belongs to the positive class. Based on our analysis, we predict when each scoring technique will be advantageous over the other and give empirical results validating our predictions.",

author = "Arnosti, {Nicholas A.} and Danyluk, {Andrea Pohoreckyj}",

year = "2012",

language = "English (US)",

isbn = "9781450312851",

series = "Proceedings of the 29th International Conference on Machine Learning, ICML 2012",

pages = "1791--1798",

booktitle = "Proceedings of the 29th International Conference on Machine Learning, ICML 2012",

note = "29th International Conference on Machine Learning, ICML 2012 ; Conference date: 26-06-2012 Through 01-07-2012",

}

TY - GEN

T1 - Feature selection via probabilistic outputs

AU - Arnosti, Nicholas A.

AU - Danyluk, Andrea Pohoreckyj

PY - 2012

Y1 - 2012

N2 - This paper investigates two feature-scoring criteria that make use of estimated class probabilities: one method proposed by Shen et al. (2008) and a complementary approach proposed below. We develop a theoretical framework to analyze each criterion and show that both estimate the spread (across all values of a given feature) of the probability that an example belongs to the positive class. Based on our analysis, we predict when each scoring technique will be advantageous over the other and give empirical results validating our predictions.

AB - This paper investigates two feature-scoring criteria that make use of estimated class probabilities: one method proposed by Shen et al. (2008) and a complementary approach proposed below. We develop a theoretical framework to analyze each criterion and show that both estimate the spread (across all values of a given feature) of the probability that an example belongs to the positive class. Based on our analysis, we predict when each scoring technique will be advantageous over the other and give empirical results validating our predictions.

UR - http://www.scopus.com/inward/record.url?scp=84867126717&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867126717&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84867126717

SN - 9781450312851

T3 - Proceedings of the 29th International Conference on Machine Learning, ICML 2012

SP - 1791

EP - 1798

BT - Proceedings of the 29th International Conference on Machine Learning, ICML 2012

T2 - 29th International Conference on Machine Learning, ICML 2012

Y2 - 26 June 2012 through 1 July 2012

ER -

Feature selection via probabilistic outputs

Abstract

Publication series

Other

OpenUrl availability

Other files and links

Fingerprint

Cite this