Inference after model selection

Xiaotong Shen; Hsin Cheng Huang; Jimmy Ye

doi:10.1198/016214504000001097

Inference after model selection

Xiaotong Shen, Hsin Cheng Huang, Jimmy Ye

Statistics (Twin Cities)

Research output: Contribution to journal › Article › peer-review

37 Scopus citations

Abstract

Typical modeling strategies involve model selection, which has a significant effect on inference of estimated parameters. Common practice is to use a selected model ignoring uncertainty introduced by the process of model selection. This could yield overoptimistic inferences, resulting in false discovery. In this article we develop a general methodology via optimal approximation for estimating the mean and variance of complex statistics that involve the process of model selection. This allows us to make approximately unbiased inferences, taking into account the selection process. We examine the operating characteristics of the proposed methodology via asymptotic analyses and simulations. These results show that the proposed methodology yields correct inferences and outperforms common alternatives.

Original language	English (US)
Pages (from-to)	751-762
Number of pages	12
Journal	Journal of the American Statistical Association
Volume	99
Issue number	467
DOIs	https://doi.org/10.1198/016214504000001097
State	Published - Sep 2004

Bibliographical note

Funding Information:
Xiaotong Shen is Professor, School of Statistics, University of Minnesota, Minneapolis, MN 55455 (E-mail: xshen@stat.umn.edu). Hsin-Cheng Huang is Associate Research Fellow, Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan (E-mail: hchuang@stat.sinica.edu.tw). Jimmy Ye is Associate Professor, Stan Ross Department of Accountancy, Baruch College, City University of New York, NY 10010 (E-mail: jimmy_ye@baruch.cuny.edu). Shen was supported in part by National Science Foundation grants IIS-0328802 and DMS-00-72635. Ye was supported by the Zicklin School of Business, Baruch College. The authors thank the editor, the associate editor, and two anonymous referees for helpful comments and suggestions.

Keywords

Bootstrap
Nonparametric
Parametric
Variable selection
Wavelet thresholding

Access

10.1198/016214504000001097

OpenUrl availability

Full text

Cite this

@article{784623bbf4524a9fa403ce65639529a3,

title = "Inference after model selection",

abstract = "Typical modeling strategies involve model selection, which has a significant effect on inference of estimated parameters. Common practice is to use a selected model ignoring uncertainty introduced by the process of model selection. This could yield overoptimistic inferences, resulting in false discovery. In this article we develop a general methodology via optimal approximation for estimating the mean and variance of complex statistics that involve the process of model selection. This allows us to make approximately unbiased inferences, taking into account the selection process. We examine the operating characteristics of the proposed methodology via asymptotic analyses and simulations. These results show that the proposed methodology yields correct inferences and outperforms common alternatives.",

keywords = "Bootstrap, Nonparametric, Parametric, Variable selection, Wavelet thresholding",

author = "Xiaotong Shen and Huang, {Hsin Cheng} and Jimmy Ye",

note = "Funding Information: Xiaotong Shen is Professor, School of Statistics, University of Minnesota, Minneapolis, MN 55455 (E-mail: xshen@stat.umn.edu). Hsin-Cheng Huang is Associate Research Fellow, Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan (E-mail: hchuang@stat.sinica.edu.tw). Jimmy Ye is Associate Professor, Stan Ross Department of Accountancy, Baruch College, City University of New York, NY 10010 (E-mail: jimmy_ye@baruch.cuny.edu). Shen was supported in part by National Science Foundation grants IIS-0328802 and DMS-00-72635. Ye was supported by the Zicklin School of Business, Baruch College. The authors thank the editor, the associate editor, and two anonymous referees for helpful comments and suggestions.",

year = "2004",

month = sep,

doi = "10.1198/016214504000001097",

language = "English (US)",

volume = "99",

pages = "751--762",

journal = "Journal of the American Statistical Association",

issn = "0162-1459",

publisher = "Taylor and Francis Ltd.",

number = "467",

}

TY - JOUR

T1 - Inference after model selection

AU - Shen, Xiaotong

AU - Huang, Hsin Cheng

AU - Ye, Jimmy

N1 - Funding Information: Xiaotong Shen is Professor, School of Statistics, University of Minnesota, Minneapolis, MN 55455 (E-mail: xshen@stat.umn.edu). Hsin-Cheng Huang is Associate Research Fellow, Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan (E-mail: hchuang@stat.sinica.edu.tw). Jimmy Ye is Associate Professor, Stan Ross Department of Accountancy, Baruch College, City University of New York, NY 10010 (E-mail: jimmy_ye@baruch.cuny.edu). Shen was supported in part by National Science Foundation grants IIS-0328802 and DMS-00-72635. Ye was supported by the Zicklin School of Business, Baruch College. The authors thank the editor, the associate editor, and two anonymous referees for helpful comments and suggestions.

PY - 2004/9

Y1 - 2004/9

N2 - Typical modeling strategies involve model selection, which has a significant effect on inference of estimated parameters. Common practice is to use a selected model ignoring uncertainty introduced by the process of model selection. This could yield overoptimistic inferences, resulting in false discovery. In this article we develop a general methodology via optimal approximation for estimating the mean and variance of complex statistics that involve the process of model selection. This allows us to make approximately unbiased inferences, taking into account the selection process. We examine the operating characteristics of the proposed methodology via asymptotic analyses and simulations. These results show that the proposed methodology yields correct inferences and outperforms common alternatives.

AB - Typical modeling strategies involve model selection, which has a significant effect on inference of estimated parameters. Common practice is to use a selected model ignoring uncertainty introduced by the process of model selection. This could yield overoptimistic inferences, resulting in false discovery. In this article we develop a general methodology via optimal approximation for estimating the mean and variance of complex statistics that involve the process of model selection. This allows us to make approximately unbiased inferences, taking into account the selection process. We examine the operating characteristics of the proposed methodology via asymptotic analyses and simulations. These results show that the proposed methodology yields correct inferences and outperforms common alternatives.

KW - Bootstrap

KW - Nonparametric

KW - Parametric

KW - Variable selection

KW - Wavelet thresholding

UR - http://www.scopus.com/inward/record.url?scp=4944259966&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=4944259966&partnerID=8YFLogxK

U2 - 10.1198/016214504000001097

DO - 10.1198/016214504000001097

M3 - Article

AN - SCOPUS:4944259966

SN - 0162-1459

VL - 99

SP - 751

EP - 762

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

IS - 467

ER -

Inference after model selection

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this