The impact of incomplete knowledge on evaluation: An experimental benchmark for protein function prediction

Curtis Huttenhower; Matthew A. Hibbs; Chad L. Myers; Amy A. Caudy; David C. Hess; Olga G. Troyanskaya

doi:10.1093/bioinformatics/btp397

The impact of incomplete knowledge on evaluation: An experimental benchmark for protein function prediction

Curtis Huttenhower, Matthew A. Hibbs, Chad L. Myers, Amy A. Caudy, David C. Hess, Olga G. Troyanskaya

Computer Science and Engineering

Research output: Contribution to journal › Article › peer-review

27 Scopus citations

Abstract

Motivation: Rapidly expanding repositories of highly informative genomic data have generated increasing interest in methods for protein function prediction and inference of biological networks. The successful application of supervised machine learning to these tasks requires a gold standard for protein function: a trusted set of correct examples, which can be used to assess performance through cross-validation or other statistical approaches. Since gene annotation is incomplete for even the best studied model organisms, the biological reliability of such evaluations may be called into question. Results: We address this concern by constructing and analyzing an experimentally based gold standard through comprehensive validation of protein function predictions for mitochondrion biogenesis in Saccharomyces cerevisiae. Specifically, we determine that (i) current machine learning approaches are able to generalize and predict novel biology from an incomplete gold standard and (ii) incomplete functional annotations adversely affect the evaluation of machine learning performance. While computational approaches performed better than predicted in the face of incomplete data, relative comparison of competing approaches - even those employing the same training data - is problematic with a sparse gold standard. Incomplete knowledge causes individual methods' performances to be differentially underestimated, resulting in misleading performance evaluations. We provide a benchmark gold standard for yeast mitochondria to complement current databases and an analysis of our experimental results in the hopes of mitigating these effects in future comparative evaluations.

Original language	English (US)
Pages (from-to)	2404-2410
Number of pages	7
Journal	Bioinformatics
Volume	25
Issue number	18
DOIs	https://doi.org/10.1093/bioinformatics/btp397
State	Published - Sep 2009

Bibliographical note

Funding Information:
Funding: National Institutes of Health (grants R01 GM071966, T32 HG003284), NSF CAREER award (DBI-0546275); National Science Foundation (grant IIS-0513552); a Google Research Award (to O.G.T.); NIGMS Center of Excellence (grant P50 GM071508).

Access

10.1093/bioinformatics/btp397

OpenUrl availability

Full text

Cite this

@article{189ef5a7d9734788b4725c6a18114637,

title = "The impact of incomplete knowledge on evaluation: An experimental benchmark for protein function prediction",

abstract = "Motivation: Rapidly expanding repositories of highly informative genomic data have generated increasing interest in methods for protein function prediction and inference of biological networks. The successful application of supervised machine learning to these tasks requires a gold standard for protein function: a trusted set of correct examples, which can be used to assess performance through cross-validation or other statistical approaches. Since gene annotation is incomplete for even the best studied model organisms, the biological reliability of such evaluations may be called into question. Results: We address this concern by constructing and analyzing an experimentally based gold standard through comprehensive validation of protein function predictions for mitochondrion biogenesis in Saccharomyces cerevisiae. Specifically, we determine that (i) current machine learning approaches are able to generalize and predict novel biology from an incomplete gold standard and (ii) incomplete functional annotations adversely affect the evaluation of machine learning performance. While computational approaches performed better than predicted in the face of incomplete data, relative comparison of competing approaches - even those employing the same training data - is problematic with a sparse gold standard. Incomplete knowledge causes individual methods' performances to be differentially underestimated, resulting in misleading performance evaluations. We provide a benchmark gold standard for yeast mitochondria to complement current databases and an analysis of our experimental results in the hopes of mitigating these effects in future comparative evaluations.",

author = "Curtis Huttenhower and Hibbs, {Matthew A.} and Myers, {Chad L.} and Caudy, {Amy A.} and Hess, {David C.} and Troyanskaya, {Olga G.}",

note = "Funding Information: Funding: National Institutes of Health (grants R01 GM071966, T32 HG003284), NSF CAREER award (DBI-0546275); National Science Foundation (grant IIS-0513552); a Google Research Award (to O.G.T.); NIGMS Center of Excellence (grant P50 GM071508).",

year = "2009",

month = sep,

doi = "10.1093/bioinformatics/btp397",

language = "English (US)",

volume = "25",

pages = "2404--2410",

journal = "Bioinformatics",

issn = "1367-4803",

publisher = "Oxford University Press",

number = "18",

}

TY - JOUR

T1 - The impact of incomplete knowledge on evaluation

T2 - An experimental benchmark for protein function prediction

AU - Huttenhower, Curtis

AU - Hibbs, Matthew A.

AU - Myers, Chad L.

AU - Caudy, Amy A.

AU - Hess, David C.

AU - Troyanskaya, Olga G.

N1 - Funding Information: Funding: National Institutes of Health (grants R01 GM071966, T32 HG003284), NSF CAREER award (DBI-0546275); National Science Foundation (grant IIS-0513552); a Google Research Award (to O.G.T.); NIGMS Center of Excellence (grant P50 GM071508).

PY - 2009/9

Y1 - 2009/9

N2 - Motivation: Rapidly expanding repositories of highly informative genomic data have generated increasing interest in methods for protein function prediction and inference of biological networks. The successful application of supervised machine learning to these tasks requires a gold standard for protein function: a trusted set of correct examples, which can be used to assess performance through cross-validation or other statistical approaches. Since gene annotation is incomplete for even the best studied model organisms, the biological reliability of such evaluations may be called into question. Results: We address this concern by constructing and analyzing an experimentally based gold standard through comprehensive validation of protein function predictions for mitochondrion biogenesis in Saccharomyces cerevisiae. Specifically, we determine that (i) current machine learning approaches are able to generalize and predict novel biology from an incomplete gold standard and (ii) incomplete functional annotations adversely affect the evaluation of machine learning performance. While computational approaches performed better than predicted in the face of incomplete data, relative comparison of competing approaches - even those employing the same training data - is problematic with a sparse gold standard. Incomplete knowledge causes individual methods' performances to be differentially underestimated, resulting in misleading performance evaluations. We provide a benchmark gold standard for yeast mitochondria to complement current databases and an analysis of our experimental results in the hopes of mitigating these effects in future comparative evaluations.

AB - Motivation: Rapidly expanding repositories of highly informative genomic data have generated increasing interest in methods for protein function prediction and inference of biological networks. The successful application of supervised machine learning to these tasks requires a gold standard for protein function: a trusted set of correct examples, which can be used to assess performance through cross-validation or other statistical approaches. Since gene annotation is incomplete for even the best studied model organisms, the biological reliability of such evaluations may be called into question. Results: We address this concern by constructing and analyzing an experimentally based gold standard through comprehensive validation of protein function predictions for mitochondrion biogenesis in Saccharomyces cerevisiae. Specifically, we determine that (i) current machine learning approaches are able to generalize and predict novel biology from an incomplete gold standard and (ii) incomplete functional annotations adversely affect the evaluation of machine learning performance. While computational approaches performed better than predicted in the face of incomplete data, relative comparison of competing approaches - even those employing the same training data - is problematic with a sparse gold standard. Incomplete knowledge causes individual methods' performances to be differentially underestimated, resulting in misleading performance evaluations. We provide a benchmark gold standard for yeast mitochondria to complement current databases and an analysis of our experimental results in the hopes of mitigating these effects in future comparative evaluations.

UR - http://www.scopus.com/inward/record.url?scp=69849104130&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=69849104130&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btp397

DO - 10.1093/bioinformatics/btp397

M3 - Article

C2 - 19561015

AN - SCOPUS:69849104130

SN - 1367-4803

VL - 25

SP - 2404

EP - 2410

JO - Bioinformatics

JF - Bioinformatics

IS - 18

ER -

The impact of incomplete knowledge on evaluation: An experimental benchmark for protein function prediction

Abstract

Bibliographical note

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this