Learning on weighted hypergraphs to integrate protein interactions and gene expressions for cancer outcome prediction

Taehyun Hwang, Ze Tian, Rui Kuang, Jean Pierre Kocher

Research output: Chapter in Book/Report/Conference proceedingConference contribution

53 Scopus citations

Abstract

Building reliable predictive models from multiple complementary genomic data for cancer study is a crucial step towards successful cancer treatment and a full understanding of the underlying biological principles. To tackle this challenging data integration problem, we propose a hypergraph-based learning algorithm called HyperGene to integrate microarray gene expressions and protein-protein interactions for cancer outcome prediction and biomarker identification. HyperGene is a robust two-step iterative method that alternatively finds the optimal outcome prediction and the optimal weighting of the marker genes guided by a protein-protein interaction network. Under the hypothesis that cancer-related genes tend to interact with each other, the HyperGene algorithm uses a protein-protein interaction network as prior knowledge by imposing a consistent weighting of interacting genes. Our experimental results on two large-scale breast cancer gene expression datasets show that HyperGene utilizing a curated roteinprotein interaction network achieves significantly improved cancer outcome prediction. Moreover, HyperGene can also retrieve many known cancer genes as highly weighted marker genes.

Original languageEnglish (US)
Title of host publicationProceedings - 8th IEEE International Conference on Data Mining, ICDM 2008
Pages293-302
Number of pages10
DOIs
StatePublished - Dec 1 2008
Event8th IEEE International Conference on Data Mining, ICDM 2008 - Pisa, Italy
Duration: Dec 15 2008Dec 19 2008

Other

Other8th IEEE International Conference on Data Mining, ICDM 2008
Country/TerritoryItaly
CityPisa
Period12/15/0812/19/08

Fingerprint

Dive into the research topics of 'Learning on weighted hypergraphs to integrate protein interactions and gene expressions for cancer outcome prediction'. Together they form a unique fingerprint.

Cite this