TY - GEN
T1 - Module-based biomarker discovery in breast cancer
AU - Zhang, Yuji
AU - Xuan, Jason J.
AU - Clarke, Robert
AU - Ressom, Habtom W.
PY - 2010
Y1 - 2010
N2 - The availability of genome-wide biological network data opens up new possibilities to discover novel biomarkers and elucidate cancer-related complex mechanisms at network level. In this paper, we propose a novel module-based feature selection framework, which integrates biological network information and gene expression data to identify biomarkers, not as individual genes but as functional modules. Also, a large-scale analysis of ensemble feature selection concept is presented. The method allows combining features selected from multiple runs with various data subsampling to increase the reliability and classification accuracy of the final set of selected features. The results from four breast cancer studies demonstrate that the identified module biomarkers achieve: i) higher classification accuracy in independent validation datasets; ii) better reproducibility than individual gene biomarkers; iii) improved biological interpretability; and iv) enhanced enrichment in cancer-related "disease drivers".
AB - The availability of genome-wide biological network data opens up new possibilities to discover novel biomarkers and elucidate cancer-related complex mechanisms at network level. In this paper, we propose a novel module-based feature selection framework, which integrates biological network information and gene expression data to identify biomarkers, not as individual genes but as functional modules. Also, a large-scale analysis of ensemble feature selection concept is presented. The method allows combining features selected from multiple runs with various data subsampling to increase the reliability and classification accuracy of the final set of selected features. The results from four breast cancer studies demonstrate that the identified module biomarkers achieve: i) higher classification accuracy in independent validation datasets; ii) better reproducibility than individual gene biomarkers; iii) improved biological interpretability; and iv) enhanced enrichment in cancer-related "disease drivers".
UR - http://www.scopus.com/inward/record.url?scp=79952421691&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79952421691&partnerID=8YFLogxK
U2 - 10.1109/BIBM.2010.5706590
DO - 10.1109/BIBM.2010.5706590
M3 - Conference contribution
AN - SCOPUS:79952421691
SN - 9781424483075
T3 - Proceedings - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010
SP - 352
EP - 356
BT - Proceedings - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010
T2 - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010
Y2 - 18 December 2010 through 21 December 2010
ER -