As the most common cancer among women, breast cancer results from the accumulation of mutations in essential genes. Recent advance in high-throughput gene expression microarray technology has inspired researchers to use the technology to assist breast cancer diagnosis, prognosis, and treatment prediction. However, the high dimensionality of microarray experiments and public access of data from many experiments have caused inconsistencies which initiated the development of controlled terminologies and ontologies for annotating microarray experiments, such as the standard microarray Gene Expression Data (MGED) ontology(MO). In this paper, we developed BCM-CO, anontology tailored specifically for indexing clinical annotations of breast cancer microarray samples from the NCI Thesaurus. Our research showed that the coverage of NCI Thesaurus is very limited with respect to i) terms used by researchers to describe breast cancer histology (covering 22 out of 48 histology terms); ii) breast cancer cell lines (covering one out of 12 cell lines); and iii) classes corresponding to the breast cancer grading and staging. By incorporating a wider range of those terms into BCM-CO, we were able to indexed breast cancer microarray samples from GEO using BCMCO and MGED ontology and developed a prototype system with web interface that allows the retrieval of microarray data based on the ontology annotations.
|Original language||English (US)|
|Number of pages||5|
|Journal||AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium|
|State||Published - 2008|