Advanced omics technologies such as deep sequencing and spectral karyotyping are revealing more of cancer heterogeneity at the genetic, genomic, gene expression, epigenetic, proteomic, and metabolomic levels. With this increasing body of emerging data, the task of data analysis becomes critical for mining and modeling to better understand the relevant underlying biological processes. However, the multiple levels of heterogeneity evident within and among populations, healthy and diseased, complicate the mining and interpretation of biological data, especially when dealing with hundreds to tens of thousands of variables. Heterogeneity occurs in many diseases, such as cancers, autism, macular degeneration, and others. In cancer, heterogeneity has hampered the search for validated biomarkers for early detection, and it has complicated the task of finding clonal (driver) and nonclonal (nonexpanded or passenger) aberrations. We show that subtyping of cancer (classification of specimens) should be an a priori step to the identification of early events of cancers. Studying early events in oncogenesis can be done on histologically normal tissues from diseased individuals (HNTDI), since they most likely have been exposed to the same mutagenic insults that caused the cancer in their neighboring tissues. Polarity assessment of HNTDI data variables by using healthy specimens as outgroup(s), followed by the application of parsimony phylogenetic analysis, produces a hierarchical classification of specimens that reveals the early events of the disease ontogeny within its subtypes as shared derived changes (abnormal changes) or synapomorphies in phylogenetic terminology.
- Early events
- Systems biology