Gene Ontology (GO) terms are often used to interpret the results of microarray experiments. The most common approach is to perform Fisher's exact tests to find gene sets annotated by GO terms which are over-represented among the genes declared to be differentially expressed in the analysis of microarray data. Another way is to apply Gene Set Enrichment Analysis (GSEA) that uses predefined gene sets and ranks of genes to identify significant biological changes in microarray data sets. However, after correcting for multiple hypotheses testing, few (or no) GO terms may meet the threshold for statistical significance, because the relevant biological differences are small relative to the noise inherent to the microarray technology. In addition to the individual GO terms, we propose testing of gene sets constructed as intersections of GO terms, Kyoto Encyclopedia of Genes and Genomes Orthology (KO) terms, and gene sets constructed by using gene-gene interaction data obtained from the ENTREZ database. Our method finds gene sets that are significantly over-represented among differentially expressed genes which cannot be found by the standard enrichment testing methods applied on individual GO and KO terms, thus improving the enrichment analysis of microarray data.
Bibliographical noteFunding Information:
This research was supported by the Slovenian Ministry of Higher Education, Science and Technology of Slovenia. We are grateful to Filip Železný for previous joint work on the use of GO and gene-gene interaction data in the analysis of microarray data which has largely stimulated this work. We also thank two anonymous reviewers who made numerous useful suggestions that have enabled us to improve the quality of this paper.
- Gene set enrichment
- Microarray data analysis