Handling very large numbers of association rules in the analysis of microarray data

Alexander Tuzhilin, Gediminas Adomavicius

Research output: Contribution to conferencePaperpeer-review

45 Scopus citations

Abstract

The problem of analyzing microarray data became one of important topics in bioinformatics over the past several years, and different data mining techniques have been proposed for the analysis of such data. In this paper, we propose to use association rule discovery methods for determining associations among expression levels of different genes. One of the main problems related to the discovery of these associations is the scalability issue. Microarrays usually contain very large numbers of genes that are sometimes measured in 10,000s. Therefore, analysis of such data can generate a very large number of associations that can often be measured in millions. The paper addresses this problem by presenting a method that enables biologists to evaluate these very large numbers of discovered association rules during the post-analysis stage of the data mining process. This is achieved by providing several rule evaluation operators, including rule grouping, filtering, browsing, and data inspection operators, that allow biologists to validate multiple individual gene regulation patterns at a time. By iteratively applying these operators, biologists can explore a significant part of all the initially generated rules in an acceptable period of time and thus answer biological questions that are of a particular interest to him or her. To validate our method, we tested our system on the microarray data pertaining to the studies of environmental hazards and their influence of gene expression processes. As a result, we managed to answer several questions that were of interest to the biologists that had collected this data.

Original languageEnglish (US)
Pages396-404
Number of pages9
DOIs
StatePublished - 2002
EventKDD - 2002 Proceedings of the Eight ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Edmonton, Alta, Canada
Duration: Jul 23 2002Jul 26 2002

Other

OtherKDD - 2002 Proceedings of the Eight ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Country/TerritoryCanada
CityEdmonton, Alta
Period7/23/027/26/02

Keywords

  • Analysis of microarray data
  • Association rules
  • Bioinformatics
  • Expert-driven rule validation
  • Post-processing of discovered rules
  • Rule filtering
  • Rule grouping

Fingerprint

Dive into the research topics of 'Handling very large numbers of association rules in the analysis of microarray data'. Together they form a unique fingerprint.

Cite this