Subspace differential coexpression analysis: Problem definition and a general approach

Gang Fang; Rui Kuang; Gaurav Pandey; Michael Steinbach; Chad L. Myers; Vipin Kumar

Subspace differential coexpression analysis: Problem definition and a general approach

Gang Fang, Rui Kuang, Gaurav Pandey, Michael Steinbach, Chad L. Myers, Vipin Kumar

Computer Science and Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

43 Scopus citations

Abstract

In this paper, we study methods to identify differential coexpression patterns in case-control gene expression data. A differential coexpression pattern consists of a set of genes that have substantially different levels of coherence of their expression profiles across the two sample-classes, i.e., highly coherent in one class, but not in the other. Biologically, a differential coexpression patterns may indicate the disruption of a regulatory mechanism possibly caused by disregulation of pathways or mutations of transcription factors. A common feature of all the existing approaches for differential coexpression analysis is that the coexpression of a set of genes is measured on all the samples in each of the two classes, i.e., over the full-space of samples. Hence, these approaches may miss patterns that only cover a subset of samples in each class, i.e., subspace patterns, due to the heterogeneity of the subject population and disease causes. In this paper, we extend differential coexpression analysis by defining a subspace differential coexpression pattern, i.e., a set of genes that are coexpressed in a relatively large percent of samples in one class, but in a much smaller percent of samples in the other class. We propose a general approach based upon association analysis framework that allows exhaustive yet efficient discovery of subspace differential coexpression patterns. This approach can be used to adapt a family of biclustering algorithms to obtain their corresponding differential versions that can directly discover differential coexpression patterns. Using a recently developed biclustering algorithm as illustration, we perform experiments on cancer datasets which demonstrates the existence of subspace differential coexpression patterns. Permutation tests demonstrate the statistical significance for a large number of discovered subspace patterns, many of which can not be discovered if they are measured over all the samples in each of the classes. Interestingly, in our experiments, some discovered subspace patterns have significant overlap with known cancer pathways, and some are enriched with the target gene sets of cancer-related microRNA and transcription factors. The source codes and datasets used in this paper are available at http://vk.cs.umn.edu/SDC/.

Original language	English (US)
Title of host publication	Pacific Symposium on Biocomputing 2010, PSB 2010
Pages	145-156
Number of pages	12
State	Published - 2010
Event	15th Pacific Symposium on Biocomputing, PSB 2010 - Kamuela, HI, United States Duration: Jan 4 2010 → Jan 8 2010

Publication series

Name	Pacific Symposium on Biocomputing 2010, PSB 2010

Other

Other	15th Pacific Symposium on Biocomputing, PSB 2010
Country/Territory	United States
City	Kamuela, HI
Period	1/4/10 → 1/8/10

Keywords

Differential coexpression
association analysis
differential biclustering
differential network analysis

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

OpenUrl availability

Full text

Cite this

@inproceedings{a55def7afaa5437895ccd6eadd5c6588,

title = "Subspace differential coexpression analysis: Problem definition and a general approach",

abstract = "In this paper, we study methods to identify differential coexpression patterns in case-control gene expression data. A differential coexpression pattern consists of a set of genes that have substantially different levels of coherence of their expression profiles across the two sample-classes, i.e., highly coherent in one class, but not in the other. Biologically, a differential coexpression patterns may indicate the disruption of a regulatory mechanism possibly caused by disregulation of pathways or mutations of transcription factors. A common feature of all the existing approaches for differential coexpression analysis is that the coexpression of a set of genes is measured on all the samples in each of the two classes, i.e., over the full-space of samples. Hence, these approaches may miss patterns that only cover a subset of samples in each class, i.e., subspace patterns, due to the heterogeneity of the subject population and disease causes. In this paper, we extend differential coexpression analysis by defining a subspace differential coexpression pattern, i.e., a set of genes that are coexpressed in a relatively large percent of samples in one class, but in a much smaller percent of samples in the other class. We propose a general approach based upon association analysis framework that allows exhaustive yet efficient discovery of subspace differential coexpression patterns. This approach can be used to adapt a family of biclustering algorithms to obtain their corresponding differential versions that can directly discover differential coexpression patterns. Using a recently developed biclustering algorithm as illustration, we perform experiments on cancer datasets which demonstrates the existence of subspace differential coexpression patterns. Permutation tests demonstrate the statistical significance for a large number of discovered subspace patterns, many of which can not be discovered if they are measured over all the samples in each of the classes. Interestingly, in our experiments, some discovered subspace patterns have significant overlap with known cancer pathways, and some are enriched with the target gene sets of cancer-related microRNA and transcription factors. The source codes and datasets used in this paper are available at http://vk.cs.umn.edu/SDC/.",

keywords = "Differential coexpression, association analysis, differential biclustering, differential network analysis",

author = "Gang Fang and Rui Kuang and Gaurav Pandey and Michael Steinbach and Myers, {Chad L.} and Vipin Kumar",

year = "2010",

language = "English (US)",

isbn = "9814295299",

series = "Pacific Symposium on Biocomputing 2010, PSB 2010",

pages = "145--156",

booktitle = "Pacific Symposium on Biocomputing 2010, PSB 2010",

note = "15th Pacific Symposium on Biocomputing, PSB 2010 ; Conference date: 04-01-2010 Through 08-01-2010",

}

TY - GEN

T1 - Subspace differential coexpression analysis

T2 - 15th Pacific Symposium on Biocomputing, PSB 2010

AU - Fang, Gang

AU - Kuang, Rui

AU - Pandey, Gaurav

AU - Steinbach, Michael

AU - Myers, Chad L.

AU - Kumar, Vipin

PY - 2010

Y1 - 2010

N2 - In this paper, we study methods to identify differential coexpression patterns in case-control gene expression data. A differential coexpression pattern consists of a set of genes that have substantially different levels of coherence of their expression profiles across the two sample-classes, i.e., highly coherent in one class, but not in the other. Biologically, a differential coexpression patterns may indicate the disruption of a regulatory mechanism possibly caused by disregulation of pathways or mutations of transcription factors. A common feature of all the existing approaches for differential coexpression analysis is that the coexpression of a set of genes is measured on all the samples in each of the two classes, i.e., over the full-space of samples. Hence, these approaches may miss patterns that only cover a subset of samples in each class, i.e., subspace patterns, due to the heterogeneity of the subject population and disease causes. In this paper, we extend differential coexpression analysis by defining a subspace differential coexpression pattern, i.e., a set of genes that are coexpressed in a relatively large percent of samples in one class, but in a much smaller percent of samples in the other class. We propose a general approach based upon association analysis framework that allows exhaustive yet efficient discovery of subspace differential coexpression patterns. This approach can be used to adapt a family of biclustering algorithms to obtain their corresponding differential versions that can directly discover differential coexpression patterns. Using a recently developed biclustering algorithm as illustration, we perform experiments on cancer datasets which demonstrates the existence of subspace differential coexpression patterns. Permutation tests demonstrate the statistical significance for a large number of discovered subspace patterns, many of which can not be discovered if they are measured over all the samples in each of the classes. Interestingly, in our experiments, some discovered subspace patterns have significant overlap with known cancer pathways, and some are enriched with the target gene sets of cancer-related microRNA and transcription factors. The source codes and datasets used in this paper are available at http://vk.cs.umn.edu/SDC/.

AB - In this paper, we study methods to identify differential coexpression patterns in case-control gene expression data. A differential coexpression pattern consists of a set of genes that have substantially different levels of coherence of their expression profiles across the two sample-classes, i.e., highly coherent in one class, but not in the other. Biologically, a differential coexpression patterns may indicate the disruption of a regulatory mechanism possibly caused by disregulation of pathways or mutations of transcription factors. A common feature of all the existing approaches for differential coexpression analysis is that the coexpression of a set of genes is measured on all the samples in each of the two classes, i.e., over the full-space of samples. Hence, these approaches may miss patterns that only cover a subset of samples in each class, i.e., subspace patterns, due to the heterogeneity of the subject population and disease causes. In this paper, we extend differential coexpression analysis by defining a subspace differential coexpression pattern, i.e., a set of genes that are coexpressed in a relatively large percent of samples in one class, but in a much smaller percent of samples in the other class. We propose a general approach based upon association analysis framework that allows exhaustive yet efficient discovery of subspace differential coexpression patterns. This approach can be used to adapt a family of biclustering algorithms to obtain their corresponding differential versions that can directly discover differential coexpression patterns. Using a recently developed biclustering algorithm as illustration, we perform experiments on cancer datasets which demonstrates the existence of subspace differential coexpression patterns. Permutation tests demonstrate the statistical significance for a large number of discovered subspace patterns, many of which can not be discovered if they are measured over all the samples in each of the classes. Interestingly, in our experiments, some discovered subspace patterns have significant overlap with known cancer pathways, and some are enriched with the target gene sets of cancer-related microRNA and transcription factors. The source codes and datasets used in this paper are available at http://vk.cs.umn.edu/SDC/.

KW - Differential coexpression

KW - association analysis

KW - differential biclustering

KW - differential network analysis

UR - http://www.scopus.com/inward/record.url?scp=77950840239&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77950840239&partnerID=8YFLogxK

M3 - Conference contribution

C2 - 19908367

AN - SCOPUS:77950840239

SN - 9814295299

SN - 9789814295291

T3 - Pacific Symposium on Biocomputing 2010, PSB 2010

SP - 145

EP - 156

BT - Pacific Symposium on Biocomputing 2010, PSB 2010

Y2 - 4 January 2010 through 8 January 2010

ER -

Subspace differential coexpression analysis: Problem definition and a general approach

Abstract

Publication series

Other

Keywords

UN SDGs

OpenUrl availability

Other files and links

Fingerprint

Cite this