New methods for separating causes from effects in genomics data.

Alexander Statnikov; Mikael Henaff; Nikita I. Lytkin; Constantin F. Aliferis

doi:10.1186/1471-2164-13-s8-s22

New methods for separating causes from effects in genomics data.

Alexander Statnikov, Mikael Henaff, Nikita I. Lytkin, Constantin F. Aliferis

Institute for Health Informatics

Research output: Contribution to journal › Article › peer-review

16 Scopus citations

Abstract

The discovery of molecular pathways is a challenging problem and its solution relies on the identification of causal molecular interactions in genomics data. Causal molecular interactions can be discovered using randomized experiments; however such experiments are often costly, infeasible, or unethical. Fortunately, algorithms that infer causal interactions from observational data have been in development for decades, predominantly in the quantitative sciences, and many of them have recently been applied to genomics data. While these algorithms can infer unoriented causal interactions between involved molecular variables (i.e., without specifying which one is the cause and which one is the effect), causally orienting all inferred molecular interactions was assumed to be an unsolvable problem until recently. In this work, we use transcription factor-target gene regulatory interactions in three different organisms to evaluate a new family of methods that, given observational data for just two causally related variables, can determine which one is the cause and which one is the effect. We have found that a particular family of causal orientation methods (IGCI Gaussian) is often able to accurately infer directionality of causal interactions, and that these methods usually outperform other causal orientation techniques. We also introduced a novel ensemble technique for causal orientation that combines decisions of individual causal orientation methods. The ensemble method was found to be more accurate than any best individual causal orientation method in the tested data. This work represents a first step towards establishing context for practical use of causal orientation methods in the genomics domain. We have found that some causal orientation methodologies yield accurate predictions of causal orientation in genomics data, and we have improved on this capability with a novel ensemble method. Our results suggest that these methods have the potential to facilitate reconstruction of molecular pathways by minimizing the number of required randomized experiments to find causal directionality and by avoiding experiments that are infeasible and/or unethical.

Original language	English (US)
Journal	Unknown Journal
Volume	13 Suppl 8
DOIs	https://doi.org/10.1186/1471-2164-13-s8-s22
State	Published - 2012

Bibliographical note

Funding Information:
The authors would like to acknowledge Dominik Janzing and Joris M. Mooij, who contributed to the development of the majority of causal orientation methods used in this study, and thank them for providing (i) software implementations of causal orientation algorithms, (ii) help with stating assumptions of the tested methods, (iii) ideas about statistical significance testing approach, and (iv) feedback on other aspects of the manuscript and, in particular, interpretation of the results. The authors are also grateful to Efstratios Efstathiadis and Eric Peskin for the help with providing access and running experiments on the high performance computing facility. Finally, the authors would like to thank Ioannis Aifantis for providing experimental data for NOTCH1 that was used for the development of the corresponding gold standard. The empirical evaluation was supported in part by the grants 1R01LM011179-01A1 from the National Library of Medicine and 1UL1RR029893 from the National Center for Research Resources, National Institutes of Health. This article has been published as part of BMC Genomics Volume 13 Supplement 8, 2012: Proceedings of The International Conference on Intelligent Biology and Medicine (ICIBM): Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/ bmcgenomics/supplements/13/S8.

Access

10.1186/1471-2164-13-s8-s22

OpenUrl availability

Full text

Cite this

@article{55b2b18acf0f4d04b01d0583ae6fcb30,

title = "New methods for separating causes from effects in genomics data.",

abstract = "The discovery of molecular pathways is a challenging problem and its solution relies on the identification of causal molecular interactions in genomics data. Causal molecular interactions can be discovered using randomized experiments; however such experiments are often costly, infeasible, or unethical. Fortunately, algorithms that infer causal interactions from observational data have been in development for decades, predominantly in the quantitative sciences, and many of them have recently been applied to genomics data. While these algorithms can infer unoriented causal interactions between involved molecular variables (i.e., without specifying which one is the cause and which one is the effect), causally orienting all inferred molecular interactions was assumed to be an unsolvable problem until recently. In this work, we use transcription factor-target gene regulatory interactions in three different organisms to evaluate a new family of methods that, given observational data for just two causally related variables, can determine which one is the cause and which one is the effect. We have found that a particular family of causal orientation methods (IGCI Gaussian) is often able to accurately infer directionality of causal interactions, and that these methods usually outperform other causal orientation techniques. We also introduced a novel ensemble technique for causal orientation that combines decisions of individual causal orientation methods. The ensemble method was found to be more accurate than any best individual causal orientation method in the tested data. This work represents a first step towards establishing context for practical use of causal orientation methods in the genomics domain. We have found that some causal orientation methodologies yield accurate predictions of causal orientation in genomics data, and we have improved on this capability with a novel ensemble method. Our results suggest that these methods have the potential to facilitate reconstruction of molecular pathways by minimizing the number of required randomized experiments to find causal directionality and by avoiding experiments that are infeasible and/or unethical.",

author = "Alexander Statnikov and Mikael Henaff and Lytkin, {Nikita I.} and Aliferis, {Constantin F.}",

note = "Funding Information: The authors would like to acknowledge Dominik Janzing and Joris M. Mooij, who contributed to the development of the majority of causal orientation methods used in this study, and thank them for providing (i) software implementations of causal orientation algorithms, (ii) help with stating assumptions of the tested methods, (iii) ideas about statistical significance testing approach, and (iv) feedback on other aspects of the manuscript and, in particular, interpretation of the results. The authors are also grateful to Efstratios Efstathiadis and Eric Peskin for the help with providing access and running experiments on the high performance computing facility. Finally, the authors would like to thank Ioannis Aifantis for providing experimental data for NOTCH1 that was used for the development of the corresponding gold standard. The empirical evaluation was supported in part by the grants 1R01LM011179-01A1 from the National Library of Medicine and 1UL1RR029893 from the National Center for Research Resources, National Institutes of Health. This article has been published as part of BMC Genomics Volume 13 Supplement 8, 2012: Proceedings of The International Conference on Intelligent Biology and Medicine (ICIBM): Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/ bmcgenomics/supplements/13/S8.",

year = "2012",

doi = "10.1186/1471-2164-13-s8-s22",

language = "English (US)",

volume = "13 Suppl 8",

journal = "Unknown Journal",

issn = "0022-1120",

publisher = "Cambridge University Press",

}

TY - JOUR

T1 - New methods for separating causes from effects in genomics data.

AU - Statnikov, Alexander

AU - Henaff, Mikael

AU - Lytkin, Nikita I.

AU - Aliferis, Constantin F.

N1 - Funding Information: The authors would like to acknowledge Dominik Janzing and Joris M. Mooij, who contributed to the development of the majority of causal orientation methods used in this study, and thank them for providing (i) software implementations of causal orientation algorithms, (ii) help with stating assumptions of the tested methods, (iii) ideas about statistical significance testing approach, and (iv) feedback on other aspects of the manuscript and, in particular, interpretation of the results. The authors are also grateful to Efstratios Efstathiadis and Eric Peskin for the help with providing access and running experiments on the high performance computing facility. Finally, the authors would like to thank Ioannis Aifantis for providing experimental data for NOTCH1 that was used for the development of the corresponding gold standard. The empirical evaluation was supported in part by the grants 1R01LM011179-01A1 from the National Library of Medicine and 1UL1RR029893 from the National Center for Research Resources, National Institutes of Health. This article has been published as part of BMC Genomics Volume 13 Supplement 8, 2012: Proceedings of The International Conference on Intelligent Biology and Medicine (ICIBM): Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/ bmcgenomics/supplements/13/S8.

PY - 2012

Y1 - 2012

N2 - The discovery of molecular pathways is a challenging problem and its solution relies on the identification of causal molecular interactions in genomics data. Causal molecular interactions can be discovered using randomized experiments; however such experiments are often costly, infeasible, or unethical. Fortunately, algorithms that infer causal interactions from observational data have been in development for decades, predominantly in the quantitative sciences, and many of them have recently been applied to genomics data. While these algorithms can infer unoriented causal interactions between involved molecular variables (i.e., without specifying which one is the cause and which one is the effect), causally orienting all inferred molecular interactions was assumed to be an unsolvable problem until recently. In this work, we use transcription factor-target gene regulatory interactions in three different organisms to evaluate a new family of methods that, given observational data for just two causally related variables, can determine which one is the cause and which one is the effect. We have found that a particular family of causal orientation methods (IGCI Gaussian) is often able to accurately infer directionality of causal interactions, and that these methods usually outperform other causal orientation techniques. We also introduced a novel ensemble technique for causal orientation that combines decisions of individual causal orientation methods. The ensemble method was found to be more accurate than any best individual causal orientation method in the tested data. This work represents a first step towards establishing context for practical use of causal orientation methods in the genomics domain. We have found that some causal orientation methodologies yield accurate predictions of causal orientation in genomics data, and we have improved on this capability with a novel ensemble method. Our results suggest that these methods have the potential to facilitate reconstruction of molecular pathways by minimizing the number of required randomized experiments to find causal directionality and by avoiding experiments that are infeasible and/or unethical.

AB - The discovery of molecular pathways is a challenging problem and its solution relies on the identification of causal molecular interactions in genomics data. Causal molecular interactions can be discovered using randomized experiments; however such experiments are often costly, infeasible, or unethical. Fortunately, algorithms that infer causal interactions from observational data have been in development for decades, predominantly in the quantitative sciences, and many of them have recently been applied to genomics data. While these algorithms can infer unoriented causal interactions between involved molecular variables (i.e., without specifying which one is the cause and which one is the effect), causally orienting all inferred molecular interactions was assumed to be an unsolvable problem until recently. In this work, we use transcription factor-target gene regulatory interactions in three different organisms to evaluate a new family of methods that, given observational data for just two causally related variables, can determine which one is the cause and which one is the effect. We have found that a particular family of causal orientation methods (IGCI Gaussian) is often able to accurately infer directionality of causal interactions, and that these methods usually outperform other causal orientation techniques. We also introduced a novel ensemble technique for causal orientation that combines decisions of individual causal orientation methods. The ensemble method was found to be more accurate than any best individual causal orientation method in the tested data. This work represents a first step towards establishing context for practical use of causal orientation methods in the genomics domain. We have found that some causal orientation methodologies yield accurate predictions of causal orientation in genomics data, and we have improved on this capability with a novel ensemble method. Our results suggest that these methods have the potential to facilitate reconstruction of molecular pathways by minimizing the number of required randomized experiments to find causal directionality and by avoiding experiments that are infeasible and/or unethical.

UR - http://www.scopus.com/inward/record.url?scp=84878796361&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84878796361&partnerID=8YFLogxK

U2 - 10.1186/1471-2164-13-s8-s22

DO - 10.1186/1471-2164-13-s8-s22

M3 - Article

C2 - 23282373

AN - SCOPUS:84878796361

SN - 0022-1120

VL - 13 Suppl 8

JO - Unknown Journal

JF - Unknown Journal

ER -

New methods for separating causes from effects in genomics data.

Abstract

Bibliographical note

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this