Algorithms for discovery of multiple Markov boundaries

Alexander Statnikov; Nikita I. Lytkin; Jan Lemeire; Constantin F. Aliferis

Algorithms for discovery of multiple Markov boundaries

Alexander Statnikov, Nikita I. Lytkin, Jan Lemeire, Constantin F. Aliferis

Research output: Contribution to journal › Article › peer-review

Abstract

Algorithms for Markov boundary discovery from data constitute an important recent development in machine learning, primarily because they offer a principled solution to the variable/feature selection problem and give insight on local causal structure. Over the last decade many sound algorithms have been proposed to identify a single Markov boundary of the response variable. Even though faithful distributions and, more broadly, distributions that satisfy the intersection property always have a single Markov boundary, other distributions/data sets may have multiple Markov boundaries of the response variable. The latter distributions/data sets are common in practical data-analytic applications, and there are several reasons why it is important to induce multiple Markov boundaries from such data. However, there are currently no sound and efficient algorithms that can accomplish this task. This paper describes a family of algorithms TIE^* that can discover all Markov boundaries in a distribution. The broad applicability as well as efficiency of the new algorithmic family is demonstrated in an extensive benchmarking study that involved comparison with 26 state-of-the-art algorithms/variants in 15 data sets from a diversity of application domains.

Original language	English (US)
Pages (from-to)	499-566
Number of pages	68
Journal	Journal of Machine Learning Research
Volume	14
Issue number	1
State	Published - Feb 2013
Externally published	Yes

Keywords

Information equivalence
Markov boundary discovery
Variable/feature selection
Violations of faithfulness

OpenUrl availability

Full text

Cite this

@article{1d15a809b48743b2b69f5150bb72b2ce,

title = "Algorithms for discovery of multiple Markov boundaries",

abstract = "Algorithms for Markov boundary discovery from data constitute an important recent development in machine learning, primarily because they offer a principled solution to the variable/feature selection problem and give insight on local causal structure. Over the last decade many sound algorithms have been proposed to identify a single Markov boundary of the response variable. Even though faithful distributions and, more broadly, distributions that satisfy the intersection property always have a single Markov boundary, other distributions/data sets may have multiple Markov boundaries of the response variable. The latter distributions/data sets are common in practical data-analytic applications, and there are several reasons why it is important to induce multiple Markov boundaries from such data. However, there are currently no sound and efficient algorithms that can accomplish this task. This paper describes a family of algorithms TIE* that can discover all Markov boundaries in a distribution. The broad applicability as well as efficiency of the new algorithmic family is demonstrated in an extensive benchmarking study that involved comparison with 26 state-of-the-art algorithms/variants in 15 data sets from a diversity of application domains.",

keywords = "Information equivalence, Markov boundary discovery, Variable/feature selection, Violations of faithfulness",

author = "Alexander Statnikov and Lytkin, {Nikita I.} and Jan Lemeire and Aliferis, {Constantin F.}",

year = "2013",

month = feb,

language = "English (US)",

volume = "14",

pages = "499--566",

journal = "Journal of Machine Learning Research",

issn = "1532-4435",

publisher = "Microtome Publishing",

number = "1",

}

TY - JOUR

T1 - Algorithms for discovery of multiple Markov boundaries

AU - Statnikov, Alexander

AU - Lytkin, Nikita I.

AU - Lemeire, Jan

AU - Aliferis, Constantin F.

PY - 2013/2

Y1 - 2013/2

N2 - Algorithms for Markov boundary discovery from data constitute an important recent development in machine learning, primarily because they offer a principled solution to the variable/feature selection problem and give insight on local causal structure. Over the last decade many sound algorithms have been proposed to identify a single Markov boundary of the response variable. Even though faithful distributions and, more broadly, distributions that satisfy the intersection property always have a single Markov boundary, other distributions/data sets may have multiple Markov boundaries of the response variable. The latter distributions/data sets are common in practical data-analytic applications, and there are several reasons why it is important to induce multiple Markov boundaries from such data. However, there are currently no sound and efficient algorithms that can accomplish this task. This paper describes a family of algorithms TIE* that can discover all Markov boundaries in a distribution. The broad applicability as well as efficiency of the new algorithmic family is demonstrated in an extensive benchmarking study that involved comparison with 26 state-of-the-art algorithms/variants in 15 data sets from a diversity of application domains.

AB - Algorithms for Markov boundary discovery from data constitute an important recent development in machine learning, primarily because they offer a principled solution to the variable/feature selection problem and give insight on local causal structure. Over the last decade many sound algorithms have been proposed to identify a single Markov boundary of the response variable. Even though faithful distributions and, more broadly, distributions that satisfy the intersection property always have a single Markov boundary, other distributions/data sets may have multiple Markov boundaries of the response variable. The latter distributions/data sets are common in practical data-analytic applications, and there are several reasons why it is important to induce multiple Markov boundaries from such data. However, there are currently no sound and efficient algorithms that can accomplish this task. This paper describes a family of algorithms TIE* that can discover all Markov boundaries in a distribution. The broad applicability as well as efficiency of the new algorithmic family is demonstrated in an extensive benchmarking study that involved comparison with 26 state-of-the-art algorithms/variants in 15 data sets from a diversity of application domains.

KW - Information equivalence

KW - Markov boundary discovery

KW - Variable/feature selection

KW - Violations of faithfulness

UR - http://www.scopus.com/inward/record.url?scp=84875199703&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84875199703&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84875199703

SN - 1532-4435

VL - 14

SP - 499

EP - 566

JO - Journal of Machine Learning Research

JF - Journal of Machine Learning Research

IS - 1

ER -

Algorithms for discovery of multiple Markov boundaries

Abstract

Keywords

OpenUrl availability

Other files and links

Fingerprint

Cite this