Algorithms for discovery of multiple Markov boundaries

Alexander Statnikov, Nikita I. Lytkin, Jan Lemeire, Constantin F. Aliferis

Research output: Contribution to journalArticlepeer-review

25 Scopus citations

Abstract

Algorithms for Markov boundary discovery from data constitute an important recent development in machine learning, primarily because they offer a principled solution to the variable/feature selection problem and give insight on local causal structure. Over the last decade many sound algorithms have been proposed to identify a single Markov boundary of the response variable. Even though faithful distributions and, more broadly, distributions that satisfy the intersection property always have a single Markov boundary, other distributions/data sets may have multiple Markov boundaries of the response variable. The latter distributions/data sets are common in practical data-analytic applications, and there are several reasons why it is important to induce multiple Markov boundaries from such data. However, there are currently no sound and efficient algorithms that can accomplish this task. This paper describes a family of algorithms TIE* that can discover all Markov boundaries in a distribution. The broad applicability as well as efficiency of the new algorithmic family is demonstrated in an extensive benchmarking study that involved comparison with 26 state-of-the-art algorithms/variants in 15 data sets from a diversity of application domains.

Original languageEnglish (US)
Pages (from-to)499-566
Number of pages68
JournalJournal of Machine Learning Research
Volume14
Issue number1
StatePublished - Feb 2013
Externally publishedYes

Keywords

  • Information equivalence
  • Markov boundary discovery
  • Variable/feature selection
  • Violations of faithfulness

Fingerprint Dive into the research topics of 'Algorithms for discovery of multiple Markov boundaries'. Together they form a unique fingerprint.

Cite this