Validation sequence optimization: A theoretical approach

Gediminas Adomavicius, Alexander Tuzhilin

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

The need to validate large amounts of data with the help of the domain expert arises naturally in many data-intensive applications, including data mining, data stream, and database-related applications. This paper presents a general validation approach that generalizes different expert-driven validation methods developed for specialized validation problems. In particular, we model the validation process as a sequence of validation operators, explore various properties of such sequences, and present theoretical results that provide for better understanding of the validation process. We also address the problem of selecting the best validation sequence among the class of equivalent sequence permutations. We demonstrate that this optimization problem is NP-hard and present two heuristic algorithms for improving validation sequences.

Original languageEnglish (US)
Pages (from-to)185-200
Number of pages16
JournalINFORMS Journal on Computing
Volume19
Issue number2
DOIs
StatePublished - 2007

Keywords

  • Computational complexity
  • Data mining
  • Dynamic programming
  • Heuristic algorithms
  • Sequence optimization
  • Validation
  • Validation operators
  • Validation sequences

Fingerprint

Dive into the research topics of 'Validation sequence optimization: A theoretical approach'. Together they form a unique fingerprint.

Cite this