Gaussian copula precision estimation with missing values

Huahua Wang, Faridel Fazayeli, Soumyadeep Chatterjee, Arindam Banerjee

Research output: Contribution to journalConference articlepeer-review

13 Scopus citations

Abstract

We consider the problem of estimating sparse precision matrix of Gaussian copula distributions using samples with missing values in high dimensions. Existing approaches, primarily designed for Gaussian distributions, suggest using plugin estimators by disregarding the missing values. In this paper, we propose double plugin Gaussian (DoPinG) copula estimators to estimate the sparse precision matrix corresponding to non-paranormal distributions. DoPinG uses two plugin procedures and consists of three steps: (1) estimate nonparametric correlations based on observed values, including Kendall's tau and Spearman's rho; (2) estimate the non-paranormal correlation matrix; (3) plug into existing sparse precision estimators. We prove that DoPinG copula estimators consistently estimate the non-paranormal correlation matrix at a rate of O(1/1-δ √log p/n), where δ is the probability of missing values. We provide experimental results to illustrate the effect of sample size and percentage of missing data on the model performance. Experimental results show that DoPinG is significantly better than estimators like mGlasso, which are primarily designed for Gaussian data.

Original languageEnglish (US)
Pages (from-to)978-986
Number of pages9
JournalJournal of Machine Learning Research
Volume33
StatePublished - 2014
Event17th International Conference on Artificial Intelligence and Statistics, AISTATS 2014 - Reykjavik, Iceland
Duration: Apr 22 2014Apr 25 2014

Fingerprint

Dive into the research topics of 'Gaussian copula precision estimation with missing values'. Together they form a unique fingerprint.

Cite this