TY - JOUR
T1 - Automatic instance selection via locality constrained sparse representation for missing value estimation
AU - Feng, Xiaodong
AU - Wu, Sen
AU - Srivastava, Jaideep
AU - Desikan, Prasanna
PY - 2015/9/1
Y1 - 2015/9/1
N2 - Missing values in real application can significantly disturb the result of knowledge discovery, and it is thus vital to estimate this unknown data accurately. This paper focuses on applying sparse representation to improve the quality of estimation of the absent values. Firstly, a novel sparse representation scheme called locality constrained sparse representation (LCSR) is presented, introducing locality l1-norm and l2-norm regularization. Taking the advantage of sparsity, smoothness and locality structure, LCSR is capable of automatically selecting instance and avoiding overfitting. Then LCSR-based missing value estimation (LCSR-MVE) is proposed to estimate the unobserved values through the linear combination of automatically selected atoms from dictionary due to the sparsity in reconstruction coefficient vector, while three dictionary constructions are also developed respectively. The proposed LCSR-MVE is evaluated on 6 datasets from UCI and gene expression databases, compared with other instance-based missing value estimation methods. Results show that the proposed LCSR-MVE outperforms other state-of-arts methods in terms of normalized root mean squared error (NRMSE), and is not much sensitive to the dictionary size and regularization parameters.
AB - Missing values in real application can significantly disturb the result of knowledge discovery, and it is thus vital to estimate this unknown data accurately. This paper focuses on applying sparse representation to improve the quality of estimation of the absent values. Firstly, a novel sparse representation scheme called locality constrained sparse representation (LCSR) is presented, introducing locality l1-norm and l2-norm regularization. Taking the advantage of sparsity, smoothness and locality structure, LCSR is capable of automatically selecting instance and avoiding overfitting. Then LCSR-based missing value estimation (LCSR-MVE) is proposed to estimate the unobserved values through the linear combination of automatically selected atoms from dictionary due to the sparsity in reconstruction coefficient vector, while three dictionary constructions are also developed respectively. The proposed LCSR-MVE is evaluated on 6 datasets from UCI and gene expression databases, compared with other instance-based missing value estimation methods. Results show that the proposed LCSR-MVE outperforms other state-of-arts methods in terms of normalized root mean squared error (NRMSE), and is not much sensitive to the dictionary size and regularization parameters.
KW - Instance selection
KW - Locality constrained regularization
KW - Missing value estimation
KW - Sparse representation
UR - http://www.scopus.com/inward/record.url?scp=84937516978&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84937516978&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2015.05.007
DO - 10.1016/j.knosys.2015.05.007
M3 - Article
AN - SCOPUS:84937516978
VL - 85
SP - 210
EP - 223
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
SN - 0950-7051
ER -