TY - JOUR
T1 - Adaptive testing for association between two random vectors in moderate to high dimensions
AU - Xu, Zhiyuan
AU - Xu, Gongjun
AU - Pan, Wei
N1 - Publisher Copyright:
© 2017 WILEY PERIODICALS, INC.
PY - 2017/11
Y1 - 2017/11
N2 - Testing for association between two random vectors is a common and important task in many fields, however, existing tests, such as Escoufier's RV test, are suitable only for low-dimensional data, not for high-dimensional data. In moderate to high dimensions, it is necessary to consider sparse signals, which are often expected with only a few, but not many, variables associated with each other. We generalize the RV test to moderate-to-high dimensions. The key idea is to data adaptively weight each variable pair based on its empirical association. As the consequence, the proposed test is adaptive, alleviating the effects of noise accumulation in high-dimensional data, and thus maintaining the power for both dense and sparse alternative hypotheses. We show the connections between the proposed test with several existing tests, such as a generalized estimating equations-based adaptive test, multivariate kernel machine regression (KMR), and kernel distance methods. Furthermore, we modify the proposed adaptive test so that it can be powerful for nonlinear or nonmonotonic associations. We use both real data and simulated data to demonstrate the advantages and usefulness of the proposed new test. The new test is freely available in R package aSPC on CRAN at https://cran.r-project.org/web/packages/aSPC/index.html and https://github.com/jasonzyx/aSPC.
AB - Testing for association between two random vectors is a common and important task in many fields, however, existing tests, such as Escoufier's RV test, are suitable only for low-dimensional data, not for high-dimensional data. In moderate to high dimensions, it is necessary to consider sparse signals, which are often expected with only a few, but not many, variables associated with each other. We generalize the RV test to moderate-to-high dimensions. The key idea is to data adaptively weight each variable pair based on its empirical association. As the consequence, the proposed test is adaptive, alleviating the effects of noise accumulation in high-dimensional data, and thus maintaining the power for both dense and sparse alternative hypotheses. We show the connections between the proposed test with several existing tests, such as a generalized estimating equations-based adaptive test, multivariate kernel machine regression (KMR), and kernel distance methods. Furthermore, we modify the proposed adaptive test so that it can be powerful for nonlinear or nonmonotonic associations. We use both real data and simulated data to demonstrate the advantages and usefulness of the proposed new test. The new test is freely available in R package aSPC on CRAN at https://cran.r-project.org/web/packages/aSPC/index.html and https://github.com/jasonzyx/aSPC.
KW - GEE-aSPU test
KW - RV test
KW - aSPC test
KW - dCov test
KW - eQTL
UR - http://www.scopus.com/inward/record.url?scp=85024867228&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85024867228&partnerID=8YFLogxK
U2 - 10.1002/gepi.22059
DO - 10.1002/gepi.22059
M3 - Article
C2 - 28714590
AN - SCOPUS:85024867228
SN - 0741-0395
VL - 41
SP - 599
EP - 609
JO - Genetic epidemiology
JF - Genetic epidemiology
IS - 7
ER -