Genome-wide, imputed, sequence, and structural data are now available for exceedingly large sample sizes. The needs for data management, handling population structure and related samples, and performing associations have largely been met. However, the infrastructure to support analyses involving complexity beyond genome-wide association studies is not standardized or centralized. We provide the PLatform for the Analysis, Translation, and Organization of large-scale data (PLATO), a software tool equipped to handle multi-omic data for hundreds of thousands of samples to explore complexity using genetic interactions, environment-wide association studies and gene-environment interactions, phenome-wide association studies, as well as copy number and rare variant analyses. Using the data from the Marshfield Personalized Medicine Research Project, a site in the electronic Medical Records and Genomics Network, we apply each feature of PLATO to type 2 diabetes and demonstrate how PLATO can be used to uncover the complex etiology of common traits.
Bibliographical noteFunding Information:
The project described was partially supported by NIH grants UL1TR000427, 1U01HG006389, HL065962, LM010040, HG006385, LM009012, LM010098, and AI116794. Additional funding was supported by F31 HG008588. We also acknowledge members of the former Center for Human Genetics Research (Jonathan L. Haines, Jacob H. McCauley, Dana C. Crawford, and William S. Bush) who were involved in early planning and implementations of PLATO.