Background: Arthropods comprise the largest and most diverse phylum on Earth and play vital roles in nearly every ecosystem. Their diversity stems in part from variations on a conserved body plan, resulting from and recorded in adaptive changes in the genome. Dissection of the genomic record of sequence change enables broad questions regarding genome evolution to be addressed, even across hyper-diverse taxa within arthropods. Results: Using 76 whole genome sequences representing 21 orders spanning more than 500 million years of arthropod evolution, we document changes in gene and protein domain content and provide temporal and phylogenetic context for interpreting these innovations. We identify many novel gene families that arose early in the evolution of arthropods and during the diversification of insects into modern orders. We reveal unexpected variation in patterns of DNA methylation across arthropods and examples of gene family and protein domain evolution coincident with the appearance of notable phenotypic and physiological adaptations such as flight, metamorphosis, sociality, and chemoperception. Conclusions: These analyses demonstrate how large-scale comparative genomics can provide broad new insights into the genotype to phenotype map and generate testable hypotheses about the evolution of animal diversity.
Bibliographical noteFunding Information:
Genome sequencing, assembly, and annotation were funded by National Human Genome Research Institute grant U54 HG003273 to R.A.G. GWCT and MWH are funded by NSF DBI-1564611. ED was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) 281125614 / GRK2220. RMW, PI, and EMZ were funded by The Swiss National Science Foundation (PP00P3_170664 to RMW, 31003A_143936 to EMZ). Contributions by DDM and AW were supported in part by NSF-DEB grant 1355169 and USDA-APHIS Cooperative Agreement 15-8130-0547-CA to DDM. BM and ON acknowledge the German Research foundation (NI 1387/3-1, MI 649/12-1) and the Leibnitz Graduate School on Genomic Biodiversity Research. CS was supported by the Blanton J. Whitmire endowment, Housing and Urban Development NCHHU-0007-13, National Science Foundation 1557864 and Alfred P. Sloan Foundation 2013-5-35 MBE. Funding from Australian Wool Innovation (to P.B. and R.B.G.) and the Australian Research Council (to R.B.G.) is gratefully acknowledged. Support to R.B.G.'s laboratory by YourGene Bioscience and Melbourne Water Corporation is gratefully acknowledged. This project was also supported by a Victorian Life Sciences Computation Initiative (VLSCI; grant number VR0007) on its Peak Computing Facility at the University of Melbourne, an initiative of the Victorian Government (R.B.G.). C.A.A. holds an NSERC Postdoctoral Fellowship. N.D.Y. holds an NHMRC Early Career Research Fellowship. P.K.K. is the recipient of a scholarship (STRAPA) from the University of Melbourne. No funding body participated in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
- DNA methylation
- Gene content
- Genome assembly
- Protein domains