Data integration methods for phenotype harmonization in multi-cohort genome-wide association studies with behavioral outcomes

TitleData integration methods for phenotype harmonization in multi-cohort genome-wide association studies with behavioral outcomes
Publication TypeJournal Article
Year of Publication2019
AuthorsLuningham, JM, McArtor, DB, Hendriks, AM, van Beijsterveldt, CEM, Lichtenstein, P, Lundström, S, Larsson, H, Bartels, M, Boomsma, DI, Lubke, GH
JournalFrontiers in Genetics
Keywordsconsortia, data integration, genome-wide association studies, latent variable modeling, phenotype harmonization

Parallel meta-analysis is a popular approach for increasing the power to detect genetic effects in genome-wide association studies across multiple cohorts. Consortia studying the genetics of behavioral phenotypes are oftentimes faced with systematic differences in phenotype measurement across cohorts, introducing heterogeneity into the meta-analysis and reducing statistical power. This study investigated integrative data analysis (IDA) as an approach for jointly modeling the phenotype across multiple datasets. We put forth a bi-factor integration model (BFIM) that provides a single common phenotype score and accounts for sources of study-specific variability in the phenotype. In order to capitalize on this modeling strategy, a phenotype reference panel was utilized as a supplemental sample with complete data on all behavioral measures. A simulation study showed that a mega-analysis of genetic variant effects in a BFIM were more powerful than meta-analysis of genetic effects on a cohort-specific sum score of items. Saving the factor scores from the BFIM and using those as the outcome in meta-analysis was also more powerful than the sum score in most simulation conditions, but a small degree of bias was introduced by this approach. The reference panel was necessary to realize these power gains. An empirical demonstration used the BFIM to harmonize aggression scores in 9-year old children across the Netherlands Twin Register and the Child and Adolescent Twin Study in Sweden, providing a template for application of the BFIM to a range of different phenotypes. A supplemental data collection in the Netherlands Twin Register served as a reference panel for phenotype modeling across both cohorts. Our results indicate that model-based harmonization for the study of complex traits is a useful step within genetic consortia.