Multi-domain rule-based phenotyping algorithms enable improved GWAS signal.
Newbury A, Elhussein A, Gürsoy G
Publication Details
Comprehensive information about this research publication
Abstract
Summary of the research findings
Biobanks are a rich source of data for genome-wide association studies (GWAS). They store clinical data from electronic health records, with data domains such as laboratory measurements, conditions, and self-reported diagnoses. Traditionally, biobank GWAS utilize case-control cohorts built exclusively from conditions. However, because reported conditions are primarily collected for billing purposes, they face data quality issues. Consequently, incorporating additional data domains in cohort construction can improve cohort accuracy and GWAS results. Here, we assess the impact of various rule-based phenotyping algorithms on GWAS outcomes, examining factors such as power, heritability, replicability, functional annotations, and polygenic risk score prediction accuracy across seven diseases in the UK Biobank. We find that high complexity phenotyping algorithms generally improve GWAS outcomes, including increased power, hits within coding and functional genomic regions, and co-localization with expression quantitative trait loci. Our findings suggest that biobank-scale GWAS can benefit from phenotyping algorithms that integrate multiple data domains.
405,811 European ancestry, unknown ancestry individuals
Study Statistics
Key metrics and study information
Analysis
Comprehensive review of health and genetic findings
Important Disclaimer: This review has been performed semi-automatically and is provided for informational purposes only. While we strive for accuracy, this analysis may contain errors, omissions, or misinterpretations of the original research. DNA Genics disclaims all liability for any inaccuracies, errors, or consequences arising from the use of this information. Users should independently verify all information and consult original research publications before making any decisions based on this content. This analysis is not intended as a substitute for professional scientific review or medical advice.
Analysis In Progress
Our analysis of this publication is currently being prepared. Please check back soon for comprehensive insights into the health and genetic findings discussed in this research.