KAUST DepartmentBiological and Environmental Science and Engineering (BESE) Division
Permanent link to this recordhttp://hdl.handle.net/10754/676336
MetadataShow full item record
AbstractGenome-wide Association Studies (GWAS) is the study used to associate common variants and phenotypes and has uncovered thousands of disease-associated variants. However, there is limited research on the contribution of a rare variant. The UK Biobank (UKB) contains detailed medical records and genetic information for nearly 500,000 individuals and offers a great opportunity for genetic association studies on rare variants. Here we focused on the role of rare protein-coding variants on UKB phenotypes. We selected three diseases for analysis: breast cancer, hypothyroidism and type II diabetes. We defined criteria for qualifying variants and pruned the control group to reduce interference signals from similar phenotypes. We identified the most known biomarkers for those diseases, such as BRCA1 and BRCA2 gene for breast cancer, TG and TSHR gene for hypothyroidism and GCK for type II diabetes. This result supports the model validity and clarifies the contribution of rare variants to diseases. Moreover, we also tried the geneset based collapsing method to aggregate information across genes to strengthen the signal from rare variants and build a diagnosis model that only relies on the genetic information. Our model could achieve great performance with an AUC of more than 20% improvement for type II diabetes and breast cancer and more than 90% accuracy for hypothyroidism.
CitationLiu, Y. (2022). Rare variant analysis on UK Biobank. KAUST Research Repository. https://doi.org/10.25781/KAUST-T66H1