Name:
KAUST_Thesis_Dissertation_LaTeX_5Sept2016gamma_YangThesis (1) (2).pdf
Size:
7.522Mb
Format:
PDF
Description:
MS Thesis
Type
ThesisAuthors
Liu, Yang
Advisors
Hoehndorf, Robert
Committee members
Hauser, Charlotte
Gojobori, Takashi

Program
BioengineeringKAUST Department
Biological and Environmental Science and Engineering (BESE) DivisionDate
2022-04-17Permanent link to this record
http://hdl.handle.net/10754/676336
Metadata
Show full item recordAbstract
Genome-wide Association Studies (GWAS) is the study used to associate common variants and phenotypes and has uncovered thousands of disease-associated variants. However, there is limited research on the contribution of a rare variant. The UK Biobank (UKB) contains detailed medical records and genetic information for nearly 500,000 individuals and offers a great opportunity for genetic association studies on rare variants. Here we focused on the role of rare protein-coding variants on UKB phenotypes. We selected three diseases for analysis: breast cancer, hypothyroidism and type II diabetes. We defined criteria for qualifying variants and pruned the control group to reduce interference signals from similar phenotypes. We identified the most known biomarkers for those diseases, such as BRCA1 and BRCA2 gene for breast cancer, TG and TSHR gene for hypothyroidism and GCK for type II diabetes. This result supports the model validity and clarifies the contribution of rare variants to diseases. Moreover, we also tried the geneset based collapsing method to aggregate information across genes to strengthen the signal from rare variants and build a diagnosis model that only relies on the genetic information. Our model could achieve great performance with an AUC of more than 20% improvement for type II diabetes and breast cancer and more than 90% accuracy for hypothyroidism.Citation
Liu, Y. (2022). Rare variant analysis on UK Biobank. KAUST Research Repository. https://doi.org/10.25781/KAUST-T66H1ae974a485f413a2113503eed53cd6c53
10.25781/KAUST-T66H1