Biomarker Detection in Association Studies: Modeling SNPs Simultaneously via Logistic ANOVA

Handle URI:
http://hdl.handle.net/10754/597675
Title:
Biomarker Detection in Association Studies: Modeling SNPs Simultaneously via Logistic ANOVA
Authors:
Jung, Yoonsuh; Huang, Jianhua Z.; Hu, Jianhua
Abstract:
In genome-wide association studies, the primary task is to detect biomarkers in the form of Single Nucleotide Polymorphisms (SNPs) that have nontrivial associations with a disease phenotype and some other important clinical/environmental factors. However, the extremely large number of SNPs comparing to the sample size inhibits application of classical methods such as the multiple logistic regression. Currently the most commonly used approach is still to analyze one SNP at a time. In this paper, we propose to consider the genotypes of the SNPs simultaneously via a logistic analysis of variance (ANOVA) model, which expresses the logit transformed mean of SNP genotypes as the summation of the SNP effects, effects of the disease phenotype and/or other clinical variables, and the interaction effects. We use a reduced-rank representation of the interaction-effect matrix for dimensionality reduction, and employ the L 1-penalty in a penalized likelihood framework to filter out the SNPs that have no associations. We develop a Majorization-Minimization algorithm for computational implementation. In addition, we propose a modified BIC criterion to select the penalty parameters and determine the rank number. The proposed method is applied to a Multiple Sclerosis data set and simulated data sets and shows promise in biomarker detection.
Citation:
Jung Y, Huang JZ, Hu J (2014) Biomarker Detection in Association Studies: Modeling SNPs Simultaneously via Logistic ANOVA. Journal of the American Statistical Association 109: 1355–1367. Available: http://dx.doi.org/10.1080/01621459.2014.928217.
Publisher:
Informa UK Limited
Journal:
Journal of the American Statistical Association
KAUST Grant Number:
KUS-CI-016-04; GRP-CF-2011-19-P-Gao-Huang
Issue Date:
2-Oct-2014
DOI:
10.1080/01621459.2014.928217
PubMed ID:
25642005
PubMed Central ID:
PMC4310485
Type:
Article
ISSN:
0162-1459; 1537-274X
Sponsors:
Hu's work was partially supported by the National Institute of Health Grants R21CA129671, R01GM080503, R01CA158113, and CGSG P30 CA016672. Huang's work was partially supported by grants from NSF (DMS-0907170, DMS-1007618, DMS-1208952), and Award Number KUS-CI-016-04 and GRP-CF-2011-19-P-Gao-Huang, made by King Abdullah University of Science and Technology (KAUST). The authors thank the editor, the associate editor, and reviewers for many constructive comments.
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorJung, Yoonsuhen
dc.contributor.authorHuang, Jianhua Z.en
dc.contributor.authorHu, Jianhuaen
dc.date.accessioned2016-02-25T12:44:11Zen
dc.date.available2016-02-25T12:44:11Zen
dc.date.issued2014-10-02en
dc.identifier.citationJung Y, Huang JZ, Hu J (2014) Biomarker Detection in Association Studies: Modeling SNPs Simultaneously via Logistic ANOVA. Journal of the American Statistical Association 109: 1355–1367. Available: http://dx.doi.org/10.1080/01621459.2014.928217.en
dc.identifier.issn0162-1459en
dc.identifier.issn1537-274Xen
dc.identifier.pmid25642005en
dc.identifier.doi10.1080/01621459.2014.928217en
dc.identifier.urihttp://hdl.handle.net/10754/597675en
dc.description.abstractIn genome-wide association studies, the primary task is to detect biomarkers in the form of Single Nucleotide Polymorphisms (SNPs) that have nontrivial associations with a disease phenotype and some other important clinical/environmental factors. However, the extremely large number of SNPs comparing to the sample size inhibits application of classical methods such as the multiple logistic regression. Currently the most commonly used approach is still to analyze one SNP at a time. In this paper, we propose to consider the genotypes of the SNPs simultaneously via a logistic analysis of variance (ANOVA) model, which expresses the logit transformed mean of SNP genotypes as the summation of the SNP effects, effects of the disease phenotype and/or other clinical variables, and the interaction effects. We use a reduced-rank representation of the interaction-effect matrix for dimensionality reduction, and employ the L 1-penalty in a penalized likelihood framework to filter out the SNPs that have no associations. We develop a Majorization-Minimization algorithm for computational implementation. In addition, we propose a modified BIC criterion to select the penalty parameters and determine the rank number. The proposed method is applied to a Multiple Sclerosis data set and simulated data sets and shows promise in biomarker detection.en
dc.description.sponsorshipHu's work was partially supported by the National Institute of Health Grants R21CA129671, R01GM080503, R01CA158113, and CGSG P30 CA016672. Huang's work was partially supported by grants from NSF (DMS-0907170, DMS-1007618, DMS-1208952), and Award Number KUS-CI-016-04 and GRP-CF-2011-19-P-Gao-Huang, made by King Abdullah University of Science and Technology (KAUST). The authors thank the editor, the associate editor, and reviewers for many constructive comments.en
dc.publisherInforma UK Limiteden
dc.subjectGwasen
dc.subjectBicen
dc.subjectL1-penaltyen
dc.subjectMm Algorithmen
dc.subjectPenalized Bernoulli Likelihooden
dc.subjectSimultaneous Modeling Of Snpsen
dc.titleBiomarker Detection in Association Studies: Modeling SNPs Simultaneously via Logistic ANOVAen
dc.typeArticleen
dc.identifier.journalJournal of the American Statistical Associationen
dc.identifier.pmcidPMC4310485en
dc.contributor.institutionDepartment of Statistics, Univerisity of Waikato, Private Bag 3105, Hamilton 3240, New Zealand.en
dc.contributor.institutionDepartment of Statistics, Texas A&M University, College Station, TX, USA, and Special Term Professor at ISEM, Captial University of Economics and Business, Beijing, China.en
dc.contributor.institutionDepartment of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.en
kaust.grant.numberKUS-CI-016-04en
kaust.grant.numberGRP-CF-2011-19-P-Gao-Huangen

Related articles on PubMed

All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.