Sparse Regression by Projection and Sparse Discriminant Analysis

Handle URI:
http://hdl.handle.net/10754/599682
Title:
Sparse Regression by Projection and Sparse Discriminant Analysis
Authors:
Qi, Xin; Luo, Ruiyan; Carroll, Raymond J.; Zhao, Hongyu
Abstract:
© 2015, © American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America. Recent years have seen active developments of various penalized regression methods, such as LASSO and elastic net, to analyze high-dimensional data. In these approaches, the direction and length of the regression coefficients are determined simultaneously. Due to the introduction of penalties, the length of the estimates can be far from being optimal for accurate predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths and the tuning parameters are determined by a cross-validation procedure to achieve the largest prediction accuracy. We provide a theoretical result for simultaneous model selection consistency and parameter estimation consistency of our method in high dimension. This new framework is then generalized such that it can be applied to principal components analysis, partial least squares, and canonical correlation analysis. We also adapt this framework for discriminant analysis. Compared with the existing methods, where there is relatively little control of the dependency among the sparse components, our method can control the relationships among the components. We present efficient algorithms and related theory for solving the sparse regression by projection problem. Based on extensive simulations and real data analysis, we demonstrate that our method achieves good predictive performance and variable selection in the regression setting, and the ability to control relationships between the sparse components leads to more accurate classification. In supplementary materials available online, the details of the algorithms and theoretical proofs, and R codes for all simulation studies are provided.
Citation:
Qi X, Luo R, Carroll RJ, Zhao H (2015) Sparse Regression by Projection and Sparse Discriminant Analysis. Journal of Computational and Graphical Statistics 24: 416–438. Available: http://dx.doi.org/10.1080/10618600.2014.907094.
Publisher:
Informa UK Limited
Journal:
Journal of Computational and Graphical Statistics
KAUST Grant Number:
KUS-CI-016-04
Issue Date:
3-Apr-2015
DOI:
10.1080/10618600.2014.907094
PubMed ID:
26345204
PubMed Central ID:
PMC4560121
Type:
Article
ISSN:
1061-8600; 1537-2715
Sponsors:
Carroll’s research was supported by a grant from the National Cancer Institute (R37-CA057030). This publication is based in part on work supported by Award Number KUS-CI-016-04, made by King Abdullah University of Science and Technology (KAUST). Zhao’s research was supported in part by NIH R01 GM59507, P01 CA154295, and NSF DMS 1106738.
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorQi, Xinen
dc.contributor.authorLuo, Ruiyanen
dc.contributor.authorCarroll, Raymond J.en
dc.contributor.authorZhao, Hongyuen
dc.date.accessioned2016-02-28T06:07:25Zen
dc.date.available2016-02-28T06:07:25Zen
dc.date.issued2015-04-03en
dc.identifier.citationQi X, Luo R, Carroll RJ, Zhao H (2015) Sparse Regression by Projection and Sparse Discriminant Analysis. Journal of Computational and Graphical Statistics 24: 416–438. Available: http://dx.doi.org/10.1080/10618600.2014.907094.en
dc.identifier.issn1061-8600en
dc.identifier.issn1537-2715en
dc.identifier.pmid26345204en
dc.identifier.doi10.1080/10618600.2014.907094en
dc.identifier.urihttp://hdl.handle.net/10754/599682en
dc.description.abstract© 2015, © American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America. Recent years have seen active developments of various penalized regression methods, such as LASSO and elastic net, to analyze high-dimensional data. In these approaches, the direction and length of the regression coefficients are determined simultaneously. Due to the introduction of penalties, the length of the estimates can be far from being optimal for accurate predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths and the tuning parameters are determined by a cross-validation procedure to achieve the largest prediction accuracy. We provide a theoretical result for simultaneous model selection consistency and parameter estimation consistency of our method in high dimension. This new framework is then generalized such that it can be applied to principal components analysis, partial least squares, and canonical correlation analysis. We also adapt this framework for discriminant analysis. Compared with the existing methods, where there is relatively little control of the dependency among the sparse components, our method can control the relationships among the components. We present efficient algorithms and related theory for solving the sparse regression by projection problem. Based on extensive simulations and real data analysis, we demonstrate that our method achieves good predictive performance and variable selection in the regression setting, and the ability to control relationships between the sparse components leads to more accurate classification. In supplementary materials available online, the details of the algorithms and theoretical proofs, and R codes for all simulation studies are provided.en
dc.description.sponsorshipCarroll’s research was supported by a grant from the National Cancer Institute (R37-CA057030). This publication is based in part on work supported by Award Number KUS-CI-016-04, made by King Abdullah University of Science and Technology (KAUST). Zhao’s research was supported in part by NIH R01 GM59507, P01 CA154295, and NSF DMS 1106738.en
dc.publisherInforma UK Limiteden
dc.subjectZero within-class and between-class correlationsen
dc.titleSparse Regression by Projection and Sparse Discriminant Analysisen
dc.typeArticleen
dc.identifier.journalJournal of Computational and Graphical Statisticsen
dc.identifier.pmcidPMC4560121en
dc.contributor.institutionGeorgia State University, Atlanta, United Statesen
dc.contributor.institutionTexas A and M University, College Station, United Statesen
dc.contributor.institutionYale University, New Haven, United Statesen
kaust.grant.numberKUS-CI-016-04en

Related articles on PubMed

All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.