Show simple item record

dc.contributor.authorCao, Jian
dc.contributor.authorGuinness, Joseph
dc.contributor.authorGenton, Marc G.
dc.contributor.authorKatzfuss, Matthias
dc.date.accessioned2022-05-16T11:22:08Z
dc.date.available2022-05-16T11:22:08Z
dc.date.issued2022-03-02
dc.identifier.urihttp://hdl.handle.net/10754/677959
dc.description.abstractGaussian process (GP) regression is a flexible, nonparametric approach to regression that naturally quantifies uncertainty. In many applications, the number of responses and covariates are both large, and a goal is to select covariates that are related to the response. For this setting, we propose a novel, scalable algorithm, coined VGPR, which optimizes a penalized GP log-likelihood based on the Vecchia GP approximation, an ordered conditional approximation from spatial statistics that implies a sparse Cholesky factor of the precision matrix. We traverse the regularization path from strong to weak penalization, sequentially adding candidate covariates based on the gradient of the log-likelihood and deselecting irrelevant covariates via a new quadratic constrained coordinate descent algorithm. We propose Vecchia-based mini-batch subsampling, which provides unbiased gradient estimators. The resulting procedure is scalable to millions of responses and thousands of covariates. Theoretical analysis and numerical studies demonstrate the improved scalability and accuracy relative to existing methods.
dc.description.sponsorshipJian Cao was partially supported by the Texas A&M Institute of Data Science (TAMIDS) Postdoctoral Project program, Jian Cao and Matthias Katzfuss by National Science Foundation (NSF) Grant DMS– 1654083, Matthias Katzfuss and Joe Guinness by NSF Grant DMS–1953005, Matthias Katzfuss by NSF Grant CCF–1934904, and Jian Cao and Marc Genton were partially supported by the King Abdullah University of Science and Technology (KAUST). We would like to thank Felix Jimenez for his helpful comments and discussions.
dc.publisherarXiv
dc.relation.urlhttps://arxiv.org/pdf/2202.12981.pdf
dc.rightsArchived with thanks to arXiv
dc.titleScalable Gaussian-process regression and variable selection using Vecchia approximations
dc.typePreprint
dc.contributor.departmentComputer, Electrical and Mathematical Science and Engineering (CEMSE) Division
dc.contributor.departmentExtreme Computing Research Center
dc.contributor.departmentSpatio-Temporal Statistics and Data Analysis Group
dc.contributor.departmentStatistics Program
dc.eprint.versionPre-print
dc.contributor.institutionDepartment of Statistics and Institute of Data Science, Texas A&M University
dc.contributor.institutionDepartment of Statistics, Cornell University
dc.contributor.institutionDepartment of Statistics, Texas A&M University
dc.identifier.arxivid2202.12981
kaust.personGenton, Marc G.
refterms.dateFOA2022-05-16T11:23:03Z


Files in this item

Thumbnail
Name:
2202.12981 (1).pdf
Size:
1.397Mb
Format:
PDF
Description:
Preprint

This item appears in the following Collection(s)

Show simple item record