Show simple item record

dc.contributor.authorGower, Robert M.
dc.contributor.authorSchmidt, Mark
dc.contributor.authorBach, Francis
dc.contributor.authorRichtarik, Peter
dc.date.accessioned2020-11-15T11:23:26Z
dc.date.available2020-11-15T11:23:26Z
dc.date.issued2020-10-16
dc.date.submitted2020-01-08
dc.identifier.citationGower, R. M., Schmidt, M., Bach, F., & Richtarik, P. (2020). Variance-Reduced Methods for Machine Learning. Proceedings of the IEEE, 108(11), 1968–1983. doi:10.1109/jproc.2020.3028013
dc.identifier.issn1558-2256
dc.identifier.issn0018-9219
dc.identifier.doi10.1109/JPROC.2020.3028013
dc.identifier.urihttp://hdl.handle.net/10754/665943
dc.description.abstractStochastic optimization lies at the heart of machine learning, and its cornerstone is stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight years have seen an exciting new development: Variance reduction for stochastic optimization methods. These variance-reduced (VR) methods excel in settings where more than one pass through the training data is allowed, achieving a faster convergence than SGD in theory and practice. These speedups underline the surge of interest in VR methods and the fast-growing body of work on this topic. This review covers the key principles and main developments behind VR methods for optimization with finite data sets and is aimed at nonexpert readers. We focus mainly on the convex setting and leave pointers to readers interested in extensions for minimizing nonconvex functions.
dc.description.sponsorshipThe authors would like to thank Quanquan Gu, Julien Mairal, Tong Zhang, and Lin Xiao for valuable suggestions and comments on an earlier draft of this article. In particular, Quanquan’s recommendations for the nonconvex section improved the organization of our Section IV-G.
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.relation.urlhttps://ieeexplore.ieee.org/document/9226504/
dc.relation.urlhttp://arxiv.org/pdf/2010.00892
dc.rights(c) 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
dc.rightsThis file is an open access version redistributed from: http://arxiv.org/pdf/2010.00892
dc.titleVariance-Reduced Methods for Machine Learning
dc.typeArticle
dc.contributor.departmentComputer Science Program
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.identifier.journalProceedings of the IEEE
dc.eprint.versionPre-print
dc.contributor.institutionLTCI, Télécom Paris, Institut Polytechnique de Paris, Paris 75634, France
dc.contributor.institutionCCAI Affiliate Chair (Amii), The University of British Columbia, Vancouver, BC V6T1Z4, Canada
dc.contributor.institutionInria, PSL Research University, Paris 75006, France
dc.identifier.volume108
dc.identifier.issue11
dc.identifier.pages1968-1983
dc.identifier.arxivid2010.00892
kaust.personRichtarik, Peter
dc.date.accepted2020-06-08
dc.identifier.eid2-s2.0-85095713994
refterms.dateFOA2020-12-08T07:46:28Z
dc.date.published-online2020-10-16
dc.date.published-print2020-11
dc.date.posted2020-10-02


Files in this item

Thumbnail
Name:
Articlefile1.pdf
Size:
7.709Mb
Format:
PDF
Description:
Pre-print

This item appears in the following Collection(s)

Show simple item record