Show simple item record

dc.contributor.authorMalinovsky, Grigory
dc.contributor.authorSailanbayev, Alibek
dc.contributor.authorRichtarik, Peter
dc.date.accessioned2021-04-21T06:55:59Z
dc.date.available2021-04-21T06:55:59Z
dc.date.issued2021-04-19
dc.identifier.urihttp://hdl.handle.net/10754/668880
dc.description.abstractVirtually all state-of-the-art methods for training supervised machine learning models are variants of SGD enhanced with a number of additional tricks, such as minibatching, momentum, and adaptive stepsizes. One of the tricks that works so well in practice that it is used as default in virtually all widely used machine learning software is {\em random reshuffling (RR)}. However, the practical benefits of RR have until very recently been eluding attempts at being satisfactorily explained using theory. Motivated by recent development due to Mishchenko, Khaled and Richt\'{a}rik (2020), in this work we provide the first analysis of SVRG under Random Reshuffling (RR-SVRG) for general finite-sum problems. First, we show that RR-SVRG converges linearly with the rate $\mathcal{O}(\kappa^{3/2})$ in the strongly-convex case, and can be improved further to $\mathcal{O}(\kappa)$ in the big data regime (when $n > \mathcal{O}(\kappa)$), where $\kappa$ is the condition number. This improves upon the previous best rate $\mathcal{O}(\kappa^2)$ known for a variance reduced RR method in the strongly-convex case due to Ying, Yuan and Sayed (2020). Second, we obtain the first sublinear rate for general convex problems. Third, we establish similar fast rates for Cyclic-SVRG and Shuffle-Once-SVRG. Finally, we develop and analyze a more general variance reduction scheme for RR, which allows for less frequent updates of the control variate. We corroborate our theoretical results with suitably chosen experiments on synthetic and real datasets.
dc.publisherarXiv
dc.relation.urlhttps://arxiv.org/pdf/2104.09342.pdf
dc.rightsArchived with thanks to arXiv
dc.titleRandom Reshuffling with Variance Reduction: New Analysis and Better Rates
dc.typePreprint
dc.contributor.departmentKing Abdullah University of Science and Technology (KAUST)
dc.contributor.departmentApplied Mathematics and Computational Science Program
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.contributor.departmentComputer Science Program
dc.eprint.versionPre-print
dc.identifier.arxivid2104.09342
kaust.personMalinovsky, Grigory
kaust.personSailanbayev, Alibek
kaust.personRichtarik, Peter
refterms.dateFOA2021-04-21T06:56:27Z


Files in this item

Thumbnail
Name:
Preprintfile1.pdf
Size:
848.7Kb
Format:
PDF
Description:
Pre-print

This item appears in the following Collection(s)

Show simple item record