Show simple item record

dc.contributor.authorHanzely, Filip
dc.contributor.authorRichtarik, Peter
dc.date.accessioned2019-05-29T07:46:05Z
dc.date.available2019-05-29T07:46:05Z
dc.date.issued2018-09-25
dc.identifier.urihttp://hdl.handle.net/10754/653116
dc.description.abstractAccelerated coordinate descent is a widely popular optimization algorithm dueto its efficiency on large-dimensional problems. It achieves state-of-the-artcomplexity on an important class of empirical risk minimization problems. Inthis paper we design and analyze an accelerated coordinate descent (ACD) methodwhich in each iteration updates a random subset of coordinates according to anarbitrary but fixed probability law, which is a parameter of the method. If allcoordinates are updated in each iteration, our method reduces to the classicalaccelerated gradient descent method AGD of Nesterov. If a single coordinate isupdated in each iteration, and we pick probabilities proportional to the squareroots of the coordinate-wise Lipschitz constants, our method reduces to thecurrently fastest coordinate descent method NUACDM of Allen-Zhu, Qu,Richt\'{a}rik and Yuan. While mini-batch variants of ACD are more popular and relevant in practice,there is no importance sampling for ACD that outperforms the standard uniformmini-batch sampling. Through insights enabled by our general analysis, wedesign new importance sampling for mini-batch ACD which significantlyoutperforms previous state-of-the-art minibatch ACD in practice. We prove arate that is at most ${\cal O}(\sqrt{\tau})$ times worse than the rate ofminibatch ACD with uniform sampling, but can be ${\cal O}(n/\tau)$ timesbetter, where $\tau$ is the minibatch size. Since in modern supervised learningtraining systems it is standard practice to choose $\tau \ll n$, and often$\tau={\cal O}(1)$, our method can lead to dramatic speedups. Lastly, we obtainsimilar results for minibatch nonaccelerated CD as well, achieving improvementson previous best rates.
dc.publisherarXiv
dc.relation.urlhttps://arxiv.org/pdf/1809.09354
dc.rightsArchived with thanks to arXiv
dc.titleAccelerated Coordinate Descent with Arbitrary Sampling and Best Rates for Minibatches
dc.typePreprint
dc.contributor.departmentApplied Mathematics and Computational Science
dc.contributor.departmentApplied Mathematics and Computational Science Program
dc.contributor.departmentComputer Science
dc.contributor.departmentComputer Science Program
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.eprint.versionPre-print
dc.contributor.institutionSchool of Mathematics, University of Edinburgh, United Kingdom
dc.contributor.institutionMoscow Institute of Physics and Technology, Russia
dc.identifier.arxividarXiv:1809.09354
kaust.personHanzely, Filip
kaust.personRichtarik, Peter
refterms.dateFOA2019-05-29T07:46:43Z


Files in this item

Thumbnail
Name:
1809.09354.pdf
Size:
6.504Mb
Format:
PDF
Description:
Preprint

This item appears in the following Collection(s)

Show simple item record