dc.contributor.author Hanzely, Filip dc.contributor.author Richtarik, Peter dc.date.accessioned 2019-05-29T07:46:05Z dc.date.available 2019-05-29T07:46:05Z dc.date.issued 2018-09-25 dc.identifier.uri http://hdl.handle.net/10754/653116 dc.description.abstract Accelerated coordinate descent is a widely popular optimization algorithm dueto its efficiency on large-dimensional problems. It achieves state-of-the-artcomplexity on an important class of empirical risk minimization problems. Inthis paper we design and analyze an accelerated coordinate descent (ACD) methodwhich in each iteration updates a random subset of coordinates according to anarbitrary but fixed probability law, which is a parameter of the method. If allcoordinates are updated in each iteration, our method reduces to the classicalaccelerated gradient descent method AGD of Nesterov. If a single coordinate isupdated in each iteration, and we pick probabilities proportional to the squareroots of the coordinate-wise Lipschitz constants, our method reduces to thecurrently fastest coordinate descent method NUACDM of Allen-Zhu, Qu,Richt\'{a}rik and Yuan. While mini-batch variants of ACD are more popular and relevant in practice,there is no importance sampling for ACD that outperforms the standard uniformmini-batch sampling. Through insights enabled by our general analysis, wedesign new importance sampling for mini-batch ACD which significantlyoutperforms previous state-of-the-art minibatch ACD in practice. We prove arate that is at most ${\cal O}(\sqrt{\tau})$ times worse than the rate ofminibatch ACD with uniform sampling, but can be ${\cal O}(n/\tau)$ timesbetter, where $\tau$ is the minibatch size. Since in modern supervised learningtraining systems it is standard practice to choose $\tau \ll n$, and often$\tau={\cal O}(1)$, our method can lead to dramatic speedups. Lastly, we obtainsimilar results for minibatch nonaccelerated CD as well, achieving improvementson previous best rates. dc.publisher arXiv dc.relation.url https://arxiv.org/pdf/1809.09354 dc.rights Archived with thanks to arXiv dc.title Accelerated Coordinate Descent with Arbitrary Sampling and Best Rates for Minibatches dc.type Preprint dc.contributor.department Applied Mathematics and Computational Science dc.contributor.department Applied Mathematics and Computational Science Program dc.contributor.department Computer Science dc.contributor.department Computer Science Program dc.contributor.department Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division dc.eprint.version Pre-print dc.contributor.institution School of Mathematics, University of Edinburgh, United Kingdom dc.contributor.institution Moscow Institute of Physics and Technology, Russia dc.identifier.arxivid arXiv:1809.09354 kaust.person Hanzely, Filip kaust.person Richtarik, Peter refterms.dateFOA 2019-05-29T07:46:43Z
﻿

Name:
1809.09354.pdf
Size:
6.504Mb
Format:
PDF
Description:
Preprint