dc.contributor.author Horvath, Samuel dc.contributor.author Kovalev, Dmitry dc.contributor.author Mishchenko, Konstantin dc.contributor.author Stich, Sebastian dc.contributor.author Richtarik, Peter dc.date.accessioned 2019-05-28T12:18:45Z dc.date.available 2019-05-28T12:18:45Z dc.date.issued 2019-04-10 dc.identifier.uri http://hdl.handle.net/10754/653103 dc.description.abstract We consider distributed optimization where the objective function is spreadamong different devices, each sending incremental model updates to a centralserver. To alleviate the communication bottleneck, recent work proposed variousschemes to compress (e.g.\ quantize or sparsify) the gradients, therebyintroducing additional variance $\omega \geq 1$ that might slow downconvergence. For strongly convex functions with condition number $\kappa$distributed among $n$ machines, we (i) give a scheme that converges in$\mathcal{O}((\kappa + \kappa \frac{\omega}{n} + \omega)$ $\log (1/\epsilon))$steps to a neighborhood of the optimal solution. For objective functions with afinite-sum structure, each worker having less than $m$ components, we (ii)present novel variance reduced schemes that converge in $\mathcal{O}((\kappa +\kappa \frac{\omega}{n} + \omega + m)\log(1/\epsilon))$ steps to arbitraryaccuracy $\epsilon > 0$. These are the first methods that achieve linearconvergence for arbitrary quantized updates. We also (iii) give analysis forthe weakly convex and non-convex cases and (iv) verify in experiments that ournovel variance reduced schemes are more efficient than the baselines. dc.description.sponsorship The authors would like to thank Xun Qian for the careful checking of the proofs and for spotting several typos in the analysis. dc.publisher arXiv dc.relation.url https://arxiv.org/abs/1904.05115 dc.relation.url https://arxiv.org/pdf/1904.05115 dc.rights Archived with thanks to arXiv dc.title Stochastic Distributed Learning with Gradient Quantization and Variance Reduction dc.type Preprint dc.contributor.department Computer Science dc.contributor.department Computer Science Program dc.contributor.department Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division dc.contributor.department Statistics dc.contributor.department Statistics Program dc.eprint.version Pre-print dc.contributor.institution Ecole polytechnique federalede Lausanne, Switzerland dc.contributor.institution University of Edinburgh, UK dc.contributor.institution Moscow Institute of Physics and Technology, Russian Federation dc.identifier.arxivid 1904.05115 kaust.person Horvath, Samuel kaust.person Kovalev, Dmitry kaust.person Mishchenko, Konstantin kaust.person Richtarik, Peter refterms.dateFOA 2019-05-28T12:18:53Z
﻿

Name:
1904.05115.pdf
Size:
1.090Mb
Format:
PDF
Description:
Preprint