Show simple item record

dc.contributor.authorHorvath, Samuel
dc.contributor.authorKovalev, Dmitry
dc.contributor.authorMishchenko, Konstantin
dc.contributor.authorStich, Sebastian
dc.contributor.authorRichtarik, Peter
dc.date.accessioned2019-05-28T12:18:45Z
dc.date.available2019-05-28T12:18:45Z
dc.date.issued2019-04-10
dc.identifier.urihttp://hdl.handle.net/10754/653103
dc.description.abstractWe consider distributed optimization where the objective function is spreadamong different devices, each sending incremental model updates to a centralserver. To alleviate the communication bottleneck, recent work proposed variousschemes to compress (e.g.\ quantize or sparsify) the gradients, therebyintroducing additional variance $\omega \geq 1$ that might slow downconvergence. For strongly convex functions with condition number $\kappa$distributed among $n$ machines, we (i) give a scheme that converges in$\mathcal{O}((\kappa + \kappa \frac{\omega}{n} + \omega)$ $\log (1/\epsilon))$steps to a neighborhood of the optimal solution. For objective functions with afinite-sum structure, each worker having less than $m$ components, we (ii)present novel variance reduced schemes that converge in $\mathcal{O}((\kappa +\kappa \frac{\omega}{n} + \omega + m)\log(1/\epsilon))$ steps to arbitraryaccuracy $\epsilon > 0$. These are the first methods that achieve linearconvergence for arbitrary quantized updates. We also (iii) give analysis forthe weakly convex and non-convex cases and (iv) verify in experiments that ournovel variance reduced schemes are more efficient than the baselines.
dc.description.sponsorshipThe authors would like to thank Xun Qian for the careful checking of the proofs and for spotting several typos in the analysis.
dc.publisherarXiv
dc.relation.urlhttps://arxiv.org/abs/1904.05115
dc.relation.urlhttps://arxiv.org/pdf/1904.05115
dc.rightsArchived with thanks to arXiv
dc.titleStochastic Distributed Learning with Gradient Quantization and Variance Reduction
dc.typePreprint
dc.contributor.departmentComputer Science
dc.contributor.departmentComputer Science Program
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.contributor.departmentStatistics
dc.contributor.departmentStatistics Program
dc.eprint.versionPre-print
dc.contributor.institutionEcole polytechnique federalede Lausanne, Switzerland
dc.contributor.institutionUniversity of Edinburgh, UK
dc.contributor.institutionMoscow Institute of Physics and Technology, Russian Federation
dc.identifier.arxivid1904.05115
kaust.personHorvath, Samuel
kaust.personKovalev, Dmitry
kaust.personMishchenko, Konstantin
kaust.personRichtarik, Peter
refterms.dateFOA2019-05-28T12:18:53Z


Files in this item

Thumbnail
Name:
1904.05115.pdf
Size:
1.090Mb
Format:
PDF
Description:
Preprint

This item appears in the following Collection(s)

Show simple item record