Show simple item record

dc.contributor.authorMishchenko, Konstantin
dc.contributor.authorWang, Bokun
dc.contributor.authorKovalev, Dmitry
dc.contributor.authorRichtarik, Peter
dc.date.accessioned2021-03-01T06:22:20Z
dc.date.available2021-03-01T06:22:20Z
dc.date.issued2021-02-16
dc.identifier.urihttp://hdl.handle.net/10754/667733
dc.description.abstractWe propose a family of lossy integer compressions for Stochastic Gradient Descent (SGD) that do not communicate a single float. This is achieved by multiplying floating-point vectors with a number known to every device and then rounding to an integer number. Our theory shows that the iteration complexity of SGD does not change up to constant factors when the vectors are scaled properly. Moreover, this holds for both convex and non-convex functions, with and without overparameterization. In contrast to other compression-based algorithms, ours preserves the convergence rate of SGD even on non-smooth problems. Finally, we show that when the data is significantly heterogeneous, it may become increasingly hard to keep the integers bounded and propose an alternative algorithm, IntDIANA, to solve this type of problems.
dc.publisherarXiv
dc.relation.urlhttps://arxiv.org/pdf/2102.08374.pdf
dc.rightsArchived with thanks to arXiv
dc.titleIntSGD: Floatless Compression of Stochastic Gradients
dc.typePreprint
dc.contributor.departmentComputer Science Program
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.contributor.departmentKing Abdullah University of Science and Technology
dc.contributor.departmentComputer Science
dc.eprint.versionPre-print
dc.identifier.arxivid2102.08374
kaust.personMishchenko, Konstantin
kaust.personWang, Bokun
kaust.personKovalev, Dmitry
kaust.personRichtarik, Peter
refterms.dateFOA2021-03-01T06:22:50Z


Files in this item

Thumbnail
Name:
Preprintfile1.pdf
Size:
4.006Mb
Format:
PDF
Description:
Pre-print

This item appears in the following Collection(s)

Show simple item record