Show simple item record

dc.contributor.authorSafaryan, Mher
dc.contributor.authorHanzely, Filip
dc.contributor.authorRichtarik, Peter
dc.date.accessioned2021-02-17T06:23:12Z
dc.date.available2021-02-17T06:23:12Z
dc.date.issued2021-02-14
dc.identifier.urihttp://hdl.handle.net/10754/667470
dc.description.abstractLarge scale distributed optimization has become the default tool for the training of supervised machine learning models with a large number of parameters and training data. Recent advancements in the field provide several mechanisms for speeding up the training, including {\em compressed communication}, {\em variance reduction} and {\em acceleration}. However, none of these methods is capable of exploiting the inherently rich data-dependent smoothness structure of the local losses beyond standard smoothness constants. In this paper, we argue that when training supervised models, {\em smoothness matrices} -- information-rich generalizations of the ubiquitous smoothness constants -- can and should be exploited for further dramatic gains, both in theory and practice. In order to further alleviate the communication burden inherent in distributed optimization, we propose a novel communication sparsification strategy that can take full advantage of the smoothness matrices associated with local losses. To showcase the power of this tool, we describe how our sparsification technique can be adapted to three distributed optimization algorithms -- DCGD, DIANA and ADIANA -- yielding significant savings in terms of communication complexity. The new methods always outperform the baselines, often dramatically so.
dc.publisherarXiv
dc.relation.urlhttps://arxiv.org/pdf/2102.07245
dc.rightsArchived with thanks to arXiv
dc.titleSmoothness Matrices Beat Smoothness Constants: Better Communication Compression Techniques for Distributed Optimization
dc.typePreprint
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.contributor.departmentVisual Computing Center (VCC)
dc.contributor.departmentComputer Science Program
dc.eprint.versionPre-print
dc.contributor.institutionToyota Technological Institute at Chicago (TTIC), Chicago, USA
dc.identifier.arxivid2102.07245
kaust.personSafaryan, Mher
kaust.personRichtarik, Peter
refterms.dateFOA2021-02-17T06:25:56Z


Files in this item

Thumbnail
Name:
2102.07245.pdf
Size:
1.684Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record