• Login
    View Item 
    •   Home
    • Research
    • Preprints
    • View Item
    •   Home
    • Research
    • Preprints
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of KAUSTCommunitiesIssue DateSubmit DateThis CollectionIssue DateSubmit Date

    My Account

    Login

    Quick Links

    Open Access PolicyORCID LibguideTheses and Dissertations LibguideSubmit an Item

    Statistics

    Display statistics

    On Biased Compression for Distributed Learning

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    Preprintfile1.pdf
    Size:
    1.581Mb
    Format:
    PDF
    Description:
    Pre-print
    Download
    Type
    Preprint
    Authors
    Beznosikov, Aleksandr
    Horvath, Samuel
    Richtarik, Peter cc
    Safaryan, Mher cc
    KAUST Department
    Computer Science Program
    Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
    Statistics
    Statistics Program
    Date
    2020-02-27
    Permanent link to this record
    http://hdl.handle.net/10754/662280
    
    Metadata
    Show full item record
    Abstract
    In the last few years, various communication compression techniques have emerged as an indispensable tool helping to alleviate the communication bottleneck in distributed learning. However, despite the fact {\em biased} compressors often show superior performance in practice when compared to the much more studied and understood {\em unbiased} compressors, very little is known about them. In this work we study three classes of biased compression operators, two of which are new, and their performance when applied to (stochastic) gradient descent and distributed (stochastic) gradient descent. We show for the first time that biased compressors can lead to linear convergence rates both in the single node and distributed settings. Our {\em distributed} SGD method enjoys the ergodic rate $\mathcal{O}\left(\frac{\delta L \exp(-K) }{\mu} + \frac{(C + D)}{K\mu}\right)$, where $\delta$ is a compression parameter which grows when more compression is applied, $L$ and $\mu$ are the smoothness and strong convexity constants, $C$ captures stochastic gradient noise ($C=0$ if full gradients are computed on each node) and $D$ captures the variance of the gradients at the optimum ($D=0$ for over-parameterized models). Further, via a theoretical study of several synthetic and empirical distributions of communicated gradients, we shed light on why and by how much biased compressors outperform their unbiased variants. Finally, we propose a new highly performing biased compressor---combination of Top-$k$ and natural dithering---which in our experiments outperforms all other compression techniques.
    Publisher
    arXiv
    arXiv
    2002.12410
    Additional Links
    https://arxiv.org/pdf/2002.12410
    Collections
    Preprints; Computer Science Program; Statistics Program; Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division

    entitlement

     
    DSpace software copyright © 2002-2023  DuraSpace
    Quick Guide | Contact Us | KAUST University Library
    Open Repository is a service hosted by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items. For anonymous users the allowed maximum amount is 50 search results.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.