A distributed flexible delay-tolerant proximal gradient algorithm
KAUST DepartmentComputer Science Program
Permanent link to this recordhttp://hdl.handle.net/10754/662928
MetadataShow full item record
AbstractWe develop and analyze an asynchronous algorithm for distributed convex optimization when the objective can be written as a sum of smooth functions, local to each worker, and a nonsmooth function. Unlike many existing methods, our distributed algorithm is adjustable to various levels of communication cost, delays, machines' computational power, and functions' smoothness. A unique feature is that the step sizes do not depend on communication delays nor number of machines, which is highly desirable for scalability. We prove that the algorithm converges linearly in the strongly convex case, and provide guarantees of convergence for the non-strongly convex case. The obtained rates are the same as the vanilla proximal gradient algorithm over some introduced epoch sequence that subsumes the delays of the system. We provide numerical results on large-scale machine learning problems to demonstrate the merits of the proposed method.
CitationMishchenko, K., Iutzeler, F., & Malick, J. (2020). A Distributed Flexible Delay-Tolerant Proximal Gradient Algorithm. SIAM Journal on Optimization, 30(1), 933–959. doi:10.1137/18m1194699
SponsorsWe thank Robert Gower for valuable comments on the first versions of this paper.
JournalSIAM Journal on Optimization