Global Convergence of Arbitrary-Block Gradient Methods for Generalized Polyak-{\L} ojasiewicz Functions

Handle URI:
http://hdl.handle.net/10754/626499
Title:
Global Convergence of Arbitrary-Block Gradient Methods for Generalized Polyak-{\L} ojasiewicz Functions
Authors:
Csiba, Dominik; Richtarik, Peter
Abstract:
In this paper we introduce two novel generalizations of the theory for gradient descent type methods in the proximal setting. First, we introduce the proportion function, which we further use to analyze all known (and many new) block-selection rules for block coordinate descent methods under a single framework. This framework includes randomized methods with uniform, non-uniform or even adaptive sampling strategies, as well as deterministic methods with batch, greedy or cyclic selection rules. Second, the theory of strongly-convex optimization was recently generalized to a specific class of non-convex functions satisfying the so-called Polyak-{\L}ojasiewicz condition. To mirror this generalization in the weakly convex case, we introduce the Weak Polyak-{\L}ojasiewicz condition, using which we give global convergence guarantees for a class of non-convex functions previously not considered in theory. Additionally, we establish (necessarily somewhat weaker) convergence guarantees for an even larger class of non-convex functions satisfying a certain smoothness assumption only. By combining the two abovementioned generalizations we recover the state-of-the-art convergence guarantees for a large class of previously known methods and setups as special cases of our general framework. Moreover, our frameworks allows for the derivation of new guarantees for many new combinations of methods and setups, as well as a large class of novel non-convex objectives. The flexibility of our approach offers a lot of potential for future research, as a new block selection procedure will have a convergence guarantee for all objectives considered in our framework, while a new objective analyzed under our approach will have a whole fleet of block selection rules with convergence guarantees readily available.
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division; Computer Science Program
Publisher:
arXiv
Issue Date:
9-Sep-2017
ARXIV:
arXiv:1709.03014
Type:
Preprint
Additional Links:
http://arxiv.org/abs/1709.03014v1; http://arxiv.org/pdf/1709.03014v1
Appears in Collections:
Other/General Submission; Other/General Submission; Computer Science Program; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorCsiba, Dominiken
dc.contributor.authorRichtarik, Peteren
dc.date.accessioned2017-12-28T07:32:13Z-
dc.date.available2017-12-28T07:32:13Z-
dc.date.issued2017-09-09en
dc.identifier.urihttp://hdl.handle.net/10754/626499-
dc.description.abstractIn this paper we introduce two novel generalizations of the theory for gradient descent type methods in the proximal setting. First, we introduce the proportion function, which we further use to analyze all known (and many new) block-selection rules for block coordinate descent methods under a single framework. This framework includes randomized methods with uniform, non-uniform or even adaptive sampling strategies, as well as deterministic methods with batch, greedy or cyclic selection rules. Second, the theory of strongly-convex optimization was recently generalized to a specific class of non-convex functions satisfying the so-called Polyak-{\L}ojasiewicz condition. To mirror this generalization in the weakly convex case, we introduce the Weak Polyak-{\L}ojasiewicz condition, using which we give global convergence guarantees for a class of non-convex functions previously not considered in theory. Additionally, we establish (necessarily somewhat weaker) convergence guarantees for an even larger class of non-convex functions satisfying a certain smoothness assumption only. By combining the two abovementioned generalizations we recover the state-of-the-art convergence guarantees for a large class of previously known methods and setups as special cases of our general framework. Moreover, our frameworks allows for the derivation of new guarantees for many new combinations of methods and setups, as well as a large class of novel non-convex objectives. The flexibility of our approach offers a lot of potential for future research, as a new block selection procedure will have a convergence guarantee for all objectives considered in our framework, while a new objective analyzed under our approach will have a whole fleet of block selection rules with convergence guarantees readily available.en
dc.publisherarXiven
dc.relation.urlhttp://arxiv.org/abs/1709.03014v1en
dc.relation.urlhttp://arxiv.org/pdf/1709.03014v1en
dc.rightsArchived with thanks to arXiven
dc.titleGlobal Convergence of Arbitrary-Block Gradient Methods for Generalized Polyak-{\L} ojasiewicz Functionsen
dc.typePreprinten
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.contributor.departmentComputer Science Programen
dc.eprint.versionPre-printen
dc.contributor.institutionUniversity of Edinburghen
dc.identifier.arxividarXiv:1709.03014en
kaust.authorRichtarik, Peteren
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.