Convergence Analysis of Gradient Descent for Eigenvector Computation
Convergence analysis of gradient descent for top-k eigenspace computation
Type
Conference PaperAuthors
Xu, ZhiqiangCao, Xin
Gao, Xin

KAUST Department
Computational Bioscience Research Center (CBRC)Computer Science Program
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Date
2018-07-05Online Publication Date
2018-07-05Print Publication Date
2018-07Permanent link to this record
http://hdl.handle.net/10754/628358
Metadata
Show full item recordAbstract
We present a novel, simple and systematic convergence analysis of gradient descent for eigenvector computation. As a popular, practical, and provable approach to numerous machine learning problems, gradient descent has found successful applications to eigenvector computation as well. However, surprisingly, it lacks a thorough theoretical analysis for the underlying geodesically non-convex problem. In this work, the convergence of the gradient descent solver for the leading eigenvector computation is shown to be at a global rate O(min{ (lambda_1/Delta_p)^2 log(1/epsilon), 1/epsilon }), where Delta_p=lambda_p-lambda_p+1>0 represents the generalized positive eigengap and always exists without loss of generality with lambda_i being the i-th largest eigenvalue of the given real symmetric matrix and p being the multiplicity of lambda_1. The rate is linear at (lambda_1/Delta_p)^2 log(1/epsilon) if (lambda_1/Delta_p)^2=O(1), otherwise sub-linear at O(1/epsilon). We also show that the convergence only logarithmically instead of quadratically depends on the initial iterate. Particularly, this is the first time the linear convergence for the case that the conventionally considered eigengap Delta_1= lambda_1 - lambda_2=0 but the generalized eigengap Delta_p satisfies (lambda_1/Delta_p)^2=O(1), as well as the logarithmic dependence on the initial iterate are established for the gradient descent solver. We are also the first to leverage for analysis the log principal angle between the iterate and the space of globally optimal solutions. Theoretical properties are verified in experiments.Citation
Xu Z, Cao X, Gao X (2018) Convergence Analysis of Gradient Descent for Eigenvector Computation. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. Available: http://dx.doi.org/10.24963/ijcai.2018/407.Sponsors
This research is supported in part by the funding from King Abdullah University of Science and Technology (KAUST).Conference/Event name
The 27th International Joint Conference on Artificial Intelligence and The 23rd European Conference on Artificial Intelligence (IJCAI-ECAI-18)Additional Links
https://www.ijcai.org/proceedings/2018/407ae974a485f413a2113503eed53cd6c53
10.24963/ijcai.2018/407