The cyclic block coordinate descent-type (CBCD-type) methods have shown remarkable computational performance for solving strongly convex minimization problems. Typical applications include many popular statistical machine learning methods such as elastic-net regression, ridge penalized logistic regression, and sparse additive regression. Existing optimization literature has shown that the CBCD-type methods attain iteration complexity of O(p · log(1/ϵ)), where ϵ is a pre-specified accuracy of the objective value, and p is the number of blocks. However, such iteration complexity explicitly depends on p, and therefore is at least p times worse than those of gradient descent methods. To bridge this theoretical gap, we propose an improved convergence analysis for the CBCD-type methods. In particular, we first show that for a family of quadratic minimization problems, the iteration complexity of the CBCD-type methods matches that of the GD methods in term of dependency on p (up to a log2 p factor). Thus our complexity bounds are sharper than the existing bounds by at least a factor of p/log2 p. We also provide a lower bound to confirm that our improved complexity bounds are tight (up to a log2 p factor) if the largest and smallest eigen-values of the Hessian matrix do not scale with p. Finally, we generalize our analysis to other strongly convex minimization problems beyond quadratic ones.
|Original language||English (US)|
|Number of pages||9|
|State||Published - 2016|
|Event||19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016 - Cadiz, Spain|
Duration: May 9 2016 → May 11 2016
|Conference||19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016|
|Period||5/9/16 → 5/11/16|
Bibliographical noteFunding Information:
This research is supported by NSF DMS1454377-CAREER; NSF IIS 1546482-BIGDATA; NIH R01MH102339; NSF IIS1408910; NSF IIS1332109; NIH R01GM083084.