Anderson Acceleration of Gradient Methods with Energy for Optimization Problems

Hailiang Liu, Jia-Hao He, Xuping Tian

Communications on Applied Mathematics and Computation ›› 2023, Vol. 6 ›› Issue (2) : 1299-1318. DOI: 10.1007/s42967-023-00327-0

Anderson Acceleration of Gradient Methods with Energy for Optimization Problems

Author information +
History +

Abstract

Anderson acceleration (AA) is an extrapolation technique designed to speed up fixed-point iterations. For optimization problems, we propose a novel algorithm by combining the AA with the energy adaptive gradient method (AEGD) [arXiv:2010.05109]. The feasibility of our algorithm is ensured in light of the convergence theory for AEGD, though it is not a fixed-point iteration. We provide rigorous convergence rates of AA for gradient descent (GD) by an acceleration factor of the gain at each implementation of AA-GD. Our experimental results show that the proposed AA-AEGD algorithm requires little tuning of hyperparameters and exhibits superior fast convergence.

Keywords

Anderson acceleration (AA) / Gradient descent (GD) / Energy stability

Cite this article

Download citation ▾
Hailiang Liu, Jia-Hao He, Xuping Tian. Anderson Acceleration of Gradient Methods with Energy for Optimization Problems. Communications on Applied Mathematics and Computation, 2023, 6(2): 1299‒1318 https://doi.org/10.1007/s42967-023-00327-0

References

[1.]
Anderson DG. Iterative procedures for nonlinear integral equations. J. ACM (JACM), 1965, 12(4): 547-560,
CrossRef Google scholar
[2.]
Bertrand Q., Massias M.: Anderson acceleration of coordinate descent. In: International Conference on Artificial Intelligence and Statistics, PMLR, pp. 1288–1296 (2021)
[3.]
Bian W, Chen XJ, Kelley CT. Andersion acceleration for a class of nonsmooth fixed-point problems. SIAM J. Sci. Comput., 2021, 43(5): S1-S20,
CrossRef Google scholar
[4.]
Bollapragada R. Scieur, Damien, d’Aspremont, Alexandre: Nonlinear acceleration of momentum and primal-dual algorithms. Math. Program., 2022, 198: 325,
CrossRef Google scholar
[5.]
Both JW, Kumar K, Nordbotten JM, Radu FA. Anderson accelerated fixed-stress splitting schemes for consolidation of unsaturated porous media. Comput. Math. Appl., 2019, 77(6): 1479-1502,
CrossRef Google scholar
[6.]
Chen X.J, Kelley C. Convergence of the EDIIS algorithm for nonlinear equations. SIAM J. Sci. Comput., 2019, 41(1): A365-A379,
CrossRef Google scholar
[7.]
Eddy, R.P.: Extrapolating to the limit of a vector sequence. In: Wang, P.C.C., Schoenstadt, A.L., Russak, B.I., Comstock, C. (eds.) Information Linkage Between Applied Mathematics and Industry, pp. 387–396. Academic Press, Elsevier Inc. (1979)
[8.]
Evans C, Pollock S, Rebholz LG, Xiao MY. A proof that Anderson acceleration improves the convergence rate in linearly converging fixed-point methods (but not in those converging quadratically). SIAM J. Numer. Anal., 2020, 58(1): 788-810,
CrossRef Google scholar
[9.]
Eyert Volker. A comparative study on methods for convergence acceleration of iterative vector sequences. J. Comput. Phys., 1996, 124(2): 271-285,
CrossRef Google scholar
[10.]
Fang H.R, Saad Y. Two classes of multisecant methods for nonlinear acceleration. Numer. Linear Algebra Appl., 2009, 16(3): 197-221,
CrossRef Google scholar
[11.]
Fu A, Zhang J, Boyd S. Anderson accelerated Douglas-Rachford splitting. SIAM J. Sci. Comput., 2020, 42(6): A3560-A3583,
CrossRef Google scholar
[12.]
Geist M., Scherrer, B.: Anderson acceleration for reinforcement learning. arXiv:1809.09501 (2018)
[13.]
Geredeli PG, Rebholz LG, Vargun D, Zytoon A. Improved convergence of the Arrow-Hurwicz iteration for the Navier-Stokes equation via grad-div stabilization and Anderson acceleration. J. Comput. Appl. Math., 2022, 442
[14.]
Li, Z., Li, J.: A fast Anderson-Chebyshev acceleration for nonlinear optimization. In: International Conference on Artificial Intelligence and Statistics, PMLR, pp. 1047–1057 (2020)
[15.]
Liu, H. L., Tian, X. P.: AEGD: adaptive gradient descent with energy. arXiv:2010.05109 (2020)
[16.]
Liu HL, Tian X. An adaptive gradient method with energy and momentum. Ann. Appl. Math., 2022, 38(2): 183-222,
CrossRef Google scholar
[17.]
Liu, H.L., Tian X.P.: Dynamic behavior for a gradient algorithm with energy and momentum. arXiv:2203.12199, 1–20 (2022)
[18.]
Liu, H.L., Tian, X.P: SGEM: stochastic gradient with energy and momentum. arXiv:2208.02208, 1–24 (2022)
[19.]
Lott PA, Walker HF, Woodward CS, Yang UM. An accelerated Picard method for nonlinear systems related to variably saturated flow. Adv. Water Resour., 2012, 38: 92-101,
CrossRef Google scholar
[20.]
Luo S, Liu QH. A fixed-point iteration method for High frequency helmholtz equations. J. Sci. Comput., 2022, 93: 74,
CrossRef Google scholar
[21.]
Mai, V., Johansson, M.: Anderson acceleration of proximal gradient methods. In: Singh, A.H.D. III (ed.) Proceedings of the 37th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 119, pp. 6620–6629. PMLR, Virtual (2020)
[22.]
Nesterov Y.E.: A method for solving the convex programming problem with convergence rate o ( 1 / k 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$o(1/k^2)$$\end{document}. Dokl. akad. nauk Sssr. 269, 543–547 (1983)
[23.]
Oosterlee CW, Washio T. Krylov subspace acceleration of nonlinear multigrid with application to recirculating flows. SIAM J. Sci. Comput., 2000, 21(5): 1670-1690,
CrossRef Google scholar
[24.]
Pasini, M.L., Yin, J., Reshniak, V., Stoyanov, M.: Stable Anderson acceleration for deep learning. arXiv:2110.14813 (2021)
[25.]
Polyak BT. Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys., 1964, 4(5): 1-17,
CrossRef Google scholar
[26.]
Poon C, Liang JW. Trajectory of alternating direction method of multipliers and adaptive acceleration. Adv. Neural. Inf. Process. Syst., 2019,
CrossRef Google scholar
[27.]
Potra FA, Engler H. A characterization of the behavior of the Anderson acceleration on linear problems. Linear Algebra Appl., 2013, 438(3): 1002-1011,
CrossRef Google scholar
[28.]
Pulay P. Convergence acceleration of iterative sequences the case of SCF iteration. Chem. Phys. Lett., 1980, 73(2): 393-398,
CrossRef Google scholar
[29.]
Rohwedder T, Schneider R. An analysis for the DIIS acceleration method used in quantum chemistry calculations. J. Math. Chem., 2011, 49(9): 1889-1914,
CrossRef Google scholar
[30.]
Scieur, D., Bach, F., d’Aspremont A.: Nonlinear acceleration of stochastic algorithms. In: NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 3985–3994. ACM (2017)
[31.]
Scieur, D., d’Aspremont, A.., Bach, F.: Regularized nonlinear acceleration. In: NIPS'16: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 712–720. ACM (2016)
[32.]
Smith DA, Ford WF, Sidi A. Extrapolation methods for vector sequences. SIAM Rev., 1987, 29(2): 199-233,
CrossRef Google scholar
[33.]
de Sterck H, He YH. On the asymptotic linear convergence speed of Andersion acceleration, Nesterov acceleration, and nonlinear GMRES. SIAM J. Sci. Comput., 2021, 43(5): S21-S46,
CrossRef Google scholar
[34.]
Su, W., Boyd, S., Candes, E.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. Journal of Machine Learning Research 17(1), 5312–5354 (2016)
[35.]
Toth A, Ellis JA, Evans T, Hamilton S, Kelley CT, Pawlowski R, Slattery S. Local improvement results for Andersion acceleration with inaccurate function evaluations. SIAM J. Sci. Comput., 2017, 39(5): S47-S65,
CrossRef Google scholar
[36.]
Toth A, Kelley CT. Convergence analysis for Anderson acceleration. SIAM J. Numer. Anal., 2015, 53(2): 805-819,
CrossRef Google scholar
[37.]
Walker HF, Ni P. Anderson acceleration for fixed-point iterations. SIAM J. Numer. Anal., 2011, 49(4): 1715-1735,
CrossRef Google scholar
[38.]
Wang D, He Y, de Sterck H. On the asymptotic linear convergence speed of Anderson acceleration applied to ADMM. J. Sci. Comput., 2021, 88(2): 38,
CrossRef Google scholar
[39.]
Yang A. Anderson acceleration for seismic inversion. Geophysics, 2021, 86(1): R99-R108,
CrossRef Google scholar
[40.]
Zhang J, O’Donoghue B, Boyd S. Globally convergent type-I Anderson acceleration for nonsmooth fixed-point iterations. SIAM J. Optim., 2020, 30(4): 3170-3197,
CrossRef Google scholar
Funding
Directorate for Mathematical and Physical Sciences(1812666)

Accesses

Citations

Detail

Sections
Recommended

/