PDF
Abstract
Discretization of continuous-time diffusion processes is a widely recognized method for sampling. However, the canonical Euler Maruyama discretization of the Langevin diffusion process, referred as unadjusted Langevin algorithm (ULA), studied mostly in the context of smooth (gradient Lipschitz) and strongly log-concave densities, is a considerable hindrance for its deployment in many sciences, including statistics and machine learning. In this paper, we establish several theoretical contributions to the literature on such sampling methods for non-convex distributions. Particularly, we introduce a new mixture weakly smooth condition, under which we prove that ULA will converge with additional log-Sobolev inequality. We also show that ULA for smoothing potential will converge in $L_{2}$-Wasserstein distance. Moreover, using convexification of nonconvex domain (Ma et al. in Proc Natl Acad Sci 116(42):20881–20885, 2019) in combination with regularization, we establish the convergence in Kullback–Leibler divergence with the number of iterations to reach $\epsilon $-neighborhood of a target distribution in only polynomial dependence on the dimension. We relax the conditions of Vempala and Wibisono (Advances in Neural Information Processing Systems, 2019) and prove convergence guarantees under isoperimetry, and non-strongly convex at infinity.
Keywords
Langevin Monte Carlo
/
Kullback–Leibler divergence
/
Log-Sobolev inequality
/
Convexification
/
Mixture weakly smooth
/
60J22
/
62F15
/
62M45
/
65C05
Cite this article
Download citation ▾
Dao Nguyen, Xin Dang, Yixin Chen.
Unadjusted Langevin Algorithm for Non-convex Weakly Smooth Potentials.
Communications in Mathematics and Statistics, 2025, 13(4): 979-1036 DOI:10.1007/s40304-023-00350-w
| [1] |
Arellano-ValleRB, RichterW-D. On skewed continuous ln, p-symmetric distributions. Chilean J. Stat., 2012, 3(2): 193-212
|
| [2] |
Bakry, D., Émery, M.: Diffusions hypercontractives. In: Séminaire de Probabilités XIX 1983/84, pp. 177–206. Springer, Berlin (1985)
|
| [3] |
Bernton, E.: Langevin Monte Carlo and JKO splitting. arXiv preprint arXiv:1802.08671 (2018)
|
| [4] |
BobkovSG. Isoperimetric and analytic inequalities for log-concave probability measures. Ann. Probab., 1999, 27(4): 1903-1921
|
| [5] |
CattiauxP, GuillinA, WuL-M. A note on Talagrand’s transportation inequality and logarithmic Sobolev inequality. Probab. Theory Relat. Fields, 2010, 148(1–2): 285-304
|
| [6] |
Chatterji, N.S., Diakonikolas, J., Jordan, M.I., Bartlett, P.L.: Langevin Monte Carlo without smoothness. arXiv preprint arXiv:1905.13285 (2019)
|
| [7] |
Chen, Z., Vempala, S.S.: Optimal convergence rate of Hamiltonian Monte Carlo for strongly logconcave distributions. arXiv preprint arXiv:1905.02313 (2019)
|
| [8] |
ChengX, BartlettPL. Convergence of Langevin MCMC in KL-divergence. PMLR, 2018, 83(83): 186-211
|
| [9] |
Dalalyan, A.S.: Further and stronger analogy between sampling and optimization: Langevin monte carlo and gradient descent. arXiv preprint arXiv:1704.04752 (2017)
|
| [10] |
Dalalyan, A.S., Riou-Durand, L., Karagulyan, A.: Bounding the error of discretized Langevin algorithms for non-strongly log-concave targets. arXiv preprint arXiv:1906.08530 (2019)
|
| [11] |
DalalyanAS, KaragulyanA. User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient. Stoch. Process. Appl., 2019, 129(12): 5278-5311
|
| [12] |
Durmus, A., Moulines, E., Saksman, E.: On the convergence of Hamiltonian Monte Carlo. arXiv preprint arXiv:1705.00166 (2017)
|
| [13] |
DurmusA, MoulinesE, PereyraM. Efficient Bayesian computation by proximal Markov chain Monte Carlo: when Langevin meets Moreau. SIAM J. Imag. Sci., 2018, 11(1): 473-506
|
| [14] |
DurmusA, MajewskiS, MiasojedowB. Analysis of Langevin Monte Carlo via convex optimization. J. Mach. Learn. Res., 2019, 20: 73-1
|
| [15] |
Dwivedi, R., Chen, Y., Wainwright, M.J., Yu, B.: Log-concave sampling: metropolis-hastings algorithms are fast! In: Conference on Learning Theory, pp. 793–797 (2018)
|
| [16] |
DwivediR, ChenY, WainwrightMJ, YuB. Log-concave sampling: Metropolis-Hastings algorithms are fast. J. Mach. Learn. Res., 2019, 20(183): 1-42
|
| [17] |
EberleA, GuillinA, ZimmerR. Couplings and quantitative contraction rates for Langevin dynamics. Ann. Probab., 2019, 47(4): 1982-2010
|
| [18] |
Erdogdu, M.A., Hosseinzadeh, R.: On the convergence of langevin monte carlo: The interplay between tail growth and smoothness. arXiv preprint arXiv:2005.13097 (2020)
|
| [19] |
Gorham, J., Mackey, L.: Measuring sample quality with kernels. In: International Conference on Machine Learning, pp. 1292–1301 (2017)
|
| [20] |
GrossL. Logarithmic sobolev inequalities. Am. J. Math., 1975, 97(4): 1061-1083
|
| [21] |
Holley, R., Stroock, D.W.: Logarithmic Sobolev inequalities and stochastic ising models (1986)
|
| [22] |
JordanR, KinderlehrerD, OttoF. The variational formulation of the Fokker-Planck equation. SIAM J. Math. Anal., 1998, 29(1): 1-17
|
| [23] |
KečkićJD, VasićPM. Some inequalities for the gamma function. Publ. de l’Institut Math., 1971, 11(31): 107-114
|
| [24] |
Kloeden, P.E., Platen, E.: Stochastic differential equations. In: Numerical Solution of Stochastic Differential Equations, pp. 103–160. Springer (1992)
|
| [25] |
Ledoux, M.: Logarithmic sobolev inequalities for unbounded spin systems revisited. In: Séminaire de Probabilités XXXV, pp. 167–194. Springer, Berlin (2001)
|
| [26] |
Lee, Y.T., Vempala, S.S.: Convergence rate of Riemannian Hamiltonian Monte Carlo and faster polytope volume computation. In: Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pp. 1115–1121 (2018)
|
| [27] |
Li, X., Wu, Y., Mackey, L., Erdogdu, M.A.: Stochastic Runge–Kutta accelerates Langevin Monte Carlo and beyond. In: Advances in Neural Information Processing Systems, pp. 7748–7760 (2019)
|
| [28] |
Lovász, L., Vempala, S.: Hit-and-run from a corner. In: Proceedings of the Thirty-sixth Annual ACM Symposium on Theory of Computing, pp. 310–314 (2004)
|
| [29] |
Lovász, L., Vempala, S.: Fast algorithms for logconcave functions: sampling, rounding, integration and optimization. In: 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06), pp. 57–68 (2006). IEEE
|
| [30] |
LovászL, VempalaS. The geometry of logconcave functions and sampling algorithms. Random Struct. Algorithms, 2007, 30(3): 307-358
|
| [31] |
MaY-A, ChenY, JinC, FlammarionN, JordanMI. Sampling can be faster than optimization. Proc. Natl. Acad. Sci., 2019, 116(42): 20881-20885
|
| [32] |
Mangoubi, O., Smith, A.: Rapid mixing of Hamiltonian Monte Carlo on strongly log-concave distributions. arXiv preprint arXiv:1708.07114 (2017)
|
| [33] |
Mangoubi, O., Vishnoi, N.: Dimensionally tight bounds for second-order Hamiltonian Monte Carlo. In: Advances in Neural Information Processing Systems, pp. 6027–6037 (2018)
|
| [34] |
NesterovY, SpokoinyV. Random gradient-free minimization of convex functions. Found. Comput. Math., 2017, 17(2): 527-566
|
| [35] |
Raginsky, M., Rakhlin, A., Telgarsky, M.: Non-convex learning via stochastic gradient Langevin dynamics: a nonasymptotic analysis. arXiv preprint arXiv:1702.03849 (2017)
|
| [36] |
RichterW-D. Generalized spherical and simplicial coordinates. J. Math. Anal. Appl., 2007, 336(2): 1187-1202
|
| [37] |
RobertC, CasellaGMonte Carlo Statistical Methods, 2013, Berlin. Springer.
|
| [38] |
Vempala, S., Wibisono, A.: Rapid convergence of the unadjusted Langevin algorithm: Isoperimetry suffices. In: Advances in Neural Information Processing Systems, pp. 8094–8106 (2019)
|
| [39] |
VillaniCOptimal Transport: Old and New, 2008, Berlin. Springer. 338
|
| [40] |
Villani, C.: Topics in Optimal Transportation vol. 58. American Mathematical Society (2021)
|
| [41] |
Welling, M., Teh, Y.W.: Bayesian learning via stochastic gradient Langevin dynamics. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 681–688 (2011)
|
| [42] |
Xu, P., Chen, J., Zou, D., Gu, Q.: Global convergence of Langevin dynamics based algorithms for nonconvex optimization. In: Advances in Neural Information Processing Systems, pp. 3122–3133 (2018)
|
| [43] |
Yan, M.: Extension of convex function. arXiv preprint arXiv:1207.0944 (2012)
|
RIGHTS & PERMISSIONS
School of Mathematical Sciences, University of Science and Technology of China and Springer-Verlag GmbH Germany, part of Springer Nature