Convergence of Hyperbolic Neural Networks Under Riemannian Stochastic Gradient Descent
Wes Whiting, Bao Wang, Jack Xin
Convergence of Hyperbolic Neural Networks Under Riemannian Stochastic Gradient Descent
We prove, under mild conditions, the convergence of a Riemannian gradient descent method for a hyperbolic neural network regression model, both in batch gradient descent and stochastic gradient descent. We also discuss a Riemannian version of the Adam algorithm. We show numerical simulations of these algorithms on various benchmarks.
Hyperbolic neural network / Riemannian gradient descent / Riemannian Adam (RAdam) / Training convergence
[1.] |
Bécigneul, G., Ganea, O.-E.: Riemannian adaptive optimization methods. arXiv:1810.00760 (2019)
|
[2.] |
|
[3.] |
De Sa, C., Gu, A., Ré, C., Sala, F.: Representation tradeoffs for hyperbolic embeddings. CoRR, arXiv:1804.03329 (2018)
|
[4.] |
Ganea, O.-E., Bécigneul, G., Hofmann, T.: Hyperbolic neural networks. CoRR, arXiv:1805.09112 (2018)
|
[5.] |
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980v9 (2014)
|
[6.] |
Kratsios, A., Bilokopytov, I.: Non-Euclidean universal approximation. CoRR, arXiv:2006.02341 (2020)
|
[7.] |
|
[8.] |
Nagano, Y., Yamaguchi, S., Fujita, Y., Koyama, M.: A wrapped normal distribution on hyperbolic space for gradient-based learning. arXiv.1902.02992 (2019)
|
[9.] |
|
[10.] |
Nickel, M., Kiela, D.: Poincaré embeddings for learning hierarchical representations. arXiv:1705.08039v2 (2017)
|
[11.] |
Peng, W., Varanka, T., Mostafa, A., Shi, H., Zhao, G.: Hyperbolic deep neural networks: a survey. arXiv:2101.04562 (2021)
|
[12.] |
|
/
〈 | 〉 |