New logarithmic step size for stochastic gradient descent
Mahsa Soheil SHAMAEE , Sajad Fathi HAFSHEJANI , Zeinab SAEIDIAN
Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (1) : 191301
New logarithmic step size for stochastic gradient descent
In this paper, we propose a novel warm restart technique using a new logarithmic step size for the stochastic gradient descent (SGD) approach. For smooth and non-convex functions, we establish an convergence rate for the SGD. We conduct a comprehensive implementation to demonstrate the efficiency of the newly proposed step size on the FashionMinst, CIFAR10, and CIFAR100 datasets. Moreover, we compare our results with nine other existing approaches and demonstrate that the new logarithmic step size improves test accuracy by 0.9% for the CIFAR100 dataset when we utilize a convolutional neural network (CNN) model.
stochastic gradient descent / logarithmic step size / warm restart technique
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
Redmon J, Farhadi A. Yolo9000: better, faster, stronger. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 6517−6525 |
| [5] |
|
| [6] |
Mishra P, Sarawadekar K. Polynomial learning rate policy with warm restart for deep neural network. In: Proceedings of 2019 IEEE Region 10 Conference (TENCON). 2019, 2087−2092 |
| [7] |
|
| [8] |
|
| [9] |
Huang G, Liu Z, Van Der Maaten L, Weinberger K Q. Densely connected convolutional networks. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 2261–2269 |
| [10] |
|
| [11] |
Loshchilov I, Hutter F. SGDR: Stochastic gradient descent with warm restarts. In: Proceedings of the 5th International Conference on Learning Representations. 2017 |
| [12] |
|
| [13] |
Xu G, Cao H, Dong Y, Yue C, Zou Y. Stochastic gradient descent with step cosine warm restarts for pathological lymph node image classification via PET/CT images. In: Proceedings of the 5th IEEE International Conference on Signal and Image Processing (ICSIP). 2020, 490−493 |
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
Wang X, Magnússon S, Johansson M. On the convergence of step decay step-size for stochastic optimization. In: Proceedings of the 35th Conference on Neural Information Processing Systems, 2021, 14226−14238 |
| [21] |
|
| [22] |
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770−778 |
| [23] |
Kingma D P, Ba J. Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations. 2015 |
| [24] |
|
| [25] |
|
Higher Education Press
Supplementary files
/
| 〈 |
|
〉 |