Automatic parallelism strategy generation with minimal memory redundancy

Yanqi SHI , Peng LIANG , Hao ZHENG , Linbo QIAO , Dongsheng LI

Front. Inform. Technol. Electron. Eng ›› 2025, Vol. 26 ›› Issue (1) : 109 -118.

PDF (614KB)
Front. Inform. Technol. Electron. Eng ›› 2025, Vol. 26 ›› Issue (1) : 109 -118. DOI: 10.1631/FITEE.2300684

Automatic parallelism strategy generation with minimal memory redundancy

Author information +
History +
PDF (614KB)

Abstract

Large-scale deep learning models are trained distributedly due to memory and computing resource limitations. Few existing strategy generation approaches take optimal memory minimization as the objective. To fill in this gap, we propose a novel algorithm that generates optimal parallelism strategies with the constraint of minimal memory redundancy. We propose a novel redundant memory cost model to calculate the memory overhead of each operator in a given parallel strategy. To generate the optimal parallelism strategy, we formulate the parallelism strategy search problem into an integer linear programming problem and use an efficient solver to find minimal-memory intra-operator parallelism strategies. Furthermore, the proposed algorithm has been extended and implemented in a multi-dimensional parallel training framework and is characterized by high throughput and minimal memory redundancy. Experimental results demonstrate that our approach achieves memory savings of up to 67% compared to the latest Megatron-LM strategies; in contrast, the gap between the throughput of our approach and its counterparts is not large.

Keywords

Deep learning / Automatic parallelism / Minimal memory redundancy

Cite this article

Download citation ▾
Yanqi SHI, Peng LIANG, Hao ZHENG, Linbo QIAO, Dongsheng LI. Automatic parallelism strategy generation with minimal memory redundancy. Front. Inform. Technol. Electron. Eng, 2025, 26(1): 109-118 DOI:10.1631/FITEE.2300684

登录浏览全文

4963

注册一个新账户 忘记密码

References

RIGHTS & PERMISSIONS

Zhejiang University Press

AI Summary AI Mindmap
PDF (614KB)

Supplementary files

FITEE-0109-24008-YQS_suppl_1

FITEE-0109-24008-YQS_suppl_2

206

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/