Adaptive layer splitting for wireless large language model inference in edge computing: a model-based reinforcement learning approach

Yuxuan CHEN , Rongpeng LI , Xiaoxue YU , Zhifeng ZHAO , Honggang ZHANG

Front. Inform. Technol. Electron. Eng ›› 2025, Vol. 26 ›› Issue (2) : 278 -292.

PDF (9933KB)
Front. Inform. Technol. Electron. Eng ›› 2025, Vol. 26 ›› Issue (2) : 278 -292. DOI: 10.1631/FITEE.2400468

Adaptive layer splitting for wireless large language model inference in edge computing: a model-based reinforcement learning approach

Author information +
History +
PDF (9933KB)

Abstract

Optimizing the deployment of large language models (LLMs) in edge computing environments is critical for enhancing privacy and computational efficiency. In the path toward efficient wireless LLM inference in edge computing, this study comprehensively analyzes the impact of different splitting points in mainstream open-source LLMs. Accordingly, this study introduces a framework taking inspiration from model-based reinforcement learning to determine the optimal splitting point across the edge and user equipment. By incorporating a reward surrogate model, our approach significantly reduces the computational cost of frequent performance evaluations. Extensive simulations demonstrate that this method effectively balances inference performance and computational load under varying network conditions, providing a robust solution for LLM deployment in decentralized settings.

Keywords

Large language models (LLMs) / Edge computing / Model-based reinforcement learning (MBRL) / Split inference / Transformer

Cite this article

Download citation ▾
Yuxuan CHEN, Rongpeng LI, Xiaoxue YU, Zhifeng ZHAO, Honggang ZHANG. Adaptive layer splitting for wireless large language model inference in edge computing: a model-based reinforcement learning approach. Front. Inform. Technol. Electron. Eng, 2025, 26(2): 278-292 DOI:10.1631/FITEE.2400468

登录浏览全文

4963

注册一个新账户 忘记密码

References

RIGHTS & PERMISSIONS

Zhejiang University Press

AI Summary AI Mindmap
PDF (9933KB)

Supplementary files

FITEE-0278-24007-YXC_suppl_1

FITEE-0278-24007-YXC_suppl_2

237

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/