Collaborative non-chain DNN inference with multi-device based on layer parallel

Qiuping Zhang , Sheng Sun , Junjie Luo , Min Liu , Zhongcheng Li , Huan Yang , Yuwei Wang

›› 2024, Vol. 10 ›› Issue (6) : 1748 -1759.

PDF
›› 2024, Vol. 10 ›› Issue (6) :1748 -1759. DOI: 10.1016/j.dcan.2023.11.004
Research article
research-article

Collaborative non-chain DNN inference with multi-device based on layer parallel

Author information +
History +
PDF

Abstract

Various intelligent applications based on non-chain DNN models are widely used in Internet of Things (IoT) scenarios. However, resource-constrained IoT devices usually cannot afford the heavy computation burden and cannot guarantee the strict inference latency requirements of non-chain DNN models. Multi-device collaboration has become a promising paradigm for achieving inference acceleration. However, existing works neglect the possibility of inter-layer parallel execution, which fails to exploit the parallelism of collaborating devices and inevitably prolongs the overall completion latency. Thus, there is an urgent need to pay attention to the issue of non-chain DNN inference acceleration with multi-device collaboration based on inter-layer parallel. Three major challenges to be overcome in this problem include exponential computational complexity, complicated layer dependencies, and intractable execution location selection. To this end, we propose a Topological Sorting Based Bidirectional Search (TSBS) algorithm that can adaptively partition non-chain DNN models and select suitable execution locations at layer granularity. More specifically, the TSBS algorithm consists of a topological sorting subalgorithm to realize parallel execution with low computational complexity under complicated layer parallel constraints, and a bidirectional search subalgorithm to quickly find the suitable execution locations for non-parallel layers. Extensive experiments show that the TSBS algorithm significantly outperforms the state-of-the-arts in the completion latency of non-chain DNN inference, a reduction of up to 22.69%.

Keywords

Collaborative DNN inference / Multi-device collaboration / Non-chain DNN model

Cite this article

Download citation ▾
Qiuping Zhang, Sheng Sun, Junjie Luo, Min Liu, Zhongcheng Li, Huan Yang, Yuwei Wang. Collaborative non-chain DNN inference with multi-device based on layer parallel. , 2024, 10(6): 1748-1759 DOI:10.1016/j.dcan.2023.11.004

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

F. Shrouf, J. Ordieres, G. Miragliotta, Smart factories in industry 4.0: a review of the concept and of energy management approached in production based on the Internet of things paradigm,in:Proceedings of the 2014 IEEE International Conference on Industrial Engineering and Engineering Management, IEEE, 2014, pp. 697-701.

[2]

M. Gerla, E.K. Lee, G. Pau, U. Lee, Internet of vehicles: from intelligent grid to autonomous cars and vehicular clouds, in: Proceedings of the 2014 IEEE World Forum on Internet of Things, IEEE, 2014, pp. 241-246.

[3]

T. Ouyang, Z. Zhou, X. Chen, Follow me at the edge: mobility-aware dynamic service placement for mobile edge computing, IEEE J. Sel. Areas Commun. 36 (10) (2018) 2333-2345.

[4]

J. Redmon, A. Farhadi, Yolo9000: better, faster, stronger,in:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2017, pp. 7263-7271.

[5]

K. Simonyan, A. Zisserman,Very deep convolutional networks for large-scale image recognition, arXiv preprint, arXiv :1409.1556, 2014.

[6]

K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2016, pp. 770-778.

[7]

Y. You, Z. Zhang, C.J. Hsieh, J. Demmel, K. Keutzer, Fast deep neural network training on distributed systems and cloud tpus, IEEE Trans. Parallel Distrib. Syst. 30 (11) (2019) 2449-2462.

[8]

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Van-houcke, A. Rabinovich, Going deeper with convolutions, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2015, pp. 1-9.

[9]

C. Szegedy, S. Ioffe, V. Vanhoucke, A. Alemi, Inception-v4, inception-resnet and the impact of residual connections on learning,in:Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Press, 2017, pp. 4278-4284.

[10]

X. Chen, J. Zhang, B. Lin, Z. Chen, K. Wolter, G. Min, Energy-efficient offloading for dnn-based smart iot systems in cloud-edge environments, IEEE Trans. Parallel Distrib. Syst. 33 (3) (2021) 683-697.

[11]

S. Zhang, S. Zhang, Z. Qian, J. Wu, Y. Jin, S. Lu, Deepslicing: collaborative and adaptive cnn inference with low latency, IEEE Trans. Parallel Distrib. Syst. 32 (9)(2021) 2175-2187.

[12]

Y. Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars, L. Tang, Neurosur-geon: collaborative intelligence between the cloud and mobile edge,in:Proceedings of the Twenty-Second International Conference on Architectural Support for Pro-gramming Languages and Operating Systems, ACM, 2017, pp. 615-629.

[13]

E. Li, Z. Zhou, X. Chen, Edge intelligence: on-demand deep learning model co-inference with device-edge synergy,in:Proceedings of the 2018 Workshop on Mobile Edge Communications, ACM, 2018, pp. 31-36.

[14]

R. Hadidi, J. Cao, M. Woodward, M.S. Ryoo, H. Kim, Musical chair: efficient real-time recognition using collaborative iot devices, arXiv preprint, arXiv :1802.02138, 2018.

[15]

J. Mao, X. Chen, K.W. Nixon, C. Krieger, Y. Chen, Modnn: local distributed mobile computing system for deep neural network,in:Proceedings of the Design, Automa-tion & Test in Europe Conference & Exhibition, IEEE, 2017, pp. 1396-1401.

[16]

Z. Zhao, K.M. Barijough, A. Gerstlauer, Deepthings: distributed adaptive deep learn-ing inference on resource-constrained iot edge clusters, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 37 (11) (2018) 2348-2359.

[17]

T. Mohammed, C. Joe-Wong, R. Babbar, M.D. Francesco, Distributed inference accel-eration with adaptive dnn partitioning and offloading, in: Proceedings of the 2020 IEEE Conference on Computer Communications, IEEE, 2020, pp. 854-863.

[18]

L. Zeng, X. Chen, Z. Zhou, L. Yang, J. Zhang, Coedge: cooperative dnn inference with adaptive workload partitioning over heterogeneous edge devices, IEEE/ACM Trans. Netw. 29 (2) (2020) 595-608.

[19]

H. Li, C. Hu, J. Jiang, Z. Wang, Y. Wen, W. Zhu, Jalad: joint accuracy- and latency-aware deep structure decoupling for edge-cloud execution,in:Proceedings of the 2018 IEEE 24th International Conference on Parallel and Distributed Systems, IEEE, 2018, pp. 671-678.

[20]

H.J. Jeong, H.J. Lee, C.H. Shin, S.M. Moon, Ionn: incremental offloading of neural network computations from mobile devices to edge servers,in:Proceedings of the ACM Symposium on Cloud Computing, ACM, 2018, pp. 401-411.

[21]

K.Y. Shin, H.J. Jeong, S.M. Moon, Enhanced partitioning of dnn layers for uploading from mobile devices to edge servers, in: Proceedings of the 3rd International Work-shop on Deep Learning for Mobile Systems and Applications, ACM, 2019, pp. 35-40.

[22]

C. Hu, W. Bao, D. Wang, F. Liu, Dynamic adaptive dnn surgery for inference ac-celeration on the edge, in: Proceedings of the 2019 IEEE Conference on Computer Communications, IEEE, 2019, pp. 1423-1431.

[23]

H. Wu, W.J. Knottenbelt, K. Wolter, An efficient application partitioning algorithm in mobile environments, IEEE Trans. Parallel Distrib. Syst. 30 (7) (2019) 1464-1480.

[24]

A.E. Eshratifar, M.S. Abrishami, M. Pedram, Jointdnn: an efficient training and in-ference engine for intelligent mobile cloud computing services, IEEE Trans. Mob. Comput. 20 (2) (2019) 565-576.

[25]

X. Chen, M. Li, H. Zhong, Y. Ma, C.H. Hsu, Dnnoff: offloading dnn-based intelligent iot applications in mobile edge computing, IEEE Trans. Ind. Inform. 18 (4) (2021) 2820-2829.

[26]

P. Belotti, C. Kirches, S. Leyffer, J. Linderoth, J. Luedtke, A. Mahajan, Mixed-integer nonlinear optimization, Acta Numer. 22 (1) (2013) 1-131.

[27]

M.L. Fredman, R.E. Tarjan, Fibonacci heaps and their uses in improved network optimization algorithms, J. ACM 34 (3) (1987) 596-615.

[28]

D.P. Kingma, M. Welling,Auto-encoding variational Bayes, arXiv preprint, arXiv: 1312.6114, 2013.

[29]

A. Dosovitskiy, J.T. Springenberg, M. Tatarchenko, T. Brox, Learning to generate chairs, tables and cars with convolutional networks, IEEE Trans. Pattern Anal. Mach. Intell. 39 (4) (2016) 692-705.

[30]

C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception ar-chitecture for computer vision, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2016, pp. 2818-2826.

AI Summary AI Mindmap
PDF

106

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/