Resource scheduling techniques in cloud from a view of coordination: a holistic survey
Yuzhao WANG, Junqing YU, Zhibin YU
Resource scheduling techniques in cloud from a view of coordination: a holistic survey
Nowadays, the management of resource contention in shared cloud remains a pending problem. The evolution and deployment of new application paradigms (e.g., deep learning training and microservices) and custom hardware (e.g., graphics processing unit (GPU) and tensor processing unit (TPU)) have posed new challenges in resource management system design. Current solutions tend to trade cluster efficiency for guaranteed application performance, e.g., resource over-allocation, leaving a lot of resources underutilized. Overcoming this dilemma is not easy, because different components across the software stack are involved. Nevertheless, massive efforts have been devoted to seeking effective performance isolation and highly efficient resource scheduling. The goal of this paper is to systematically cover related aspects to deliver the techniques from the coordination perspective, and to identify the corresponding trends they indicate. Briefly, four topics are involved. First, isolation mechanisms deployed at different levels (micro-architecture, system, and virtualization levels) are reviewed, including GPU multitasking methods. Second, resource scheduling techniques within an individual machine and at the cluster level are investigated, respectively. Particularly, GPU scheduling for deep learning applications is described in detail. Third, adaptive resource management including the latest microservice-related research is thoroughly explored. Finally, future research directions are discussed in the light of advanced work. We hope that this review paper will help researchers establish a global view of the landscape of resource management techniques in shared cloud, and see technology trends more clearly.
Coordination / Co-location / Heterogeneous computing / Microservice / Resource scheduling techniques
/
〈 | 〉 |