A PTS-PGATS based approach for data-intensive scheduling in data grids
Kenli LI, Zhao TONG, Dan LIU, Teklay TESFAZGHI, Xiangke LIAO
A PTS-PGATS based approach for data-intensive scheduling in data grids
Grid computing is the combination of computer resources in a loosely coupled, heterogeneous, and geographically dispersed environment. Grid data are the data used in grid computing, which consists of large-scale data-intensive applications, producing and consuming huge amounts of data, distributed across a large number of machines. Data grid computing composes sets of independent tasks each of which require massive distributed data sets that may each be replicated on different resources. To reduce the completion time of the application and improve the performance of the grid, appropriate computing resources should be selected to execute the tasks and appropriate storage resources selected to serve the files required by the tasks. So the problem can be broken into two sub-problems: selection of storage resources and assignment of tasks to computing resources. This paper proposes a scheduler, which is broken into three parts that can run in parallel and uses both parallel tabu search and a parallel genetic algorithm. Finally, the proposed algorithm is evaluated by comparing it with other related algorithms, which target minimizing makespan. Simulation results show that the proposed approach can be a good choice for scheduling large data grid applications.
data grid / task scheduling / tabu search / genetic algorithms / parallelism
[1] |
Foster I, Kesselman C, eds. The Grid: Blueprint for a New Computing Infrastructure. San Francisco: Morgan Kaufmann Publishers, 1999
|
[2] |
Kenli L, Tianfang T, Feng W. Parallelization methods for implementation of discharge simulation along resin insulator surfaces. Computers & Electrical Engineering, 2011, 37(1): 30-40
|
[3] |
Chervenak A, Foster I, Kesselman C, Salisbury C, Tuecke S. The data grid: towards an architecture for the distributed management and analysis of large scientific datasets. Journal of Network and Computer Applications, 2000, 23(3): 187-200
|
[4] |
Kim S, Weissman J B. A genetic algorithm based approach for scheduling decomposable data grid applications. In: Proceedings of the 2004 International Conference on Parallel Processing. 2004, 406-413
|
[5] |
Maheswaran M, Ali S, Sieel H J, Hensgen D, Freund R F. Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems. In: Proceedings of 8th Heterogeneous Computing Workshop. 1999, 30-44
|
[6] |
Tang X, Li K. A novel security-driven scheduling algorithm for precedence constrained tasks in heterogeneous distributed systems. IEEE Transactions on Computers, 2011, 60(7): 1017-1029
|
[7] |
Dauzere-Peres S, Paulli J. An integrated approach for modeling and solving the general multiprocessor job-shop scheduling problem using tabu search. Annals of Operations Research, 1997, 70(0): 281-306
|
[8] |
Abdelaziz A Y, Mohamed F M, Mekhamer S F. Distribution system reconfiguration using a modified tabu search algorithm. Electric Power Systems Research, 2010, 80(8): 943-953
|
[9] |
Etminani K, Naghibzadeh M. A min-min max-min selective algorithm for grid task scheduling. In: Proceedings of 3rd IEEE/IFIP International Conference on Internet. 2007, 1-7
|
[10] |
Schwiegelshohn U, Tchernykh A, Yahyapour R. Online scheduling in grids. In Proc. of IEEE International Symposium on Parallel and Distributed Processing. Los Alamitos: IEEE Computer Society, 2008, 1-10
|
[11] |
Noriguki F, Kenichi H. A comparison among grid scheduling algorithms for independent coarse-grained tasks. In: Proceedings of the 2004 International Symposium on Applications and the Internet Workshops. 2004, 674-680
|
[12] |
Casanova H, Legrand A, Zagorodnov D, Berman F. Heuristics for scheduling parameter sweep applications in grid environments. In: Proceedings of the 9th Heterogeneous Computing Workshop. 2000, 349-363
|
[13] |
Elghirani A, Subrata R, Zomaya A Y, Mazari A A. Performance enhancement through hybrid replication and genetic algorithm co-scheduling in data grids. In: Proceedings of IEEE/ACS International Conference on Computer System and Applications. 2008, 436-443
|
[14] |
Elghirani A, Subrata R, Zomaya A Y. Intelligent scheduling and replication in data grids: a synergistic approach. In: Proceedings of 7th IEEE International Symposium on Cluster Computing and the Grid. 2007, 179-182
|
[15] |
Dang N N, Hwang S, Lim S B. Improvement of data grid’s performance by combining job scheduling with dynamic replication strategy. In: Proceedings of 6th International Conference on Grid and Cooperative Computing. 2007, 513-520
|
[16] |
Venugopal S, Buyya R. An scp-based heuristic approach for scheduling distributed data-intensive applications on global grids. Journal of Parallel and Distributed Computing, 2008, 68(4): 471-487
|
[17] |
Wang Zhixin and Ju Gang. A parallel genetic algorithm in multi-objective optimization. In: Proceedings of Control and Decision Conference. 2009, 3497-3501
|
[18] |
Guangyuan L, Jingjun Z, Ruizhen G, Yanmin S. An improved parallel adaptive genetic algorithm based on pareto front for multi-objective problems. In: Proceedings of 2nd International Symposium on Knowledge Acquisition and Modeling. 2009, 212-215
|
[19] |
Yi H, Yuhui Q, Guangyuan L, Kaiyou L. A parallel tabu search approach based on genetic crossover operation. In: Proceedings of 19th International Conference on Advanced Information Networking and Application. 2005, 467-470
|
[20] |
Czajkowski K, Fitzgerald S, Foster I, Kesselman C. Grid information services for distributed resource sharing. In: Proceedings of 10th IEEE International Symposium on High Performance Distributed Computing. 2001, 181-194
|
[21] |
Rajasekar A, Moore R, Wan M. Mysrb & srb: components of a data grid. In: Proceedings of 11th IEEE International Symposium on High Performance Distributed Computing. 2002, 301-310
|
[22] |
Wolski R, Spring N, Hayes J. The network weather services: a distributed resource performance forcasting service for metacomputing. Journal of Future Generation Computer Systems, 1999, 15(5-6): 757-768
|
[23] |
Kakarontzas G, Savvas I K. Agent-based resource discovery and selection for dynamic grids. In: Proceedings of 15th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises. 2006, 195-200
|
[24] |
Chapman C, Musolesi M, Emmerich W, Mascolo C. Predictive resource scheduling in computational grids. In: Proceedings of IEEE International Parallel and Distributed Processing Symposium. 2007, 1-10
|
[25] |
Ranganathan K, Foster I. Decoupling computation and data scheduling in distributed data-intensive applications. In: Proceedings of 11th IEEE Symposium on High Performance Distributed Computing. 2002, 352-358
|
[26] |
Lee W, Mcgough S, Newhouse S, Darlington J. A standard based approach to job submission through web services. In: Proceedings of the UK e-Science All Hands Meeting. 2004, 901-905
|
[27] |
Srinivas M, Patnaik L M. Genetic algorithms: A survey. Computer, 1994, 27(4): 17-26
|
[28] |
Schengjun X, Shaoyong G, Dongling B. The analysis and research of parallel genetic algorithm. In: Proceedings of 4th International Conference on Wireless Communications, Networking and Mobile Computing. 2008, 1-4
|
[29] |
Zhang J, Lee B S, Tang X, Yeo C K. Impact of parallel download on job scheduling in data grid environment. In: Proceedings of 7th International Conference on Grid and Cooperative Computing. 2008, 102-109
|
[30] |
Buyya R, Murshed M. Gridsim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. Concurrency and Computation: Practice and Experience, 2002, 14(13-15): 1175-1220
|
[31] |
Li J, Pan Q, Liang Y. An effective hybrid tabu search algorithm for multi-objective flexible job-shop scheduling problems. Computers & Industrial Engineering, 2010, 59(4): 647-662
|
/
〈 | 〉 |