Cover
Vol. 6 No. 2 (2010)

Published: November 30, 2010

Pages: 186-193

Original Article

New Replica Selection Technique for Binding Replica Sites in Data Grids

Abstract

The objective in Data Grids is to reduce access and file (replica) transfer latencies, as well as to avoid single site congestion by the numerous requesters. To facilitate access and transfer of the data, the files of the Data Grid are distributed across the multiple sites. The effectiveness of a replica selection strategy in data grids depends on its ability to serve the requirement posed by the users' jobs. Most jobs are required to be executed at a specific execution time. To achieve the QoS perceived by the users, response time metrics should take into account a replica selection strategy. Total execution time needs to factor latencies due to network transfer rates and latencies due to search and location. Network resources affect the speed of moving the required data and searching methods can reduce scope for replica selection. This paper presents a replica selection strategy that adapts its criteria dynamically so as to best approximate application providers’ and clients’ requirements. We introduce a new selection technique (EST) that shows improved performance over the more common algorithms.

References

  1. M. Rashedur Rahman, Ken Barker, Reda Alhajj, "Replica selection environment:a data-mining approach", Distributed systems and grid computing (DSGC),pp: 695 – 700 , 2005
  2. J. Gwertzman and M. Seltzer, "The case for geographical push cashing", In Proceeding of the 5th Workshop on Hot ZTopic in Operating Systems, 1995.
  3. R. Kavitha, I. Foster, "Design and evaluation of replication strategies for a high performance data grid", in, Proceedings of Computing and High Energy and, Nuclear Physics, 2001.
  4. S. Vazhkudai, J. Schopf, I. Foster, "Predicting the performance of wide-area data transfers", in: 16th International PDPS, 2002.
  5. S. Vazhkudai, J. Schopf, "Using regression techniques to predict large data transfers", in: Computing: Infrastructure and Applications, The International Journal of High Performance Computing Applications, IJHPCA , August, 2003.
  6. A. Abbas, Grid Computing:" A Practical Guide to Technology and A PPLICATIONS ", 2006.
  7. http://goc.pragma-grid.net/wiki/index.php/UoHyd.
  8. S. Vazhkudai, S Tuecke, I. Foster, "Replica selection in the on Cluster Computing and the Grid, CCGrid 2001.
  9. J. Guyton and M. Schwartz."Locating nearby copies of replicated
  10. A. Tirumala, J. Ferguson, Iperf 1.2 - The TCP/UDP Bandwidth Measurement Tool, 2002.
  11. R. Wolski, Dynamically forecasting network performance using the Network Weather Service, Cluster Computing (1998).
  12. Yunhong Gu, Robert L. Grossman, "UDT: UDP-based data transfer for high-speed wide area networks " , Computer Networks, Volume 51, Issue 7, 16 May 2007, Pages 1777 1799. Elsevier.
  13. R.M. Rahman, K. Barker, R. Alhajj, "Predicting the performance of GridFTP transfers", in: Proceedings of IEEE Symposium of Parallel and Distributed Systems, 2004, New Mexico, USA, p. 238a.
  14. J. F. Kurose, K.W. Ross, "Computer Networking A Top-Down Approach Featuring the Internet", 3rd edition.
  15. S. Venugopal, . R. Buyya,"The Gridbus Toolkit for Service Oriented Grid and Utility Computing: An Overview and Status Report"2004.
  16. R. Agrawal, T. Imielinski, A.Swami, "Mining association rules between sets of items in large databases". In: Proc. ACM SIGMOD Intl. Conf. Management Data, 1993
  17. R.M Rahman, K Barker and R Alhajj, "Replica selection strategies in data grid", Journal of Parallel and Distributed Computing, Volume 68, Issue 12, Pages 1561-1574, December 2008.
  18. A. Jaradat, R. Salleh and A. Abid, "Imitating K-Means to Enhance Data Selection", Journal of Applied Sciences 9 (19): 3569-3574, 2009, ISSN 1812-5654, Asian Network for Scientific
  19. S. Venugopal, . R. Buyya, K. Ramamohanarao, "A taxonomy of Data Grids for distributed data sharing, management, and processing". ACM Comput. Surv. 38, 1 (Jun. 2006), ACM New York, NY, USA
  20. http://www.resample.com/xlminer/help/Index.htm
  21. A. K Pujari, "Data mining techniques", Hyderabad : Universities Press, 2002.
  22. G. Williams, M. Hegland and S. Roberts, "A Data Mining Tutorial", IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN’98) 14 December 1998.
  23. T. Ceryen, and M. Kevin, 2005. "Performance characterization of decentralized algorithms for replica selection in dstributed object systems". Proceedngs of 5th International Workshop on Software and Performance, July 11 -14, Palma, de Mallorca, Spain, pp: 257-262.
  24. F. Corina, and M. Mesaac, 2003, "A scalable replica selection strategy based on flexible contracts". Proceedings of the 3rd Computer Society Washington, DC, USA, pp: 95-99.
  25. R. M. almuttari, R. Wankar, A. Negi, C.R. Rao. "Intelligent Replica Selection Strategy for Data Grid", In proceeding of the 10 th International conference on Parallel and Distributed Proceeding Techniques and Applications, IEEE Computer Society Washington, DC, WorldComp2010, LasVegas, USA, Volume3, pp: 95-100, July 12-15-2010.
  26. Cisco Distributed Director, http://www.cisco.com/warp/public/cc/pd/cxsr/dd/index.shtml
  27. M. Sayal, Y. Breitbart, P. Scheuermann, R. Vingralek, "Selection algorithms for replicated web servers". In Proceeding of the Workshop on Internet Server Performance,1998.
  28. E. Zegura, M. Ammar, Z. Fei, and S. Bhattacharjee, "Application-layer anycasting: a server selection architecture and use in a replicated web service", IEEE/ACM Transactions on Networking, vol. 8, no. 4, pp. 455–466, Aug. 2000. . Figure 10 Data Grid and their associated network geometry