Ridge: Combining reliability and performance in open grid platforms

Krishnaveni Budati, Jason Sonnek, Abhishek Chandra, Jon Weissman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

25 Scopus citations

Abstract

Large-scale donation-based distributed infrastructures need to cope with the inherent unreliability of participant nodes. A widely-used work scheduling technique in such environments is to redundantly schedule the out sourced computations to a number of nodes. We present the design and implementation of RIDGE, a reliability aware system which uses a node's prior performance and behavior to make more effective scheduling decisions. We have implemented RIDGE on top of the BOINC distributed computing infrastructure and have evaluated its performance on a live test bed consisting of 120 PlanetLab nodes. Our experimental results show that RIDGE is able to match or surpass the throughput of the best vanilla BOINC configuration under different reliability environments, by automatically adapting to the characteristics of the underlying environment. In addition, RIDGE is able to provide much lower work unit makes pans compared to BOINC, which indicates its desirability in service-oriented environments with time constraints.

Original languageEnglish (US)
Title of host publicationProceedings of the 16th International Symposium on High Performance Distributed Computing 2007, HPDC'07
Pages55-64
Number of pages10
DOIs
StatePublished - 2007
Event16th International Symposium on High Performance Distributed Computing 2007, HPDC'07 and Co-Located Workshops - Monterey, CA, United States
Duration: Jun 25 2007Jun 29 2007

Publication series

NameProceedings of the 16th International Symposium on High Performance Distributed Computing 2007, HPDC'07

Other

Other16th International Symposium on High Performance Distributed Computing 2007, HPDC'07 and Co-Located Workshops
Country/TerritoryUnited States
CityMonterey, CA
Period6/25/076/29/07

Keywords

  • Open grids
  • Reliability
  • Reputation
  • Scheduling

Fingerprint

Dive into the research topics of 'Ridge: Combining reliability and performance in open grid platforms'. Together they form a unique fingerprint.

Cite this