Reliability-aware deduplication storage: Assuring chunk reliability and chunk loss severity

Youngjin Nam, Guanlin Lu, David H Du

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

Reliability in deduplication storage has not attracted much research attention yet. To provide a demanded reliability for an incoming data stream, most deduplication storage systems first carry out deduplication process by eliminating duplicates from the data stream and then apply erasure coding for the remaining (unique) chunks. A unique chunk may be shared (i.e., duplicated) at many places of the data stream and shared by other data streams. That is why deduplication can reduce the required storage capacity. However, this occasionally becomes problematic to assure certain reliability levels required from different data streams. We introduce two reliability parameters for deduplication storage: chunk reliability and chunk loss severity. The chunk reliability means each chunk's tolerance level in the face of any failures. The chunk loss severity represents an expected damage level in the event of a chunk loss, formally defined as the multiplication of actual damage by the probability of a chunk loss. We propose a reliability-aware deduplication solution that not only assures all demanded chunk reliability levels by making already existing chunks sharable only if its reliability is high enough, but also mitigates the chunk loss severity by adaptively reducing the probability of having a chunk loss. In addition, we provide future research directions following to the current study.

Original languageEnglish (US)
Title of host publication2011 International Green Computing Conference and Workshops, IGCC 2011
DOIs
StatePublished - 2011
Event2011 International Green Computing Conference, IGCC 2011 - Orlando, FL, United States
Duration: Jul 25 2011Jul 28 2011

Publication series

Name2011 International Green Computing Conference and Workshops, IGCC 2011

Other

Other2011 International Green Computing Conference, IGCC 2011
CountryUnited States
CityOrlando, FL
Period7/25/117/28/11

Keywords

  • deduplication
  • loss severity
  • reliability
  • storage

Cite this