Skip to main content
Article
EAD: Elasticity Aware Deduplication Manager for Datacenters with Multi-tier Storage Systems
Cluster Computing, DOI: 10.1007/s10586-018-2141-z, 2018. (2018)
  • Zhengyu Yang, Northeastern University
  • Yufeng Wang
  • Janki Bhimani
  • Chiu C. Tan
  • Ningfang Mi
Abstract
The popularity of Big Data applications places pressures on storage systems to efficiently scale to meet the demand. At the same time, new developments like solid-state drives have changed to traditional storage hierarchy. Cloud storage systems are transitioning towards a hybrid architecture consisting of large amounts of memory, solid-state disks (SSDs), and traditional magnetic hard disks (HD). This paper presents Elasticity Aware Deduplication (EAD), a data deduplication framework designed for multi-tier cloud storage architectures consisting of SSD and HD. EAD dynamically adjusts the deduplication parameters at runtime in order to improve performance. Experimental results indicate that EAD is able to detect more than 98% of all duplicate data, but it only consumes less than 5% of expected memory space. Additionally, EAD saves approximately 74% of overall IO access cost compared to the traditional design.
Keywords
  • Deduplication Estimation,
  • Scalability,
  • Migration,
  • Cloud Storage Systems,
  • Fusion Disk,
  • Adaptive Dynamical Sampling keyword,
  • Cluster Computing,
  • Cloud Computing
Disciplines
Publication Date
2018
Citation Information
Zhengyu Yang, Yufeng Wang, Janki Bhimani, Chiu C. Tan, et al.. "EAD: Elasticity Aware Deduplication Manager for Datacenters with Multi-tier Storage Systems" Cluster Computing, DOI: 10.1007/s10586-018-2141-z, 2018. (2018)
Available at: http://works.bepress.com/zhengyuyang/62/