This paper presents a scalable and adaptive decentralized metadata lookup scheme for ultra large-scale file systems (≥ Petabytes or even Exabytes). Our scheme logically organizes metadata servers (MDS) into a multi-layered query hierarchy and exploits grouped Bloom filters to efficiently route metadata requests to desired MDSs through the hierarchy. This metadata lookup scheme can be executed at the network or memory speed, without being bounded by the performance of slow disks. An effective workload balance algorithm is also developed in this paper for server reconfigurations. This scheme is evaluated through extensive trace-driven simulations and prototype implementation in Linux. Experimental results show that this scheme can significantly improve metadata management scalability and query efficiency in ultra large-scale storage systems.
Available at: http://works.bepress.com/yifeng_zhu/5/
Technical Report TR-UNL-CSE-2007-0025
Issued Nov. 20, 2007