Zebra: An Efficient, RDMA-Enabled Distributed Persistent Memory File System

Jingyu Wang, Shengan Zheng*, Ziyi Lin, Yuting Chen, Linpeng Huang*
Published in International Conference on Database Systems for Advanced Applications (DASFAA), 2022

Abstract: Distributed file systems (DFSs) play important roles in datacenters. Recent advances in persistent memory (PM) and remote direct memory access (RDMA) technologies provide opportunities in enhancing distributed file systems. However, state-of-the-art distributed PM file systems (DPMFSs) still suffer from a duplication problem and a fixed transmission problem, leading to high network latency and low transmission throughput. To tackle these two problems, we propose Zebra, an efficient RDMA-enabled distributed PM file system—Zebra uses a replication group design for alleviating the heavy replication overhead, and leverages a novel transmission protocol for adaptively transmitting file replications among nodes, eliminating the fixed transmission problem. We implement Zebra and evaluate its performance against state-of-the-art distributed file systems on an Intel Optane DC PM platform. The evaluation results show that Zebra outperforms CephFS, GlusterFS, and NFS by 4.38×, 5.61×, and 2.71× on average in throughput, respectively.

[pdf] [url]