Exploiting Persistent CPU Cache for Scalable Persistent Hash Index

Bowen Zhang, Shengan Zheng*, Liangxu Nie, Zhenlin Qi, Linpeng Huang*, Hong Mei
Published in IEEE International Conference on Data Engineering (ICDE), 2024

Abstract: Byte-addressable persistent memory (PM) has been widely studied in the past few years. Recently, the emerging eADR technology further incorporates CPU cache into the persistence domain. The persistent CPU cache is promising to optimize the write performance of PM-based storage systems and facilitate the design of concurrent crash-consistent data structures. In this paper, we propose Spash, a highly scalable persistent hash index for PM systems with persistent CPU cache. Spash fully exploits the benefits of persistent CPU cache to implement a durable linearizable index with low PM access overhead and high concurrency. Spash employs a fine-grained extendible hash architecture and a metadata-free segment design to minimize the number of PM accesses. Moreover, Spash adopts adaptive in-place updates and compacted-flush insertions, which dramatically conserve scarce PM write bandwidth by absorbing a large amount of PM write in the persistent CPU cache. Furthermore, Spash proposes a two-phase concurrency protocol and a collaborative staged doubling mechanism, which leverage the persistent CPU cache and hardware transactional memory to achieve lock-free concurrency and durable linearizability. Spash outperforms the other state-of-the-art persistent hash indexes in YCSB workloads by up to 19.6×.

[pdf] [url]