Shiro: Efficient and Accurate In-Storage Data Lifetime Separation for NAND Flash SSDs

Penghao Sun, Shengan Zheng*, Litong You, Wanru Zhang, Ruoyan Ma, Jie Yang, Feng Zhu, Shu Li, Linpeng Huang*
Published in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2025

Abstract: The log-structured nature of NAND flash storage necessitates garbage collection in SSDs. Garbage collection (GC) is a major source of runtime write amplification (WA), leading to faster device wear out and interference with host I/Os. The key to mitigating this problem is separating data by lifetime so that data in the same flash block are invalidated within temporal proximity. For higher lifetime prediction accuracy and adaptibility, prior works proposed using machine learning algorithms for data separation. However, existing learning-based solutions perform data lifetime prediction at the host side, leading to several drawbacks. First, host-side prediction does not have knowledge of the internal data movement inside the SSD during GC, and thus fails to leverage the opportunity to further separate GC writes, resulting in suboptimal WA reduction in the long term. Second, performing prediction at the host significantly prolongs the I/O critical path and consumes host resources that could otherwise be used for serving user applications. We present Shiro, a holistic FTL design that performs instorage data separation for both user writes and GC writes for maximal long-term WA reduction. For user writes, Shiro uses a sequence model to accurately predict data lifetime by learning lifetime distribution from long historical access patterns. For GC writes, Shiro incorporates a reinforcement learning-assisted page migration strategy that takes direct feedback from longterm WA to further improve data separation efficacy. To address the challenges posed by performing fine-grained and real-time machine learning decisions inside the resource-constrained SSD, we propose a suite of enabling techniques to keep computation and storage overhead low. Extensive evaluation of Shiro on real-world traces shows that Shiro can deliver 29%-68% lower WA compared with conventional FTL and state-of-the-art instorage data separation schemes. Furthermore, thanks to lower data migration overhead during GC, Shiro achieves significantly higher steady-state I/O performance.

[pdf (camera ready)] [url]