Skip to main content
MEMRES: A Fast Memory System Reliability Simulator
IEEE Transactions on Reliability (2016)
  • Shaodi Wang, University of California, Los Angeles
  • Henry Hu
  • Hongzhong Zheng
With scaling technology, emerging nonvolatile devices, and data-intensive applications, memory faults have become a major reliability concern for computing systems. With various hardware and software approaches proposed to address this issue, a comprehensive evaluation is required to understand the effectiveness of these solutions. Considering the complex nature of various memory faults as well as interactions between various correction mechanisms, we propose MEMRES, a fast main memory system reliability simulator. It enables memory fault simulation with error-correcting code (ECC) algorithms and modern memory reliability management, including memory page retirement, mirroring, scrubbing, and hardware sparing. MEMRES is computationally efficient in obtaining memory failure probabilities in the presence of multiple failure mechanisms and complex correction scheme, allowing the optimization of memory system reliability, the prediction of emerging memory reliability, and designing a reliability enhancement technique. The accuracy of MEMRES is verified by an existing analytical model and an existing memory fault simulator. We performed a case study on spin-transfer torque random access memory (STT-RAM)-based main memory, and the results indicate that in-memory ECC can significantly mitigate the write error rate of STT-RAM, demonstrating the capability of handling emerging memory system.
  • Memory management,
  • Circuit faults,
  • Error correction codes,
  • Analytical models,
  • Reliability engineering,
  • Monte Carlo methods
Publication Date
Fall October 11, 2016
Citation Information
Shaodi Wang, Henry Hu and Hongzhong Zheng. "MEMRES: A Fast Memory System Reliability Simulator" IEEE Transactions on Reliability (2016) ISSN: 1558-1721
Available at: