Multi-bit soft error tolerable L1 data cache based on characteristic of data value

Dang-hui Wang , He-peng Liu , Yi-ran Chen

Journal of Central South University ›› 2015, Vol. 22 ›› Issue (5) : 1769 -1775.

PDF
Journal of Central South University ›› 2015, Vol. 22 ›› Issue (5) : 1769 -1775. DOI: 10.1007/s11771-015-2695-3
Article

Multi-bit soft error tolerable L1 data cache based on characteristic of data value

Author information +
History +
PDF

Abstract

Due to continuous decreasing feature size and increasing device density, on-chip caches have been becoming susceptible to single event upsets, which will result in multi-bit soft errors. The increasing rate of multi-bit errors could result in high risk of data corruption and even application program crashing. Traditionally, L1 D-caches have been protected from soft errors using simple parity to detect errors, and recover errors by reading correct data from L2 cache, which will induce performance penalty. This work proposes to exploit the redundancy based on the characteristic of data values. In the case of a small data value, the replica is stored in the upper half of the word. The replica of a big data value is stored in a dedicated cache line, which will sacrifice some capacity of the data cache. Experiment results show that the reliability of L1 D-cache has been improved by 65% at the cost of 1% in performance.

Keywords

data cache / reliability / replica / data value / single event upset (SEU)

Cite this article

Download citation ▾
Dang-hui Wang, He-peng Liu, Yi-ran Chen. Multi-bit soft error tolerable L1 data cache based on characteristic of data value. Journal of Central South University, 2015, 22(5): 1769-1775 DOI:10.1007/s11771-015-2695-3

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

SyedZ S, MehdiB T. Transient error detection and recovery in processor pipelines [C]. 24th IEEE Int’l Symp on Defect Fault Tolerance in VLSI Systems, 2009, Chicago, Illinois, USA, IEEE Computer Society: 304-312

[2]

DegalahalV, VijaykrishnanV, IrwinM J. Analyzing soft errors in leakage optimized SRAM design [C]. 16th Int’l Conf on VLSI Design, 2003, New Delhi, India, IEEE Circuits and Systems Society: 227-233

[3]

ZhangWei. Computing cache vulnerability to transient rrrors and its implication [C]. 24th IEEE Int’l Symp on Defect Fault Tolerance in VLSI Systems, 2005, Monterey, California, USA, IEEE Computer Society: 427-435

[4]

BhattacharyaK, RanganathanN, KimS. A framework for correction of multi-bit soft errors in L2 caches based on redundancy [J]. IEEE Trans on VLSI, 2009, 17(2): 194-206

[5]

Sun Ultra Sparc T3. [Online]. [2010-07] Available: http://www.sun.com/processors/whitepapers

[6]

WangD-h, XinM-rui. Design of a reliable cache based on grouped checking and data reloading. [C]. Proc of the 2nd Int’l Conf on Information Science and Engineering, 2010, Hangzhou, China, IEEE Computer Society: 5051-5054

[7]

PaulS, CaiF, ZhangX M, BhuniaS. Reliability-driven ECC allocation for multiple bit error resilience in processor cache [J]. IEEE Trans on Computers, 2011, 60(1): 20-34

[8]

WangS, HuJ, ZiavrasS G. Replicating tag entries for reliability enhancement in cache tag arrays [J]. IEEE Trans on Very Large Scale Integeration (VLSI) Systems, 2011, 20(4): 643-654

[9]

ChengY, MaA-g, WangY-w, TangY-x, ZhangM-xuan. SS-SERA: An improved framework for architectural level soft error reliability analysis [J]. Journal of Central South University, 2012, 19(11): 3129-3146

[10]

ZhangW, GurumuthiS, KandemirM, SiavasubramaniamA. ICR: in-cache replication for enhancing data cache reliability [C]. Proc Int’l Conf Dependable Systems and Networks (DSN), 2003, San Francisco, California, USA, IEEE Computer Society: 291-300

[11]

ZhangWei. Replication cache: a small fully associative cache to improve data cache reliability [J]. IEEE Trans. On Computers, 2005, 54(12): 1547-1555

[12]

AkcicekD, KoyuncuS, SenH, KadayifI. Exploiting potentially dead blocks for improving data cache reliability against soft errors [C]. 22nd Int’l Symp on Computer and Information Sciences, 2007, Ankara, Turkey, IEEE Turkey Section: 340-345

[13]

MukherieeS, WeaverC, EmerJ, ReinhardtS, AustinT. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor [C]. Proc of the Int’l Symp on Microarchitecture, 2003, San Diego, California, USA, IEEE Computer Society: 29-41

[14]

ZhangWei. Computing and minimizing cache vulnerability to transient errors [J]. IEEE Design and Test, 2009, 26(2): 44-51

[15]

StuecheliJ, KaseridisD, DalyD, HunterH, JohnL. Coordinating DRAM and last-level-cache policies with the virtual write queue [J]. IEEE Micro, 2011, 31(1): 90-98

[16]

SunH-b, LiuC-y, XuW, ZhaoJ-z, ZhangTong. Using magnetic RAM to build low-power and soft error-resilient L1 cache [J]. IEEE Trans on Very Large Scale Integration (VLSI) Systems, 2012, 20(1): 19-28

[17]

WenW-j, MaoM-j, ZhuX-c, KangS H, WangD-h, ChenY-ran. CD-ECC: Content-dependent error correction codes for combating asymmetric nonvolatile memory operation errors [C]. IEEE/ACM Int’l Conf on Computer-Aided Design (ICCAD), 2013, San Jose, USA, IEEE Computer Society: 1-8

[18]

ZhangY-j, BayramI, WangY, LiH, ChenY-ran. ADAMS: Asymmetric differential STT-RAM cell structure for reliable and high-performance applications [C]. IEEE/ACM Int’l Conf on Computer-Aided Design (ICCAD), 2013, San Jose, USA, IEEE Computer Society: 9-16

[19]

BiX-y, MaoM-j, WangD-h, LiHai. Unleashing the potential of MLC STT-RAM caches [C]. IEEE/ACM Int’l Conf on Computer-Aided Design (ICCAD), 2013, San Jose, USA, IEEE Computer Society: 429-436

AI Summary AI Mindmap
PDF

92

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/