Jit fault detection: Increasing availability in 1oo2 systems just-in-time

Leo Botler, Nermin Kajtazovic, Konrad Diwold, Kay Römer

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

With silicon technology decreasing in size, memories get more susceptible to external influences, which can lead to soft errors. Although temporary, these errors constitute a challenge for safety-critical systems. Redundancy-based error detection is commonly used in industry to increase safety and mitigate these errors. When an error is detected, safety-critical systems are usually switched to a safe state. While this prevents failures, it negatively affects the system's availability. In this work, we propose Just-in-Time fault detection, a novel method which enables a system to be switched to the safe state only in case a detected error would affect the system's behavior. A software tool enabling the deployment of this method on an off-the-shelf processor is implemented, and the method is validated and compared with a state-of-the-art alternative approach using mixed-critical memories. Our results show an availability gain between 25.2% and 100% compared with the state-of-the-art approach while executing two different standard algorithms.

Original languageEnglish
Title of host publicationProceedings of the 15th International Conference on Availability, Reliability and Security, ARES 2020
PublisherAssociation of Computing Machinery
Number of pages10
ISBN (Electronic)9781450388337
DOIs
Publication statusPublished - 25 Aug 2020
Event15th International Conference on Availability, Reliability and Security: ARES 2020 - Virtuell, Ireland
Duration: 25 Aug 202028 Aug 2020

Publication series

NameACM International Conference Proceeding Series

Conference

Conference15th International Conference on Availability, Reliability and Security
Abbreviated titleARES 2020
Country/TerritoryIreland
CityVirtuell
Period25/08/2028/08/20

Keywords

  • 1oo2
  • Availability
  • Fault tolerance
  • Overwriting
  • RAM test
  • Safety critical systems
  • Soft errors

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint

Dive into the research topics of 'Jit fault detection: Increasing availability in 1oo2 systems just-in-time'. Together they form a unique fingerprint.

Cite this