Challenges in Mitigating Errors in 1oo2D Safety Architecture with COTS Micro-controllers

Amer Kajmakovic*, Konrad Diwold*, Nermin Kajtazovic, Robert Zupanc

*Corresponding author for this work

Research output: Contribution to journalArticle

Abstract

The number of Commercial-Off-The-Shelf (COTS) micro-controllers used in safety applications has increased significantly over the last decade. In contrast to safety-certified micro-controllers, they are produced without integrated protection against memory soft errors and limited in terms of available memory and computation power. However, due to constant optimizations of the memory's physical size and the voltage margins, the probability that external factors, such as magnetic fields or cosmic rays, temporally alter a memory state (and thus cause a soft error) rises. It is crucial to address such errors within safety-critical systems, and consequently a wide range of error mitigation strategies have been proposed. In the context of established brownfield automation systems, redesign and redeployment of new hardware is usually not feasible. Therefore, other approaches can be applied to existing fail-safe architectures to further improve their performance without the need for a partial rework or conceptual changes. This article identifies challenges associated with soft error detection and correction strategies in 1-out-of-2 with diagnostic (1oo2D) safety architecture. Moreover, it investigates mitigation strategies and their deployment challenges through different production phases of the systems (i.e., greenfield) as well as requirements and limitations when working with already existing systems (i.e., brownfield). Among other parameters, the memory usage profile and its effect on the mitigation strategies is explained. A brief overview and evaluation of already available hardware-based strategies along with the evaluation of the most prominent software-based strategies are presented. In addition, a discussion about potential mitigation strategies that rely on the underlying hardware features is outlined. The article demonstrates how to identify and assess trade-offs associated with different strategies to decide on suitable methods to enhance fault tolerance in existing and future automation systems.
Original languageEnglish
Article number6
Pages (from-to) 250-263
JournalInternational Journal on Advances in Systems and Measurements
Volume13
Issue number3-4
Publication statusPublished - 30 Dec 2020

Keywords

  • soft errors
  • mixed-criticality
  • fail-safe
  • 1oo2
  • COTS

Fingerprint

Dive into the research topics of 'Challenges in Mitigating Errors in 1oo2D Safety Architecture with COTS Micro-controllers'. Together they form a unique fingerprint.

Cite this