Design Con 2015
Breaking News
Comments
Newest First | Oldest First | Threaded View
Eric Verhulst_Altreonic
User Rank
Rookie
re: Functional safety implementations in modern MCUs
Eric Verhulst_Altreonic   7/27/2013 9:22:02 AM
NO RATINGS
Of course, SIL4 and Fault Tolerance are not the same concept, but SIL4 requires in almost all cases a fault tolerant architecture (as you say as well). The issue is that the standards often allow to minimize the safety risk because they define SIL as function of probability of occurence, severity and controllability. Very subjective. And because relying on a Hazrd and Risk Aalysis, most likely not covering all the risks (as there is not always a historical track record, especially in automotive). Not to speak of the unpredictable nature of the environment and the operator.

The other issue is that SIL thinking is rooted in the times when things were mostly linear (analog, continuous domain), where probabilities and graceful degradation (still) apply. Digital electronics and software are however in the digital (discrete, non-linear) domain. One small bit flip and 20 nanoseconds later, the system can have failed. The statespace is so large that fault tree analysis techniques can never go to this level of detail. The point is also that software is like a virtual machine sitting on top of a discrete state machine sitting on top of a semiconductor device (that is again in the continuous domain). The hidden assumption for software is not so much that it is error-free (more or less true when using formal methods), but that the hardware is always fault-free. Hence we have a hierarchy of levels. At the chip level, reliability margins apply, at the discrete level micro-redundancy applies, at the software level, block level redundancy and at the system level macro-level redundancy applies. There is an additional level that takes into account residual common mode failures and that requires diversity as well. We have developed a criterion, called ARRL (Assured Reliability and Resilience Level) that takes this analysis onto account. Draft white paper on request (I need an email to send it to).

The benefit of this approach is that it becomes possible to characterise components (or subsystem entities) in terms of how they deal with failures and one can reuse them from one domain to another also in the contact of safety critical systems (in essence the components carry a contract with them). One can also define rules on how to reach higher ARRL (and hence SIL) levels by composition. Note also that SIL and ARRL are complementary. They meet in the middle (just like a HARA and FMEA do).

The point I wanted to make is that there is no reason why MCU can't be made "fault tolerant" by default. Gates are almost for free these days. And while lockstepping CPUs can help, they are not a miracle solution for safety. They basically only alow to detect that there is a fault, but not to correct the fault. Safety comes from masking out such internal faults so that the system continous to deliver its service. Using 2 such chips (2 oo 4) is a higher level remedy (but watch out for common mode failures, e.g. power issues). The other in my view more interestig approach is already in use in space (and as far as I know in high-end server chips like IBM's Power7). Make the logic cells fault tolerant (triplicate the gates). A very nice and recent example is Microsemi's SmartFusion-22. And it is not expensive. Certainly less expensive than developing a fault detection and correction architecture around traditional chips.  

Ironhorse0
User Rank
Rookie
re: Functional safety implementations in modern MCUs
Ironhorse0   7/26/2013 1:23:22 PM
NO RATINGS
"The default approach should be fault tolerant (SIL4)"

Fault tolerant and SIL4 are not equivalent terms.  Fault tolerant refers to the ability of a system or function to operate correctly even though one or more of its component parts are malfunctioning.  SIL4 refers to required or achieved probability or rate of failure of a safety system or function.   Fault tolerant systems vary by how many simultaneous faults they can detect correct and by how many of those faults they can correct.   It is only implicit that higher SIL levels generally require greater degress of fault tolerance.

Ironhorse0
User Rank
Rookie
re: Functional safety implementations in modern MCUs
Ironhorse0   7/26/2013 1:21:00 PM
NO RATINGS
 "In terms of functional safety this is not fault tolerant."

I am confused by this statement. Functional Safety refers to the part of the overall safety of a system or function that depends on a system or function operating correctly in response to its inputs.  Thus, Functional Safety depends on hazard and on what the correct function is. These microcontrollers >are< fault tolerant to the degree to which they are capable.  ECC-SECDED means that the microcontroller can tolerate up to 2 simultaneous bit flip faults in any word at any time, 1 bit flip results in no effect,  2 bit flips results in a trigger than can be used to safe the microcontroller.  That is fault tolerant, but whether that is fault tolerant enough depends on the particular Functional Safety requirements that are placed upon the microcontroller.  Dual lock step cores are fault tolerant, 1 one fault is detectable.  That is enough for some Functional Safety cases, but not in others.  Triplication can provide 1 fault correction, but end to end triplication is exceptionally complex, and in distributed functions exposes the system to Byzantine faults.  The trend to accomplish guarantees of 1 fault correction is not triplication, but Quadruple Modular Redundancy (2oo4); that is, the pairing of lock-step microcontrollers, or implementation of 2 pairs of lock-step cores in a microcontroller (see FSL's QUASAR project)

Aaron M
User Rank
Rookie
re: Functional safety implementations in modern MCUs
Aaron M   3/30/2012 8:02:07 PM
NO RATINGS
Freescale is committed to helping system manufacturers more easily achieve system compliance with functional safety standards (ISO 26262 and IEC 61508). Through our new SafeAssure functional safety program, engineers can easily identify Freescale hardware and software solutions that are optimally designed to support functional safety implementations. There’s more info about these as well as our safety processes and support at Freescale.com/safeassure -Aaron McDonald, Freescale

Eric Verhulst_Altreonic
User Rank
Rookie
re: Functional safety implementations in modern MCUs
Eric Verhulst_Altreonic   3/28/2012 7:49:33 AM
NO RATINGS
Although very relevant article, In terms of functional safety this is not fault tolerant. It allow to fail "safely" (like when driving 200 km/hr). Extra cost for triplication and voting is very minimal (giving today's silicon dimensions) and could seriously reduce the development cost of fault tolerance support. The default approach should be fault tolerant (SIL4) so that when there is a failure, the system drops in SIL3. Still fully functional but only a second failure leads to a fail-safe stop. eric.verhulst (at=@) altreonic.com

mklassen
User Rank
Rookie
re: Functional safety implementations in modern MCUs
mklassen   3/22/2012 3:13:04 PM
NO RATINGS
This is an interesting article and very relevant...if you are interested in ISO 26262 you might want to read this recent blog: http://tinyurl.com/7kjhqxh @mfklassen



Top Comments of the Week
Flash Poll
Like Us on Facebook

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)
EE Life
Frankenstein's Fix, Teardowns, Sideshows, Design Contests, Reader Content & More
Carlos Bueno

Adventures in Userland
Carlos Bueno
Post a comment
Editor’s Note: Excerpted from Lauren Ipsum: A story about computer science and other improbable things, author Carlos Bueno introduces us to Lauren and her adventures in ...

Max Maxfield

Tired Old iPad 2 vs. Shiny New iPad Air 2
Max Maxfield
9 comments
I remember when the first iPad came out deep in the mists of time we used to call 2010. Actually, that's only four years ago, but it seems like a lifetime away -- I mean; can you remember ...

Martin Rowe

Make This Engineering Museum a Reality
Martin Rowe
Post a comment
Vincent Valentine is a man on a mission. He wants to make the first house to ever have a telephone into a telephone museum. Without help, it may not happen.

Rich Quinnell

Making the Grade in Industrial Design
Rich Quinnell
16 comments
As every developer knows, there are the paper specifications for a product design, and then there are the real requirements. The paper specs are dry, bland, and rigidly numeric, making ...

Special Video Section
The LT8640 is a 42V, 5A synchronous step-down regulator ...
The LTC2000 high-speed DAC has low noise and excellent ...
How do you protect the load and ensure output continues to ...
General-purpose DACs have applications in instrumentation, ...
Linear Technology demonstrates its latest measurement ...
10:29
Demos from Maxim Integrated at Electronica 2014 show ...
Bosch CEO Stefan Finkbeiner shows off latest combo and ...
STMicroelectronics demoed this simple gesture control ...
Keysight shows you what signals lurk in real-time at 510MHz ...
TE Connectivity's clear-plastic, full-size model car shows ...
Why culture makes Linear Tech a winner.
Recently formed Architects of Modern Power consortium ...
Specially modified Corvette C7 Stingray responds to ex Indy ...
Avago’s ACPL-K30T is the first solid-state driver qualified ...
NXP launches its line of multi-gate, multifunction, ...
Doug Bailey, VP of marketing at Power Integrations, gives a ...
See how to ease software bring-up with DesignWare IP ...
DesignWare IP Prototyping Kits enable fast software ...
This video explores the LT3086, a new member of our LDO+ ...
In today’s modern electronic systems, the need for power ...