Breaking News
Comments
Newest First | Oldest First | Threaded View
Eric Verhulst_Altreonic
User Rank
Rookie
re: Functional safety implementations in modern MCUs
Eric Verhulst_Altreonic   7/27/2013 9:22:02 AM
NO RATINGS
Of course, SIL4 and Fault Tolerance are not the same concept, but SIL4 requires in almost all cases a fault tolerant architecture (as you say as well). The issue is that the standards often allow to minimize the safety risk because they define SIL as function of probability of occurence, severity and controllability. Very subjective. And because relying on a Hazrd and Risk Aalysis, most likely not covering all the risks (as there is not always a historical track record, especially in automotive). Not to speak of the unpredictable nature of the environment and the operator.

The other issue is that SIL thinking is rooted in the times when things were mostly linear (analog, continuous domain), where probabilities and graceful degradation (still) apply. Digital electronics and software are however in the digital (discrete, non-linear) domain. One small bit flip and 20 nanoseconds later, the system can have failed. The statespace is so large that fault tree analysis techniques can never go to this level of detail. The point is also that software is like a virtual machine sitting on top of a discrete state machine sitting on top of a semiconductor device (that is again in the continuous domain). The hidden assumption for software is not so much that it is error-free (more or less true when using formal methods), but that the hardware is always fault-free. Hence we have a hierarchy of levels. At the chip level, reliability margins apply, at the discrete level micro-redundancy applies, at the software level, block level redundancy and at the system level macro-level redundancy applies. There is an additional level that takes into account residual common mode failures and that requires diversity as well. We have developed a criterion, called ARRL (Assured Reliability and Resilience Level) that takes this analysis onto account. Draft white paper on request (I need an email to send it to).

The benefit of this approach is that it becomes possible to characterise components (or subsystem entities) in terms of how they deal with failures and one can reuse them from one domain to another also in the contact of safety critical systems (in essence the components carry a contract with them). One can also define rules on how to reach higher ARRL (and hence SIL) levels by composition. Note also that SIL and ARRL are complementary. They meet in the middle (just like a HARA and FMEA do).

The point I wanted to make is that there is no reason why MCU can't be made "fault tolerant" by default. Gates are almost for free these days. And while lockstepping CPUs can help, they are not a miracle solution for safety. They basically only alow to detect that there is a fault, but not to correct the fault. Safety comes from masking out such internal faults so that the system continous to deliver its service. Using 2 such chips (2 oo 4) is a higher level remedy (but watch out for common mode failures, e.g. power issues). The other in my view more interestig approach is already in use in space (and as far as I know in high-end server chips like IBM's Power7). Make the logic cells fault tolerant (triplicate the gates). A very nice and recent example is Microsemi's SmartFusion-22. And it is not expensive. Certainly less expensive than developing a fault detection and correction architecture around traditional chips.  

Ironhorse0
User Rank
Rookie
re: Functional safety implementations in modern MCUs
Ironhorse0   7/26/2013 1:23:22 PM
NO RATINGS
"The default approach should be fault tolerant (SIL4)"

Fault tolerant and SIL4 are not equivalent terms.  Fault tolerant refers to the ability of a system or function to operate correctly even though one or more of its component parts are malfunctioning.  SIL4 refers to required or achieved probability or rate of failure of a safety system or function.   Fault tolerant systems vary by how many simultaneous faults they can detect correct and by how many of those faults they can correct.   It is only implicit that higher SIL levels generally require greater degress of fault tolerance.

Ironhorse0
User Rank
Rookie
re: Functional safety implementations in modern MCUs
Ironhorse0   7/26/2013 1:21:00 PM
NO RATINGS
 "In terms of functional safety this is not fault tolerant."

I am confused by this statement. Functional Safety refers to the part of the overall safety of a system or function that depends on a system or function operating correctly in response to its inputs.  Thus, Functional Safety depends on hazard and on what the correct function is. These microcontrollers >are< fault tolerant to the degree to which they are capable.  ECC-SECDED means that the microcontroller can tolerate up to 2 simultaneous bit flip faults in any word at any time, 1 bit flip results in no effect,  2 bit flips results in a trigger than can be used to safe the microcontroller.  That is fault tolerant, but whether that is fault tolerant enough depends on the particular Functional Safety requirements that are placed upon the microcontroller.  Dual lock step cores are fault tolerant, 1 one fault is detectable.  That is enough for some Functional Safety cases, but not in others.  Triplication can provide 1 fault correction, but end to end triplication is exceptionally complex, and in distributed functions exposes the system to Byzantine faults.  The trend to accomplish guarantees of 1 fault correction is not triplication, but Quadruple Modular Redundancy (2oo4); that is, the pairing of lock-step microcontrollers, or implementation of 2 pairs of lock-step cores in a microcontroller (see FSL's QUASAR project)

Aaron M
User Rank
Rookie
re: Functional safety implementations in modern MCUs
Aaron M   3/30/2012 8:02:07 PM
NO RATINGS
Freescale is committed to helping system manufacturers more easily achieve system compliance with functional safety standards (ISO 26262 and IEC 61508). Through our new SafeAssure functional safety program, engineers can easily identify Freescale hardware and software solutions that are optimally designed to support functional safety implementations. There’s more info about these as well as our safety processes and support at Freescale.com/safeassure -Aaron McDonald, Freescale

Eric Verhulst_Altreonic
User Rank
Rookie
re: Functional safety implementations in modern MCUs
Eric Verhulst_Altreonic   3/28/2012 7:49:33 AM
NO RATINGS
Although very relevant article, In terms of functional safety this is not fault tolerant. It allow to fail "safely" (like when driving 200 km/hr). Extra cost for triplication and voting is very minimal (giving today's silicon dimensions) and could seriously reduce the development cost of fault tolerance support. The default approach should be fault tolerant (SIL4) so that when there is a failure, the system drops in SIL3. Still fully functional but only a second failure leads to a fail-safe stop. eric.verhulst (at=@) altreonic.com

mklassen
User Rank
Rookie
re: Functional safety implementations in modern MCUs
mklassen   3/22/2012 3:13:04 PM
NO RATINGS
This is an interesting article and very relevant...if you are interested in ISO 26262 you might want to read this recent blog: http://tinyurl.com/7kjhqxh @mfklassen



EE Life
Frankenstein's Fix, Teardowns, Sideshows, Design Contests, Reader Content & More
Max Maxfield

10 Top Video Parodies on User Interfaces
Max Maxfield
10 comments
As you may know, the people of Scotland are holding a referendum today to decide whether they wish to remain part of the United Kingdom (UK) or to become fully independent and "go it ...

EDN Staff

11 Summer Vacation Spots for Engineers
EDN Staff
20 comments
This collection of places from technology history, museums, and modern marvels is a roadmap for an engineering adventure that will take you around the world. Here are just a few spots ...

Glen Chenier

Engineers Solve Analog/Digital Problem, Invent Creative Expletives
Glen Chenier
15 comments
- An analog engineer and a digital engineer join forces, use their respective skills, and pull a few bunnies out of a hat to troubleshoot a system with which they are completely ...

Larry Desjardin

Engineers Should Study Finance: 5 Reasons Why
Larry Desjardin
46 comments
I'm a big proponent of engineers learning financial basics. Why? Because engineers are making decisions all the time, in multiple ways. Having a good financial understanding guides these ...

Flash Poll
Top Comments of the Week
Like Us on Facebook
EE Times on Twitter
EE Times Twitter Feed

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)