"I do not understand why in such a critical control system, a redundant MPU control module is not used."
It seems that in Toyota were thinking along somewhat similar lines more than two decades ago! Clearly at that time someone was anticipating that there might be problems with electronic throttles.
See US Patent 4,995,364
Abstract: "A throttle control apparatus for engines comprises two throttle actuators for driving two corresponding main and sub throttle valves mounted in series in an intake pipe of an engine. An observer, to which the modern control theory is applied, presumes an opening degree of the main throttle valve in a normal condition, which occurs a predetermined time later, from an accelerator depression amount, which represents a throttle opening command, and an opening degree ( angular position) of the main throttle valve. A failure detector quickly finds, from a deviation from the presumed opening degree of the main throttle valve, that the main throttle valve has failed. When a failure occurs, the control of the sub throttle valve is started, making it possible to affect the throttle opening control with improved reliability."
1st rule of failsafe is an independent off switch. If the cars were only wired with a on-off switch that actually turned the car off, instead of an input to the processor, that would have also been an option in case of this failure. I'll never buy a car that doesn't have a switch that is electrically in series with the rest of the system.
Forget the FDA. I worked a contract for a company that was using computer software to control blood analysis and they hired me to work on the FDA application for an IND (investigational new drug, that's the only application for evaluation the agency has). As I recall there were about 50K lines of code written in a language that I did not at the time know well enough to evaluate nor were there ANY "tools" available to assist in the evaluation. Incredibly they actually asked ME to provide "signature authorization" that the code had been thoroughly tested and evaluated (there had only been some functional tests performed, nothing at all in the way of either unit testing or any level of structural coverage). I of course refused and was summarily let go, after which I went back to working on certification for avionics projects under FAA regulations because I already had worked in that field and knew that such incompetence and laxity just isn't tolerated in that world, and I never looked back. After reading this I don't get the impression that NHTSA is much different than FDA. As far as the latter goes, just don't get sick!
I worked in the petrochemical industry back in the 1980's and we used redundant CPU modules with an independent hardware switch which would switch from a defective CPU module when a failure was detected, then sound an alarm.
The matrix of these control systems were interconnected where any single/multiple failure would be handed over to another control system so quickly the chemical processing equipment never even knew it was changed, but the alarms did.
I do not understand why in such a critical control system, a redundant MPU control module is not used.
It would not add that much weight to the vehicle to use a redundant control module system.
Let all the engineers in the world, join together and demand a common standard in the automotive industry to equal or exceed the medical and industrial chemical standards.
After all it may be your family whose lives are in danger!
There are safety criticla OS'es that will detect an error like a bit flip in the tasking -- through redundant code -- but they are not ported to every processor -- The adding of ECC to MCU's is only something that has come about after about 2009/2010 due to advances in part desnsity for a given price target -- prior to this it had to be handled via the OS(has existed as a technology since the 1990's)
all ABS brake equiped cars owners manuals specifically state not to pump the brakes -- therein lies the problem -- The automation would not handlle a fault conditon correctly -- Interestingly TI, and Now Renesas both make MCU's that will automatically halt or reboot if there is a Bit Flip. Bit Flips can happen due to Electrical Noise, Cosmic Radiation(what creates radio active Carbon14 in the air) and then there are hard failues like Flash Memory Charge bake-out, Electromigration, and thermal and mechanical cycling which in an automotive application can limit the life of parts to less than 10,000 hours.(One of my vehicles has over 4,000 hours on it already)
"We can't just wait around for that particular bit to flip, which may take a long time."
The quoted testimony does not reveal much detail about the nature of the forced failure testing. Was this bit flip part of EDAC-protected memory, and if so, were multiple bits flipped? Specifically, was the test designed to overpower the error detection & correction capability of an error correcting code applied to particular variables in memory, or was this an example of memory locations that were unprotected?
If you are testing an error correcting code in an application, you know in advance the power of the code, as allocated to detection vs correction of bit errors, so you know in advance that the code can detect up to X errors in Y bits, and correct up to Z errors in the same Y bits, so part of your testing would be to confirm the behavior when you overwhelm the code with too many errors.
I'm not suggesting anything, just asking questions about how the testing was conducted, and how it relates to proof of what really happened on the road.
As development or verification engineers know, during the course of product development, design & verification teams can be somewhat adversarial in the sense that verification's job is to find ways to break the design, and there are always ways to break a design -- any design. The question then becomes whether the conditions that break the design are in scope or out of scope with respect to the requirements that must be met. If the only breakage is out of scope and the design meets all of its requirements, it likely moves forward to production.
It is also noteworthy that it appears that the brakes still functioned even under Task X death, if they were pumped. Whether pumping the brakes in an emergency could be deemed expected driver behavior might depend on whether one assumes the driver is a younger person who has no experience with non-ABS brakes, or an older driver who learned the pump the brakes technique in a pre-ABS era.
NASA's Orion Flight Software Production Systems Manager Darrel G. Raines joins Planet Analog Editor Steve Taranovich and Embedded.com Editor Max Maxfield to talk about embedded flight software used in Orion Spacecraft, part of NASA's Mars mission. Live radio show and live chat. Get your questions ready.
Brought to you by