Embedded Systems Conference
Breaking News
Newest First | Oldest First | Threaded View
<<   <   Page 2 / 2
User Rank
MS243   11/27/2013 7:10:10 AM
MTBF is ususaly projected by just putting parts in an Oven, and letting them bake --   There is no thermal cycling, vibration, ESD due to service personnel, or other factors such and HIRF suceptibility, etc factored in.    All it takes for a failure, is a customer not contracting for a service guide and then attempting to service a complex system with an FPGA which may be much more ESD sensitive than past products to give everyone a sour taste in the mouth with failed devices for example.

User Rank
Adam-Taylor   11/26/2013 6:14:48 PM
The MTBF is also an intersting point as what people are really interested in is the probablity of success. Which at the point that the elapsed operating time = MTBF has a 37% chance of still being working. Which means if you want something to work for 10 years which a high probability of success you will need a much larger MTBF or a redundancy architecture or both.

While FPGA have good FIT rates where the problem comes at times is in creating the power architecture as DC DC and other POLS especially hybrids can have much worse FIT rates which swamp the FPGA contribution.

User Rank
Lots to consider
Adam-Taylor   11/26/2013 6:05:01 PM

Great blog, there are lots to consider when looking at fpga reliability, not just the actutal fit rate and mtbf of the device, remember FIT rate only applies in the constant failure rate period of the bathtub. 

You also need to consider the mounting method - BGA, column or land grid, Quad flat pack. Then there is the assignment of pins which are best to use if you have a choice. 

SEU are a concern which can lead to lock up but also there is the impacts of total ionising dose which can effect both the timing and the power dissipation.

With SEU you need to be very careful of synthesis optimisations to ensure they do not introduce potential problems under SEU. Many companies / institiutes are a little concerned about thing like auto state machine illegal state detection and instead prefer hand coded solutions. It is possible to determine the MTBF between SEU events in user logic and connfiguration logic for Xilinx devices I wrote an article on it but manufacturers are very careful not to scare users with SEU as there is a lot of bad advice out there. 

In real high end applications you are also going to be trying to ensure the junction temperature is de rated correctly at your maximum qualification temperature to ensure reliability (think of arrhenius) 

Of course within th FPGA we can do TMR, error correction and detection which can impact the speed of the device. You also need to consider the effects on single points such as clocks, resets and inputs, hencce why global TMR can be so useful. 

Also if you are designing your FPGA to be relaible then the rest of the system needs to be and you need to consider a lot more so the cost goes up quickly. 

User Rank
Re: SEUs in SRAM-based FPGAs
paul.dillien   11/26/2013 3:11:07 PM

Yes SEUs can do very nasty things, which is why it is important to understand what is the probability of a bit flip.  Designers of high reliability equipment will use features such as triple module redundancy and Error Detection and Correction (EDAC).  In addition there are techniques for "scrubbing" the configuration.

I know that Xilinx has been very active for many years on mitigating SEUs.  This includes design techniques that have resulted in the measurements on 28nm devices of SEUs/Mbit being the best ever going back as far as 250nm.  Obviously, there is much more configuration memory and user memory (Block RAM) in the latest devices, but the numbers are real (not calculated), and users can build in Soft Error Mitigation (SEM) IP cores to attack the issue from the design side too.

User Rank
SEUs in SRAM-based FPGAs
DrFPGA   11/26/2013 2:28:58 PM
Single Event Upsets (SEU) in SRAM-based FPGAs can have some unfortunate consequences. Not only can you change the logic function of a look-up table but since many of the configuration SRAM bits control the interconnect, a 'flipped' bit could add/subtract connections to an otherwise working device. Either of these errors could cause a failure that ripples thru the system and might end up creating a cascade effect where additional failures impact other devices (external memories, MCUs or data transmissions, PoL converters, etc). The mind reels...

Max The Magnificent
User Rank
I know it's important, but...
Max The Magnificent   11/26/2013 2:20:04 PM
I know reliability is important. It's just that I find wading through things like MTBF and FIT and stuff boring ... I just want to solder things together and have them work (LOL)

<<   <   Page 2 / 2

As data rates begin to move beyond 25 Gbps channels, new problems arise. Getting to 50 Gbps channels might not be possible with the traditional NRZ (2-level) signaling. PAM4 lets data rates double with only a small increase in channel bandwidth by sending two bits per symbol. But, it brings new measurement and analysis problems. Signal integrity sage Ransom Stephens will explain how PAM4 differs from NRZ and what to expect in design, measurement, and signal analysis.

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)
Like Us on Facebook
Special Video Section
The LTC®6363 is a low power, low noise, fully differential ...
Vincent Ching, applications engineer at Avago Technologies, ...
The LT®6375 is a unity-gain difference amplifier which ...
The LTC®4015 is a complete synchronous buck controller/ ...
The LTC®2983 measures a wide variety of temperature sensors ...
The LTC®3886 is a dual PolyPhase DC/DC synchronous ...
The LTC®2348-18 is an 18-bit, low noise 8-channel ...
The LT®3042 is a high performance low dropout linear ...
Chwan-Jye Foo (C.J Foo), product marketing manager for ...
The LT®3752/LT3752-1 are current mode PWM controllers ...
LED lighting is an important feature in today’s and future ...
Active balancing of series connected battery stacks exists ...
After a four-year absence, Infineon returns to Mobile World ...
A laptop’s 65-watt adapter can be made 6 times smaller and ...
An industry network should have device and data security at ...
The LTC2975 is a four-channel PMBus Power System Manager ...
In this video, a new high speed CMOS output comparator ...
The LT8640 is a 42V, 5A synchronous step-down regulator ...
The LTC2000 high-speed DAC has low noise and excellent ...
How do you protect the load and ensure output continues to ...