A single event upset (SEU) is where a latch or logic element on the device is flipped into the wrong state by an unexpected occurence.
In an earlier blog about reliability (What Do We Mean by FPGA Reliability?), I mentioned single-event upsets (SEUs). The topic struck a chord with some readers, so now -- at the risk of being corrected at every turn -- I will attempt to delve a little deeper into this complex subject.
Before we start, let me state for the record that I am a marketing person and not an expert on SEUs. So this will not read like a deep dive into nuclear physics. The first thing to establish is the meaning of a "single-event upset." This is where a latch or logic element on the device is flipped into the wrong state by an unexpected event.
The unintended switch-over is typically caused when the charge holding the gate of a latch is destabilized by an energetic particle hitting a vulnerable part of the silicon. The particle might be a proton, an alpha particle, or a shower of sub-atomic particles, such as neutrons, caused by a cosmic ray. It has sufficient energy to strip electrons from atoms in the devices, leaving a track of ionized atoms in its path. An SEU can happen when the ionizing track passes through the gate of a cross-coupled flip-flop and changes the gate voltage. If this is sufficiently strong to cause a momentary change of state, then the regenerative action of the latch will retain the new logic output.
Simplified latch showing areas vulnerable to ionizing particles.
Earth's atmosphere acts like a blanket, absorbing some of the particles that might cause an SEU, so it is not surprising that the probability of an upset increases with altitude. Satellites are exposed to much higher levels of radiation and hence, potentially, to greater disruption than are terrestrial products. Aerospace applications normally use FPGA devices fabricated using a modified semiconductor process and/or design. FPGA vendors such as Microsemi and Xilinx offer radiation hardened (rad-hard) or radiation tolerant (rad-tolerant) devices that are less susceptible to SEUs. These devices can also accumulate a larger total dose of radiation. Total dose is a different factor, because it is the build-up of exposure over time that may not result in SEUs, but it is analogous to an aging effect.
To complicate matters further, the probability of an SEU is influenced by location in the world. Particles are affected by the Earth's magnetic field, so New York gets hit with twice the number compared to anywhere on the equator.
Aerospace applications have always worried about SEUs, but as leading CMOS processes hit 28 nm and below, there has been an increasing level of awareness. Of course, FPGAs are not the only semiconductor devices to suffer from SEU errors. Some SRAM devices are also very susceptible, but because most FPGAs store their configurations in on-chip SRAM, this has raised awareness of the phenomenon. Flash-based FPGA devices from vendors like Microsemi and Lattice Semiconductor do not use SRAM to hold the configuration, but SEUs can still disrupt the on-chip SRAM memory blocks that designers can use in their applications.
Semiconductor vendors have a significant influence over SEUs in several ways. For example, I mentioned that alpha particles can toggle the circuitry, but -- as we learn at school -- alpha particles have very poor penetration of solids. So you might question how they can possibly have any relevance to causing SEUs. The answer is that the packaging may contain impurities that generate alpha particles adjacent to the chip. The solution is for the vendors to use packaging that has very low alpha particle content.
The detailed design of the latch is very important to reducing SEU effects. One improvement might be to increase the capacitance on the gate, so that the impinging particle would have more coulombs of charge to disrupt. The SRAM elements that store the FPGA configuration do not need to be fast, so increasing the capacitance (which is tiny by most measures and is in the range of femto-farads) is viable. As the transistor size shrinks in size, so does the "target area" for the particle. The angle of the trajectory relative to the die is also important, which leads me to speculate that (perhaps) blades mounted vertically might be less susceptible than horizontally mounted boards.
To Page 2