Computational-bound offloading, System-on-Chip integration, and custom processes are driving software applications into FPGAs.
Common configurations in deploying FPGAs include at least three general categories, largely depending on skill level and task type. These categories are as follows:
- FPGAs as a true "field of gates"
- FPGAs + special function blocks such as DSPs
- FPGAs + special function blocks such as DSPs + an on-board MCU
Each has its role in accelerating image processes. The field of gates works great when it's a consistent design type which can be highly parallelized. Starfield analysis falls into this. DSP blocks can add simple processing but the most typical use is to offload a CPU using the Communicating Sequential Processes (CSP) model.
Communicating Sequential Processes (CSP) model.
Within CPU offload, there is a lot of iteration needed to a) refactor the code efficiently and b) determine optimal partitioning between the hardware and the software. The process of refactoring from a CPU to an FPGA often involves maintaining a "golden model" in C that can help meter the increases in performance and provide a true A/B comparison. The first-pass is often to just establish functional equivalence. In this effort, an equivalent set of C files to the golden model are refactored as individual, coarse-grained streaming processes. This form can be most efficiently compiled into multiple streaming hardware processes. The initial pass will be less efficient but quick to turn. Gates are "free" in this pass, but memory is far from "free," and this is an area that software developers have to accommodate. Minimizing the amount of data leaving the chip is critical.
VHDL and Verilog are the most common coding languages -- hardware description languages (HDLs) -- used to represent FPGA designs. Most software developers do not use either, nor do they typically want to learn, hence the emergence of HLL-to-HDL compilers. Using these tools, users can create portable design prototypes -- portable between FPGAs within a brand, and even between FPGAs from different manufacturers.
On screen, the software developer sees an interface compatible with common tools such as Visual Studio. Impulse adds elements that help visualize the task of parallelizing in a manner that maximizes use of the parallel channels in the FPGA.
Coincidentally, this methodology maximizes the potential portability to an ASIC/SoC by maintaining an appropriate intermediate file format that can be refined or redirected later. Using HLLs in this flow is typically heterogeneous; i.e., the design ideally retains the capability of remaining mixed.
In order to shorten the development time, designers should stay in the HLL (e.g., C or OpenCL) as much as possible. This further maximizes portability as the HLL approach often isolates the FPGA-specific code to a board support library, a.k.a. a platform support package. Minor elements of modules or parts of the overall system may often be candidates to be "pulled out" of the flow and hand-coded for size or performance while targeting a specific system or device architecture.
Design types common to this approach include:
- UAVs: These designs are often driven by power requirements to use slower clock speeds and lower power FPGAs. System maintenance and continuance may be driven by requirements to bump-up (increase) resolution as soon as new sensors/cameras become viable and available. Use of appropriate HLL-to-FPGA technology results in shorter time to deployment and lower power HLL-designed FPGAs.
- HD: These designs are driven by ever-increasing resolutions and frame rates. If the target is a high-volume consumer system, FPGAs are often used as an intermediate step toward a full ASIC production. Use of appropriate HLL-to-FPGA technology shortens development time and lets the design team experiment with power-resolution-cost trade-offs.
- Machine vision used in manufacturing: Similar to what drives HD, but with a longer field life. In the case of these designs, power consumption is not as much of an issue, but resolution and processing speed directly relate to the throughput and value of the system.
To Page 3 >