If you are doing a FIR with constant coëfficients in an FPGA, check out "distributed arithmetic". It replaces multipliers with look-up tables and adders. Amazing technique, "indistinguishable from magic".
@sa_penguin: Sounds like you could us embedded RAM blocks and do the whole thing in real time on the FPGA. This would not require external memory, controller, DMA, etc. And the multipliers on the FPGA are very fast.
There's some hefty intellectual argument going on here. As a lew-level starter, I can can tell you what started me out: Software Defined Radio [SDR] high speed ADC's and the USB interface.
High speed USB is rated at 480Mbps, but you're lucky to get 300Mbps. A 12-bit ADC running at 120MHz produces 1.44Gbps so a little massaging is in order, along with learning about DSP...
Downsampling? Take 12 samples, multiply each by a twiddle factor, and produce a 16-bit output sample? That should be 160Mbps and slow enough to transfer.
But - multiplying is hard. Taking those input samples and multiplying them takes time, and all the while new samples are coming in. A simple CPU is flat out, just multiplying and moving data around. Wouldn't it be nice if I could just write factor to 12 memory locations, have a DMA point the incoming data at another 12 locations - and a single value pops out after every 12 samples? I could call it a "decimating FIR filter"... and if I want one, I need a DSP chip or an FPGA.
The worst [or best] part came when I read about using Fourier transforms [transform input, select a band of interest, translate to a lower frequency band, then do an inverse transform]. The first step in transforming is re-ordering samples. I looked at the code for this, and thought: "DMA transfer, use an incrementing counter, but flip the counter bits before they go into the DMA".
Even if it doesn't work [time will tell] the fact I'm think in terms of counters and bit order means the journey has begun. There's no turning back now.
JeffL_2 wrote: I'm looking at data sheets here for Atmel and Lattice devices, and I see no evidence that either incorporates support for JTAG.
I don't know what devices you're looking at. If they're "jellybean" PLDs like 22V10s, then yes, those don't have JTAG: it's C-as-in-complex-PLDs and FPGAs that have JTAG. FPGAs and CPLDs use JTAG for its traditional purpose -- boundary scan for testing boards and chips -- and for programming the FPGA or CPLD. Xilinx also lets you read and write register bits, and create your own internal scan paths using logic cells. As I said upstream, Xilinx provides you and third parties the information to do this, but I don't know if they provide any software.
Most people debug FPGA designs using simulation (if they don't need to run real-time) or by routing internal signals to spare pins connected to test points so they can be 'scoped in real time. I mostly use the latter, using actual prototype hardware. You can also get FPGA development boards in various price ranges which provide more convenient access to signals. You can also get modules with DIP and PLCC pinouts for prototyping or small-scale production, but I wouldn't call them cheap.
Xilinx has something called ChipScope which (if I understand it correctly) makes it easy to view internal signals. I don't know how it uses JTAG. I've never used ChipScope, mostly because it isn't included with the free WebPACK version of the software. So I just include test points in my Verilog source code and modify the test point equations to look at different signals. The important thing is to have large enough test point holes in the PC board so that your probes stay put while you're running the test.
Profiles here do not seem to provide email addresses. My gmail.com address is 'paaronclayton', so you could either email me there or post your mildly obsfucated (to avoid spam) email address in a reply here.
I did not realize that Berkeley "pulled back the VLE".
I am surprised that compiler people would have much input on encoding, though the commonness of JIT compilation could justify more software-friendly encoding. (For ahead-of-time compilation, binary generation overhead would not be important in complexity or speed for any somewhat reasonable encodings. Compiler people and software people in general might have useful input in what instructions are offered, which instructions are short, etc.; but where semantic fields are placed within an instruction would not seem to be a big deal for compilers. However, that statement may just express my ignorance.)
It does sound like an interesting project. It is a bit sad that government involvement is necessary, but "If men were angels, no government would be necessary."
I look forward to emailing you some of my thoughts (more than requested or desired most likely! [I have an affection for computer architecture, as evidenced by Internet activity--comp.arch newsgroup, Real World Technologies forum, Stack Exchange, etc.]) when I get your email address.
If you need high algorithm performance or low latency, go with the FPGA.
If it's a complex function, go with the CPU.
A CPU is a cheap way to implement complex functionality, because the cost of functions is very low. So if your CPU is fast enough and meets your real-time latency requirements, go for the CPU. OTOH, the CPU can only do one operation at a time -- or a few if multiple cores and/or multiple issue, assuming your software can take advantage of them -- so performance is limited.
An FPGA can do many operations at the same time, so if you have a high-performance DSP algorithm (e.g., image processing or beam forming) or something similar, you can blast through a lot more data than a CPU, plus you can transfer data directly from block to block instead of having to transfer through memory. OTOH, you have to pay for those blocks so you don't want to do it that way unless it's really necessary.
The other great thing you get from an FPGA is extremely low latency, so if you have real-time data -- e.g., a specialized communications protocol -- you can respond immediately to your inputs. Most CPUs cannot respond quickly to inputs, especially if you're running an OS, so if you need low latency an FPGA can be a great way to do it. Often these applications don't need a lot of logic -- they just need the logic to be dedicated to the inputs. So you can get by with a cheap FPGA.
"1) the "core" voltage is some low nonstandard value which you may have to provide at a fairly high current depending on the size of the device"
The voltage and current requirements are what they need to be to ensure a balance between speed and power consumption.
Besides, have you looked at the power supply requirements for an Intel Core processor?Pro tip: it's not a 5V-only part.
Another Pro tip: there are these things called voltage regulators, which use simple resistive dividers to set the output voltage to pretty much whatever you want. The actual voltage required by the FPGA doesn't matter. Plop down the right regulator and move on. "2) sockets for these devices either do not exist or are two orders of magnitude more expensive than the device itself! (Not that current MCUs and MPUs are devoid if this issue either, for reasons that I still find inexplicable.)"
SOCKET? Nobody ships a product with a socket. The last time I put a socket in something was when we still used UV-erasable EPROMs. There is no reason to use a socket in this age of BGA packages and in-system-programmable flash memory. (OK, if you develop with OTP FPGAs for space applications, you might put a socket on the prototype boards.) "you could potentially add the VHDL to internally replicate an existing MCU for which compilers and assemblers already exist, but then you'd probably be in violation of someone's copyright"
If you embedded an 8051 in your FPGA, and didn't tell anyone, would anyone know? "something like the JTAG standard that caught hold for some MCUs isn't all that commonly used for FPGAs,"
Are you kidding? I'm hard-pressed to thing of a mainstream FPGA or CPLD family which doesn't have JTAG! "If the hardware aspects of the tools were sufficiently standardized that you KNEW you could use the same interface over the next 3-5 projects, and could amortize the tool costs against that, it would help a lot."
All of the FPGA vendors sell a USB-to-JTAG programming dongle for at most a couple hundred bucks. I still use the same Xilinx dongle the company bought for me almost nine years ago when I started the job. (OK, I have two more of them.) If you're doing FPGA design for a living and you can't afford the dongle (and the rest of the tool set is $FREE), then I don't know what to say. "I would NOT say that learning VHDL SHOULD be a problem since the benefits appear to outweigh the learning curve and it's very widely accepted (although I'm no expert in it yet either)."
Well, seeing as how VHDL (and Verilog) are the two primary FPGA design-entry methods (for both synthesis AND simulation -- you DO simulate and verify your designs, right?), then I'd say you MUST learn an HDL. It is simply not an option. Schematics? Don't go there.
"I hope Xilinx learned lessons from the Virtex-II Pro, which had built-in PowerPC cores. We considered that chip at one time since we were using PowerPC SoCs, but we gagged on the price. We later went with an IBM/AMCC 405EP and a Spartan-IIE, with a 33 MHz 32-bit PCI bus between them."
We did a bunch of designs with the Virtex-4 FX with built-in PPC, and we had the same experience. The chips were more expensive than a standalone PPC and a smaller FPGA next to it, and the savings in board space going with the V4FX wasn't all that much. We didn't do PCI between the PPC and FPGA, we just did the processor local bus.
And all of that cost was part of the equation. The other part was the continually-"evolving" toolset. Let's go from OPB to PLB to AHB, making the customer rewrite the same custom cores THREE times. Let's provide REALLY AWFUL cores with the tools. And so on.
"To be honest, until recently I still had questions like 'Why should one consider FPGA technology when we have well-known microcontrollers available to us'?"
The answer is quite obvious, actually: it's because generally, FPGAs and microcontrollers/microprocessors solve different problems.
The question our blogger asked is generally asked by folks coming in from a host software background, or perhaps from a embedded micro background.
A processor presents a sequential instruction execution model, limited I/O, and a straightforward programming model which allows the engineer to write programs in C or other language. A microcontroller's peripherals and pin-outs are limited to what the chip designers thought would be a good balance between cost and flexibility.
An FPGA is a platform for general digital logic design. The engineer can implement whatever is necessary to meet the design goals. The designer isn't limited to what peripherals were chosen by the chip vendor, so if you need nine SPI ports, each with different word lengths, you design them and add them to the larger system. If you need a wacky ping-pong assymetric memory interface to handle sensor data, you design exactly what you need. Pinouts are limited by the device choice; if you need more pins or more logic, you choose a larger device.
Of course, the flexibility of the FPGA comes at a cost, or at least with a requirement: you have to be a digital design engineer. It is not like writing firmware for a microcontroller. It is a different skill set. This needs to be understood.
Now of course the FPGA vendors are embracing High-Level Synthesis and "C to gates" methodologies as a way to "ease" the transition from high-level software and algorithm acceleration into hardware. But for mainstream logic design, where one really does need to be concerned about the cost of the FPGA device (as part of the overall BOM), HLS is still a non-starter.
And then there are the Systems-on-a-Chip, which embed a processor hard core with some amount of FPGA fabric. The intent here is a good one, since some applications require both a processor to handle stuff like an Ethernet or USB interface to a host, combined with the custom logic that is necessary to implement the design. Does every design require an SoC? Certainly not, but it's a useful tool for a designer to have (assuming the vendor design tools don't totally suck eggs).
NASA's Orion Flight Software Production Systems Manager Darrel G. Raines joins Planet Analog Editor Steve Taranovich and Embedded.com Editor Max Maxfield to talk about embedded flight software used in Orion Spacecraft, part of NASA's Mars mission. Live radio show and live chat. Get your questions ready.
Brought to you by