I received this EE Times article through ACM while here at the embedded systems week in Salzburg. The topic of the article is echoed here as well, and let me tell you, people seem to have a short memory because the answer has been given in the late '70s.
There will never be a real 'parallel' language. Actually what people mean is a compiler that turns a sequential program into a parallel one. In the best case, we will have something like the parallising fortran compilers. These compilers look for the loops and the split them over multiple processors. The issue is that one can never extract more parallellism than was originally put in the program. For a lot of scientific programs or even for some graphic applications there is some potential, but for most applications the potential is very limited. Even then, a lot more parallelism could be found if a parallising compiler was not an exercise in reverse engineering. The original problems often have a lot of real-world parallelism. E.g. fluid dynamics code starts from a model where millions of small "voxels" and their interactions are integrated to obtain a sequential code. in the process, the "parallel" information gets lost.
In the embedded world we have fortunately already RTOS code using multi-tasking.
This is often quite natural as the real world is composed of concurrent (a better world for parallel) entities that interact. Hence real code is composed of concurrent entities that interact. This is the logical model. In real-time applications one has also to add the time dimension and that what makes it all seemingly harder than programming in C or Java on the PC. Real code is concurrent, interacting and has time-properties. There are also other properties like resource usage but we can ignore this for the moment. No single programming language can capture that automatically. It has to be part of the programming paradigm. It also has to be supported adequately in the hardware.
To start, multi-core architectures are already common in the embedded world.
It's almost the default architecture. Your mobile phone has likely a RISC, a DSP and maybe a couple of vectorising accelerators. I work with a company that makes smart sensors (Melexis). They put a 16bit microcontroller with just 32KB program code together with a 4bit controller handling I/O.
So what else is needed? Fast context switching and low latency I/O or data moving going on in parallel with the CPU. As Lothar Thiele put it again at the conference. Software = computing + communication + resource management. Hence, another computing paradigm is not going to solve it as they leave out two important aspects. Note that time here is a resource as well, but all three aspects are orthogonal and should remain as such.
Concurrency then becomes natural if one programs explicitly concurrently from the beginning. That shouldn't be an issue as good software engineering (not the same as writing code) is modeling in the first place, verifying that the design is correct and then writing the concurrent code is trivial. It can even be done by another program. No human is needed.
In the beginning of this letter, I alluded to the 70's and of course I was referring to the INMOS transputer that was based on Hoare's CSP process algebra.
It worked extremely well. The transputer did context switches in a single microsecond at 20 MHz. Even a single line instruction could be scheduled as a process. Unfortunately, serious marketing mistakes killed the transputer and its programming model although a small community of converted people is still alive and kicking.
How do modern processors compare? I should really say, modern processors and the software running on top of it. Mostly very badly. On top of that, people use them as references. The example that we all use subconciencelessly is the PC. We think Intel+Windows, but any other processor with Linux doesn't do much better.
E.g. Windows is using a 15 ms timeslicing scheduler whether it is a 100 MHz Pentium or the latest 3 GHz Machine. A lot of applications communicate by polling. Even on a "single core" this looks like a lot of waste. It is only justified by the fact that the interaction with our PC is happening at 25 Hz and that it is a good enough solution for heavy gaming graphics. But a true priority driven preemptive scheduler would have gotten a lot more out of the machine.
This was recently made very clear to us when developing a virtual prototype for a SoC where labview was combined with a CPU register level simulator en communicating over shared memory. When running it on a dual core PC (no source code changes), the simulation speed when up with a factor of about 200. So, even on a single core PC we are not getting the bang for the buck we deserve.
So, what's the final message here? If we want to exploit the power of multi-core architectures - and they are the natural architectures - we have to abandon the pure von Neumann architectures that is still reflected in the programming languages we use. We have to raise the level of abstraction and use sound system and software modeling methodologies rather than programming straight away in C,
C++ or Java. Which construction engineer would develop bridges by C++ putting the stones himself without even making a plan? Of course, this means we need to change the way (software) engineers are educated and trained. There is not much new to be invented as we just have to pick up the thread of the transputer and its CSP model. Shouldn't be that hard as RTOS'es have been using the concepts in an ad-hoc way for decades (but not always very efficiently). The major effort should now focus on much more rigor and formalism to achieve a "correct by design" methodology. Even in this domain, a lit of the building blocks are present but they need to be "productised" and people need to be trained. The major obstacle to using the potential of multi-core architectures is actually in the mind. People and this applies as well to engineers are mostly good at repeating the same thing they already know. Now they need to learn a new "language" (read: a way of thing in concurrency) and that is hard although learning a language comes natural when we are young.
Eric.Verhulst @ OpenLicenseSociety.org
Blog Doing Math in FPGAs Tom Burke 18 comments For a recent project, I explored doing "real" (that is, non-integer) math on a Spartan 3 FPGA. FPGAs, by their nature, do integer math. That is, there's no floating-point ...