Hi, you are right, it's an ARM3.
But the cloudx.cc project also supports AVR, MSP430 and soon a Thumb2 clone.
I'm not sure if instruction sets (or their complexity) really matter these days or in the future. There are other (multicore) challenges …
With little or no memory on the chip this just makes the "memory wall" more of a problem.
To execute an instruction the opcode and 2 operands are required and something has to be done with the result.
RISC trades CPU complexity for many, many more instructions.
Better to start with structured program statements and implement using memory blocks.
Horizontal microcode with local storage for variables is fast and reduces memory bus utilization.
ISAs have been massaged over and over for years and all this is taking a subset of a popular ISA.
I do not get the point.
I am sure they will matter because instruction set complexity is related to compiler and computational efficiency, which is ultimately related to energy efficiency.
However as you indicate the challenges of multicore operation and benefits to be derived by doing it well may be greater than those from swapping out one ISA and replacing with another.
But better to do both in an optimum manner.
There are uses for simple processors, or large sets of them, but they are not general computing ie you won't be running Word on them any time soon. A large number of real world applications need lots of data, and that means that making use of 1000 cores is pretty tricky if they all need to fetch data at the same time from 'random' locations.
Some applications can be rewritten to make use of cellular automata, or streaming, or systolic arrays or whatever in which the data moves through the processors in an ordered way so only the processors on the edge need to access memory and the rest just shift it through, transforming it as they go.
Since the early Inmos days these applications have become more widespread, eg video and audio compression/decompression can make use of this - but then there are specialised hardware implementations that do it better than a CPU anyway.
There are super-computing applications that might make use of lots of processors, but again they tend to need quite a lot of RAM per node (that is one reason the big machines use x86's and similar and not 1000 times as many tiny CPU's).
The problem remains not how to put lots of little brains on a chip, but how to make use of them in real applications.
(The original T400 transputer had 15 core instructions, and a mechanism to use the 16th to add extra ones which gradually happened over time. The instructions were all 1 byte long, with 4 bits for instruction and 4 bits for data, one instruction was used to extend the data into further nibbles. It was extremely RISC and had around 50k transistors I think, of which more than half were in the on chip RAM. I have the manual somewhere ...)
"you won't be running Word on them any time soon"
I beg to differ; a divide-and-conquer approach with a many-core CPU could work very well indeed, by passing both parallel and "systolic" tasks:
- keyboard interpreter
- mouse controls
- buffer editor
- mass storage manager
- window formatting
- font renderer
I see no reason why this couldn't be viable.
But in the interests of energy efficiency it is generally better to complete task in reasonable time and then put the system to sleep.
So not pointless with regard to energy consumption...or you could be doing other things.
Actually, the real value in RISC is by breaking up the instructions into single cycle steps, you enable the CPU to make the maximum use of the data bus. You could also argue that you are eliminating the microcode engine. The cost is that you have more instructions in a program. The benefit is that you can more closely hone the performance of that program on simpler hardware.
Multi-core in smartphones is coherent memory multi-processors running general purpose operating systems. Transputer was nothing like that. Cool as Transputer was, you can't claim it as the predecessor of what is
being implemented now.
Coherent multiprocessors are very old technology. What is amazing is that you can now manufacture them cheaply enough to carry it around in your pocket to read email, browse the web, and play angry birds. It is primarily a manufacturing/economic achievement.
I would not disagree with that.
But keeping multiprocessors coherent probably only works for a relatively small number of processors or applications where maintaining coherence is not overly burdensome.
A more scalable multiprocessing may require the adoption of incoherence..like the Internet.
If anyone wants a simple computer, think about: structured programs evaluate a relational expression(condition) and either continue with the next sequential or jump/branch to a target.
next is either an assignment or another relational expression.
The location of the expression and the first 2 operands and operator are all that are requied.
Dual port embedded RAM can deliver the two operands simultaneously while a second delivers the operator and next address.
Why bother creating a new processor, unless the point is to build it out of 74xx NAND chips just to show that you can? Restricting oneself to a subset of the available instruction set, or being forbidden to use a particularly helpful built-in operation, was one of the tricks used by my professor in CS205 class back in the 1970s.
I think Professor May does want to build the processor on the lab bench at the register level.
He makes the point that no schoolchildren today have ever seen any form of mechanical calculator; neither slide-rule nor hand-cranked calculator nor even the three-position dialing machine for calculating the remainder to score in a darts game.
As a result first-year students have no idea about what sits between a high-level language and the 1s and 0s toggling on a digital chip's pins....to them it is simply "magic."
Modern processor architectures are so complex and full of exceptions and special cases that it would be highly wasteful of time to teach processor architecture in that context.
@DutchUncle: personally, I design CPUs just for fun, but I do see a lot of new processors out there, with different ISAs and different architectures.
Let me be honest: I don't think that multi-core/SMP is the way to go. These two techniques explore paralellism, and have a lot of drawbacks (synchronization, lack of predicability).
There's another technique that does just about the same - superscalar.
The hardware always is inherently parallel - we have combinatory circuits feeding synchronous elements, all at the same time. What we fail to do so far is to figure out how to explore this massive parallel infrastructure into our benefit, without turning the General Purpose Computer in a specific one - meaning we can indeed explore this parallelism, but only for certain tasks.
Perhaps we are doing it all wrong. Perhaps we don't need "general purpose" registers. Perhaps we don't need stacks. We still use programming/CPU architecture (general purpose) as it was done in the early days (in late fifties).
Restricting the set of instructions has its benefits: it simplifies the design, reduces size and power consumption, while having an impact on the performance - this might pay off for more simple execution streams.
I do think we are specializing CPU too much: I don't think it makes much sense to have a CPU doing "complex" tasks like vector arithmetic, SIMD, and others. There might be a solution for this: not multi-core, but multiple specialized execution units, that can somewhat be independent of each other. This would require a new programming model though.
Blog Doing Math in FPGAs Tom Burke 21 comments For a recent project, I explored doing "real" (that is, non-integer) math on a Spartan 3 FPGA. FPGAs, by their nature, do integer math. That is, there's no floating-point ...