So it is literally the 24 year old 486 with a 16KB shared I&D cache. In terms of clock cycles per instruction, yes the 486 is pretty slow compared to a modern RISC. Besides that, it is also single-issue while pretty much all Cortex-A cores can execute 2 instructions per cycle. And the 16KB shared I&D cache is not going to perform well compared with the 32KB I + 32KB D-cache of the Sitara (which also happens to have a 256KB L2).
For cycle timings see section 12.3 in the first link. Look at the multiply timings for example - up to 42 cycles for 32x32 multiply, and compare with the single-cycle multiplier in ARM cores. Shifts take 2-3 cycles while they are effectively free on most ARM cores. Floating point is not any better, with 10-16 cycles for fadd/fmul, while most ARMs do either in a single cycle. And back are all the AGU stalls, the penalties for complex addressing modes and LEA's, the various stalls for unaligned accesses, and our favorite: the 4-cycle fxch.
So yes while we don't have absolute figures, it is easy to understand that a 24 year old CISC CPU is going to be very slow compared to a modern RISC.
I am not saying that you are incorrect, but I am wondering where you are getting your comparisons from. The Quark X1000 has been listed as a using the pentium instruction set. This takes it out of the 486 range, and I would imagine that it would quite handily beat a 486 architecture if there had even been one that was clocked up to 400MHz. It was not until the PII that there was a 400MHz processor in the Intel line.
I am not saying that this chip is really going to perform well, but I have yet to see any information that says it will perform poorly. Even though quite a bit of information has been given, there is still a lot more information that is not yet out on the performance of this little device. Things like number of processing cycles for math operations or DMIPS/MIPS per MHz. Once that data starts flowing out, then we can make actual comparisons. Right now very few people even have a board in hand, let alone have the knowledge to do performance tests. I much prefer to make judgements on data rather than subjective information.
The reason that I included the M4F is that it is slightly better in clock speed than that uC and slightly less than the A8 series. It does have many more features of the A8 series than the M4F. The question that I have, though, is if Intel plans on going further down the line into M0 territory, or if they plan to go slightly higher.
The board can indeed run more than Arduino code. There was mention in the many supporting documents that it could be programmed in GCC. I have not gone digging to see if I could find the version that supports the Quark. I would like to see some of the initial setup in C code. It seems that just the initialization could be a bit of a pain.
I believe that this board runs Linux. This implies that the Quark X1000 soc has an MMU....So its not a 'true' microcontroller but a microprocessor. Realistically it shouldn't be compared to the Cortex-M4 MCU parts. A more relevant comparison could be made with the Cortex-A5/8/9 parts.
I hope that this board can do more than run Arduino code. The Arduino API while very simple ,effective and easy to use for beginners in the embedded space, does not provide enough flexibility for the more experienced embedded programmers/hobbyists. It would be a real shame and a waste of sophisticated hardware if the only thing that this board can do is run Arduino sketches
Price is quite interesting, and quite good.
Similar chips on market with similar prices :
Ti sitara, $5, 400mhz, cortex-a8,internal mcu for real time stuff, few hundred of kb sram, plenty of peripherals including good analog.
But you at right, this is a long game. My guess is that in a few years, we will see similar hardware capabilities(for many embedded systems, the requirements are not that hard to achieve using good processes) and the fight will be on software and support.
And in a sense that's the more interesting part of the quark.
One of the interesting things is that most of those devices that you listed off have become rather successful. I was just reading that it was in 2007 that Intel stopped with the 8096 and its variants. Around the same time they sold off their ARM products line.
In this case, though there are some different market conditions that may cause Intel to attempt to compete in this market. With ARM trying to challenge Intel in the mid level processors, and thereby challenging them in the upper level processors due to competition for different device markets, Intel has a reason to try and notch a few wins in ARMs core market (sorry for the bad pun). This may drive Intel to be more competitive in this market as they can then amortize their fabs over a longer period and still have a lead in this space.
And if the move isn't quickly profitable for Intel, what will happen? Intel has been in the embedded business more times than I can remember, starting with the 8048 and 8051, with stops along the way with the 8096, the 80960, StrongARM, and embedded x86.
With companies like Microchip, Freescale, TI, Atmel, and Renesas, you know they are committed to the MCU market, and will have good long term availability. Intel and AMD have the opposite track record.
I very much would compare it to Cortext-A processors; it's probably more expensive than many, and has similar requirements (external DRAM and flash, etc), and I doubt it has much speed advantage, especially if you can use OpenCL on an ARM SoC's GPU.
Most Cortex-M* chips are MCUs with embedded flash and SRAM, and are available in QFP, not just BGA packages.
You bring up some interesting points. I think that Intel has yet to really clarify what their intent is with this device. I think that it will yet be a few years before it can really evolve into what they want it to become. The other issue that may come about is some of the peripherals commonly found in embedded controllers perform better at the larger process nodes. I am not too familiar with the techniques that are required for designing the chips themselves, so we will have to wait and see what ends up evolving from this.
As to the target market, this too is also yet to evolve. In looking over the datasheet, I can tell you that the power configurations were enough to make my head spin. It is laid out very differently from any imbedded device that I have seen. The highest performance device that I have used is a STM32F407. I referenced these devices because these sell at the $5-10 range. From what I have read, this is the target price range for the Quarks. All that information is unsubstantiated, so do not quote me on it.
I don't agree that this device comeptes with the 65nm node. I think it competes with devices like the QCA4004(cpu+wifi on a chip,no flash, 40nm, quallcom) that are about to come out.
Other competitors are renesas with it's 40nm embedded flash process(which it want to license to other firms), and i think that there are other firms working on 40nm embedded flash, spansion among them(working togheter with UMC).
I believe they use 40nm because they are currently cheaper than 28nm. But maybe costs at intel are better? maybe they already paid for their 32nm fabs, and have got nothing better to do with them, so they can sell capacity cheaply?
Aero engineer, you also mention the quark made in 22nm: that is really interesting. One of the reasons many mcu's are done using 130nm-90nm is much higher(1000x) sleep current of newer processes. Intel's 22nm tri-gate process reduces sleep current by orders of magnitude, according to intel's claims, and should really help here. I wonder thought, isn't demonstrating this at 22nm would be much better sell ?