Breaking News
Comments
TarraTarra!
User Rank
CEO
ASSP not Server SoC?
TarraTarra!   6/3/2014 11:45:47 AM
NO RATINGS
The interesting thing here is the sheer variety of Thunder options optimized for different workloads. Cavium seems to have pulled this out of their experience with Embedded parts. What I am confused about after reading this is how this will apply to servers.

Datacenter operators buy bulk servers for their fleet. They typically do not know apriori what workloads will run on them. Now they would have to chose the type of server for each workload?? One thing that Intel got right was to simplify the product offerings. Their problem was cost and power.

Cavium seems to be overthinking the problem here? 

rick merritt
User Rank
Author
Re: ASSP not Server SoC?
rick merritt   6/3/2014 12:53:10 PM
NO RATINGS
@Tarra! Tarra!  I may not have been clear about this. There are four families of products under the Thunder brand. One is specifically targeted at servers. Others target storage and security appliances and networking.

TarraTarra!
User Rank
CEO
Re: ASSP not Server SoC?
TarraTarra!   6/3/2014 5:45:41 PM
NO RATINGS
Rick,

"There are four families of products under the Thunder brand. One is specifically targeted at servers. Others target storage and security appliances and networking."

 

This is what does not make sense. Volume servers in the datacenter run all those applications today. E.g the same server could run map/reduce (hadoop) and also run generic web-tier applications. It appears that cavium is proposing to fragment the datacenter and have the operators chose between their different offerings and convert volume severs into appliances that can only run specific appliances?? That is a tall order. 

DaisyCSage
User Rank
Rookie
Re: ASSP not Server SoC?
DaisyCSage   6/3/2014 6:10:38 PM
I­m mak­ing ­over $­1­3k a month working ­part tim­e. I kept hea­ring other p­eople tell me­ how much m­oney they can ma­ke­ online­ so I d­ecide­d to look int­o­ it. Wel­, it was all tru­e and ha­s totally ch­anged­ my life. ­This is­ wha­t I do,

➜➜➜➜➜➜➜➜➜➜➜➜➜➜➜ W­­­W­­W.J­­­O­B­S­7­­­5.C­­­O­M
GO TO THE SITE AND CLICK NEXT TAB FOR MORE INFO AND HELP

servernut
User Rank
Rookie
48 core power would be very high
servernut   6/3/2014 12:13:36 PM
NO RATINGS
Rick,

Any indication from cavium on what the power of the 48 core device would be? the article mentions the cores as out of order. cavium has so far stayed away with in order simple designs for their cpus.  is that a typo? If the core is out of order, then a 48 core thunder would be over 150W! How will it then compete with intel?

 

 

rick merritt
User Rank
Author
Re: 48 core power would be very high
rick merritt   6/3/2014 12:51:24 PM
NO RATINGS
@Servernut: Going back to my notes I see Cavium left itself some quibble room, saying its core "supports optimized OOO."

Re power, as reported they said the products ranges from 20-95W including the Ethernet ports, so well below Xeon.

servernut
User Rank
Rookie
Re: 48 core power would be very high
servernut   6/3/2014 2:37:58 PM
NO RATINGS
"Re power, as reported they said the products ranges from 20-95W including the Ethernet ports, so well below Xeon."

 


Really?? Xeon E3 power is in the 40W range, E5 is in the 60W range. Even if you factor in the additional components. How is it lower than Xeon?

 

On the OOO, yes that is a very wide quibble room :) Octeon has in-order multi-issue and to go to OOO is not that easy. I would imagine that is a convinent error on their part.

rick merritt
User Rank
Author
Re: 48 core power would be very high
rick merritt   6/4/2014 12:40:16 AM
NO RATINGS
@Servernut and/or Hank: Can one of you briefly explian what the Octeon OOO-like scheme is and how it is different in performance to full OOO?

Perhpas B'com's promise of a full OOO quad issue FinFET-based processor will be significantrly more powerful...AMD';s planned K-12, too. But those are likely 2016 chips.

servernut
User Rank
Rookie
Re: 48 core power would be very high
servernut   6/4/2014 2:23:07 AM
NO RATINGS
@Rick. Cavium's Octeon designs are not out-of-order but in-order. Thunder is likely to be the same. All other server CPUs - Xeons, Opterons, even X-Gene are fully out-of-order machines.

 

Actually XGene from appliedmicro is completely missing from your post. They showed a mini-datacenter running at Computex this week. Any reason, you are not covering them? They seem to be shipping already. From the specs they seem to have everything that thunder is claiming and a few years ahead.

What are your thoughts on XGene?

rick merritt
User Rank
Author
Re: 48 core power would be very high
rick merritt   6/4/2014 6:14:34 AM
NO RATINGS
@Servernut: Applied definitely got out there early. I have written 3-4 stories about them so far. I am not a Computex so would love to hear the latest. For a while they have been in Cavium's spot: we have been waiting for them to ship and report performance specs. Anyone have an update on that?

GSMD
User Rank
Manager
Re: 48 core power would be very high
GSMD   6/4/2014 3:16:31 AM
NO RATINGS
I think this has come up a few times in this forum but I guess it is worth reiterating some of the conclusions we reached over the past 2 years

1. This a key point. Intel's main challenge is its high margin model, especially in the server segment. It is unrealistic to believe that any competing vendor can beat it on the semicon process or in systems architecture. Its x86 inst set does cause an overhead but that is primarily in the decode stage.

2. Wrt OO architecture, Intel, Power8 and the Netlogic 4 issue pipelines are some of the best when it comes to ILP. The  FS QorIQ is equally good but it is a dual issue. Cavium typically went for in-order cores and it is not clear how agressively out of order these cores are. There is nothing inherent in the ARM ISA that will perevent Cavium from matching Intel in OO performance, it is just a question of what they wanted out of this design. 

3. In server grade parts, the interconnect and the cache architecture is key and I have a tough time believeing Cavium can match Intel, AMD or IBM in this regard within one generation. Nothing magical in getting to where Intel is today but it needs work over 3-4 generations of parts and lots and lots of real life usage data. Intel's ring bus and QPI is pretty optimized.

4. ARM does NOT have a power advantage over Intel. If the SoC config is the same (process, cache size, OO width etc) you will find parts built using either ISA will have TDP numbers in the same ballpark.

5. We do have perf numbers from one Freescale part. The 12 core  T4240 matches an Ivy Bridge 6 core Core i7 part with half the TDP and half the freq. This is just results from one benchmark, the coremark so take it for what it is worth. But this is not an appple to apples comparison since teh cache sizes are vastly different, pipeline depth is half and I/O mix is different. FS chose a short depth pipeline since for networking/comm. applications, frequent pipeline flushes are less expensive with shorter pipelines.

But we are doing a detailed study using enterprise benchmarks in our lab. So stay tuned ! We may find that FS's design choices make sense in a enterprise env. too.

 

rick merritt
User Rank
Author
Re: 48 core power would be very high
rick merritt   6/4/2014 6:19:42 AM
NO RATINGS
@GSMD: Good points! I'd love to hear results from your lab tests when they are done.

Wilco1
User Rank
CEO
Re: 48 core power would be very high
Wilco1   6/4/2014 3:51:15 PM
NO RATINGS
GSMD, a few comments on your points:

1. It's a myth that ISA overhead is just in decode. There are many aspects of an ISA that affect the overall microarchitecture. Just to mention one example, x86 requires more load/store units due to having fewer registers and load+op instructions. x86 also uses a more complex memory ordering model.

2. Given they designed their own CPU it seems likely Cavium are aiming for better than Cortex-A57 performance, as otherwise they could have just licensed that (the same argument applies to X-Gene). A 3-way in-order is not completely implausible, but to get decent throughput it would need to be at least 2-way and ideally 4-way multithreaded.

4. If all else is equal, an identically performing x86 would use more power than ARM due to its more complex ISA. So the x86 ISA really is LESS efficient. Of course different processes, microarchitectures etc can mitigate this difference.

In any case there is no doubt a dedicated CPU can outperform a generic Xeon despite having a process disadvantage (as you say in point 5). Beating Xeon on single-threaded performance is much harder of course, but that is not something Cavium or X-Gene are attempting (at least with their current line-up). For many tasks, using more, slower cores is actually far more energy efficient.

 

GSMD
User Rank
Manager
Re: 48 core power would be very high
GSMD   6/4/2014 10:12:50 PM
NO RATINGS
1. The power overhead is only in the decoder and related support functions. Others are a washout where denser encoding helps and lesser number of register does make some muxes simpler. I am not theorizing here. I run a large processor designgroup desinging server and HPC grade processors, so these are issues we analyse in great detail. My colleague, a professor in fact has a x86 comaptible design under his belt. But don't take my word for it. The HPCA 2013 paper goes about it in more detail.

http://research.cs.wisc.edu/vertical/papers/2013/hpca13-isa-power-struggles.pdf

Bottom line, differences are negligible and the micro-arch is what matters. There is really NOT a great x86 ISA penalty.

But having said that, I am not advocating using an x86 like ISA. That path is a one way ticket to an asylum for any CPU designer ! Intel using an incredible amount of resources, has managed to more or less eliminate the burden of teh x86 ISA. CISC vs RISC is an entirely different issue. PowerPC is a better examplar of CISC done right and a PPC vs ARM comparison (or RISC-V pereferably) is a better technical debate.

 

2. If Cavium wants to adress the server market they better get single threaded perf. right. Oracle had to create the M class CPUs to compensate for the T class's bad single threaded performance. Again I am commenting on what the market wants. Single threaded perf. is needed becuase most of the world's programmers cannot write multi-threaded code even if their lives depended on it.

On a purely technical basis the Sparct T series approach is the best way to go (Intel HPC parts are also taking this approach). Back in my Sybase days, there would be 1k+ threads per core and we could not get enough cores or HW thread support. I presume it is still teh same. But other RDBMSs do not leverage threading so well.

FS's approach in the T4240 shows that a balance can be reached with wide OO and lower power large core counts. Our testing will reveal how far this is true. They have also taken an AMD like approach where threads are almost indepenedent cores with dedicated execution and related resources.

 

 

 

GSMD
User Rank
Manager
Re: 48 core power would be very high
GSMD   6/5/2014 1:53:34 AM
NO RATINGS
Actually reality also intrudes when it comes to these power figures. When we did some power analysis testing on a 3.4 Ghz Corei7 (Ivy Bridge), when we hit max load on all the 4 cores, thermal management gets into the act and reduces freq to below 2 Ghz. The key is that it seems to stay there for a  long time !

I think the team is publishing these results

Granted this is not an ISA issue but realworld performance is always different from the claims. So in this case, you may as well go with a 2-2.2 Ghz processor if you want to have all cores running at max freq. IB EP is now at 15 cores, presumably Broadwell based design will hit 24+cores. At these core counts, we are probably looking at 2-2.5 Ghz speeds. Which implies ARM and x86 cores inteh server market will run at similar clock speeds. Makes comaprisons easier I guess.

Our server designs are aslo aimed at 2-2.5 Ghz with large core counts. Does not seem to make sense going above this and besides our arch. capability is excellent but our physical design capability is not that great. But then we are an academic/research enity !

Wilco1
User Rank
CEO
Re: 48 core power would be very high
Wilco1   6/5/2014 8:25:39 AM
NO RATINGS
Well it's obvious you've never looked in detail at the complexity of the x86 ISA. The overheads of x86 affect the whole microarchitecture. With an identical microarchitecture x86 would end up slower (and thus less power efficient). For x86 to achieve the same performance as a RISC, it needs a far more complex microarchitecture, increasing die size and power. You can compare die sizes for various ARM and x86 CPUs here: http://chip-architect.com/news/2013_core_sizes_768.jpg

The claim that x86 has a dense encoding is yet another myth. In fact the complex encoding means that x86 binaries are typically a little larger than ARM binaries, and significantly larger than Thumb-2. x64 is usually 15% larger than x86.

Yes I've read that paper and discussed it in detail on RWT. It is a badly written paper with most of the conclusions not supported by evidence. If you choose to compare wildly different and relatively ancient CPUs, an old compiler and completely ignore the memory system then of course the only possible conclusion is that microarchitecture matters the most! But that's only true if you make wild extrapolations and ignore or handwave at all other aspects. Let's hope this paper was a one-off mistake and doesn't reflect on the quality of papers coming from this university.

Note PPC is certainly not CISC. Neither is ARM or Thumb. PPC vs ARM is less interesting as their ISA features are nearly identical (not that there aren't differences but the differences tend to be insignificant details).

TarraTarra!
User Rank
CEO
Re: 48 core power would be very high
TarraTarra!   6/5/2014 11:17:14 AM
NO RATINGS
@Wilco1, @GSMD,

The debate on ISAs is interesting. I have designed x86 CPUs and other ISAs as well. It is a fact that x86 is inherently more complex than MIPS or ARM or PowerPC to varying degrees. There is certainly the CISC instruction decode penalty but there are other complex mechanisms that have been built into x86 over generations which still need to be supported by the latest x86 processors. All of these mechanisms take die-size and/or complexity. Almost every implementation of x86 CPU has a built in micro-code engine. This is like a programable engine within the CPU to handle these complex tasks. Intel has continued to stress floating point performance and each generation adds additional instructions adding transistors to the design.

So why is this relevant? This "overhead" becomes smaller in very high performance implementations - out-of-order, multi-threaded, large cache designs. Here the overhead can be amortized over the performance gains of a complex CPU. This is why Intel has competed well at the very high end compute but failed in low power efficient designs that are required for mobile.

In these less complex implementations where the CPU has fewer transistors, this overhead starts to make a difference. This is why the mobile processors from Intel and even the Atom cores have not competed so well.

 

servernut
User Rank
Rookie
Re: 48 core power would be very high
servernut   6/3/2014 2:55:17 PM
NO RATINGS
@Servernut: Going back to my notes I see Cavium left itself some quibble room, saying its core "supports optimized OOO."

 


Btw, Rick, what other points in your article are inaccurate "quibbles-room" from Cavium that you are merely repeating? Been reading your articles for sometime and you are usually good about sniffing out marketing FUD.

 

 

rick merritt
User Rank
Author
Reviving hopes for ARM servers
rick merritt   6/3/2014 12:54:22 PM
NO RATINGS
After the implosion of Calxeda and the rumors about Samsung going off an ARM server project, this initiative just got a badly needed shot of espresso.

But like Linley said, I suspect the real market here starts in 2016, about the time B'com rolls out its part.

HankWalker
User Rank
Manager
We Need Data
HankWalker   6/3/2014 1:11:57 PM
NO RATINGS
Until we have throughput/$, throughput/W and similar metrics, we cannot make any useful comparison between product offerings. All we have right now is marketing and PowerPoint.

servernut
User Rank
Rookie
Re: We Need Data
servernut   6/3/2014 2:51:58 PM
NO RATINGS
@Hank, Agree. This is marketing FUD at its best. The silicon is nowhere in sight and will be in production in 2016. It is very easy to make claims with power-point 2 years out.

krisi
User Rank
CEO
Re: We Need Data
krisi   6/3/2014 11:02:18 PM
NO RATINGS
Well, if you have no hardware to show you send your marketing guys with powerpoint to fight ;-)

rick merritt
User Rank
Author
Re: We Need Data
rick merritt   6/4/2014 12:27:33 AM
NO RATINGS
@Hank and others: Good points. We need hard performance numbers. And with no first silicon back yet there could still be significant issues getting a product out.

tb100
User Rank
CEO
MIPS and Arm
tb100   6/3/2014 2:08:50 PM
NO RATINGS
It may be that MIPS and Arm are geared for different markets, but given that Cavium is making basically the same computer chip with MIPS cores and with Arm cores, it would be interesting to see benchmark results from the two processors.

It would be a real apples-to-apples comparison. Which core is faster?

rick merritt
User Rank
Author
Software for ARM servers
rick merritt   6/4/2014 12:29:23 AM
NO RATINGS
Thx all for your good points on the chips.

Now I'd like to hear some reality about where we are at and need to be at in server software for ARM if this borader initiative of which Cavium is just one part is going to get traction. Details, please!

servernut
User Rank
Rookie
Re: Software for ARM servers
servernut   6/4/2014 2:33:31 AM
NO RATINGS
"Now I'd like to hear some reality about where we are at and need to be at in server software for ARM"

 

Rick, 

Here is some information on XGene's recent demo with the software stack. 

http://www.informationweek.com/cloud/infrastructure-as-a-service/energy-sipping-arm-chips-made-for-cloud/d/d-id/1269308

 

servernut
User Rank
Rookie
CPU freq does not mean performance
servernut   6/6/2014 2:30:27 PM
NO RATINGS
I was amused to see the Cavium CEO quote GHz/socket and use that to claim performance superiority over Intel. Does he really think the analysts and reporters would be fooled by that?

 

Oh wait, I just remembered this article and Linley's quotes. I guess he succeeded. What does that say about the competence of these so called analysts?

 



Flash Poll
EE Life
Frankenstein's Fix, Teardowns, Sideshows, Design Contests, Reader Content & More
Rishabh N. Mahajani, High School Senior and Future Engineer

Future Engineers: Don’t 'Trip Up' on Your College Road Trip
Rishabh N. Mahajani, High School Senior and Future Engineer
7 comments
A future engineer shares his impressions of a recent tour of top schools and offers advice on making the most of the time-honored tradition of the college road trip.

Max Maxfield

Juggling a Cornucopia of Projects
Max Maxfield
20 comments
I feel like I'm juggling a lot of hobby projects at the moment. The problem is that I can't juggle. Actually, that's not strictly true -- I can juggle ten fine china dinner plates, but ...

Larry Desjardin

Engineers Should Study Finance: 5 Reasons Why
Larry Desjardin
41 comments
I'm a big proponent of engineers learning financial basics. Why? Because engineers are making decisions all the time, in multiple ways. Having a good financial understanding guides these ...

Karen Field

July Cartoon Caption Contest: Let's Talk Some Trash
Karen Field
151 comments
Steve Jobs allegedly got his start by dumpster diving with the Computer Club at Homestead High in the early 1970s.

Top Comments of the Week
Like Us on Facebook
EE Times on Twitter
EE Times Twitter Feed

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)