SAN FRANCISCO – Facebook could start running--at least in part--on so-called wimpy server CPU cores by the second half of 2013. Long term, the company wants to move to a systems architecture that lets it upgrade CPUs independent of memory and networking components, buying processors on a subscription model.
The social networking giant will not reveal whether it will use ARM, MIPS-like or Atom-based CPUs first. But it does plan to adopt so-called wimpy cores over time to replace some of the workloads currently run on more traditional brawny cores such as Intel Xeon server processors.
“It will be a journey [for the wimpy cores] starting with less intensive CPU jobs [such as storage and I/O processing] and migrating to more CPU-intensive jobs,” said Frank Frankovsky, the director of hardware design and supply chain at Facebook in an interview with EE Times at the GigaOm Structure conference here. “I’m bullish on the whole category even though we will need multiple wimpy cores to replace one brawny core—the net performance per watt per dollar is good,” he said.
“We’re testing everything, and we don’t have any religion on processor architectures,” Frankovsy said.
Facebook published a white paper last year reporting on its tests that showed the MIPS-like Tilera multicore processors provided a 24 percent benefit in performance per watt per dollar when running memached jobs. Tilera is “the furthest along” of all the offerings, and they are “production ready today” with 64-bit support, he said.
Frankovsky noted several ARM SoCs and alternatives from both Intel and AMD are also “in the hunt.” This week Hewlett-Packard said it will use Intel dual-core Atom chips named Centerton in the first incarnation of its low power server called Gemini.
Facebook launched its OpenCompute project in April 2011 to get server makers to pay more attention to the unique needs of large scale data centers. “Our computing needs are different from where the vast majority of product development effort is today--that’s why we need to pull this community together to get people to design for the future,” he said.
That is the whole point. They are redesiging the entire processor to be on a generic carrier, so that it can be easily swapped. On low power, high efficiency cores, the heat sink requirements are much less, so the physical design is easier.
what's people's opinion on this new Ram cube option given the new membership ? and the fact they say it has even greater potential speed than wideIO
that seems simple enough when you make these rack of ARM on SODIMM carrier with 4 SOC per SODIMM, and move the RAM on these to the main daughter server board
you make sure to use Wide IO as per Samsung's ARM speced Block.
the key is to keep as much generic kit as you can from the mobile space upgraded to the servers space requirement today, then slowly or even faster move all of today's dual/quad mobile and static SOC home devices into that same ARM on SODIMM + daughter board configuration...
presto facebook or their future agents can sell these older ARM on SODIMM SOCK + daughter boards to their end users :) every one wins and make a profit as one example.
surely the problem here is that if you upgrade a 1 core cpu card with a 2 or 4 core one, that you will be expecting 4 times the memory bandwidth. It is unlikely you would have designed your system to have 4 times the necessary memory bandwidth to start with, so just updating the cores will be less than effective as they will clash for memory resource. I presume that most of the work FB's servers are doing are very short transactions (eg when someone puts a short message on a page) so processors will be constantly having to get new data as they swap from serving one user to serving another. This implies huge disk and/or internal network bandwidth to fetch that user's page.
Unless of course they are planning to keep each of the 1billion accounts alive in DRAM in their farm 24/7 - in which case I expect the memory manufacturers will be rubbing their hands with glee.
A minor correction: Tilera's processor core has a "MIPS-like" instruction set, but it is not a MIPS licensed processor, so calling it MIPS-based is misleading.
Ironically, one of the first "wimpy core" server processor was the "Niagara" UltraSPARC-T1, but that line of servers has not gotten much attention of late.
Probably more about supporting upstream procesors that can be pin compatible in a family.
Quad ARMs are not ready except from Invidia.
Duals are abundant and will be priced accordingly.
If you can drop in a quad A9 in 18 months it will be cheap because everyone will have figured it out. The dual 15 with dual A9 would be nice to be in a version that can drop into the same slot. Evn if it gives up some features of its bigger pin count brothers...and on and on. The CPU churn rate is mutch faster than the rack-server infrastructure evolves. so it makes sense to want to put a few releases of CPU's in the same infrastructure. Or rip it out each time you want to go to the next 10% improvement?
Why replace? Why not expand with the new agnostic topology? (not just the cpu per say but the echosystem that surounds it as well.
Is the wimpy cpu just about serving up utube and facebook type web page feeds?
whats the need difference for a data streaming for a financial application for example v.s. a utube video?