Design Con 2015
Breaking News
Comments
Newest First | Oldest First | Threaded View
Page 1 / 3   >   >>
resistion
User Rank
CEO
Internal disagreement?
resistion   2/5/2014 11:18:29 AM
NO RATINGS
"Micron's process technology experts have expressed "wild disagreement" about when a DRAM replacement will be needed. "The earliest points to 2015, and the latest points to far enough out you could call it never."

Seems inside Micron there are those who want DRAM forever, those who want MRAM, those who want PCM, those who want RRAM, those who want Flash...

Good for R&D to thrive, but bad for immediate product development..

DougInRB
User Rank
Manager
Re: HMC's DRAM Controller
DougInRB   2/4/2014 1:08:05 PM
NO RATINGS
In real life systems arch, I think every system deserves its own dedicated architecture.

As an engineer, I'd love to do it right from the bottom-up.  The reality is that drastic changes aren't possible.  Look at how long it took us to get multi-threaded CPUs fully supported.  First, the CPU guys had to implement it.  It took a long time after that before the compiler, OS, and application folks figured out how to take advantage of it.  This is one reason that the transputer never really got out of academia - nobody knew how to program it.  Maybe now with GPGPU architectures being embraced by the HPC folks, the time of the transputer has come - provided that somebody takes the time to generate a robust library of commonly used functions.

But I would prefer to junk PCIe which I frankly think is an abomination as an interconnect !

Junking PCIe has the same problem as I cited above - it is everywhere, and people know how to use it.  Having said that, I would love it if I didn't have to pay certain IP vendors a small fortune to use their PCIe cores.

DougInRB
User Rank
Manager
Re: HMC's DRAM Controller
DougInRB   2/4/2014 11:14:07 AM
NO RATINGS
Hmmm.  Now you have me thinking about this with a new perspective.  First of all, the FPGA based systems can definitely take advantage of this.  I've designed a DDR interface for an FPGA and it is not only a pain in the butt, it also wastes the bandwidth capability of the DRAM.  By using the HMC, very few pins are needed and the latency is not a problem.  Fan-out to logic that can inhale the data at full bandwidth could be a problem but it is easily solved with wide internal buses.  Then the memory can be shared amongst all of the hardware accelerators and embedded processors...

Hello Xilinx and Altera - can you please build me a big FPGA in a smaller package?  With PCIe and HMC, I don't need all of those pins!

My other thought about an application of the HMC is for an array of small low-power, lower frequency processors (remember the transputer?).  When scaled out, this could provide a lot more compute power per sq in than the monster heater CPUs we use today.

OK - maybe I'm not as skeptical now.  Even though it is still a bad fit for conventional CPUs, it might be a good fit for compute intensive workloads that can be parallelized.

I still think that a comm application with built-in packet inspection/routing/etc. would be a great place to start.  The array of light weight processors or FPGAs might even be the right infrastructure for  this.

resistion
User Rank
CEO
Re: HMC-CPU connection
resistion   2/4/2014 3:26:17 AM
NO RATINGS
So no takers for HMC on CPU? The DRAM-CPU communication was supposed to be the main beneficiary of going to TSV technology.

DougInRB
User Rank
Manager
Re: HMC-CPU connection
DougInRB   2/3/2014 7:18:46 PM
NO RATINGS
Even if the memory cube is directly attached to the CPU (which is a very bad idea from a manufacturing yield perspective), the latency will be higher.  To access a DRAM, you need to provide the row and column addresses and a few nanoseconds later a cache line is available.  To use a serial interface, you need to create a command packet that says "read starting at this address and give me so many bytes".  That command packet then needs to be serialized and then sent to the memory cube controller.  That has to be de-serialized and interpreted.  If the command is not for that memory cube, it has to be passed along the chain to another cube.  If it IS for that memory cube, the DRAM has to be read (same row/column read cycle, but at a higher frequency).  The data needs to be read into a buffer, then a response packet needs to be generated, serialized, and finally sent to the CPU.  Whichever thread of the CPU that was trying to do the read has had to twittle its proverbial thumbs this whole time while waiting for a cache fill to complete.  This takes a few nanoseconds with DDR and will take 10s or 100s of nanoseconds with a memory cube.

That should drag just about any high performance CPU to its knees.  If the idea is good enough, the CPU makers might be willing to reinvent the whole multi-thread, cache, and memory management infrastructure, but I kind of doubt it :-).

Like I hinted in my earlier post, this may make a great main memory as long as there is a very large low latency RAM between it and the CPU (4th level cache) - and the cache hit rate of the 4th level cache is VERY high... 

resistion
User Rank
CEO
Re: HMC-CPU connection
resistion   2/3/2014 6:26:03 PM
NO RATINGS
Good point. I guess it's supposed to be on top of CPU with TSV connection. This would also require CPU maker buy-in.

krisi
User Rank
CEO
spin-transfer torque RAM and phase-change
krisi   2/3/2014 5:55:21 PM
NO RATINGS
any estimates when spin-transfer torque RAM and phase-change might happen?

rick merritt
User Rank
Author
Re: 120GB/s ~ 160GB/s? I'm all for it...
rick merritt   2/3/2014 3:15:59 PM
NO RATINGS
@GSMD: Thanks for the good perspective! How about the question of latncy being higher than DDR?

Whjat is the latency vs DDR? How do you handle that?

DougInRB
User Rank
Manager
Re: HMC's DRAM Controller
DougInRB   2/3/2014 1:36:11 PM
NO RATINGS
It seems that everyone is ignoring the fact that the memory cube will have significantly higher latency than DDR-4.  A RMW will stall the CPU for eons.  This means that it cannot be used by a CPU as the main memory attached to the cache.  It essentiallty brings in a new tier to the memory hierarchy.  It seems like a great idea that will bring much higher overall memory bandwidth, but the critical latency to the CPU is not solved.

Maybe  the local DRAM will become a 4th level cache.  Maybe someday the DRAM will be displaced by MRAM.  In any case, I cannot see the DDR interface being simply replaced with a bunch of serial links.

It seems like the first niche for the memory cube would be in comm, where latency is not as big a deal and throughput is king...  You could make an amazing switch with such a device.

 

rcsiv
User Rank
Rookie
Re: 120GB/s ~ 160GB/s? I'm all for it...
rcsiv   2/3/2014 12:33:30 PM
NO RATINGS
GSMD, I would like to follow up, what is your email address or another way to get a hold of you?

Page 1 / 3   >   >>


Top Comments of the Week
Flash Poll
Like Us on Facebook

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)
EE Life
Frankenstein's Fix, Teardowns, Sideshows, Design Contests, Reader Content & More
Max Maxfield

Max's BADASS Display: A Comedy of Errors
Max Maxfield
1 Comment
Good grief -- where does the time go? I first determined to build my Bodacious Acoustic Diagnostic Astoundingly Superior Spectromatic (BADASS) display way back in the mists of time we used ...

<b><a href=Betajet">

The Circle – The Future's Imperfect in the Present Tense
Betajet
5 comments
The Circle, a satirical, dystopian novel published in 2013 by San Francisco-based writer Dave Eggers, is about a large, very powerful technology company that combines aspects of Google, ...

Martin Rowe

Make This Engineering Museum a Reality
Martin Rowe
Post a comment
Vincent Valentine is a man on a mission. He wants to make the first house to ever have a telephone into a telephone museum. Without help, it may not happen.

Rich Quinnell

Making the Grade in Industrial Design
Rich Quinnell
16 comments
As every developer knows, there are the paper specifications for a product design, and then there are the real requirements. The paper specs are dry, bland, and rigidly numeric, making ...

Special Video Section
The LT8640 is a 42V, 5A synchronous step-down regulator ...
The LTC2000 high-speed DAC has low noise and excellent ...
How do you protect the load and ensure output continues to ...
General-purpose DACs have applications in instrumentation, ...
Linear Technology demonstrates its latest measurement ...
10:29
Demos from Maxim Integrated at Electronica 2014 show ...
Bosch CEO Stefan Finkbeiner shows off latest combo and ...
STMicroelectronics demoed this simple gesture control ...
Keysight shows you what signals lurk in real-time at 510MHz ...
TE Connectivity's clear-plastic, full-size model car shows ...
Why culture makes Linear Tech a winner.
Recently formed Architects of Modern Power consortium ...
Specially modified Corvette C7 Stingray responds to ex Indy ...
Avago’s ACPL-K30T is the first solid-state driver qualified ...
NXP launches its line of multi-gate, multifunction, ...
Doug Bailey, VP of marketing at Power Integrations, gives a ...
See how to ease software bring-up with DesignWare IP ...
DesignWare IP Prototyping Kits enable fast software ...
This video explores the LT3086, a new member of our LDO+ ...
In today’s modern electronic systems, the need for power ...