Why would you want a DRAM controller in a processor? The DRAM controller handles things like timings for the different row selects, bank access, refresh, etc. You want these as close to the DRAM chip as possible.
The reason memory controllers became integrated into processors is not because that's the best place for them - it was to avoid the extra chip, buses and latency caused by having the controller on a separate device between the processor and the memory (and sharing bus bandwidth with PCI and other buses).
The reason the controller is not in the DRAM chips today is because the chip process needed for controller logic is very different from that needed for the DRAM cells, so it would be hugely expensive to put them on the same die.
If you put the memory controller inside the DRAM cube, then the bus between the processor and the memory can be simpler and faster, and the memory controller can be more optimal for the dram banks it is controlling (including wider bus access and local cache inside the cube).
The potential here is not just to increase bandwidth, but also to lower latency - especially in servers with large memories and ECC.
Agreed stacking DRAMs has advantage. But I can't understand the rest. It is tightly coupled with the memory controller, so there is power and latency benefits. But it seems to force the processor away from the memory controller. Many processors have integrated DRAM controllers; so the stacking scheme seems to negate their advantage.