Cache sizes are largely driven by working set of the apps that are running, and those are a mix of the persistent apps (including the OS) and and the temporary needs of the transienyts, including all the "wake me up to check if there's a new version to download" code.
A faster processor is done with the transients sooner, and so the contribution to average working set from the transients is smaller on a Mill. The contribution from the persistent code is relatively fixed, so the working set demand is not reduced that much.
The Mill does reduce the demand for DRAM bandwidth. The reduction varies by application, but 25% is a reasonable rule of thumb - see ootbcomp.com/docs/memory for an explanation. You could reduce the cache sizes, leading to more churn in the cache (and more bandwidth demand) until you were back at the bandwidth of a conventional architecture but a smaller cache. Whether that design point would be worthwhile is very market dependent.