Personally, I think 32 bits is enough for any application I'd care to run. If you need more than a few GB of memory, IMO the application is being ridiculously wasteful. But YMMV -- if some people think it's fun to debug 16-digit hexadecimal numbers, I'll quote Mr. Knightly: "I have nothing to say against it, but that they shall not choose pleasures for me."
IMO 64 bits is for mainframe operating systems. For my personal use, I prefer something... well, more personal.
While I sympathize with the dislike of bloat, for traditional OSes like Linux having the virtual address space be substantially larger than the physical address space is convenient. (I think Linus Torvalds said something like beyond 1 or 2 GiB a 32-bit virtual address space becomes inconvient.)
One can also consider less traditional organizations like single-level storage or single address space OS. The concept of single-level storage might even be extendable to include resources on a network.
Address space layout randomization (a security technique) also works better with a sparse address space.
Furthermore, fat pointers can carry metadata. (AArch64 supports ignoring 8 bits so that they can be used for metadata. I think Azul Systems Vega processors ignored 16 bits; if I recall correctly, this was used for class ID to better support Java.) If a pointer is going to be 40-bits or larger anyway, extending it to 64 bits is not as big an issue.
In addition, the added manufacturing and energy use cost of a 64-bit processor is relatively small (when used with 32-bit pointers; 64-bit pointers can noticeably inflate memory use), especially in a phone or tablet system context. (Using the same ISA and even core-type for tablets and phones has some advantages and tablets are pushing the 32-bit boundary.)
ARM may well migrate to 64-bit architectures in smartphones soon, but to me 64 bits matter more for future applications of ARM, not so much for use in a smartphone.
You need more than 32 bits to address more than 4 Gbyte of RAM. This matters if a PC is running multiple simultaneous apps, and/or if the PC wants to keep all of its commonly used apps in RAM, for lightining fast access. I'm not sure how important this is when using ARM in a smartphone role, unless that smartphone will eventually become the CPU of something more than just a smartphone device.
Anyway, it was practically a given that eventually all multipurpose processors were going to migrate to 64 bits. IMO.
Bert wrote: You need more than 32 bits to address more than 4 Gbyte of RAM. This matters if a PC is running multiple simultaneous apps, and/or if the PC wants to keep all of its commonly used apps in RAM, for lightining fast access.
The newer ARM Cortex-A processors (A7, A12, A15, A17) all have a 40 bit physical address space, so the processor is able to address 1 TiB. So while you can "only" have a 32-bit address space for a single process, you can have many processes in physical memory simultaneously.
Personally, I like this better than having a 64-bit architecture. OTOH, I haven't looked at the ARMv8 instruction coding. It may be that the 64-bit instruction set does a much cleaner job of encoding instructions than the 32-bit ARMv7 architecture, which evolved from a RISC machine into rococo.
There isn't a huge difference between the ARM64 and ARM or Thumb-2 ISAs. They all have similar RISC instructions with encodings which are easy to decode (either fixed 32-bit size or 16/32 bit mix). ARM64 removed a few of the more complex instructions but added additional ones as well. So in terms of "viability" they are all viable, so it is not a burden for ARMv8 CPUs to support ARM64, ARM, and Thumb-2 efficiently.
Of course this support is necessary as existing apps will remain 32-bit for the foreseeable future even when they run on a 64-bit OS.
It's a 40-bit physical address which addresses DRAM and memory-mapped I/O. Virtual addresses within processes are still 32-bit. Operating systems typically allocate memory to processes in 1-4KB pages, and the memory mapping hardware usually has a minimum page size of 1KB or larger. This means the 40-bit physical byte address is really a 28-30 bit page address plus 12-10 bits to address a byte within a page. The 12-10 bits are not re-mapped. The OS only needs to deal with the page addresses, which fit into 32-bit variables.
The code to deal with 40-bit physical addresses is the same on ARM64 and ARM/Thumb-2 with LPAE. That is, it likely uses 64-bit types throughout. In contrast to popular belief using 64-bit types on a 32-bit CPU is not really inefficient.
I think what Rick meant was that a 32-bit OS starts to have to do more work when it has more than 2-3GB of RAM. That is true even without using > 32-bit addressing. The effect is not that large, I've been told in the worst case about 5%. Paging is extremely slow so adding more RAM is better overall despite the cost.
Don't overlook the move to expand the standard peripherals that come along in the ARM suite. So many issues with manufacturers that have different standard peripherals. I'd like to see the simple peripherals all pulled into the ARM clearing house and the manufacturers can focus on creating hardware accelerators for the difficult problems. I don't want a differentiated SPI port I want a differentiated graphics engine...