Personally, I think 32 bits is enough for any application I'd care to run. If you need more than a few GB of memory, IMO the application is being ridiculously wasteful. But YMMV -- if some people think it's fun to debug 16-digit hexadecimal numbers, I'll quote Mr. Knightly: "I have nothing to say against it, but that they shall not choose pleasures for me."
IMO 64 bits is for mainframe operating systems. For my personal use, I prefer something... well, more personal.
ARM may well migrate to 64-bit architectures in smartphones soon, but to me 64 bits matter more for future applications of ARM, not so much for use in a smartphone.
You need more than 32 bits to address more than 4 Gbyte of RAM. This matters if a PC is running multiple simultaneous apps, and/or if the PC wants to keep all of its commonly used apps in RAM, for lightining fast access. I'm not sure how important this is when using ARM in a smartphone role, unless that smartphone will eventually become the CPU of something more than just a smartphone device.
Anyway, it was practically a given that eventually all multipurpose processors were going to migrate to 64 bits. IMO.
While I sympathize with the dislike of bloat, for traditional OSes like Linux having the virtual address space be substantially larger than the physical address space is convenient. (I think Linus Torvalds said something like beyond 1 or 2 GiB a 32-bit virtual address space becomes inconvient.)
One can also consider less traditional organizations like single-level storage or single address space OS. The concept of single-level storage might even be extendable to include resources on a network.
Address space layout randomization (a security technique) also works better with a sparse address space.
Furthermore, fat pointers can carry metadata. (AArch64 supports ignoring 8 bits so that they can be used for metadata. I think Azul Systems Vega processors ignored 16 bits; if I recall correctly, this was used for class ID to better support Java.) If a pointer is going to be 40-bits or larger anyway, extending it to 64 bits is not as big an issue.
In addition, the added manufacturing and energy use cost of a 64-bit processor is relatively small (when used with 32-bit pointers; 64-bit pointers can noticeably inflate memory use), especially in a phone or tablet system context. (Using the same ISA and even core-type for tablets and phones has some advantages and tablets are pushing the 32-bit boundary.)
Bert wrote: You need more than 32 bits to address more than 4 Gbyte of RAM. This matters if a PC is running multiple simultaneous apps, and/or if the PC wants to keep all of its commonly used apps in RAM, for lightining fast access.
The newer ARM Cortex-A processors (A7, A12, A15, A17) all have a 40 bit physical address space, so the processor is able to address 1 TiB. So while you can "only" have a 32-bit address space for a single process, you can have many processes in physical memory simultaneously.
Personally, I like this better than having a 64-bit architecture. OTOH, I haven't looked at the ARMv8 instruction coding. It may be that the 64-bit instruction set does a much cleaner job of encoding instructions than the 32-bit ARMv7 architecture, which evolved from a RISC machine into rococo.
Don't overlook the move to expand the standard peripherals that come along in the ARM suite. So many issues with manufacturers that have different standard peripherals. I'd like to see the simple peripherals all pulled into the ARM clearing house and the manufacturers can focus on creating hardware accelerators for the difficult problems. I don't want a differentiated SPI port I want a differentiated graphics engine...