Certainly an IR style sensor would be good for low-light recognition in a way that using a CMOS image sensor would not.
But one wonders how necessary gesture recognition would be for handheld devices. The use case in television and other more remote equipments is more obvious.
I wonder if the real reason for using IR is that it can obtain full illumination in light or dark without annoying the human user. A bright light coming from the sensor would be annoying - especially if people intended to be viewing a screen in the dark.
Sounds like a neat idea, I am left wondering about the details. What is the range, resolution for the sensor arrays and how much processing power is needed to support the gesture recognition? I would like to know more (please).