PORTLAND, Ore. Segmentation and object recognition software used in video applications such as facial recognition in security cameras is being applied to sports programming.
Researchers at the University of Calgary (Alberta) are investigating whether automatic recognition and tracking software can "watch" a sporting event for a viewer, keep track of who did what and when, then diagrammatically represent the highlights or even live action using icons instead of raw video.
"Once we extract the moving objects in a scene, we can transform that data and present it in all sorts of formats. We can even make a little schematic of a sports game which is good for viewing on a small handheld device like a cellphone," said researcher Jeffrey Boyd.
Boyd's work builds on previous research he conducted with colleagues at the University of British Columbia, Dalhousie University and Waterloo University.
"On devices that don't have enough bandwidth for live video, like cellphones, we can show a moving schematic or diagram of the action with, say, Sharks or flaming C's representing the players," said Boyd.
Work began well before the MPEG-7 specification was formalized two years ago, but Boyd's Camera
Markup Language shares a common base with MPEG-7: both append an XML documennt to video describing its
Boyd's software "watches" a video stream while generating a
running commentary in the form of a continuous XML document. For instance, if a "white ball" enters the scene, Boyd's software finds it when it segments the scene into objects, names it, then begins tracking it in a stream of XML statements about its color, size, location and trajectory.
While this operation could be performed with the standard
MPEG-7 format, Boyd's Camera Markup Language goes beyond MPEG-7 by allowing bidirectional communication between the video server running his software and the camera. An MPEG-7 camera produces a stream of XML documents describing the video being taped, but the Camera Markup Language can send XML documents back to the camera to control it.
"Whereas MPEG-7 is primarily a one-way description you have some video, you describe it and that's it we were looking more at interacting with the camera. As the video is being produced we can tell the camera to do different things," said Boyd.
By employing a server-client architecture, servers can process video and its XML documents, describe the objects and generate both video representations for TV and a graphical representation for cellphones. Meanwhile, it can also send XML commands back to the camera for operations such as "follow-the-ball." Even if the ball
stops moving, the server can instruct the camera to switch from "optical flow" tracking to a "background subtraction" algorithm.
Currently, Boyd's software is restricted to tracking discrete objects. In the future, he hopes to enable more complex descriptions that not only recognize objects but can discern what activities the object is participating in. For instance, security applications today
depend upon a human operator to identify "suspicious" persons and to switch between cameras to track their activities. The new software could perform this function on the server, send directives back to cameras to handle tracking and send a graphical icon to the operator
to identify the suspicious activity.
The sports demonstrations uses a 1:32 scale model of
a hockey rink with moving plastic players. Next, Boyd
wants to wire an actual real hockey rink with his client/server architecture. The testbed should be operating by September, Boyd said.
The software is currently being used to monitor traffic in Calgery.
Boyd plans to begin developing techniques for archiving video with its associated XML annotations, so that clients can later scan, search and variously transform the data for specific applications.