The September 2001 debut of the MPEG-7 standard heralds a new wave of applications for managing the exponential growth and distribution of multimedia content such as content over the Internet, digital broadcast networks and home databases.
MPEG-7 makes searching the Web for multimedia content as easy as searching for text-only files. The MPEG-7 standard will prove particularly useful in large-content archives, which the public can now access, and in multimedia catalogs in which consumers identify content for purchase. Content-retrieval information may also be used by agents, for the selection and filtering of broadcast "push" material or for personalized advertising. Further, MPEG-7 descriptions will allow fast and cost-effective use of the underlying data, by enabling semiautomatic multimedia presentation and editing.
In essence, MPEG-7 is the metadata standard, based on XML Schema, for describing features of multimedia content and providing the most comprehensive set of audio-visual description tools available. These description tools are based on catalog (title, creator, rights); semantic (the who, what, when and where information about objects and events); and structural (the measurement of the amount of color associated with an image or the timbre of a recorded instrument) features of the audio-visual content. They build on the audio-visual data representation defined by MPEG-1, -2 and -4.
MPEG-7 interoperates with other leading standards and has been working closely, for example, with the TV Anytime consortium. Earlier standards include:
- MPEG-1. For the storage and retrieval of moving pictures and audio on storage media.
- MPEG-2. For digital television, it's the timely response for the satellite broadcasting and cable television industries in their transition from analog to digital formats.
- MPEG -4. Codes content as objects and enables those objects to be manipulated individually or collectively on an audio-visual scene.
MPEG-1, -2 and -4 make content available. MPEG-7 lets you to find the content you need. Because of that, it's important to note that MPEG-7 addresses many different applications in many different environments. It provides a flexible and extensible framework for describing audio-visual data, by defining a multimedia library of methods and tools. It standardizes:
- A set of descriptors. A descriptor is a representation of a feature that defines the syntax and semantics of the feature representation.
- A set of description schemes. A description scheme specifies the structure and semantics of the relationships between its components, which may be both descriptors and description schemes.
- A language that specifies description schemes, the Description Definition Language (DDL). It also allows for the extension and modification of existing description schemes. MPEG-7 adopted XML Schema Language as the MPEG-7 DDL. However, the DDL requires some specific extensions to XML Schema Language to satisfy all the requirements of MPEG-7. These extensions are being discussed through liaison activities between MPEG and W3C, the group standardizing XML.
- A binary representation to encode descriptions. A coded description is one that's been encoded to fulfill relevant requirements such as compression efficiency, error resilience and random access.
The MPEG-7 working groups have made available several visual tools. A user can, for example, search for images by color or the position that a certain color occupies within an image. Other possible queries include searching for images with certain textural patterns. Searching for specific textural patterns from a satellite image database, for example, can reveal valuable information about geological and structural infrastructures.
Users might also search for images that include certain objects and shapes, such as all images that contain a specific logo design owned by a commercial company-information that marketing agencies could find useful.
Another scenario is to search by motion trajectory, a type of search that is particularly useful in remote surveillance systems where unusual object motions at public places can be tracked and flagged automatically for review by security personnel.
MPEG-7 tools are able to detail deep levels of audio-visual content. Most content-management tools describe audio-visual content at a semantic level only, outlining features of audio-visual segments and subsegments such as news programs, sports and politics, as well as titles, dates, actors, basic copyrights and basic timing/frame information.
MPEG-7 tools, however, enable descriptions of deep-level features of audio-visual content-they break down a segment of audio-visual content into its constituent elements and into relationships between those elements. A moving region within a video segment can be indexed into a subset of moving regions by using MPEG-7. Then objects within a moving region can be further indexed. An object can also be indexed by its color, texture and shape, and by its trajectory feature within the motion region. The spatial-temporal relationships of objects within frames of a stream can also be indexed using MPEG-7.
There are those in the industry who suggest that MPEG-7 brings nothing new to the market technically; some vendors, like Virage Inc., already provide audio-visual content-management solutions, the critics point out. The criticism is unfounded-MPEG-7's main strength is that it is an open-standards technology, which gives it strong appeal because it aims for interoperability with other leading audio-visual standards groups and consortia. With a description language grounded in XML Schema, content creators, distributors, content-management vendors and startup enterprises alike will find MPEG-7 an encouraging environment.
With more than 100 tools already developed and a firm commitment to allow new tools to extend the standard, MPEG-7 will provide the most comprehensive tool set for managing audio-visual content. This promise may not be realized immediately; initial applications most likely will be designed using the high-level catalog and semantic description tools of MPEG-7. The structural tools for describing deeper levels of audio-visual content will at first be taken up by professionals in the broadcast and digital studio industries. Search engine companies will follow once their customers start demanding better search and retrieval performances using audio-visual content.
As need increases for more multimedia content-management services, a greater number of enterprises will be created for the purposes of indexing and archiving content so that people can access that content.
A new breed of digital librarian, called a multimedia archivist, will appear, who, with indexing tools such as those provided by MPEG-7, will help store content for retrieval purposes. At first, this process will move slowly, because converting legacy content will create a bottleneck; however, the costs of this conversion process will eventually be covered by customers willing to pay for access to rich multimedia.