Digital video has now reached a stage of maturity characterized by the emergence of comprehensive platforms and solutions for broadband delivery, making video as easy to locate, manage and use as text.
The Internet has significantly raised the bar on what people expect from all types of digital content. Users now expect to easily browse, search, download and share information any time, anywhere. Until recently, these types of interactions applied primarily to text-based information because traditional video, even in digital form, did not lend itself to these new levels of interactivity.
To fully enable video for the Internet, it must be fully indexed at a fine level of granularity, intelligently prepared in a device-independent way, and delivered by an application platform that can exploit the index information. For the purposes of this article, we will refer to application-server-driven indexed video as "applied video." But before delving into the meaning of applied video, it would be worthwhile to understand where digital video has been and where it is going.
Clearly, analog video in the form of VHS tape and broadcast/cable has been with us for a long time. The migration to digital media assembly processes and digital TV delivery has been a lengthy and painful process, one that is not yet complete. Web-based video, on the other hand, has evolved much more rapidly. Beginning with QuickTime, Microsoft Video for Windows and MPEG-1, digital video files became available for download in the early '90s. The growth of the Internet brought forth new video standards based on streaming (such as RealVideo) and Microsoft Media based on the real-time interactive networked multimedia standard MPEG-4.
The driving force behind the recent evolution of digital video is the opportunity to commercialize content in new ways that go beyond traditional delivery and commerce mechanisms. Applied video introduces opportunities for content owners to develop new revenue streams.
Many of the steps in the value chain are shared with traditional video production and delivery processes. However, several new mechanisms are required to fully enable video for intelligent, interactive delivery in the broadband paradigm of "what I want, when I want it, and where I want it." For example, when something of interest is found using a wireless PDA while commuting on the train, and then sent to the user's desktop at work for viewing when they arrive. If there is one central ingredient to the recipe, it is the capture and intelligent use of video metadata, index information about the contents of the video.
The preparation of video for film, broadcast, training or strategic enterprise purposes follow essentially the same production and media assembly process, as does the preparation of applied video. The key departure from traditional preparation workflow occurs where the finished content is transformed into intelligent, indexed and purposefully annotated content. The transformation process is accomplished using a new breed of video indexing technology and tools typified by VideoLogger and MediaSite Publisher, which can perform state-of-the-art signal analysis on the video and audio to extract useful metadata.Metadata elements
Metadata consists of time-stamped data elements such as keyframes, spoken text, speaker and face identification, on-screen text reading, logo detection, and so on. Each of these metadata elements acts as a reference back into the video content in much the same way that a card catalog unlocks the wealth of information in a library. The video index enables searching, fine-grained navigation, preview and association with ancillary activities, such as personalization and content-rights management.
Metadata also enables video content to be effectively managed and delivered with targeted advertising and content-relevant e-commerce opportunities. If a user is watching a skiing video, show them an ad for snowboards, an option to enroll in a drawing for a ski-weekend getaway, and allow a one-click buying opportunity for lift tickets at a resort near the user. For premium content owners, indexed video is the basic starting point for broadband delivery mechanisms such as syndication, pay-per-view and subscription business models.
But video indexing using a video logging tool is only part of the story. Metadata is captured and offered to users through an application-server mechanism, while the video content itself is distributed through content delivery networks (CDNs) and edge caching infrastructure. The metadata must refer, in a time-accurate manner, back to the video content itself. In a multiplatform delivery model, this actually means referring back to many different physical renditions of a given piece of content.
Modem users need 56-kbit/second streams, while broadband users need 300-kbit/s streams and above. Video content for set-top box delivery must be broadcast-quality, while wireless devices are currently best served with text and thumbnail images rather than actual video.
Therefore, the transformation process must not only produce a rich metadata index of the content, but must also prepare a wide variety of renditions of the content, all of which are time-synchronized with the metadata.
One popular solution to this problem is the SmartEncode process, which orchestrates video indexing with any number of simultaneous encoding processes in various bit rates and formats. Such indexed video is the first step to searchability and interactivity, allowing users to pull snippets of video of interest to them from repositories of long-form video, such as, "I don't want to watch the entire debate, I just want to see what the candidate said about Social Security and education."
To address the new requirements of increased interactivity and multiplatform delivery, the popular digital video and streaming formats are evolving rapidly and new standards are emerging to handle metadata. De-facto standards like RealVideo, QuickTime, Microsoft Media, and the Virage VDF metadata format have a solid foothold today, but new formats will become a factor in the future. MPEG-7 is an emerging standard that provides a standardized description of various types of multimedia information. MPEG-7 is formally called Multimedia Content Description Interface. The standard does not specify the (automatic) extraction of video metadata, nor does it specify the search mechanism that can make use of the description. Several other metadata standards are also gaining acceptance, such as the SMPTE metadata working group efforts, the Advanced Authoring Format, as well as interesting things that can be accomplished with the highly flexible MPEG-4 format.
An important sign that digital video has matured to the point where one can rightfully call it applied video is the emergence of complete video solutions that support the intelligent applications discussed above, in a platform-independent manner. Today, video application servers have matured to a level on par with traditional Web authoring and content management solutions for text and graphics media. Turnkey video publishing platforms can be licensed or outsourced as hosted applications that support all the advanced capabilities and delivery platforms of interest. Premier vendors have established relationships and tight integration with CDNs, and can provide not only the basic video indexing and application hosting features, but also value-added editorial assistance to fully exploit e-commerce and ad targeting opportunities. Some vendors even provide layered application frameworks that allow the addition of syndication engines, personalization support, and community building features.
"To date, home-grown, piecemeal streaming solutions have frustrated content providers and held back broadband video both within the enterprise and on the Internet," said Jeremy Schwartz, senior analyst at Forrester Research Inc. (Cambridge, Mass.). "But with the advent of integrated video application platforms, content providers can now stop experimenting and start efficiently exploiting their rich media assets, either as strategic information or monetized content."
Video metadata and streaming media, effectively managed by a video application server, is a key ingredient to achieving device independence in the delivery chain. For example, most wireless devices can't today receive and display streaming video at any bit rate. However, they can display thumbnail images and transcript text, as well as provide links to direct applied video back to your desktop. Or a set-top box can deliver quality video time-synched with auxiliary Web-based content.
Applied video pervades the convergence landscape, from PCs with broadband access to interactive television and set-top boxes to wireless devices. Underneath it all, video metadata and application servers are the central nervous system that places content in the right location, on the right device, at the right time.
See related chart