In recent years, the technologies used to link individuals at remote sites in real time have evolved at a rapid pace. A number of solutions, designed to enable collaborative communication, have emerged during this time, ranging from simple software tools to elaborate videoconferencing rooms. As a result, many companies have found it difficult to determine the optimal strategy for investing in collaborative computing technologies and the best networking strategy for linking these capabilities together. At the same time, the explosion in Internet usage has created even greater opportunities for enhancing the scope and cost-effectiveness of conferencing sessions. The advent of Internet-enabled conferencing technologies has created an entirely new set of logistical and technical challenges that networking professionals must evaluate, implement, and manage.
In October 1996, H.323 was ratified by the International Telecommunications Union (ITU), the body responsible for telecommunications standards. H.323 is designed to enable conferencing over IP networks, such as the Local Area Network (LAN) and the Internet.
You can now use your existing Internet Protocol (IP) desktop connections to take advantage of one of the most valuable collaborative conceptsnetworked conferencingthe ability to share video, audio, and data exchanges in real time regardless of network architecture.
Upcoming work in the ITU and the Internet Engineering Task Force (IETF) will most certainly provide enhancements for confidentiality and security in personal and business global communication. The challenge for networked conferencing companies is to create H.323 products that blend into your established networks and use resources that are already present, for example, interfacing to a wide variety of products on the IP network side, such as TCP/IP hubs, switches, and routers. In addition, networked conferencing companies must present a seamless integration with the existing H.320 standard of ISDN (WAN) conferencing products.
In the formative years of videoconferencing during the late 1980s to early 1990s, videoconferences were conducted in designated meeting rooms with expensive conferencing equipment that was difficult to use and functionally limited. Because these first videoconferencing systems were based on proprietary technologies, interoperability among devices and applications from different manufacturers was impossible. Since the mid-1990s with the advent of standards for multimedia communication over ISDN, conferencing application and equipment vendors have been offering standards-based products. As a result of the ensuing intense competition and technology advances, the cost of systems has decreased dramatically, while capabilities have expanded and deployment has become much easier.
Benefits of Using Standards
Standards make it possible for people all over the world to communicate using hardware and software, developed and distributed by different vendors. When products interoperate, consumers can choose from a vast selection of multimedia conferencing products without fear of being tied to the goods and services of one vendor. These products can then work together as seamlessly as telephones and fax machines.
The major organizations responsible for defining and promoting conferencing standards are the ITU, the IMTC, and the IETF. The International Telecommunications Union (ITU) is a body within the United Nations where international governments and the private sector coordinate standards for global telecommunication networks and services. The Telecommunication Standardization Sector, the T in ITU-T, develops telephone standards. While companies are welcome to join and participate in ITU discussions and proposals, the voting members are not companies, they are countries.
The International Multimedia Teleconferencing Consortium (IMTC) is a non-profit organization whose purpose is to promote and facilitate the development and implementation of standards-based, interoperable products and services for multimedia conferencing. The IMTC is comprised of individuals from organizations that develop and supply multimedia conferencing products and services around the world. While IMTC members are encouraged to participate in the ITU, their primary concerns are the validation and promotion of standards and interoperability. The IMTC concentrates its attention on the adoption of ITU multimedia conferencing standards and market education. The Internet Engineering Task Force (IETF) is a parallel standards committee that defines networking standards for the Internet. They also addresss conferencing on the Internet.
Four ITU-T umbrella recommendations contain the standards required for interoperability:
H.320 for ISDN videoconferencing describes standards for both multipoint and point-to-point videoconferences over circuit switched networks (CSN), such as ISDN and Switched-56. H.320 governs the basic concepts of audio and video communication by:
Specifying requirements for processing audio and video information
Providing common formats for compatible audio and video inputs and outputs
Defining protocols for multimedia endpoints to use the communication links and synchronization of audio and video signals.
T.120 for dataconferencing specifies how to distribute application data efficiently and reliably in real time during a multimedia multipoint conference, including conferences using network bridging products and services. The T.120 objective is to:
Ensure transparent interoperability among unlike endpoints
Permit data-sharing among participants connected through any network transport such as ISDN, PSTN, CSN, or IP
Specify infrastructure protocols for dataconferencing applications.
H.324 for POTS multimedia conferencing describes high-quality video and audio compression over POTS modem connections. This recommendation specifies a common method for sharing audio, video, and data simultaneously using high-speed modem connections over a single analog (POTS) telephone line.
H.323 was finalized by the ITU-T in October 1996, Recommendation H.323 is the dominant standard for the next generation of multimedia conferencing technology and equipment. H.323 specifies the modes of operation that are required for endpoints from different vendors to intercommunicate with any combination of audio, video, and T.120 graphics over the LAN, intranet, or Internet. It provides the call-model descriptions, the call-signaling procedures, and the system and component descriptions for packet-based conferencing. However, H.323 does not directly include standards for guaranteeing Quality of Service (QoS).
As a derivative of previous recommendations, borrowing H.320 structure, modularity, and audio-video standards, H.323 presents standards that permit customers who have made large investments in H.320-based systems to take advantage of low-cost desktop conferencing. In contrast to H.320 conferencing, which requires a separate circuit-based network of ISDN-BRI or PRI-lines, H.323 leverages the packet-switched IP network already in place for T.120-data communication. Because H.323 communication is independent of network topology, the LAN over which H.323 endpoints communicate may be a single segment, ring, or multiple segments with complex topologies. These endpoints can communicate through hubs, routers, bridges, and dial-up connections.
Quality of Service Tools for H.323
The title of Recommendation H.323 clearly states that this recommendation specifies standards for "Visual Telephone Systems and Equipment for Local Area Networks which Provide a Non-Guaranteed Quality of Service." H.323 does not, however, neglect QoS. For example, to provide reliable delivery of packets, H.323 specifies that control and data channels use end-to-end transport services, such as TCP, that signal acknowledgment and support re-transmission of packets. These reliable services ensure that signals are delivered through flow-controlled transmission in the order in which they were sent and error freeno signals are lost. In contrast, best-effort services transporting audio and video signals, such as the User Datagram Protocol (UDP), while more efficient, do not send acknowledgment and retransmission signals and thus can introduce undesirable conditions, such as packet loss, timing fluctuations (jitter), and network congestion. H.323, therefore, specifies UDP only for audio and video and for the registration, admission, and status (RAS) channel. H.323 uses the reliable TCP for the H.245 control channel, the T.120 data channels, and the call-signaling channel. To handle streaming audio and video over the Internet, H.323 provides two protocols and an additional technique, multicast, to benefit network bandwidth efficiency.
The Real-Time Protocol (RTP) and the Real-Time Control Protocol (RTCP) work in concert, with RTCP monitoring RTP. These protocols (with H.245) also work with IP Multicast to guarantee timing, not data integrity, for UDP. RTP handles timing issues by time stamping and sequencing every UDP packet transmitted and including information on the synchronization of audio and video streams, expected data rate, expected packet rate, and distance in time to sender. The receiver, with appropriate buffering, can eliminate duplicate packets, reorder out-of-sequence packets, and synchronize sound, video, and data. Thus, when delays occur, the receiver can play back information that is consistently spaced in time and recover from jitter or other timing skews introduced by the network. Manufacturers use the RTCP sender and receiver QoS reports (listing statistics about lost packets, sequencing, and jitter) to detect network congestion and take corrective action, such as reducing media stream data rates.
The Resource Reservation Protocol (RSVP) is designed to prevent packet loss in router-based networks and to help reduce delay and jitter. RSVP helps to avoid problems with network congestion by guaranteeing that requested bandwidth in the router be dedicated to specific applications, such as conferencing. Endpoints can reserve network resources along the routing path between sender and receiver either just before or during call setup. RSVP determines and notifies the endpoint whether a router has sufficient available resources to supply the requested QoS and whether that endpoint has administrative permission to make the reservation. When resources are available and the permission is accepted, RSVP sets parameters in the router packet classifier and scheduler for the QoS. RSVP does not control the QoS of the network segments themselves.
In multipoint conferences, multicastin contrast to unicast or to broadcasthandles streaming audio and video over the Internet with RTP. Multicast processes the transmission of a single packet from one source to many destinations on the network without replication. Conversely, unicast sends multiple point-to-point transmissions; broadcast sends to all destinations. The unicast and broadcast transmission methods use the network less efficiently as packets are replicated throughout the network.
The H.323 umbrella standard incorporates already established recommendations and protocols to carry out its intention to specify conferencing over packet-switched networks. To satisfy this intent, H.323 requires endpoints to support these capabilities:
H.245 conference control
Q.931 call signaling and call setup
RAS messaging to communicate with a Gatekeeper
RTP/RTCP support to sequence audio and video packets
To help endpoints meet these requirements, H.323 includes the following standards:
Call, Conference, and Media Control (H.225, H.245)
H.323 uses the Q.931, RAS, H.225, and H.245 protocols to manage audio, video and control signals passing through a control layer. (See Description of Standards.) The control layer is responsible for initiating calls between endpoints and setting up the media stream.
Video (H.261, H.263)
Although H.323 does not require an endpoint to have video capability, all endpoints with video capability must support H.261, allowing for the use of the same video frame encoding rules. In addition, endpoints can improve image quality by using the optional algorithm, H.263, which improves picture quality by using an improved compression algorithm, sacrificing less network bandwidth and maintaining more video data.
Audio (G.711, G.722, G.723, G.728, G.729)
While support for other audio standards is optional, an H.323 endpoint must support the G.711 standard for speech compression. Designed originally for continuous bit-rate networks, G.711 typically transmits voice at 48 kbps, 56 kbps, or 64 kbps, well within LAN bandwidth limits. The G.723 standard operates at significantly lower bit rates and is the most likely choice for H.323 applications. However, G.723 is a more expensive algorithm to use because it requires more processing power to encode than does G.711. All other endpoint capabilities, including videoconferencing and data conferencing, are optional. The very absence of a requirement for video capability leaves open the possibility of connections in an H.323 conference from audio-only endpoints. To support data conferencing, H.323 endpoints must incorporate T.120 capabilities. When enabled, T.120-capable H.323 endpoint users can work collaboratively using shared applications, such as spreadsheets and presentation packages.
Description of Standards
H.225 Describes the mediaaudio and videostreampacketization, media stream synchronization, control stream packetization, and control message formats. H.245 Describes the messages and procedures used to negotiate channel usage for opening and closing logical channels for audio, video, and data; for capabilities exchange; for mode requests; for control; and for indicators. H.261 Describes the video-coding and decoding methods for the moving picture component of audiovisual services at rates of N x 64 kbps. H.263 Describes a better picture at 128 kbps than by H.261 specified. G.711 Specifies pulse code modulation (PCM) of voice frequencies for a 3-kHz bandwidth at 48 kbps, 56 kbps, and 64 kbps (normal telephony). G.722 Specifies audio for a 7-kHz bandwidth at 48 kbps, 56 kbps, and 64 kbps, using adaptive differential pulse code modulation (ADPCM) coding. G.723 Specifies audio transmitted at 5.3 kbps to 6.3 kbps, close to the quality of speech in a conventional phone call. G.728 Specifies audio for a 3-kHz bandwidth at 16 kbps, using low-delay code excited linear prediction (LD-CELP). G.729 Specifies audio for a toll-quality, 8 kbps speech coder, using a linear prediction analysis-by-synthesis coder.
Recommendation H.323 describes the network components that connect to a LAN employed for interaction with a CSN. It does not describe the LAN itself or the transport layer used to connect various LANs. To implement an IP network-based communication system, H.323 defines these four major components:
Endpoints on the LAN, whether they are integrated into personal computers or implemented in stand-alone devices, support real-time, bidirectional communication. As described under the Standards Component Section, H.323 endpoints must support H.245, Q.931, RAS, RTP/RTCP, and G.711, with video and T.120 data being optional. By supporting these H.323 standard protocols and through an appropriate gateway, H.323 endpoints can interoperate with these endpoints. H.320 on narrowband ISDN (N-ISDN) H.321/H.310 on broadband ISDN (B-ISDN) using an asynchronous transfer mode (ATM) H.322 on guaranteed QoS LANs (IsoEthernet) H.324 on general switched telephone network (GSTN) In contrast to endpoints defined by other ITU recommendations, H.323 endpoints can optionally provide multipoint controller (MC) capabilities.
Already common to telephone networks, gateways perform worldwide signal translation services. H.323, with its infrastructure of routers and switches, expands on this implementation and embeds gateway technology into the world of standards-based conferencing over IP networks. For H.323, gateways manage inter-operation between ITU-T endpoints by translating the call signaling, control channel messages, audio compression algorithms, and multiplexing techniques between an IP-based endpoint and an endpoint connecting through an ISDN. As a result of gateway services, H.320 systems can communicate with packet-based H.323 systems. Because endpoints on an IP network communicate directly, gateway services are not required for translation. H.323 mandates endpoint requirements (see the Standards Components Section) to minimize the transcoding that the gateway must perform to achieve interoperability. The gateway may not need to transcode audio when conference endpoints communicate with a common mode. In some cases, however, the gateway may perform audio transcoding so that each endpoint can operate with its optimum bandwidth efficiency. H.323 does not mandate the number and types of interfaces that a gateway can manage. In actual practice, a gateway can support several concurrent LAN-CSN sessions. And the CSN participation in each session may include several different types of networks.
Gatekeepers perform management services for H.323 conferencing zones. A gatekeeper is an optional element in H.323. When a gatekeeper is enabled in an IP network, all H.323 endpoints contacting that network must make use of it. The gatekeeper helps to preserve the operational quality of the LAN by performing these functions:
By authorizing access to the LAN for H.323 endpoints including gateways and MCUs, the gatekeeper not only limits the amount of bandwidth these entities use on the network, but guarantees access only to recognized entities. The gatekeeper grants permission for both placing and accepting calls from H.323 endpoints. For a connection to be successful, an H.323 endpoint must be recognized by the gatekeeper and must also be registered in the gatekeeper's zonethe collection of endpoints, gateways, and MCUs, independent of IP subnet boundariesthat the gatekeeper manages. A zone can have only one gatekeeper. When multiple endpoints on a LAN contain a gatekeeper, all but one should be disabled. However, the H.323 Recommendation also states that admissions control may be set to admit all requests from recognized entities.
As designated in the RAS specification, the gatekeeperthrough admissions controlensures that bandwidth is available within its H.323 zone for email, file transfers, and other designated applications. While the gatekeeper can modify the bandwidth usage during a call, the criteria for doing so is not specified in H.323.
As defined in the RAS specification, the gatekeeper accepts both external E.164 telephone number addresses received from endpoints outside the LAN and aliasnameaddresses from LAN endpoints. It then translates the numbers and names to network-recognizable addresses, for example, IP addresses. The initiating endpoint can then complete the connection. Gatekeepers may also pass the H.245 signaling (used to negotiate channel usage and capabilities) between two endpoints or between each of several endpoints and the MCU.
Multipoint Control Units (MCUs)
The MCU is a server that bridges signaling and media among three or more sitesa multipoint connection. The H.323 Recommendation specifies the two parts of an MCU:
Multipoint Controller (MC)
Through H.245, the MC negotiates among conference endpoints, determines common audio and video capabilities, and establishes media channels. An MC is required for all multi point conference types but may be located in an endpoint, a gateway, or a gatekeeper.
Multipoint Processor (MP)
The MP mixes and switches audio, video, and data streams. An MP is needed only for centralized conferences. An MCU may consist of an MC or an MC and one or more MPs. H.323 does not standardize communications between the MC and the MP. An MCU can also be known as a multimedia conference server (MCS).
Growth and Future Directions
Within the business environment, networked conferencing is extremely valuable in facilitating global business meetings, project design reviews, and contract negotiations. Conserving the time and productivity of key personnel and accelerating the speed of business processes are probably more important to most organizations in today's business climate than reducing travel expense. Networked conferencing enables more to be done with less time and less money.
The good news for today's corporate infrastructures is that organizations are on the brink of realizing the benefits of multimedia conferencing using IP-based connections. Global communication networks already exist. POTS, enterprise-wide intranets, and the Internet carry conferencing information to conference participants. Network engineering enhancements, provided by value-added service providers, can only optimize performance for H.323 conferencing over these networks.
Affordable multimedia conferencing equipment is available today. Conferencing is accessible not only to senior managers within corporations but also to individual contributors. Thanks to leading computer technology developers, such as Intel and Microsoft, conferencing technology is being embedded in browsers and operating systems. PC vendors are shipping conferencing-ready PCs in OEM bundles with Intel and others. As a result, millions of desktop workstation users are conferencing-capable and ready to participate in H.323-enabled networked conferencing.