Design Article

IMG1

VoIP quality issues, part 2: Jitter, delay, and echo

Michael F. Finneran

4/3/2008 3:00 AM EDT

Order this book today at www.elsevierdirect.com or by calling 1-800-545-2522 and receive an additional 20% discount. Use promotion code 92398 when ordering. Offer expires 06/01/08. Valid only in North America.

Part 1 addressed VoIP issues including voice quality, transit delay, echo, comfort noise, and foreign language speakers. It explained the G.711, G.726, G.729A, G.723.1, and G.722 voice compression standards.


8.5 Delay Tolerance
There are two time-related issues that can adversely impact the quality of IP voice services: transit delay and jitter. To the user, transit delay is the most obvious and annoying. However, it is important to recognize that the two issues are inextricably intertwined.

8.5.1 Transit Delay Tolerance: Maximum 150 msec One-Way Delay
The impact of transit delay in a voice transmission is essentially psychological. Human beings have a surprisingly accurate internal clock that governs the flow of a conversation. When a person asks a question, their internal clock starts. He or she is waiting a given span of time to hear a reply. If the answer is not heard within that time span, invariably he or she will repeat the question.

When the two parties can see each other (i.e., an in-person meeting), visual cues play a significant role. If you can see that someone is thinking, you will let them think. However, in a telephone call, there are no visual cues. In that case, the questioner must rely solely on his of her internal clock. If the questioner does not get a reply within the anticipated time frame, he or she will either repeat the question or ask, "Did you hear me?"

This human reaction is quite inconvenient in packet telephony. If the transmission path introduces an inordinately lengthy delay, that delay will trigger this follow-up reaction (e.g., "Did you hear me?"). That follow-up will typically collide with the response coming from the other end. Research has shown that if the network service introduces a one-way delay in excess of 150 msec, human speakers will begin encountering these conversational diffi culties.

With the advent of IP telephony, a number of testing organizations set out to determine the range of delays that users would find acceptable. What they found was that one-way delays up to 70 or 100 msec were essentially unnoticeable. Once that one-way delay went past 100 msec, some people began to complain. When the delays reached 150 msec, virtually everyone was complaining. To get an idea of what that delay is like, a cell phone-to-cell phone call has a one-way delay of around 140 msec. We have found that users are willing to trade off voice quality of the convenience of mobility, so we should not try to equate that directly with user expectations on wired telephone systems.

8.6 Jitter/Delay Sources
As mentioned before, transit delay and jitter are the two sources of delay in packet telephony.

  1. Transit Delay: The amount of time it takes for the signal to travel from the speaker, through all of the network elements, to the recipient.
  2. Jitter: The variation in delay that is seen from packet to packet.

Transit delay is the total delay a voice signal will encounter when traveling through the network; this is called mouth-to-ear delay. Circuit switched voice systems introduce negligible delays, typically under 30 msec. IP PBXs can introduce delays on the order of 50 to 70 msec between wired stations. The addition of a WLAN connection increases the delay. The transit delays introduced in a wide area IP telephony application can be far greater. Given the distance and the number of routers involved, wide area systems can introduce delays in excess of 100 msec.

Transit delay is a combination of several factors:

  • Voice Encoding: As we noted above, each voice coding system introduces some amount of delay. Where the 64 Kbps PCM encoding delay is under 1 msec, voice compression systems can introduce delays ranging up to 70 msec.
  • Packet Generation: The packet voice system will collect some amount of voice information to insert in a packet; that delay will be a factor of the duration of the speech sample carried in each packet. To minimize the delay, we try to keep the voice packets short. The size of the voice sample carried in a packet will typically range from 20 to 40 msec, though we recommend a 20 msec sample size to minimize the packeting delay. The downside of a short packet size is that the amount of header information remains the same regardless of the size of the speech sample, so shorter packets increase the ratio of overhead to content.
  • WLAN Contention Delay: If the voice is forwarded over a wireless LAN, the frame must be sent according to the WLAN access protocol (i.e., CSMA/CA). That means there is a short waiting interval before the frame can be sent, and if there is a collision or other transmission failure, the frame must be resent. Further, the entire frame must be received at the access point and tested for errors before it is forwarded through the wired network. The goal of WLAN QoS (i.e., IEEE 802.11e) is to minimize the delay for voice packets. Even with the QoS feature, transmitting voice packets over a WLAN typically adds about 20 to 30 msec to the overall delay.
  • Serialization Delay: The amount of time it takes to forward the packet onto the serial (i.e., bit-at-a-time) transmission link. The serialization delay is a factor of the packet size and the transmission rate of the link.
  • Propagation Delay: The amount of time it takes the signal to travel over the physical transmission facility. In local networks, links are short and propagation delays are minimal; worst-case propagation delay for 100 m of EIA Category 5e LAN cable is 548 nsec. However, propagation delays do become signifi cant when we introduce wide area facilities. Assuming the packet routing does not change (i.e., a virtual circuit service), this value should stay constant during the connection.
  • Switch/Router Delay: Typically the biggest delay variable in IP voice systems will be the delay introduced by switches and routers in the path. Switch/router delays are composed of two elements.
    • Switching Delay: Each switch or router the packet passes through will take some amount of time to process the packet. That processing involves reading the address, doing a table lookup, and determining the best path if there are multiple entries in the table. The processing delay is typically constant and a factor of the processor speed and efficiency of the software.
    • Buffering Delay: Buffering is the major variable in the transit delay. Routers process one packet at a time, and the packet must wait in a buffer until its turn comes up. The waiting interval is determined by the volume of traffic in each router's buffer when the voice packet arrives. If QoS is used, the router will maintain separate buffers for each priority level, so the delay is impacted primarily by traffic in the same priority class. However, the highest priority queue is typically not given exclusive access to the facility until it is emptied. As a result, traffic in lower priority queues will also have an impact on the buffering delay.

  • Jitter Removal: Once the packet arrives at its destination, the RTP receiver process must reestablish the timing consistency (i.e., remove the jitter). To do that, the packet is placed in a buffer and then played out according to the RTP timestamp. That receive buffer is the last element in the delay chain.
1  2  3 

print

email

rss

Bookmark and Share

Joinpost comment



Comments


Jeff_Mo

4/11/2008 3:24 PM EDT

"Circuit switched voice systems introduce negligible delays, typically under 30 msec."

Do you have a citation for this 30 msec number? Most circuit switching delays I've seen are sub-millisecond.

Sign in to Reply


MFJumbo

4/25/2008 11:44 AM EDT

H i Jeff, I wrote that Voice over Wireless LAN book.

You are correct in scaling circuit switch delays in the sub-second range given the way a TDM switching network operates- but that's just the delay getting through one switch. The 30 msec reference I used was for a long distance connection. Frankly, it's a swag. In a circuit switched connection the actual delay will be based on the minimal delays introduced by switches and multiplexers and propagation delay, which is a factor of distance and the velocity of the signal (i.e. miles per hour). For example, light in a fiber travels at about 2/3 of the speed of light in a vacuum. The standard estimate we use for the total of that is 10 msec per 1000 miles, so 30 msec would get you from New York to LA.

Hope that helps.

Regards

Sign in to Reply


Ten

5/6/2008 12:32 PM EDT

Good article, one other sometimes significant factor in transit delay is the algorithmic audio delay in echo cancellers (line and acoustic), noise reduction, and a variety of other voice quality enhancements.
Hardware solutions tend to be less for some algorithms than software solutions but they can add anywhere from <1msec to 20+msec of delay. Thought this would be a good factor to consider.

Sign in to Reply


Please sign in to post comment

Navigate to related information

Product Parts Search

Enter part number or keyword
PartsSearch

FeedbackForm