Deploying VoWiFi: Handoff delays and QoS of voice over WiFi
In order to benefit from VoIP, the network infrastructure where VoIP will be deployed must meet minimal performance requirements. If VoIP is deployed on networking infrastructures that are not ready for it, the quality of VoIP communications might not be as good as it should be. When VoIP is deployed on wireless mobile devices (Voice over Wifi or VoWiFi), new constraints must be taken into account. Namely the latency introduced by handoffs (changing of access points) must be kept deterministic and as low as possible. How much handoff latency is considered acceptable ? This article tries to answer that question.
What are the requirements for good quality VoIP
According to (ITU Y.1541), highly interactive voice over IP applications require a Class-0 quality of service characterized by a jitter that does not exceed 50ms and a network latency of about 100ms.
This means that at any time, we need to make sure that the end to end latency is deterministic and that variance is less than 50ms. The network latency and the jitter can be improved by using dedicated links and using traffic shaping techniques where VoIP packets get a priority.
Besides the network, there are several other actors involved in the overall delay, these were discussed in previous articles :
Including all VoIP processing and network communications in consideration, The total mouth to ear delay should be kept under 200ms according to the ITU recommendation (ITU G.114) in which analytical evaluation of the effect of mouth to ear latency on human conversations were carried.
To summarize, in order to ensure a good quality of service for VoIP communications, the network most provide Class 0 QoS (meaning less than 100ms latency and less than 50ms jitter), and the overall mouth to ear latency (including codecs, buffers, etc..) must be kept under 180ms.
Handoff performance requirements for VoWiFi
When using VoIP with mobile devices such as smart-phones and PDAs, a new challenge arises with regard to handoff delay. In deed, if the handoff delay is too large the total latency can go over 180ms.
It is therefore important to understand how much handoff latency can be afforded. For this purpose, we need to calculate how much 'extra latency' our VoIP can handle.
When the wireless device is connected to an access point, and the network satisfied the Class 0 QoS requirement, the jitter is more or less stable (under 50ms). However, when the station is performing a handoff, data packets will not be transmitted for the duration of the handoff. The VoIP communication will suffer degradation in its QoS if the receiving end experiences a buffer underrun (no more voice packets to palyback).
For this reason, the handoff latency should always be less than the total playback buffering available at the receiving end.
Considering a Hypothetical Reference Endpoint (HRE) for speech media as shown in the figure below and using formulas from a previous article, the Endpoint delay analysis is shown in Table 1.
![]() | Hypothetical Reference Endpoint (HRE) |
![]() | Table.1 Endpoint delay analysis |
The average latency introduced by a 60ms jitter is 30ms. The HRE issues 20ms chunks of audio data per RTP [28] packet. Since a PCM frame size is 0.125ms, the total number of frames per packet is 160. When applying Eq.2, the maximum latency that can be attributed to the codec operations is ((2 * 160) + 1) * 0.125 40ms. The PCM encoder (ITU-G.711) has a packet loss concealment mechanism that when used adds 10ms latency to the HRE. The total latency in an average end point is thus about 80ms.
When codec and de-jittering latency are combined with the 100ms network latency requirement of Class-0 QoS, the total average delay for mouth-to-ear is 100+80 = 180ms which is consistent with the ITU recommendation G114, which allocates a latency budget of 200ms for optimal quality of service in interactive human speech.
Assuming a system as the HRE, when occasional positive jitter occurs, the packet may be lost if the receiving side decides that the packet came too late. To understand how the receiving side does this decision. Let us first define Packet play-out time and Discard Window (ITU G.1020).
Packet play-out time:
The play-out time for a packet transported over RTP (RFC3550) is calculated by adding a fixed amount of time to the RTP time-stamp. The fixed amount of time captures latency due to network and latency due to the de-jittering buffer. The playout time corresponds to the absolute instant in time when the packet must be available for the decoder to process.
Discard window:
The discard window is an extra ’grace period’ used to decide whether the packet should be judged late and discarded. A packet is discarded by the receiving side if the difference between the packet’s play- out time and the current time is larger than the discard window.
Assuming a situation where end point devices correspond to the HRE and with Class-0 QoS requirements as reference. The packet’s play-out time is calculated by adding 100ms network latency and 60ms de-jitter buffer latency to the RTP time-stamp. On the other hand, for real time applications with high interactivity, the discard window in the HRE can be estimated as the time it takes to decode one RTP packet, which means 20ms. Based on this, if a packet missed its play-out time for more than 20ms (the discard window size), then the packet will be discarded. This is to say that the total jitter tolerated when taking the discard window in consideration is equal to 80ms = (de-jittering buffer + discard window).
Conclusion
In typical VoIP deployments, events that introduce more than 80ms of extra latency would cause packet loss. This budget must be taken in consideration when deploying VoIP on wireless devices, the budget for wireless handoff delay can be averaged to 80ms.
References
- (ITU G.1020) International Telecommunication Union, “Recommendation G.1020 - performance parameter definitions for quality of speech and other voiceband applications utilizing ip networks,” July 2006.
- (ITU Y.1541) International Telecommunication Union, “Recommendation Y.1541 - internet protocol aspects quality of service and network performance,” Feb. 2006.



Comment