Improved jitter buffer and playout management approach

Cinar, Yusuf
The popularity and adoption of Voice over Internet Protocol (VoIP) communications have risen at an unprecedented level in the recent decade as an alternative to conventional telephony technologies, such as Public Switched Telephone Service (PSTN). At the core of the attractions for VoIP is its use of ubiquitous Internet Protocol and IP infrastructure, ability to integrate with other applications and provide advanced voice, video, and data services on commodity devices for wide-ranging use cases. The recent COVID-19 pandemic has even further expanded, and revealed the importance of, the use of such Voice and Video communication over IP technologies in many domains. However, VoIP significantly differentiates from its predecessor technologies in many dimensions. From a transmission perspective, for instance, the packet-switched transport medium of the IP networks was not initially engineered for near real-time scenarios such as voice communications. The real-time constraints of VoIP applications make them highly sensitive to the impairments inherent in IP networks. Additionally, IP networks and the Internet also has seen dramatic changes and grown in complexity and heterogeneity with new types of technologies, such as 3G, LTE, 5G, various Wi-Fi technologies, to name just a few. The volume of data transmitted over the IP networks has also been breaking new records every year. All these contribute to the issues that IP networks present and VoIP solutions need to mitigate, such as network congestions, delay spikes, jitter, packet loss. The emergence of WebRTC made VoIP technologies much more pervasive than ever before. Billions of WebRTC applications are serving billions of people, whether it is for a social conversation or virtual business-critical meeting. Consequently, the quality issues, WebRTC and VoIP solutions manifest, remain an important challenge today. In this thesis, we study the Quality of Service (QoS) and Quality of Experience (QoE) aspects in VoIP communications. In particular, we analyse and document the principles of the jitter buffer and playout management approach in WebRTC. We experimentally evaluate its performance under an extensive collection of network conditions with high delay and jitter. Consequently, we highlight the QoS and QoE issues and quantify the impact of the network impairments using QoS metrics and speech quality assessment models. A dual approach of real-world packet traces and network simulations is taken. The former is used to quantify the performance against the real-world Wi-Fi and LTE settings, while the latter is used to produce a broader range of artificial network distortions and assess the impact. We propose, implement, and validate an improved jitter buffer and playout management algorithm which mitigates the speech quality distortions caused by temporal network imperfections with high delay and jitter that result in arrivals of packets in bursts. We have identified that WebRTC removes all the packets stored in the jitter buffer once a packet arrives at a full jitter buffer. This causes a colossal packet drop rate resulting in a voice quality degradation. The results have shown that the proposed approach can prevent the excessive packet drops that otherwise would occur. We perform a comparative analysis to benchmark the proposed approach and the approach in the prior art. As an objective listening quality assessment method, the ITU-T Rec. P.863 model was deployed, and the MOS-LQOf results have confirmed the quality improvement provided by the proposed approach.
NUI Galway
Publisher DOI
Attribution-NonCommercial-NoDerivs 3.0 Ireland