Speech Quality Measurement Algorithms and Testing Technology

PRODUCT
Toàn Nguyễn
April 07, 2023
Speech Quality Measurement Algorithms and Testing Technology

Estimating the quality of speech is a central part of quality assurance in any system for generating or transforming speech. 

Such systems include telecommunication networks or speech processing or generating software. The speech signal in their output suffers from various degradations inherent to the particular system.

With background noise cancellation, an algorithm could leave remnants of noise in an audio snippet or partially suppress speech (see Fig. 1). Or, a telecommunication system could also suffer from packet loss and delays. Meanwhile, audio codecs can introduce unwanted coloration just like speech-to-text systems deliver unnatural sounds. 

A more straightforward approach to speech quality testing is conducting listening sessions, resulting in subjective quality results. 

A group of people listens to recordings under controlled settings. They’re then asked to provide normalized feedback (e.g. on a scale from 1 to 5). Responses are then aggregated into a single quality value, called Mean Opinion Score (MOS). 

Aggregation is necessary to avoid subject bias. There are standard guidelines for conducting such listening tests and for further statistical analysis of the results that yield the MOS values (see ITU-T P.830, ITU-R BS.1116, ITU-T P.835).

Fig. 1 An example of a degraded speech signal and its reference

Cookies help us improve your experience. Please accept our use of cookies. Cookie policy