AbstractsEngineering

Artificial bandwidth extension of narrowband speech - enhanced speech quality and intelligibility in mobile devices

by Laura Laaksonen




Institution: Aalto University
Department:
Year: 2013
Keywords: Telecommunications engineering; speech processing; speech enhancement; artificial bandwidth extension; speech quality; mobile devices; puheenkäsittely; puheen siistaus; keinotekoinen kaistanlaajennus; puheen laatu; matkapuhelimet
Record ID: 1130619
Full text PDF: https://aaltodoc.aalto.fi/handle/123456789/9015


Abstract

Even today, most of the telephone users are offered only narrowband speech transmission. The limited frequency band from 300 Hz to 3400 Hz reduces both quality and intelligibility of speech due to the missing high frequency components that are important cues especially in consonant sounds. Particularly in mobile communications that often takes place in noisy environments, degraded speech intelligibility results in listener fatigue and difficulty in speaker recognition. The deployment of wideband (50–7000 Hz), and superwideband (50–140000 Hz) speech transmission is ongoing, but the current narrowband speech coding will coexist with the new technologies still for years. In this thesis, a speech enhancement method called artificial bandwidth extension (ABE) for narrowband speech is studied. ABE methods aim to improve quality and intelligibility of narrowband speech by regenerating the missing high frequency content in the speech signal, typically in the frequency range 4 kHz–8 kHz. Since the enhanced speech quality is achieved without any transmitted information, the algorithm can be implemented at the receiving end of a communication link, for example in a mobile device after decoding the speech signal. This thesis presents algorithms for artificially extending the speech bandwidth. The methods are primarily designed for monaural speech signals, but also the extension of binaural speech signals is addressed. The algorithms are developed such that they incur reasonable computational costs, memory consumption, and algorithmic delays for mobile communications. These and other implementational issues related to mobile devices are addressed here. The performance of the methods has been evaluated by several subjective tests, including listening-opinion tests in several languages, intelligibility tests, and conversational tests. The evaluations have been mostly carried out with coded speech to provide realistic results. The results from the subjective evaluations of the methods show that artificial bandwidth extension can improve quality and intelligibility of narrowband speech signals in mobile communications. Further evidence of the reliability of the methods has been obtained by successful product implementations. Suurin osa puhelinliikenteestä on vielä nykyäänkin kapeakaistaista, eli puhelignaalista lähetetään vain 300–3400 Hz:in taajuuskaista. Rajoitettu taajuuskaista huonontaa sekä puheen laatua että ymmärrettävyyttä, koska korkeataajuiset, erityisesti konsonanttiäänteille tärkeät akustiset vihjeet, puuttuvat signaalista. Etenkin meluisissa ympäristöissä matkapuhelimien puhesignaalien heikko ymmärrettävyys väsyttää käyttäjiä sekä aiheuttaa ongelmia puhujan tunnistettavuudessa. Vaikka laajakaistaisen (50–7000 Hz) puheensiirtotekniikan käyttöönotto on aloitettu, kapeakaistaiset puheensiirtomenetelmät ovat käytössä vielä vuosia uusien menetelmien rinnalla. Tässä väitöskirjassa tutkitaan kapeakaistaisen puhesignaalin keinotekoista kaistanlaajennusta. Tällä puheenparannusmenetelmällä pyritään parantamaan puheäänen laatua…