首页   按字顺浏览 期刊浏览 卷期浏览 Importance of tonal envelope cues in Chinese speech recognition
Importance of tonal envelope cues in Chinese speech recognition

 

作者: Qian-Jie Fu,   Fan-Gang Zeng,   Robert V. Shannon,   Sigfrid D. Soli,  

 

期刊: The Journal of the Acoustical Society of America  (AIP Available online 1998)
卷期: Volume 104, issue 1  

页码: 505-510

 

ISSN:0001-4966

 

年代: 1998

 

DOI:10.1121/1.423251

 

出版商: Acoustical Society of America

 

数据来源: AIP

 

摘要:

Recent studies have shown that temporal waveform envelope cues can provide significant information for English speech recognition. This study investigated the use of temporal envelope cues in a tonal language: Mandarin Chinese. In this study, the speech was divided into several frequency analysis bands; the amplitude envelope was extracted from each band by half-wave rectification and low-pass filtering and was used to modulate a noise of the same bandwidth as the analysis band. These manipulations preserved temporal and amplitude cues in each frequency band, but removed the spectral detail within each band. Chinese vowels, consonants, tones and sentences were identified by 12 native Chinese-speaking listeners with 1, 2, 3, and 4 noise bands. The results showed that the recognition score of vowels, consonants, and sentences increased monotonically with the number of bands, a pattern similar to that observed in English speech recognition. In contrast, tones were consistently recognized at about 80% correct level, independent of the number of bands. This high level of tone recognition produced a significant difference in the open-set sentence recognition between Chinese (11.0%) and English (2.9%) for the one-band condition where no spectral information was available. The data also revealed that, with primarily temporal cues, the falling–rising tone (tone 3) and the falling tone (tone 4) were more easily recognized than the flat tone (tone 1) and the rising tone (tone 2). This differential pattern in tone recognition resulted in a similar pattern in word recognition: words having either tone 3 or 4 were more likely to be recognized while words having tone 1 and 2 were not. The quantitative role of tones in Chinese speech recognition was further explored using a power-function model and found to play a significant role in relating phoneme recognition to sentence recognition.

 

点击下载:  PDF (127KB)



返 回