首页   按字顺浏览 期刊浏览 卷期浏览 Lexical stress and its application in large vocabulary speech recognition
Lexical stress and its application in large vocabulary speech recognition

 

作者: Ann Marie Aull,   Victor W. Zue,  

 

期刊: The Journal of the Acoustical Society of America  (AIP Available online 1984)
卷期: Volume 76, issue S1  

页码: 47-47

 

ISSN:0001-4966

 

年代: 1984

 

DOI:10.1121/1.2021878

 

出版商: Acoustical Society of America

 

数据来源: AIP

 

摘要:

Recent research (Huttenlocher and Zue, ICASSP 84) indicates that segmental information for isolated words provides strong constraints for lexical access. Furthermore, the results suggest that the lexical constraints provided by segments around stressed syllables are stronger than that around unstressed syllables. The present study focuses on two related issues. First, we investigate the amount of lexical constraint provided by stress information alone. Second, we implement a system that derives the stress information from the acoustic signal. In order to determine the lexical constraints provided by stress information, the polysyllabic words in the Merriam pocket dictionary are mapped into their corresponding stress patterns. The results indicate that, from stress information alone, the largest class size constitutes 28% of the lexicon. An overall expected class size of 15% illustrates the constraining power of the stress information. In order to exploit these findings, we develop a system that determines the stress pattern of isolated words and performs subsequent lexical access. The system initially segments the speech signal into broad phonetic classes. From these segments, syllable nuclei are determined. Next, known acoustic correlates of stress, such as duration, energy, and fundamental frequency, are extracted for each syllable. The stress pattern is established through a relative comparison of the syllable feature vectors. Finally, lexical access based on the derived stress pattern provides a list of word candidates. Phonological rules are incorporated to account for variations in the number of syllables from the lexical base forms. The system is evaluated on a database of 1500 isolated words, spoken by eight speakers, with varying degrees of difficulty for syllabification. Preliminary evaluation suggests that the majority of error can be attributed to initial segmentation and syllabification. Less than 5% is due to the stress algorithm. [Work supported by the Office of Naval Research under contract N00014‐82‐K‐0727 and by the System Development Foundation.]

 

点击下载:  PDF (177KB)



返 回