Decoding the speech code—Applications of temporal decomposition
作者:
Stephen M. Marcus,
Bishnu S. Atal,
期刊:
The Journal of the Acoustical Society of America
(AIP Available online 1986)
卷期:
Volume 80,
issue S1
页码: 17-17
ISSN:0001-4966
年代: 1986
DOI:10.1121/1.2023680
出版商: Acoustical Society of America
数据来源: AIP
摘要:
Articulatory phonetics describes speech as a sequence of overlapping articulatory gestures, each of which may be associated with a characteristic ideal target spectrum. In normal speech, the idealized target gestures for each speech sound are often never attained, and the speech signal exhibits only transitions between such (implicit) targets. It has been suggested that the underlying speech sounds can only be recovered by reference to detailed knowledge of the gestures by which individual speech sounds are produced. It will be shown that it is possible to decompose the speech signal into overlapping “temporal transition functions” using techniques which make no assumptions about the phonetic structure of the signal or the articulatory constraints used in speech production. Previous work has shown that these techniques can produce a large reduction in the information rate needed to represent the spectral information in speech signals [B.S. Atal, Proc. ICASSP83, 2.6, 81–84 (1983)]. It will be shown that these methods are able to derive speech components of low bandwiths that vary on a time scale closely related to traditional phonetic events. Implications for perception and the application of such techniques both for speech coding and as a possible front end for speech recognition will be discussed.
点击下载:
PDF
(178KB)
返 回