Feature extraction segmentation and labeling in the Harpy and Hearsay‐II systems
作者:
H. G. Goldberg,
R. Reddy,
期刊:
The Journal of the Acoustical Society of America
(AIP Available online 1976)
卷期:
Volume 60,
issue S1
页码: 11-11
ISSN:0001-4966
年代: 1976
DOI:10.1121/1.2003140
出版商: Acoustical Society of America
数据来源: AIP
摘要:
Goldberg [J. Acoust. Soc. Am.59, S97(A) (1976)] has shown that uniform techniques for segmentation and labeling can provide the initial signal‐to‐symbol transformation for speech recognition systems with reasonable accuracy and efficiency. Furthermore, the choice of parametric representation was not found to be critical for most commonly accepted representations. However, for efficiency, the computationally simplest techniques should be used to segment the utterance before more accurate (and expensive) spectral representations are used for labeling [R. Reddy, J. Acoust. Soc. Am.42, 329–47 (1967)]. To provide an initial symbolic input for both the Harpy and Hearsay‐II systems, an hierarchical, feature‐extraction based segmenter, using the ZAPDASH parameters, has been developed. After segmentation, labeling is done by a modified LPC minimum distance [F. Itakura, IEEE Trans. ASSP‐23, 67–72 (1975)]. Labeling proceeds by comparing the midpoint of each segment with stored templates (acquired by an iterative learning process from speaker‐specific training corpus) and adjusted with weights according to features obtained from the segmenter. The use of the highly efficient segmentation procedures and parameters provides approximately a factor of 5 speedup over uniform techniques which were previously used with both Harpy and Hearsay‐II [Research supported by the Defense Advanced Projects Agency.]
点击下载:
PDF
(203KB)
返 回