Segmentation Scheme for Use of a Speech Recognition Computer Program
作者:
James W. Forgie,
Carma D. Forgie,
期刊:
The Journal of the Acoustical Society of America
(AIP Available online 1960)
卷期:
Volume 32,
issue 11
页码: 1516-1516
ISSN:0001-4966
年代: 1960
DOI:10.1121/1.1936353
出版商: Acoustical Society of America
数据来源: AIP
摘要:
A segmentation scheme is being developed to operate as a first step in a general speech recognition program. Input to the segmenter is sampled, quantized spectral data (35 frequency channels scanned every 5.5 msec). The output is a sequence of boundary markers together with a rough classification of the enclosed segments. The segments are generally phonemic, or smaller, in size, and the classification corresponds either to type of phoneme (e.g., fricative, nasal, vowel) or to sound characteristic (e.g., silence, aspiration). The segmenter operates by first classifying each 5.5‐msec scan of the spectral data. This classification is obtained by computing a number of measurements of different attributes of the spectral pattern and combining the results with suitable weighting functions. The number of such measurements is made larger than the logical minimum to avoid placing too much dependence on any one measurement. Segments are then built by grouping together scans which have the same individual classifications. Finally these primitive segments are subjected to an editing process which modifies and merges together certain of the primitive segments according to rules based on length, amplitude, the classification of neighboring segments, etc. The effect of the editing process is to insure that the output segment sequence does not violate any simple articulatory rules for the language. (Lincoln Laboratory is operated with support from the U. S. Army, Navy, and Air Force.)
点击下载:
PDF
(169KB)
返 回