Comparisons of Some Statistical Distance Measures for Talker Identification
作者:
M. H. Becker,
R. Gnanadesikan,
M. V. Mathews,
R. S. Pinkham,
S. Pruzansky,
M. B. Wilk,
期刊:
The Journal of the Acoustical Society of America
(AIP Available online 1964)
卷期:
Volume 36,
issue 10
页码: 1988-1988
ISSN:0001-4966
年代: 1964
DOI:10.1121/1.1939195
出版商: Acoustical Society of America
数据来源: AIP
摘要:
Preliminary results are given on a comparative study of various objective talker‐recognition procedures, based on spectrographic analysis of 7 replicate utterances of each of 10 words by each of 10 different speakers. The spectrograms are quantized into 17 frequency channels and approximately 50 time channels. Different summarizations are applied to the spectrograms, including marginal energies, totalled across time, in each frequency channel; marginal energies for each time channel; and momentlike descriptions of energy distribution of the time margin. Various combinations of these summarizations were used as inputs to different multivariate distance measures, including (a) distance from unknown to a speaker centroid, using a metric based on a covariance matrix pooled over all speakers; (b) distances based on eigenvectors, using a classical discriminant‐analysis approach; (c) distances based on metrics, employing individual speaker covariance matrices. Percent correct identification varied from 22% (discriminant analysis, using one eigenvector of energy margin on time) to 97% [distance (a) applied to the frequency margins]. Frequency classification of energy is better than time classification; distance (a) is better than the others; certain words are much better than others.
点击下载:
PDF
(181KB)
返 回