NSTL回溯数据服务平台

首页

按字顺浏览

期刊浏览

卷期浏览

Speaker normalization of static and dynamic vowel spectral features

Speaker normalization of static and dynamic vowel spectral features

作者: Stephen A. Zahorian, Amir J. Jagharghi,

期刊: The Journal of the Acoustical Society of America （AIP Available online 1991）
卷期: Volume 90, issue 1

页码: 67-75

ISSN:0001-4966

年代: 1991

DOI:10.1121/1.402350

出版商: Acoustical Society of America

数据来源: AIP

摘要:

Two methods are described for speaker normalizing vowel spectral features: one is a multivariable linear transformation of the features and the other is a polynomial warping of the frequency scale. Both normalization algorithms minimize the mean‐square error between the transformed data of each speaker and vowel target values obtained from a ‘‘typical speaker.’’ These normalization techniques were evaluated both for formants and a form of cepstral coefficients (DCTCs) as spectral parameters, for both static and dynamic features, and with and without fundamental frequency (F0) as an additional feature. The normalizations were tested with a series of automatic classification experiments for vowels. For all conditions, automatic vowel classification rates increased for speaker‐normalized data compared to rates obtained for nonnormalized parameters. Typical classification rates for vowel test data for nonnormalized and normalized features respectively are as follows: static formants—69%/79%; formant trajectories—76%/84%; static DCTCs 75%/84%; DCTC trajectories—84%/91%. The linear transformation methods increased the classification rates slightly more than the polynomial frequency warping. The addition ofF0 improved the automatic recognition results for nonnormalized vowel spectral features as much as 5.8%. However, the addition ofF0 to speaker‐normalized spectral features resulted in much smaller increases in automatic recognition rates.

点击下载: PDF (1435KB)