首页   按字顺浏览 期刊浏览 卷期浏览 Speaker normalization of static and dynamic vowel spectral features
Speaker normalization of static and dynamic vowel spectral features

 

作者: Stephen A. Zahorian,   Amir J. Jagharghi,  

 

期刊: The Journal of the Acoustical Society of America  (AIP Available online 1991)
卷期: Volume 90, issue 1  

页码: 67-75

 

ISSN:0001-4966

 

年代: 1991

 

DOI:10.1121/1.402350

 

出版商: Acoustical Society of America

 

数据来源: AIP

 

摘要:

Two methods are described for speaker normalizing vowel spectral features: one is a multivariable linear transformation of the features and the other is a polynomial warping of the frequency scale. Both normalization algorithms minimize the mean‐square error between the transformed data of each speaker and vowel target values obtained from a ‘‘typical speaker.’’ These normalization techniques were evaluated both for formants and a form of cepstral coefficients (DCTCs) as spectral parameters, for both static and dynamic features, and with and without fundamental frequency (F0) as an additional feature. The normalizations were tested with a series of automatic classification experiments for vowels. For all conditions, automatic vowel classification rates increased for speaker‐normalized data compared to rates obtained for nonnormalized parameters. Typical classification rates for vowel test data for nonnormalized and normalized features respectively are as follows: static formants—69%/79%; formant trajectories—76%/84%; static DCTCs 75%/84%; DCTC trajectories—84%/91%. The linear transformation methods increased the classification rates slightly more than the polynomial frequency warping. The addition ofF0 improved the automatic recognition results for nonnormalized vowel spectral features as much as 5.8%. However, the addition ofF0 to speaker‐normalized spectral features resulted in much smaller increases in automatic recognition rates.

 

点击下载:  PDF (1435KB)



返 回