首页   按字顺浏览 期刊浏览 卷期浏览 Visual representations of speech—A computer model based on correlation
Visual representations of speech—A computer model based on correlation

 

作者: Malcolm Slaney,   Richard F. Lyon,  

 

期刊: The Journal of the Acoustical Society of America  (AIP Available online 1990)
卷期: Volume 88, issue S1  

页码: 23-23

 

ISSN:0001-4966

 

年代: 1990

 

DOI:10.1121/1.2028916

 

出版商: Acoustical Society of America

 

数据来源: AIP

 

摘要:

The use of the cochleagram and the correlogram in speech and sound recognition is discussed. The cochleagram represents a sound as a pattern of neural firing probabilities at places along the basilar membrane versus time. It is roughly analogous to the spectrogram and its benefits have been described in several papers. But using the cochleagram as a basis for speech recognition is only a weak way to use knowledge of human auditory processing A richer representation of speech and sound is called the correlogram. Sound is represented as a two‐dimensional picture versus time—the extra dimension allowing several interesting perceptual experiences to be modeled. Assembling correlograms into a movie and synchronizing them to sound allow the auditory and visual percepts to be compared. The correlogram is more useful than a cochleagram (or spectrogram) because it shows an orthogonal dimension that represents the fine‐time structure and pitch in the auditory signal. The extra dimension provides the information necessary for auditory grouping and speaker separation. This talk will emphasize the advantages of a two‐dimensional representation of sound and describe several auditory maps that might be used by the brain to do auditory scene analysis. Videotapes will be shown to demonstrate the advantages of the correlogram representation.

 

点击下载:  PDF (113KB)



返 回