NSTL回溯数据服务平台

首页

按字顺浏览

期刊浏览

卷期浏览

Complementarity and synergy in bimodal speech: Auditory, visual, and audio-visual ident...

Complementarity and synergy in bimodal speech: Auditory, visual, and audio-visual identification of French oral vowels in noise

作者: Jordi Robert-Ribes, Jean-Luc Schwartz, Tahar Lallouache, Pierre Escudier,

期刊: The Journal of the Acoustical Society of America （AIP Available online 1998）
卷期: Volume 103, issue 6

页码: 3677-3689

ISSN:0001-4966

年代: 1998

DOI:10.1121/1.423069

出版商: Acoustical Society of America

数据来源: AIP

摘要:

The efficacy of audio-visual interactions in speech perception comes from two kinds of factors. First, at theinformationlevel, there is some “complementarity” of audition and vision: It seems that some speech features, mainly concerned with manner of articulation, are best transmitted by the audio channel, while some other features, mostly describing place of articulation, are best transmitted by the video channel. Second, at theinformation processinglevel, there is some “synergy” between audition and vision: The audio-visual global identification scores in a number of different tasks involving acoustic noise are generally greater than both the auditory-alone and the visual-alone scores. However, these two properties have been generally demonstrated until now in rather global terms. In the present work, audio-visual interactions at thefeaturelevel are studied for French oral vowels which contrast three series, namely front unrounded, front rounded, and back rounded vowels. A set of experiments on the auditory, visual, and audio-visual identification of vowels embedded in various amounts of noise demonstrate that complementarity and synergy in bimodal speech appear to hold for a bundle ofindividualphonetic features describing place contrasts in oral vowels. At the information level (complementarity), in the audio channel the height feature is the most robust, backness the second most robust one, and rounding the least, while in the video channel rounding is better than height, and backness is almost invisible. At the information processing (synergy) level, transmitted information scores show that all individual features are better transmitted with the ear and the eye together than with each sensor individually.

点击下载: PDF (305KB)