|
11. |
A composite model of the auditory periphery for the processing of speech based on the filter response functions of single auditory‐nerve fibers |
|
The Journal of the Acoustical Society of America,
Volume 90,
Issue 2,
1991,
Page 773-786
Rick L. Jenison,
Steven Greenberg,
Keith R. Kluender,
William S. Rhode,
Preview
|
PDF (2411KB)
|
|
摘要:
A composite model of the auditory periphery, based upon a unique analysis technique for deriving filter response characteristics from cat auditory‐nerve fibers, is presented. The model is distinctive in its ability to capture a significant broadening of auditory‐nerve fiber frequency selectivity as a function of increasing sound‐pressure level within a computationally tractable time‐invariant structure. The output of the model shows the tonotopic distribution of synchrony activity of single fibers in response to the steady‐state vowel [q] presented over a 40‐dB range of sound‐pressure levels and is compared with the population‐response data of Young and Sachs (1979). The model, while limited by its time invariance, accurately captures most of the place‐synchrony response patterns reported by the Johns Hopkins group. In both the physiology and in the model, auditory‐nerve fibers spanning a broad tonotopic range synchronize to the first formant (F1), with the proportion of units phase‐locked toF1increasing appreciably at moderate to high sound‐pressure levels. A smaller proportion of fibers maintain phase locking to the second and third formants across the same intensity range. At sound‐pressure levels of 60 dB and above, the vast majority of fibers with characteristic frequencies greater than 3 kHz synchronize toF1(512 Hz), rather than to frequencies in the most sensitive portion of their response range. On the basis of these response patterns it is suggested that neural synchrony is thedominantauditory‐nerve representation of formant information under ‘‘normal’’ listening conditions in which speech signals occur across a wide range of intensities and against a background of unpredictable and frequently intense acoustic interference.
ISSN:0001-4966
DOI:10.1121/1.401947
出版商:Acoustical Society of America
年代:1991
数据来源: AIP
|
12. |
Spectral cues to perception of /d, n, l/ by normal‐ and impaired‐hearing listeners |
|
The Journal of the Acoustical Society of America,
Volume 90,
Issue 2,
1991,
Page 787-798
Sally G. Revoile,
James M. Pickett,
Linda Kozma‐Spytek,
Preview
|
PDF (1674KB)
|
|
摘要:
The alveolar consonants /d, n, l/ occur frequently in intervocalic position in conversational speech but have received little study for differences in their acoustic cues. Impaired‐ and normal‐hearing listeners were investigated for use of consonant‐segment versus transition‐segment cues to recognition of /d, n, l/ in /1C1/ tokens extracted from sentences. To examine the cues’ contribution to /d, n, l/ recognition, the segments were degraded singly or in combinations in the tokens as follows: [1C] or [C1]transitions were replaced by adjacent pitch periods from the respective vowels; the consonant segments were replaced by silence or by a synthetic consonant approximating the summed low‐frequency spectra of the /d, n, l/ murmurs. The results with normal‐hearing listeners showed that the presence of any one of the three segments, [1C] transition, [C1]transition, or natural consonant segment, supported a moderate to high level of /d, n, l/ recognition, depending on the phoneme. In contrast, the severely hearing‐impaired listeners’ consonant recognition was poor on the basis of transition information, but better in the presence of the natural consonants. The /1C1/’s with the synthetic consonant yielded chance level performance for the hearing‐impaired listeners but good consonant recognition for the normal‐hearing listeners—a further indication that cues in the transitions were quite useful for the normal‐hearing group but not for the hearing‐impaired group.
ISSN:0001-4966
DOI:10.1121/1.401948
出版商:Acoustical Society of America
年代:1991
数据来源: AIP
|
13. |
Analysis of vocal tract shape and dimensions using magnetic resonance imaging: Vowels |
|
The Journal of the Acoustical Society of America,
Volume 90,
Issue 2,
1991,
Page 799-828
T. Baer,
J. C. Gore,
L. C. Gracco,
P. W. Nye,
Preview
|
PDF (4050KB)
|
|
摘要:
Magnetic resonance imaging (MRI) techniques were used to gather basic data to apply in computational models of speech articulation. Two experiments were performed. In experiment 1, voice recordings from two male subjects were obtained simultaneously with axial, coronal, or midsagittal MR images of their vocal tracts while they produced the four point vowels. Area functions describing the individual tract shapes were obtained by measurements performed on the MR images. Digital filters derived from these functions were then used to resynthesize the vowel sounds which were compared, both perceptually and acoustically, with the subjects’ original recordings. In experiment 2, axial images of the pharyngeal cavity were collected during the production of an ensemble of nine vowels. Plots of cross‐sectional area versus the midsagittal width of the tract at different locations within the pharynx and for different vowel productions were used to derive a functional relationship between the two variables. Data from experiment 1 relating midsagittal width to cross‐sectional area within the oral cavity were also examined.
ISSN:0001-4966
DOI:10.1121/1.401949
出版商:Acoustical Society of America
年代:1991
数据来源: AIP
|
14. |
Comodulation masking release as a function of level |
|
The Journal of the Acoustical Society of America,
Volume 90,
Issue 2,
1991,
Page 829-835
Brian C. J. Moore,
Michael J. Shailer,
Preview
|
PDF (1072KB)
|
|
摘要:
These experiments examine the effects of masker level on the magnitude of comodulation masking release (CMR). In experiment 1, threshold was measured for detecting a 2000‐Hz signal in noise bands 100‐ or 3200 Hz wide, centered at the signal frequency. The noise was either amplitude modulated by a low‐pass‐filtered noise, or was unmodulated. At noise spectrum levels of 30 and 50 dB, thresholds were lower in the 3200‐Hz‐wide modulated noise than in the 100‐Hz‐wide modulated noise or the 3200‐Hz‐wide unmodulated noise, indicating a CMR. The magnitude of this CMR decreased at a noise spectrum level of 10 dB, and was very small at a spectrum level of −10 dB. In experiment 2, threshold was measured for a 700‐Hz signal centered in a 20‐Hz wide band of noise (the on‐frequency band, OFB), both in the presence and absence of eight flanking bands (FBs) whose envelopes were either identical with that of the OFB (correlated condition) or were uncorrelated. Thresholds were lower in the correlated than in the uncorrelated condition, indicating a CMR. When the OFB and the FBs were presented to the same ear, the CMR decreased when the spectrum level of all bands was below 30 dB, or when the spectrum level of the FBs was decreased below 40 dB keeping the level of the OFB constant at 40 dB. When the OFB and the FBs were presented to opposite ears, the CMR decreased when the spectrum level of all bands was decreased below 30 dB or when the spectrum level of the FBs was decreased below 40 dB, keeping the level of the OFB fixed at 40 or 60 dB. However, the CMR was almost independent of the spectrum level of the OFB (over the range 10–70 dB) when the spectrum level of the FBs was held constant at 60 dB. The results are interpreted in terms of perceptual grouping mechanisms. Implications for the measurement of CMR in hearing‐impaired subjects are also discussed.
ISSN:0001-4966
DOI:10.1121/1.401950
出版商:Acoustical Society of America
年代:1991
数据来源: AIP
|
15. |
Effect of amplitude modulation on profile detection |
|
The Journal of the Acoustical Society of America,
Volume 90,
Issue 2,
1991,
Page 836-845
Huanping Dai,
David M. Green,
Preview
|
PDF (1446KB)
|
|
摘要:
The effect of amplitude modulation on profile detection was examined in three experiments. The observer’s task was to determine in which observation interval an increment was added to the 1000‐Hz target component of a multitone complex in which the components were equally spaced on a logarithmic frequency scale from 200 to 5000 Hz. The target was unmodulated throughout the study. In some conditions, all nontarget components of the standard were modulated in phase; in other conditions, they were modulated with random phase. In experiment 1, the threshold was measured as a function of the modulation rate. The results show that, at low modulation rates, 5 Hz for example, modulation elevates threshold by about 13 dB. The threshold decreases as the modulation rate increases, with the threshold elevation being only 3 dB at 80 Hz. In experiment 2, threshold was measured as a function of modulation depth for both 21‐ and 5‐component complexes. The results show that for a 5‐Hz modulation rate the threshold decreases as the modulation depth decreases, and that the rate of decrease is greater for the 21‐component complex than for the 5‐component complex. In experiment 3, the effects of random‐phase modulation were explored; the phase of the modulation waveform was randomly chosen for each component. The results show that there is no difference between in‐phase and random‐phase modulation when each component occupies a different critical band. If, however, two or more components occupy the same critical band, then randomizing the phase of modulation reduces the effective depth of modulation within that critical band, and the effect of modulation is thereby lessened.
ISSN:0001-4966
DOI:10.1121/1.401951
出版商:Acoustical Society of America
年代:1991
数据来源: AIP
|
16. |
Effect of time compression and expansion on the discrimination of tonal patterns |
|
The Journal of the Acoustical Society of America,
Volume 90,
Issue 2,
1991,
Page 846-857
Robert D. Sorkin,
DeMaris A. Montgomery,
Preview
|
PDF (1529KB)
|
|
摘要:
This experiment tested how well human listeners can discriminate between temporal patterns that are compressed or expanded in time. The listener’s task was to determine whether two arrhythmic, tonal sequences had the same or different temporal patterns. According to the pattern correlation model [R. D. Sorkin, J. Acoust. Soc. Am.87, 1695–1701 (1990)], listeners perform this task by computing the correlation between the pattern of time intervals marked by the tones in each sequence. Listener performance dropped when one of the sequences was compressed or expanded in time. In order for the model to describe the observed performance, it was necessary to postulate an internal noise component that was proportional to the magnitude of the difference between the sequence transformations.
ISSN:0001-4966
DOI:10.1121/1.401952
出版商:Acoustical Society of America
年代:1991
数据来源: AIP
|
17. |
Temporal integration and multiple looks |
|
The Journal of the Acoustical Society of America,
Volume 90,
Issue 2,
1991,
Page 858-865
Neal F. Viemeister,
Gregory H. Wakefield,
Preview
|
PDF (1367KB)
|
|
摘要:
The decrease in detection and discrimination thresholds with increases in signal duration has often been taken to indicate that a process of relatively long‐term temporal integration occurs in hearing. Two experiments are reported that suggest that no such process occurs. The first experiment is similar to the two‐pulse experiment reported by Zwislocki [J. Zwislocki, J. Acoust. Soc. Am.32, 1046–1059 (1960)] in which the threshold in quiet for a pair of brief pulses is measured as a function of the temporal separation between them. Our data indicate that power integration occurs only for separations less than approximately 5 ms. For separations larger than 5–10 ms, thresholds do not change with separation and the pulses appear to be processed independently. In the second experiment, brief 1‐kHz tone pulses separated by 100 ms are presented during gaps in a wideband noise. The threshold for a pair of pulses is lower than that for either pulse presented alone, indicating that some type of ‘‘integration’’ occurs. However, the threshold for the pulse pair is not affected by changes in the level of the noise during the interval between the pulses. These data are inconsistent with the classical view of temporal integration that involves long‐term integration. They are consistent with the notion that the input is sampled at a fairly high rate and that these samples or ‘‘looks’’ are stored in memory and can be accessed and processed selectively. This multiple‐look model can account for the data from the present experiment and also can account for the data on temporal integration for tones and noise. The model provides a framework for describing resolution‐related phenomena, such as gap detection, without resorting to multiple time constants.
ISSN:0001-4966
DOI:10.1121/1.401953
出版商:Acoustical Society of America
年代:1991
数据来源: AIP
|
18. |
Turning on a tone |
|
The Journal of the Acoustical Society of America,
Volume 90,
Issue 2,
1991,
Page 866-873
William Morris Hartmann,
Dan C. Sartor,
Preview
|
PDF (1136KB)
|
|
摘要:
It is possible to choose the starting phase of a pure tone in a way that minimizes the onset noise when the tone is turned on abruptly. A spectral model shows that when the tone has a low frequency, minimum onset noise is expected for a starting phase of zero (turning on a sine tone) but when the tone has a high frequency, minimum onset noise is expected for a starting phase of ±90 deg (turning on a cosine tone). Listening experiments confirm the above expectations and show that the transition between low‐ and high‐frequency domains is sharp and depends upon both the electroacoustical transducer and the individual listener.
ISSN:0001-4966
DOI:10.1121/1.401954
出版商:Acoustical Society of America
年代:1991
数据来源: AIP
|
19. |
Dynamic processes in the precedence effect |
|
The Journal of the Acoustical Society of America,
Volume 90,
Issue 2,
1991,
Page 874-884
Richard L. Freyman,
Rachel K. Clifton,
Ruth Y. Litovsky,
Preview
|
PDF (1679KB)
|
|
摘要:
Three experiments were conducted to investigate the dependence of echo suppression on the auditory stimulation just prior to a test stimulus. Subjects sat in an anechoic chamber between two loudspeakers, one which presented the ‘‘lead’’ sound, and the other the delayed ‘‘lag’’ sound. In the first experiment, subjects reported whether or not they heard an echo coming from the vicinity of the lag loudspeaker during a test click pair. In seven of nine listeners, perception of the lagging sound was strongly diminished by the presence of a train of ‘‘conditioning’’ clicks presented just before the test click. Echo threshold increased (subjects were less sensitive to echoes) as the number of clicks in the train increased from 3 to 17. For a fixed number of clicks, the effect was essentially independent of click rate (from 1/s through 50/s) and duration of the train (from 0.5 through 8 s). A second experiment demonstrated a similar buildup of echo suppression with white noise bursts, regardless of whether the bursts in the conditioning train were repeated samples of frozen noise, or were independent samples of noise. Using an objective procedure for measuring echo threshold, the third experiment demonstrated that both lead and lag stimuli must be presented during the conditioning train in order to produce the buildup of suppression. When only the lead sound was presented during the conditioning train, the perceptibility of the lag sound during the test burst appeared to be enhanced.
ISSN:0001-4966
DOI:10.1121/1.401955
出版商:Acoustical Society of America
年代:1991
数据来源: AIP
|
20. |
The effect of frequency‐selective attenuation on the speech‐reception threshold of sentences in conditions of low‐frequency noise |
|
The Journal of the Acoustical Society of America,
Volume 90,
Issue 2,
1991,
Page 885-894
Janette N. van Dijkhuizen,
Joost M. Festen,
Reinier Plomp,
Preview
|
PDF (1603KB)
|
|
摘要:
Within a study on the merits of a multichannel automatic gain control in hearing aids, the effect of frequency‐selective amplification on the masked speech‐reception threshold (SRT) for sentences is measured in conditions of seriously disturbing low‐frequency noise, with the effect of wideband amplification as a reference. Speech and noise are both spectrally shaped according to the bisector line of the listener’s dynamic‐range of hearing, but with the noise in a single octave band (0.25–0.5 or 0.5–1 kHz) increased by 20 dB relative to this line. The increase of noise level is steady state in the first experiment, and time varying in the second experiment. Results for 12 normal‐hearing and 12 hearing‐impaired listeners indicate that, in both experiments, frequency‐selective compression of the signal in the octave band with the 20‐dB increase of noise is more beneficial than wideband compression. For the hearing‐impaired group, wideband compression does not give any systematic change in intelligibility. Frequency‐selective compression in steady‐state conditions may, for both groups of listeners, give a decrease of masked SRT (relative to a condition without compression) of up to 4 dB for a compression factor of 100%. Roughly comparable effects are seen for frequency‐selective compression in time‐varying conditions. The superiority of frequency‐selective over wideband compression is attributed to a more effective reduction of upward spread of masking.
ISSN:0001-4966
DOI:10.1121/1.402385
出版商:Acoustical Society of America
年代:1991
数据来源: AIP
|
|