|
1. |
Review of text‐to‐speech conversion for English |
|
The Journal of the Acoustical Society of America,
Volume 82,
Issue 3,
1987,
Page 737-793
Dennis H. Klatt,
Preview
|
PDF (9453KB)
|
|
摘要:
The automatic conversion of English text to synthetic speech is presently being performed, remarkably well, by a number of laboratory systems and commercial devices. Progress in this area has been made possible by advances in linguistic theory, acoustic–phonetic characterization of English sound patterns, perceptual psychology, mathematical modeling of speech production, structured programming, and computer hardware design. This review traces the early work on the development of speech synthesizers, discovery of minimal acoustic cues for phonetic contrasts, evolution of phonemic rule programs, incorporation of prosodic rules, and formulation of techniques for text analysis. Examples of rules are used liberally to illustrate the state of the art. Many of the examples are taken from Klattalk, a text‐to‐speech system developed by the author. A number of scientific problems are identified that prevent current systems from achieving the goal of completely human‐sounding speech. While the emphasis is on rule programs that drive a formant synthesizer, alternatives such as articulatory synthesis and waveform concatenation are also reviewed. An extensive bibliography has been assembled to show both the breadth of synthesis activity and the wealth of phenomena covered by rules in the best of these programs. A recording of selected examples of the historical development of synthetic speech, enclosed as a 33 (1)/(3) ‐ rpm record, is described in the Appendix.
ISSN:0001-4966
DOI:10.1121/1.395275
出版商:Acoustical Society of America
年代:1987
数据来源: AIP
|
2. |
Inversion of ultrasonic scattering data for red blood cell suspensions under different flow conditions |
|
The Journal of the Acoustical Society of America,
Volume 82,
Issue 3,
1987,
Page 794-799
R. J. Lucas,
V. Twersky,
Preview
|
PDF (709KB)
|
|
摘要:
Recent results for low‐frequency scattering by correlated random distributions of nonspherical particles averaged over orientation are applied to invert ultrasonic data for red blood cell suspensions under different flow conditions. The inversion procedure isolates a correlation parameter (c) representing a process in which the volume fraction (w) of particles increases linearly, and also a cell population parameterP. Reduced data records of scattering versus hematocrit are compared withS(c;w)P, where the generalized fluctuation functionSis proportional to the variance in particle number, andPis proportional to the backscattering cross section of an isolated particle. The peak scattering for the different flow processes occurs at values ofwranging from about 0.15 for the most uniform to 0.25 for the least, corresponding tocvalues of about 2.1 to 0.4, as compared withw≊0.13 andc=3 for hard (repulsive at contact) spheres or aligned ellipsoids. The lower values ofcsuggest weaker repulsion between the deformable cells and effective interparticle attraction (aggregative trends), andc≊2 may also involve flow alignment of the discoids.
ISSN:0001-4966
DOI:10.1121/1.395276
出版商:Acoustical Society of America
年代:1987
数据来源: AIP
|
3. |
Volume estimation of symmetrical branching structures by resonance mode analysis |
|
The Journal of the Acoustical Society of America,
Volume 82,
Issue 3,
1987,
Page 800-806
David T. Raphael,
M. A. Farrell Epstein,
Preview
|
PDF (883KB)
|
|
摘要:
An exact resonance condition is derived for rigid symmetric second‐order bifurcating structures. In the low‐frequency range, the resonance condition can be reduced into forms that facilitate volume estimation of bifurcating structures. Two such volume approximation techniques are presented: (1) a fundamental frequency method, in which the lowest resonant frequency is inversely proportional to the structure volume, and (2) an equivalent‐length method, in which an equivalent length of two daughter branches is calculated for all branches distal to the first bifurcation. An experimental study to determine the resonance modes of seven bifurcating glass structures was performed. The volume estimates obtained by either method were in very close agreement with the true volumes.
ISSN:0001-4966
DOI:10.1121/1.395277
出版商:Acoustical Society of America
年代:1987
数据来源: AIP
|
4. |
Propagation of beluga echolocation signals |
|
The Journal of the Acoustical Society of America,
Volume 82,
Issue 3,
1987,
Page 807-813
Whitlow W. L. Au,
Ralph H. Penner,
Charles W. Turl,
Preview
|
PDF (939KB)
|
|
摘要:
The propagation characteristics of high‐frequency echolocation signals (peak energies above 100 kHz) of the beluga (Delphinapterusleucas) were measured while the animal performed a target detection task. The whale was trained to station on a bite plate so that its transmission beam could be measured in the vertical and horizontal planes using hydrophone arrays. The transitional region between the acoustic near‐ and farfields was also located using an array of hydrophones that extended directly in front of the animal in the horizontal plane. Three distinct modes of signals were observed. Mode 1 signals had click intervals greater than the time required for the signals to travel to the target and back (two‐way transit time). Mode 2 signals had click intervals shorter than the two‐way transit time, and mode 3 signals had high repetition rates with an average click interval of 1.7 ms, approximately 2% of the two‐way transit time. The average click intervals for the modes 1 and 2 signals were 193 and 44 ms, respectively. The vertical and horizontal beam patterns of the mode 1 signals had similar 3‐dB beamwidths of approximately 6.5°. The major axis of the vertical beam was directed approximately 5° above the plane defined by the animal’s teeth. The near‐ to farfield transition region was approximately 0.64–0.75 m from the tip of the animal’s mouth.
ISSN:0001-4966
DOI:10.1121/1.395278
出版商:Acoustical Society of America
年代:1987
数据来源: AIP
|
5. |
Sounds and source levels from bowhead whales off Pt. Barrow, Alaska |
|
The Journal of the Acoustical Society of America,
Volume 82,
Issue 3,
1987,
Page 814-821
W. C. Cummings,
D. V. Holliday,
Preview
|
PDF (970KB)
|
|
摘要:
Sounds were recorded from bowhead whales migrating past Pt. Barrow, AK, to the Canadian Beaufort Sea. They mainly consisted of various low‐frequency (25 to 900‐Hz) moans and well‐defined sound sequences organized into ‘‘song’’ (20–5000 Hz) recorded with our 2.46‐km hydrophone array suspended from the ice. Songs were composed of up to 20 repeated phrases (mean, 10) which lasted up to 146 s (mean, 66.3). Several bowhead whales often were within acoustic range of the array at once, but usually only one sang at a time. Vocalizations exhibited diurnal peaks of occurrence (0600–0800, 1600–1800 h). Sounds which were located in the horizontal plane had peak source spectrum levels as follows—44 moans: 129–178 dBre: 1 μPa, 1 m (median, 159); 3 garglelike utterances: 152, 155, and 169 dB; 33 songs: 158–189 dB (median, 177), all presumably from different whales. Based on ambient noise levels, measured total propagation loss, and whale sound source levels, our detection of whale sounds was theoretically noise‐limited beyond 2.5 km (moans) and beyond 10.7 km (songs), a model supported by actual localizations. This study showed that over much of the shallow Arctic and sub‐Arctic waters, underwater communications of the bowhead whale would be limited to much shorter ranges than for other large whales in lower latitude, deep‐water regions.
ISSN:0001-4966
DOI:10.1121/1.395279
出版商:Acoustical Society of America
年代:1987
数据来源: AIP
|
6. |
A model for directional and distance hearing in swimbladder‐bearing fish based on the displacement orbits of the hair cells |
|
The Journal of the Acoustical Society of America,
Volume 82,
Issue 3,
1987,
Page 822-829
Nico A. M. Schellart,
Jan C. de Munck,
Preview
|
PDF (1093KB)
|
|
摘要:
It is known that teleosts, without Weberian ossicles but with a swimbladder, can detect the direction of and, under appropriate conditions, the distance to a sound source [e.g., Schuijf and Hawkins, Nature302, 143–144 (1983)]. It is hypothesized here that the underlying mechanism is the analysis of the parameters of the elliptical movement of the hair cells with respect to the otoliths. This movement results from the displacement wave impinging directly upon the labyrinth and the response displacement wave reradiated by the swimbladder. For a given swimbladder geometry, given the positions of themaculaeof both labyrinths with respect to the swimbladder and the damping of the swimbladder, the displacement orbits of themaculaecan be calculated [de Munck and Schellart, J. Acoust. Soc. Am.81, 556–560 (1987)]. These calculations were made for the cod and the trout with the frequency, direction, and distance between the fish and the sound source as parameters with the source within the same horizontal plane as the fish. The orbit model predicts that theutriculushas the most strategic position to detect direction and distance of such a sound source. Moreover, the model predicts that this could basically be done monaurally. A hypothesis is proposed to describe how the utricular system analyzes the orbit parameters. The model is evaluated in relation to the results of behavioral experiments described in the literature.
ISSN:0001-4966
DOI:10.1121/1.395280
出版商:Acoustical Society of America
年代:1987
数据来源: AIP
|
7. |
Acoustic comparison of soprano solo and choir singing |
|
The Journal of the Acoustical Society of America,
Volume 82,
Issue 3,
1987,
Page 830-836
Thomas D. Rossing,
Johan Sundberg,
Sten Ternström,
Preview
|
PDF (972KB)
|
|
摘要:
Five soprano singers were recorded while singing similar texts in both choir and solo modes of performance. A comparison of long‐term‐average spectra of similar passages in both modes indicates that subjects used different tactics to achieve somewhat higher concentrations of energy in the 2‐ to 4‐kHz range when singing in the solo mode. It is likely that this effect resulted, at least in part, from a slight change of the voice source from choir to solo singing. The subjects used slightly more vibrato when singing in the solo mode.
ISSN:0001-4966
DOI:10.1121/1.395281
出版商:Acoustical Society of America
年代:1987
数据来源: AIP
|
8. |
An investigation of motor equivalence in the speech of children and adults |
|
The Journal of the Acoustical Society of America,
Volume 82,
Issue 3,
1987,
Page 837-842
Bruce L. Smith,
Ann McLean‐Muse,
Preview
|
PDF (1111KB)
|
|
摘要:
Although the concept of motor equivalence (i.e., articulatory intercoordination) is generally accepted as functioning in normal speech production, few studies have experimentally demonstrated its existence. One purpose of the present study was thus to obtain additional data concerning this phenomenon. Because motor equivalence is often assumed to represent a rather sophisticated ability in speakers, another purpose of the study was to determine whether trends could be observed that might demonstrate a developmental progression toward more frequent occurrence of articulatory intercoordination with increasing age. A strain gauge transduction system was used to monitor inferior–superior upper lip, lower lip, and jaw movements produced by a group of adults and three groups of children ranging from 4–11 years of age, as they spoke in a normal condition and in two ‘‘perturbed’’ conditions (bite block and fast rate). Based on the assumption that the presence of a significant negative correlation between two articulators constitutes evidence of articulatory intercoordination, there was little indication of motor equivalence in the speech of the adults or the children.
ISSN:0001-4966
DOI:10.1121/1.395282
出版商:Acoustical Society of America
年代:1987
数据来源: AIP
|
9. |
Speaking rate of adventitiously deaf male cochlear implant candidates |
|
The Journal of the Acoustical Society of America,
Volume 82,
Issue 3,
1987,
Page 843-846
Steven B. Leder,
Jaclyn B. Spitzer,
J. Cameron Kirchner,
Carole Flevaris‐Phillips,
Paul Milner,
Frederick Richardson,
Preview
|
PDF (597KB)
|
|
摘要:
No objective group data on speaking rate or speaking duration have been reported on the speech of adventitiously profoundly hearing‐impaired adults. Results of the present study showed that speaking rate, i.e., number of syllables per second, was significantly slower and speaking duration was significantly longer for 25 adventitiously profoundly hearing‐impaired adult male cochlear implant candidates than for 10 normal‐hearing control subjects. The factors of length of time since onset of profound hearing loss and hearing aid use did not significantly affect speaking rate. Based on these objective data, a rationale and method are presented for aural rehabilitation of the profoundly hearing‐impaired who exhibit speaking rate abnormalities.
ISSN:0001-4966
DOI:10.1121/1.395283
出版商:Acoustical Society of America
年代:1987
数据来源: AIP
|
10. |
Effects of stress and final‐consonant voicing on vowel production: Articulatory and acoustic analyses |
|
The Journal of the Acoustical Society of America,
Volume 82,
Issue 3,
1987,
Page 847-863
W. Van Summers,
Preview
|
PDF (2692KB)
|
|
摘要:
Durations of the vocalic portions of speech are influenced by a large number of linguistic and nonlinguistic factors (e.g., stress and speaking rate). However, each factor affecting vowel duration may influence articulation in a unique manner. The present study examined the effects of stress and final‐consonant voicing on the detailed structure of articulatory and acoustic patterns in consonant–vowel–consonant (CVC) utterances. Jaw movement trajectories andF1 trajectories were examined for a corpus of utterances differing in stress and final‐consonant voicing. Jaw lowering and raising gestures were more rapid, longer in duration, and spatially more extensive for stressed versus unstressed utterances. At the acoustic level, stressed utterances showed more rapid initialF1 transitions and more extremeF1 steady‐state frequencies than unstressed utterances. In contrast to the results obtained in the analysis of stress, decreases in vowel duration due to devoicing did not result in a reduction in the velocity or spatial extent of the articulatory gestures. Similarly, at the acoustic level, the reductions in formant transition slopes and steady‐state frequencies demonstrated by the shorter, unstressed utterances did not occur for the shorter, voiceless utterances. The results demonstrate that stress‐related and voicing‐related changes in vowel duration are accomplished by separate and distinct changes in speech production with observable consequences at both the articulatory and acoustic levels.
ISSN:0001-4966
DOI:10.1121/1.395284
出版商:Acoustical Society of America
年代:1987
数据来源: AIP
|
|