Speech Compression

 

作者: Friedrich Vilbig,  

 

期刊: The Journal of the Acoustical Society of America  (AIP Available online 1956)
卷期: Volume 28, issue 1  

页码: 160-160

 

ISSN:0001-4966

 

年代: 1956

 

DOI:10.1121/1.1918113

 

出版商: Acoustical Society of America

 

数据来源: AIP

 

摘要:

One can distinguish between physical compression when only the physical characteristics of the speech oscillations are used, and linguistic compression if, beyond that, rules are used which are found by linguistic investigation. We shall consider here only the physical methods. They can be divided into two groups. In the first group, which may include several variations, every frequency of the original band is divided by a factornby means of frequency division, or time element skipping. On the receiver side, the original frequencies are then restored by means of frequency multiplication, or time element repetition. By the second group, on the transmitter side, the so‐called analyzer builds from specific characteristics of the speech oscillations certain code‐signals. These are transmitted in a narrow frequency band channel and control a speech synthesizer on the receiver side. The code signals contain statements about the excitation and the system functions. The excitation function indicates whether the air stream propelled by the lungs either excites the vocal cords producing a pitch frequency modulation when vowels are being uttered, or if unmodulated, produces a kind of noise, for example, if consonants are spoken. The system function indicates the varying positions of the mouth organs in the form of “envelope curve” of the speech spectrum. Four variations of the code‐signal system differ by the method of transmission of the system function “envelope curve.” (1) In the earliest system, the Bell Vocoder, the mean values of 10 or 16 frequency band sections of the envelope curve are measured and transmitted through 10 or 16 frequency channels. (2) In the Scan Vocoder, we scan the envelope of the short time‐frequency spectrum, for example, 30 times per second. The scan voltage is then transmitted through only one narrow band frequency channel as an envelope curve. (3) In the Pulse Vocoder, the envelope curve is transmitted after a transference to a succession of pulse groups. (4) By the Formant Vocoder, instead of the whole envelope curve, only three signals corresponding to the three envelope resonance frequencies (Formant frequencies) are transmitted. The system developed by the Air Force Cambridge Research Center is a Scan Vocoder which can be converted into a Pulse or Formant Vocoder by the exchange of a controlling unit. Also we obtain a unit system which makes it possible to combine the systems with others; for example, with the synthesizer of Lawrence, the Formant extractor of Flanagan, or the Formant extractor of Northeastern University. By means of block diagrams, the equipment on hand will be discussed. How it works will be demonstrated by the playing of a record.

 

点击下载:  PDF (206KB)



返 回