Creating your own vowel resonators using tubing


When modeling speech production for applications such as speech processing we have learned that we can approximate the vocal tract as a excitation and a filtering component. The excitation corresponds to the vocal cords being vibrated by air pushed by the diaphragm. This is analogous to the plucking of a guitar string or the vibration of a reed. The excitation is then filtered by the vocal cavity. The shape of this cavity corresponds to the phoneme created. The different shapes of the vocal tract increase certain harmonics in the original excitation and decrease others to produce different voiced vowel sounds such as "ah" and "ih". The resonant frequencies that are amplified by the vocal tract are called formants and specific formants correspond to the different phonemes. We can therefore model the vocal tract as a series of uniform tubes that filter the signal.

In order to find the transfer function of these tubes we can utilize the z transform and some matrix algebra. Each tube delays the airflow signal that passes through it. This is equivalent to a time delay and by taking the z transform and putting the equations in matrix form we can obtain the transfer function of the time delay. In matrix form the equation for time delay can be shown below:

$ \begin{pmatrix}R_d(z) \\L_d(z)\end{pmatrix} = z\begin{pmatrix}1 & 0 \\0 & z^-2\end{pmatrix}\begin{pmatrix}\overline{R_d(z)}\\\overline{L_d(z)}\end{pmatrix} $

Where $ R_d(z) $ is the z transform of the air moving right at the entrance to the tube and $ L_d(z) $ is the z transform of the air moving left at the entrance to the tube. $ \overline{R_d(z)} $ and $ \overline{L_d(z)} $ are the z tranforms of the air moving right and left at the exit of the tube.

When air flows between tube junctions the transfer function can also be obtained. Using the principles of flow continuity and pressure continuity the equation of the transfer function of a tube junction is shown below:

$ \begin{pmatrix}R_d(z) \\L_d(z)\end{pmatrix} = \dfrac{1}{1+r}\begin{pmatrix}1 & -r \\-r & 1\end{pmatrix}\begin{pmatrix}\overline{R_d(z)} \\\overline{L_d(z)}\end{pmatrix} $

Where $ r $ is the reflection coefficient which is a ratio of the cross sectional area of the tubing given by the equation $ r=\dfrac{B-A}{B+A} $ Multiplying these equations for time delay and tube junctions together gives us an equation for the transfer function of an approximation of the vocal tract.

Background Research:

Since the vocal tract can be modeled by a source of excitation followed by filtering by tubes of various cross sectional area, a simple vowel resonator can be manufactured. They can be made by combining tubes of different lengths and cross sectional areas to give the proper formants for the desired vowel phoneme and then excited using some sort of source. In the example on this website a duck caller was used as the source for three different resonators. Different lengths of PVC tubing and pipe insulation were used to approximate the area of the vocal tract for the various phonemes. This example provided a crude approximation of the vocal tract by only using two sizes of tubing and it can be improved by combining multiple different sizes of tubes together to better approximate the vocal tract. Area functions of the vocal tract for different vowel sounds are shown in these sources:

Vowel Resonator Tubes:

The tubes created for this project are for the vowels phonemes of "ah" and "ih". They were constructed using multiple different sizes of flexible PVC tubing and insert fittings that can couple between the different tube sizes. The source used was a duck caller with a plastic reed that oscillates when air if blown through it. This is a rather crude approximation of the vocal cords. The sizes and length of tubing that were connected together were based on approximations of the vocal tract using the area functions provided on the websites listed above. Pictures of the tubes for "ah" and "ih" are provided below:


"AH" Vowel Resonator


"IH" Vowel Resonator


The sound output of the vowel resonators built for this project sound relatively close to the human vocal phonemes of "ah" and "ih". The source excitation may need to be modified to improve the quality of the output. Also greater care could be taken to insure the tube lengths and areas are equivalent to that of the vocal tract. Other vocal phonemes can also be produced in the same way. These could be used in class demonstrations to describe how the vocal tract filters input signals.

Alumni Liaison

Meet a recent graduate heading to Sweden for a Postdoctorate.

Christine Berkesch