The field of articulatory phonetics is a subfield of phonetics that studies articulation and ways that humans produce speech. Articulatory phoneticians explain how humans produce speech sounds via the interaction of different physiological structures. Generally, articulatory phonetics is concerned with the transformation of aerodynamic energy into acoustic energy. Aerodynamic energy refers to the airflow through the vocal tract. Its potential form is air pressure; its kinetic form is the actual dynamic airflow. Acoustic energy is variation in the air pressure that can be represented as sound waves, which are then perceived by the human auditory system as sound.
|Manners of articulation|
|Places of articulation|
Sound is produced simply by expelling air from the lungs. However, to vary the sound quality in a way useful for speaking, two speech organs normally move towards each other to contact each other to create an obstruction that shapes the air in a particular fashion. The point of maximum obstruction is called the place of articulation, and the way the obstruction forms and releases is the manner of articulation. For example, when making a p sound, the lips come together tightly, blocking the air momentarily and causing a buildup of air pressure. The lips then release suddenly, causing a burst of sound. The place of articulation of this sound is therefore called bilabial, and the manner is called stop (also known as a plosive).
The vocal tract can be viewed through an aerodynamic-biomechanic model that includes three main components:
- air cavities
- air valves
Air cavities are containers of air molecules of specific volumes and masses. The main air cavities present in the articulatory system are the supraglottal cavity and the subglottal cavity. They are so-named because the glottis, the openable space between the vocal folds internal to the larynx, separates the two cavities. The supraglottal cavity or the orinasal cavity is divided into an oral subcavity (the cavity from the glottis to the lips excluding the nasal cavity) and a nasal subcavity (the cavity from the velopharyngeal port, which can be closed by raising the velum). The subglottal cavity consists of the trachea and the lungs. The atmosphere external to the articulatory stem may also be considered an air cavity whose potential connecting points with respect to the body are the nostrils and the lips.
Pistons are initiators. The term initiator refers to the fact that they are used to initiate a change in the volumes of air cavities, and, by Boyle's Law, the corresponding air pressure of the cavity. The term initiation refers to the change. Since changes in air pressures between connected cavities lead to airflow between the cavities, initiation is also referred to as an airstream mechanism. The three pistons present in the articulatory system are the larynx, the tongue body, and the physiological structures used to manipulate lung volume (in particular, the floor and the walls of the chest). The lung pistons are used to initiate a pulmonic airstream (found in all human languages). The larynx is used to initiate the glottalic airstream mechanism by changing the volume of the supraglottal and subglottal cavities via vertical movement of the larynx (with a closed glottis). Ejectives and implosives are made with this airstream mechanism. The tongue body creates a velaric airsteam by changing the pressure within the oral cavity: the tongue body changes the mouth subcavity. Click consonants use the velaric airstream mechanism. Pistons are controlled by various muscles.
Valves regulate airflow between cavities. Airflow occurs when an air valve is open and there is a pressure difference between the connecting cavities. When an air valve is closed, there is no airflow. The air valves are the vocal folds (the glottis), which regulate between the supraglottal and subglottal cavities, the velopharyngeal port, which regulates between the oral and nasal cavities, the tongue, which regulates between the oral cavity and the atmosphere, and the lips, which also regulate between the oral cavity and the atmosphere. Like the pistons, the air valves are also controlled by various muscles.
To produce any kind of sound, there must be movement of air. To produce sounds that people can interpret as spoken words, the movement of air must pass through the vocal cords, up through the throat and, into the mouth or nose to then leave the body. Different sounds are formed by different positions of the mouth—or, as linguists call it, "the oral cavity" (to distinguish it from the nasal cavity).
The two classes of sounds
Sounds of all languages fall under two categories: Consonants and Vowels.
Consonants are produced with some form of restriction or closing in the vocal tract that hinders the airflow from the lungs. Consonants are classified according to where in the vocal tract the airflow has been restricted. This is also known as the place of articulation.
Places of articulation
An obstruction is necessarily formed when two articulators come close together. Generally, one is moving (the active articulator), and the other is stationary (the passive articulator). As a result, what is normally termed the "place of articulation" is actually a combination of a place of active articulation and a place of passive articulation. For example, the English f sound is labiodental—a shorthand way of saying that the active articulator is the lower lip, which moves up (along with the jaw in general) to contact the upper teeth. The lower lip can also be the active articulator for other places of articulation (e.g. bilabial, where it contacts the upper lip, as in English p). Likewise, the upper teeth can be the passive articulator for other places of articulation (e.g. dental, where the tongue contacts the upper teeth, as in the English th sound).
The places of articulation used in English are:
- Bilabial: Both lips come together, as in p, b or m
- Labiodental: Lower lip contacts upper teeth, as in f or v
- Dental: Tongue tip or tongue blade (part just behind the tip) contacts upper teeth, as in the two th sounds (e.g. thin vs. this)
- Alveolar: Tongue tip contacts the alveolar ridge (the gums just behind the teeth), as in t, d, n, or l; or tongue blade contacts the alveolar ridge, as in s or z
- Postalveolar: Tongue blade contacts the postalveolar region behind the alveolar ridge, as in sh, ch, zh, or j; or tongue tip contacts the postalveolar region, as in r
- Palatal: Middle of tongue approaches or contacts the hard palate, as in y
- Velar: Back of tongue contacts the soft palate (or "velum"), as in k, g or ng
- Labiovelar: Back of tongue approaches the soft palate and lips also come close to each other, as in w
- Laryngeal: No obstruction anywhere but in the vocal cords down in the throat, as in h
The place of articulation is clearest for consonants, where there is generally a significant amount of obstruction. For vowels, part of the tongue moves closer to the roof of the mouth, but there is still enough of a gap that it is difficult to precisely specify the location of maximum obstruction. As a result, vowels are normally described by height and frontness of the tongue (as well as amount of rounding of the lips) rather than by a specific place of articulation. For example, the vowel in the first syllable of father is a low back unrounded vowel; the vowel in tooth is a high back rounded vowel, and the vowel in men is a low-mid front unrounded vowel.
Sometimes there can be more than one obstruction (although rarely more than two). There are two kinds of double obstruction: Either both obstructions block the air flow in equal amounts, or one obstruction blocks the air flow more than the other. The former type, a doubly articulated consonant, does not occur in English. The latter type, however, is more common and does occur in English; w is one example. With w, the place of greatest obstruction, called the primary articulation, occurs at the soft palate; the rounding of the lips causes less blockage, and is called the secondary articulation. Another example in English is the qu of words such as quit, with the same primary and secondary articulations, but a complete blockage of the air at the soft palate rather than only a restriction of the flow (a difference in manner of articulation; see below). (Note that the sound of qu is normally analyzed as a sequence of k plus w, but both parts are actually pronounced at the same time.)
Bilabial sounds are produced with both lips, such as [b], [m], and [p].
[f] and [v] are articulated by placing the upper teeth against the lower lip.
Interdental or Dental
[θ] and [ð] are both spelled as "th" (θ as in think) (ð as in the). They are pronounced by inserting the tip of the tongue between the teeth.
[t] [d] [n] [s] [z] [l] [r] are produced in many ways where the tongue is raised towards the alveolar ridge.
[t, d, n] the tip of the tongue is raised and touches the ridge.
[s, z] the sides of the front of the tongue are raised, but the tip is lowered so that air escapes over it.
[l] the tip of the tongue is raised while the rest of the tongue remains down, permitting air to escape over its sides. Hence, [l] is called a lateral sound (âm biên).
[r] [IPA ɹ] curl the tip of tongue back behind the alveolar ridge, or bunch up the top of the tongue behind the ridge, the air escapes through the central part of the mouth.
[ʃ] [ʒ] [tʃ] [dʒ] [j] are produced by raising the front part of the tongue to the palate.
[k] [ɡ] [ŋ] are produced by raising the back part of the tongue to the soft palate or the velum.
[ʀ] [q] [ԍ] these sounds are produced by raising the back of the tongue to the uvula. The 'r' in French and German may be an uvular trill (symbolized by [ʀ]). The uvular sounds [q] and [ԍ] occur in Arabic. These do not normally occur in English.
[h] [ʔ] the sound [h] is from the flow of air coming from an open glottis, past the tongue and lips as they prepare to pronounce a vowel sound, which always follows [h]. if the air is stopped completely at the glottis by tightly closed vocal cords the sound upon release of the cords is called a glottal stop [ʔ].
Manners of articulation
|Manners of articulation|
"Manner of articulation" refers in general to characteristics of the speech organs other than the location of the obstruction(s). There are multiple parameters involved here, and different types of each. The manners of articulation used in English are:
1. Degree of stricture: How much blockage occurs at the primary articulation (the place of greatest obstruction). The types in English are:
- Stop: Complete blockage followed by sudden release, as in t, d, p, b, k, g. The blockage of air causes air pressure to build up; when released, the air bursts out, giving these sounds their characteristic sharp quality.
- Fricative: Incomplete blockage but still close enough to cause significant airflow turbulence, as in f, v, s, z, sh, zh and both th sounds. The turbulence causes the characteristic noisiness of fricatives.
- Affricate: Complete blockage followed by a gradual release, resulting in a combination of stop + fricative, as in ch and j.
- Approximant: Incomplete blockage and far enough apart that airflow is smooth, as in r, y, w, and h.
2. Alternative air flow: The air travels a path other than down the center of the mouth:
- Nasal: Complete blockage of air out the mouth but air can freely flow out the nose, as in m, n, ng.
- Lateral: Complete blockage of air by the center of the tongue but air can flow out the sides of the tongue, as in L Consonant .
3. Dynamic movement of the tongue:
- Flap: Very brief complete blockage of air, in a way that doesn't cause any pressure buildup or release burst, as in the American English pronunciation of t and d between vowels.
- Trill: Multiple brief complete blockages in a row, caused by the active articulator (e.g. the tongue) vibrating. A trilled r is well known in Spanish and also occurs as the normal pronunciation of r by some Scottish English speakers.
Approximants, nasals, laterals, flaps, and trills are often grouped together as sonorants or resonants (which also includes vowels); all of them have in common the fact that there is smooth airflow throughout the consonant, and they are nearly always voiced (see below).
Vowels are produced by the passage of air through the larynx and the vocal tract. Most vowels are voiced (i.e. the vocal folds are vibrating). Except in some marginal cases, the vocal tract is open, so that the airstream is able to escape without generating fricative noise.
Variation in vowel quality is produced by means of the following articulatory structures:
The larynx is used to differentiate voiced and voiceless vowels. In addition, the pitch of the vowel is changed by altering the frequency of vibration of the vocal folds. In some languages there are contrasts among vowels with different phonation types.
Vowels may be made pharyngealized (also epiglottalized, sphincteric or strident) by means of a retraction of the tongue root.:306-310 Vowels may also be articulated with advanced tongue root.:298 There is discussion of whether this vowel feature (ATR) is different from the Tense/Lax distinction in vowels. :302-6
Soft palate (velum)
Vowels are normally produced with the soft palate raised so that no air escapes through the nose. However, vowels may be nasalized as a result of lowering the soft palate. Many languages use nasalization contrastively. :298-300
The tongue is a highly flexible organ that is capable of being moved in many different ways. For vowel articulation the principal variations are vowel height and the dimension of backness and frontness.. A less common variation in vowel quality can be produced by a change in the shape of the front of the tongue, resulting in a rhotic or rhotacized vowel.
What the above equations express is that given an initial pressure P1 and volume V1 at time 1 the product of these two values will be equal to the product of the pressure P2 and volume V2 at a later time 2. This means that if there is an increase in the volume of cavity, there will be a corresponding decrease in pressure of that same cavity, and vice versa. In other words, volume and pressure are inversely proportional (or negatively correlated) to each other. As applied to a description of the subglottal cavity, when the lung pistons contract the lungs, the volume of the subglottal cavity decreases while the subglottal air pressure increases. Conversely, if the lungs are expanded, the pressure decreases.
A situation can be considered where (1) the vocal fold valve is closed separating the supraglottal cavity from the subglottal cavity, (2) the mouth is open and, therefore, supraglottal air pressure is equal to atmospheric pressure, and (3) the lungs are contracted resulting in a subglottal pressure that has increased to a pressure that is greater than atmospheric pressure. If the vocal fold valve is subsequently opened, the previously two separate cavities become one unified cavity although the cavities will still be aerodynamically isolated because the glottic valve between them is relatively small and constrictive. Pascal's Law states that the pressure within a system must be equal throughout the system. When the subglottal pressure is greater than supraglottal pressure, there is a pressure inequality in the unified cavity. Since pressure is a force applied to a surface area by definition and a force is the product of mass and acceleration according to Newton's Second Law of Motion, the pressure inequality will be resolved by having part of the mass in air molecules found in the subglottal cavity move to the supraglottal cavity. This movement of mass is airflow. The airflow will continue until a pressure equilibrium is reached. Similarly, in an ejective consonant with a glottalic airstream mechanism, the lips or the tongue (i.e., the buccal or lingual valve) are initially closed and the closed glottis (the laryngeal piston) is raised decreasing the oral cavity volume behind the valve closure and increasing the pressure compared to the volume and pressure at a resting state. When the closed valve is opened, airflow will result from the cavity behind the initial closure outward until intraoral pressure is equal to atmospheric pressure. That is, air will flow from a cavity of higher pressure to a cavity of lower pressure until the equilibrium point; the pressure as potential energy is, thus, converted into airflow as kinetic energy.
Sound sources refer to the conversion of aerodynamic energy into acoustic energy. There are two main types of sound sources in the articulatory system: periodic (or more precisely semi-periodic) and aperiodic. A periodic sound source is vocal fold vibration produced at the glottis found in vowels and voiced consonants. A less common periodic sound source is the vibration of an oral articulator like the tongue found in alveolar trills. Aperiodic sound sources are the turbulent noise of fricative consonants and the short-noise burst of plosive releases produced in the oral cavity.
Voicing is a common period sound source in spoken language and is related to how closely the vocal cords are placed together. In English there are only two possibilities, voiced and unvoiced. Voicing is caused by the vocal cords held close by each other, so that air passing through them makes them vibrate. All normally spoken vowels are voiced, as are all other sonorants except h, as well as some of the remaining sounds (b, d, g, v, z, zh, j, and the th sound in this). All the rest are voiceless sounds, with the vocal cords held far enough apart that there is no vibration; however, there is still a certain amount of audible friction, as in the sound h. Voiceless sounds are not very prominent unless there is some turbulence, as in the stops, fricatives, and affricates; this is why sonorants in general only occur voiced. The exception is during whispering, when all sounds pronounced are voiceless.
- Non-vocal fold vibration: 20–40 hertz (cycles per second)
- Vocal fold vibration
- Lower limit: 70–80 Hz modal (bass), 30–40 Hz creaky
- Upper limit: 1170 Hz (soprano)
Vocal fold vibration
- cricoid cartilage
- thyroid cartilage
- arytenoid cartilage
- interarytenoid muscles (fold adduction)
- posterior cricoarytenoid muscle (fold abduction)
- lateral cricoarytenoid muscle (fold shortening/stiffening)
- thyroarytenoid muscle (medial compression/fold stiffening, internal to folds)
- cricothyroid muscle (fold lengthening)
- hyoid bone
- sternothyroid muscle (lowers thyroid)
- sternohyoid muscle (lowers hyoid)
- stylohyoid muscle (raises hyoid)
- digastric muscle (raises hyoid)
- Magnetic resonance imaging (MRI) / Real-time MRI
- Medical ultrasonography
- Electromagnetic articulography
In order to understand how sounds are made, experimental procedures are often adopted. Palatography is one of the oldest instrumental phonetic techniques used to record data regarding articulators. In traditional, static palatography, a speaker's palate is coated with a dark powder. The speaker then produces a word, usually with a single consonant. The tongue wipes away some of the powder at the place of articulation. The experimenter can then use a mirror to photograph the entire upper surface of the speaker's mouth. This photograph, in which the place of articulation can be seen as the area where the powder has been removed, is called a palatogram.
Technology has since made possible electropalatography (or EPG). In order to collect EPG data, the speaker is fitted with a special prosthetic palate, which contains a number of electrodes. The way in which the electrodes are "contacted" by the tongue during speech provides phoneticians with important information, such as how much of the palate is contacted in different speech sounds, or which regions of the palate are contacted, or what the duration of the contact is.
- Note that although sound is just air pressure variations, the variations must be at a high enough rate to be perceived as sound. If the variation is too slow, it will be inaudible.
- "Laver, John Principles of Phonetics, 1994, Cambridge University Press
- "Peter Ladefoged and Ian Maddieson The Sounds of the World's Languages, 1996, Blackwell; ISBN 0-631-19815-6
- Stated in a less abbreviatory fashion: pressure1 × volume1 = pressure2 × volume2
- volume1 divided by sum of volume1 and change in volume = sum of pressure1 and the change in pressure divided by pressure1
- Niebergall, A; Zhang, S; Kunay, E; Keydana, G; Job, M; et al. (2010). "Real-time MRI of Speaking at a Resolution of 33 ms: Undersampled Radial FLASH with Nonlinear Inverse Reconstruction". Magn. Reson. Med. doi:10.1002/mrm.24276..
- Ladefoged, Peter (1993). A Course In Phonetics (3rd ed.). Harcourt Brace College Publishers. p. 60.
- Interactive place and manner of articulation
- Observing your articulators
- QMU's CASL Research Centre site for ultrasound tongue imaging
- Seeing Speech – with reference examples of IPA sounds using MRI and ultrasound tongue imaging
- UCLA Electromagnetic articulography
- UCLA Aerometry
- UCLA Electrolaryngography
- Interactive Flash website for American English, Spanish and German sounds