Do Isolated Vowels Represent Vowel Targets in French? An Acoustic Study On Coarticulation

Coarticulatory effects of labial, dental, palato-velar and uvular places of articulation on vowel targets of ten French oral vowels /i, e, ɛ, a, u, o, ɔ, y, ø, œ/ were examined. The average vowel formant frequencies F1, F2, F3 and F4 in symmetrical sequences CVCVCVC to formant values of the same vowels in isolation were compared. The results show that the direction and magnitude of coarticulation of most vowels follow, as expected, contextual assimilation or acoustic centralization. Nevertheless, vowels /a/ and /ɔ/ present unpredictable coarticulatory patterns. This can be explained by the fact that 1) /a/ has two phonetic variants depending on the environment: back vowel [ɑ] in isolation and central/front vowel [a] in continuous speech, and 2) uttering [ɔ] in isolation violates the phonotactic rules of standard French.These results suggest that coarticulatory effects on vowels /a/ and /ɔ/ are probably not to be studied from their isolated positions.


Coarticulatory Effects and Vowel Targets
Phonemes of a language are rarely pronounced in isolation in continuous speech. Coarticulation, inherent to speech, refers to the fact that the phonetic nature of phonemes, i.e. the articulation and acoustic characteristics of vowels and consonants, is determined inter alia by the surrounding sounds. In French, coarticulatory effects are mostly anticipatory (Gottfried 1984). The dental stop /t/ in the pronoun <tu> /ty/ is produced with rounded lips in anticipation of the front rounded vowel, i.e. [tʷ]. Hirsch et al. (2004) found that the fricative noise of /s/ in the sequence /sy/ lowers so as to match the F3 of the following rounded vowel. The perceptual impact is that when /s/ is extracted from the sequence /sy/ and played alone, listeners can correctly guess what kind of vowel will follow. Although coarticulation is primarily anticipatory in French, progressive coarticulation, i.e. the influence of phonemes on following segments such as C on V in a CV sequence, is, of course, also active, which is explainable by acoustic principles (Fant 1960, Stevens 1998).
To study coarticulatory effects of consonants on surrounding vowels, an idealized vowel target that can serve as a reference must first be defined. In Miller's (1981) definition of vowel reduction, idealized vowel targets are represented by isolated vowels, i.e. vowels delimited by a pause. Once the vowel target is defined, researchers can investigate the direction of formant shifts. The systematic effect of the place of articulation of the surrounding consonants on the spectral characteristics of vowels was first analysed by Stevens & House (1963), hereafter SH, and later by Hillenbrand et al. (2001) for English. The authors examined how average formant frequencies of English vowels /i, ɪ, ɛ, ae, ɑ, ʊ, u, ʌ/ changed according to the immediate symmetrical consonant context, which was labial, post-dental, or velar. Vowels uttered in isolation, i.e. between two short pauses, and in a glottal context served as a reference. In French, coarticulatory effects of labial, dental and velar consonants on fourteen vowels /i, y, u, e, ø, o, ɛ, oe, ɔ, a, ɑ, ɑ, ɛ, ɔ/ were examined by Durand (1985) in an acoustic study based on utterances of two speakers. Unlike SH and Hillenbrand et al., Durand quantified the coarticulatory effects by using as a reference the general mean formant values, averaged over all contexts, including a uvular one. To the best of the author's knowledge, no other systematic study on coarticulatory effects of consonants with different places of articulation has been carried out for French, except for the vowel /a/ (Vaissière 1985). Nevertheless, other interesting studies in the field of vowel reduction have been performed with a focus on other factors, such as tempo or stress. Noteworthy studies examining vowel reduction (or 'target undershoot') as a function of vowel duration were conducted for English by Lindblom (1963) and for French, more recently, by Gendrot & Adda-Decker (2007), hereafter GAD.
Two types of phonetic reduction can explain the non-achievement of the vowel acoustic target: acoustic centralization and contextual assimilation. Acoustic centralization means that formant values tend to be those of a neutral vowel, with formant frequencies F1, F2 and F3 at 500 Hz, 1000 Hz, and 1500 Hz, respectively. These frequencies correspond to the resonances of a uniform tube 17.7 cm long (Stevens 1998). Following this tendency, the first formant frequency F1 of closed vowels uttered rapidly should raise and that of open vowels should lower. F1 is traditionally correlated with vowel aperture controlled by the tongue height and mandibles and F2 with anteriority-posteriority, but lip rounding can also affect the first and second formant frequencies (Vaissière 2007). The second type of phonetic reduction called contextual assimilation means that formants shift from the acoustic vowel target toward the acoustic loci of adjacent consonants. Labial consonants are known to lower all vowel formants. The French dental consonants having a dental locus at around 1800 Hz (Delattre 1961) induce systematic shifts of the formant mainly due to the front cavity (in most cases, F2, but F3 for /i/) toward 1800 Hz. All front consonants tend to lower the F1 of the surrounding vowels, and the back ones such as /R/ raise the F1 of neighbouring vowels. GAD, having analysed 25-hour-long recordings of radio speech, found that vowel reduction corresponds to an acoustic centralization in only 40% of consonant contexts. Whereas the direction of formant shifts is stable across languages, which is explainable by universal acoustic principles, the magnitude of coarticulation is language-dependant (Manuel & Krakow 1984): the more the vocalic space of a given language is crowded, the less coarticulation is expected.

Vowel Inventory of Standard French
In phonetic studies, standard French is described as "Northern French" (De Mareüil et al. 2010), "Nonmeridional French" (Pustka 2011), or "Parisian French" (Miller et al. 2011). Even if the designations differ according to the authors, all seem to agree on the exclusion of the south of France and the reasons for that are elucidated in section 1.3.

Mid vowel distribution: Conflict Between General Tendency and Particular Rules
French is characterized by a predominance of open syllables, reinforced by the phenomena of liaisons and enchaînements (Wioland 1991). Phonotactic rules of Standard French impose the production of /ø/ and /o/ in a stressed open syllable (/oe/ and /ɔ/ are not allowed in that position), but the contrast between /e/ and /ɛ/ is sometimes maintained (Wioland 2005). According to the bon usage, speakers of French should pronounce the two verbal modes 'chanterai' (indicative of future, pronounced with a final /e/) and 'chanterais' (conditional, pronounced with a final /ɛ/) differently, but it is rarely the case in spontaneous speech or even on the radio or in TV news (Lefebvre 1988).
The mid vowels are neutralized in meridional French and have a low functional load in Standard French because of the restricted number of minimal pairs actively produced by young educated speakers. Even when speakers commute the qualities within the vowel pairs, listeners generally understand the message without any difficulty.

Aim of the Study
The first aim is to complete acoustic data related to the coarticulatory effects of symmetrical consonant environments on vowels in French. The second aim is to examine whether it is appropriate to consider isolated vowels as idealized vowel targets in Standard French. A previous study performed with 40 French women has shown that French native speakers contrast mid vowels e/ɛ, o/ɔ, ø/oe in isolation and therefore isolated vowels can be considered a reference (Georgeton et al. 2012). However, phonotactic rules prohibit placing /ɔ/ and /oe/ in an open syllable, and their utterance in isolation is thus a laboratory artefact.

Data Acquisition
Ten female non-southern French native speakers without any known hearing problem took part in this experiment. At the time of the recordings, the speakers were aged between 21 and 48 years (M = 28.5, S. D. = 7.5) and had lived in Paris for 16.2 years on average. They were recorded in an anechoic chamber with a headband microphone AKGC 520 L and a sound card Edirol UA 25 connected to a Mac computer. The speech material, derived from a larger body (Landron et al. 2011), consists of ten French oral vowels /i, e, ɛ, a, u, o, ɔ, y, ø, oe/ produced in: 1. isolation, i.e. between two short pauses, embedded in carrier sentences such as 'Tôt, il a dit <ô> comme dans tôt' ('Soon, he said <u> like in soon'). The first part of the speech material serves to define the acoustic target for each vowel.
2. symmetrical CViCVmCVfC sequences where C corresponds to consonants of four different places of articulation: labial /p/, dental /d/, palato-velar /k/ and uvular /R/, and Vi, Vm and Vf correspond respectively to an initial, a median and a final vowel. The nonce words were embedded in carrier sentences such as 'Le mot kaukaukauke peut bien coller.' ('The word kaukaukauke matches well.'). To make sure the speakers produced the target vowel, existing words containing the target vowel were given on the same slide and were also indicated with phonetic symbols. As the spectral characteristics of vowels vary minimally according to their position within nonce words in French (Paillereau 2015b), the formant values of each vowel were averaged over the three word positions. The second part of the speech material serves to test coarticulatory effects.
The sentences, each repeated four times, were mixed so as to avoid uttering the same target vowel consecutively. All instructions were given in writing, and the recordings were preceded by a training phase.

Measurements
The ten target vowels were recorded with a sampling frequency of 44100 Hz and a sampling rate at 16 bits. Isolated vowels and vowels in trisyllabic nonce words were extracted from their carrier sentences and analysed in Praat (Boersma & Weenink 1992. Formant frequencies were measured semi-automatically at approximately one third, one half, and two-thirds of the vowel length. The applied formula To Formant (burg) is based on the Linear Predictive Coding (LPC) analysis that uses, in order to visualize formants, a large window fixed at 25 ms. As the speakers are all women, having higher fundamental frequency and thus higher resonance frequencies than men, the first five formants were located in the range of 0 -5500 Hz. When the formant detection turned out to be wrong because of giving aberrant values, the frequency range was lowered to 5000 Hz; if detection errors persisted, the values were checked and corrected manually. When errors could not be corrected because the formants were not clearly visible, as in a breathy voice resulting in a relatively highamplitude first harmonic and relatively weak upper harmonics (Klatt & Klatt 1990), the vowels were removed from the analysis. The final database thus contains 399 vowels uttered in isolation and 4,793 vowels uttered in phonetic contexts.

Results Visualization
All the raw data were saved in an xls file that was further treated in a program called VisuVo, standing for Visualisation of Vowels (Paillereau 2015a). Compared to Praat, VisuVo has the advantage of quickly illustrating coarticulatory effects in an interactive way. In this study, French posterior vowels characterised by merging F1 and F2 are only defined in terms of these two formants, as only they turn out to be perceptually salient (Delattre 1951). On the other hand, anterior vowels are defined by the first three formants. F3 is an acoustic correlate of labiality, opposing front rounded and unrounded vowels. Lastly, the closed front unrounded vowel /i/ is described by means of the first four formants because F4 merging with F3 is perceptively the most important zone of frequencies for the identification of this vowel (Vaissière, 2008).
The mean formant values were calculated as follows: a) in isolated vowels, the formants were averaged over three points of measurement and four repetitions of each vowel by each of the ten speakers; b) as examining the factor of the prosodic position in vowel reduction is not the goal of this study, and as the phonetic reduction, due to stress, turns out to be minor in French (Delattre 1969, Paillereau 2015b, the formant values of coarticulated vowels were averaged over three positions within a nonce word CViCVmCVfC and four repetitions of each vowel by each of the ten speakers. In order to better approximate the vowels unlinear perception, the linear formant frequency scale was transformed into Bark following Zwicker and Fastl's (1990) formula:

Acoustic Target of Isolated Vowels
The average vowel formant frequencies F1 and F2 of ten French oral vowels uttered in isolation, as well as F3 of front vowels and F4 of /i/, are summarized in Table 1. The standard deviation expressed in Hertz and in percentage is in brackets. Extreme values are marked in bold.
The data in Table 1  The formant frequency F2 is traditionally, but not exclusively, correlated to backward and forward movements of the tongue. Table 1 indicates that front unrounded vowels /i, e, ɛ/ are realized with an average F2 of over 2200 Hz, whereas F2 of front rounded vowels /y, ø, oe/ is lower, as expected, but still over 1500 Hz. F2 below 1500 Hz is characteristic for back vowels /u, o, ɔ/ and for /a/. Physically, /a/ is thus realized in isolation as a pharyngeal [a̠ ] with a high F1 and a low F2, and lies in the acoustic triangle among back vowels. Finally, the formant frequency F3 is primarily an acoustic cue of labiality: front rounded vowels /y, ø, oe/ are realized with a lower F3 (at 2460 Hz, 2645 Hz and 2751 Hz, respectively) than their unrounded counterparts /i, e, ɛ/, whose F3 is at 3787 Hz, 3364 Hz and 3066 Hz, respectively. Again, there is no exclusive correspondence between labiality and the third formant frequency F3 because 1) lips' rounding also affects other vowel formants, as outlined in Figure 1; 2) F3 can be affected by other articulatory gestures, as demonstrated, for example, by Riordan (1977).  Figure 1: Formants (in Bark) F1-F3 for isolated front vowels and F1-F4 for /i/ averaged over 3 moments of measurement (1/3, 1/2 and 2/3 of vowel lengths) and 4 repetitions by 10 French women. Standard deviation is 1. Figure 1 shows that the opposition between isolated i/y is mainly based on F3 frequency (the highest for /i/, low for /y/), and to a lesser degree on F2. In the /i/ configuration, the front cavity is much narrower than the back cavity, so the resonances of the two cavities are independent; F3 is a resonance of the front cavity and F2 is due to the back cavity. In the /e, ɛ, oe, ø/ configurations, the front cavity is less narrow, so the resonances F2 and F3 are dependent on both cavities. According to our data, the contrasts between e/ø and ɛ/oe are mainly based on F2, and to a lesser degree on F3.

Statistical treatment
In order to study the effects of an immediate context which can be null, labial, dental, palato-velar or uvular on formant frequencies F1, F2 of ten French vowels, as well as on F3 of front vowels, a series of ANOVAs were undertaken. The main effect of the context is significant, as expressed by the ANOVA value: F(4, 5145) = 283, p<0.05 for F1, F(4, 5145) = 731, p<0.05 for F2, and F(4, 5145) = 91, p<0.05 for F3. The context effect on F2 proves the most significant.

Average formant values and shifts
Coarticulatory effects are quantified in terms of the direction and amount of formant shifts. The differences (in Bark) between average formant frequencies F1, F2 and F3 of vowels uttered in labial, dental, palato-velar and uvular environments, and those of the same vowels uttered in isolation are summarized in Table 2. Negative values signify downward coarticulatory shifts, whereas positive values correspond to upward shifts. Unexpected shifts are greyed and formant differences corresponding to 1 Bark or more are in bold.

F1
F2  It is apparent from Table 2 that shifts in F1 are, on average, smaller than those in F2. All contexts, except the uvular one, cause minor coarticulatory shifts in F1 of closed or mid-closed vowels, not exceeding 30 Hz. As for the open and mid-open vowels /a/, /ɛ/ and /ɔ/, the F1 shifts are more appreciable. In a palato-velar context, F1 of /a/ and /ɛ/ are lower by 78 Hz and 59 Hz, respectively; this movement can be explained both by contextual assimilation and acoustic centralization. On the contrary, neither of the tendencies can explain the upward shift of /ɔ/ in a labial context compared to an isolated /ɔ/ (+65 Hz). Formant shifts in F1 and F2 of /ɔ/ are illustrated in Figure 2. The Figure 2, generated with the VisuVo software, traces the evolution of formants according to the phonetic context: labial (1 st column), dental (2 nd column), palato-velar (3 rd column), uvular (4 th column) and null context with isolated vowels whose formants are represented by horizontal strokes crossing the Figure. Also visualized is the evolution of mean formant values according to the position within the nonce word, which is initial, median or final, for each context separately. Finally, the Figure shows the evolution of formants according to the point of measurement, at the beginning, in the middle and at the end of the vowel length. The frequency scale is in Bark and the standard deviation is one.
The Second formant frequency turns out to be more sensitive to coarticulation than F1, which is consistent with SH data. The magnitude and the direction of F2 shifts depend on the place of articulation of the surrounding consonants. Average formant values F1 and F2 of vowels in labial, dental and palato-velar contexts (dotted line) compared to the same vowels in isolation (plain line) are illustrated in the vowel triangle of Figure 3. It is apparent from the Figure that a labial context causes a) F2 downward shifts in front unrounded vowels /i, e, ɛ/ that are at 153 Hz, 202 Hz and 152 Hz, respectively, b) minimal F2 shifts of front rounded vowels, and c) F2 upward shifts in back vowels /u/, /o/, /ɔ/ (at 82 Hz, 100 Hz and 170 Hz, respectively) and also in /a/ (at 278 Hz). This result is consistent with acoustic centralization. A dental context has a mean effect on F2 of back vowels: compared to the vowel targets, F2 of /u/, /o/, /ɔ/ in tVt syllables rises by 440 Hz, 393 Hz and 389 Hz, respectively. F2 of front rounded vowels also goes upwards, but to a lesser degree, whereas F2 of front unrounded vowels goes downwards. F2 of /i/ is not much affected. These data are consistent with contextual assimilation because of F2 of the vowel tending to reach the dental locus of the consonant at about 1800 Hz. A palato-velar context mostly causes F2 upward shifts by 19 Hz, 152 Hz and 240 Hz, in front rounded vowels /y/, /ø/ and /oe/, respectively, and by 76 Hz, 153 Hz and 207 Hz in back vowels /u/, /o/ and /ɔ/, respectively. Nevertheless, the greatest F2 upward shift is noted for /a/: it rises by 683 Hz. A uvular context, as indicated in Figure 4, induces F2 downward shifts in front vowels, except for /i/, as expected, and for /a/. F2 of back vowels is not much affected by /R/, which is explainable by the place of articulation: no large tongue movement is required in order to pass from one phoneme to another. Our data further indicate that the amount of coarticulation is also vowel-dependent. Figure 3 and Table 2 show that F2 shifts are, in most contexts, more prominent in back vowels and /a/ than in front vowels. This result was expected: as already shown in Figure 1, a large F2 variation in front vowels could neutralize the contrasts between rounded and unrounded vowels. The difference in the magnitude of coarticulation between /i/ and /a/ is illustrated in Figure 5 and Figure 6, which trace the evolution of F-patterns according to the 1) point of measurement (at 1/3, 1/2 and 2/3 of vowel lengths), 2) position in the word (initial, median and final vowels within a trisyllabic nonce word) and 3) phonetic context (/p/, /t/, /k/ and /R/). The formants of the same vowels uttered in isolation are represented by horizontal strikes. Figure 5 shows that F2 of /i/, being realised with a prepalatal constriction and thus with a very short anterior cavity (Vaissière, 2007), is primarily affiliated to the back cavity, which is not much affected by coarticulatory effects. Figure 6 points out surprisingly prominent F2 shifts for /a/; while F2 of an isolated /a/ is at 1300 Hz. F2 of /a/ uttered in labial, dental and palato-velar environments reaches 1571 Hz, 1808 Hz and 1976 Hz, respectively. Against expectations, F2 rises even in a uvular context, to 1426 Hz on average. The direction and amount of formant shifts in /a/ thus cannot be systematically explained by acoustic centralization or contextual assimilation.
Lastly, coarticulatory shifts in F3 are appreciable in most consonantal contexts only for front unrounded vowels /i/ and /e/, both produced with a narrow front constriction. Table 2 makes it clear that the uvular consonant /R/ has the largest impact on the vowels' third formant frequency.

Conclusion
This study has confirmed that native French non-southern speakers can produce, for laboratory purposes, isolated oral vowels with ten different qualities /i, e, ɛ, a, u, o, ɔ, y, ø, oe/ which are identified by French nonsouthern listeners with ease. The ten French isolated vowels thus have at least a psychological identity and they are part of the phonological competence of native French speakers. This result is consistent with two previous studies: 1) Gottfried (1984) showing a better identification of French isolated vowels than vowels in context by French native listeners and 2) Georgeton et al. (2012), giving F-patterns of isolated French vowels that could serve as a reference for learners of French as a Foreign Language (FFL).
This study contributes to completing data on vowel coarticulation caused by only one factor, which is the place of articulation of surrounding consonants. In general, the first vowel formant frequency, traditionally, but not exclusively, correlated with vowel aperture, proves to be less affected by coarticulatory phenomena than the second vowel formant frequency, traditionally, but not exclusively, correlated with anteriority/posteriority of the place of articulation. F1 shifts that are the most prominent in open and mid-open vowels lead to a general elevation of the vowel triangle floor, which is consistent with previous findings (Flemming 2005). In all vowels, except for /ɔ/, F1 shifts can be explained either by contextual assimilation or acoustic centralization. The vowel /ɔ/ manifests unpredictable coarticulatory patterns: its F1 is considerably higher in a labial context than in a null context, which cannot be explained by either theory. The explanation more likely points to a violation of orthoepic rules of Standard French, which prohibit uttering /ɔ/ in isolation. The fact that isolated /ɔ/ is a laboratory artefact is also reflected in the large dispersion of its F1, which means that its acoustic realization varies a great deal from one speaker to another, and even within one speaker.
Shifts in F2 are more prominent in back vowels and /a/ than in front vowels. In accordance with GAD results for French, a dental context causes appreciable formant shifts, whereas a labial context affects adjacent vowels to a lesser extent. This result is also consistent with Nguyen & Fagyal (2008), who showed that the acoustic fronting of back vowels /o/ and /ɔ/ as a result of vowel harmony characterizes the current tendency of Standard French. Prominent F2 shifts are in accordance with results of current studies on the acoustic variation of different languages (Gendrot & Adda-Decker 2007, Mok 2011. However, the present study shows that the magnitude of the acoustic distance between F2 of /a/ in isolation and in different consonant environments is so immense that the French vowel /a/ actually has two different phonetic variants: a back [ɑ] in isolation, gathering F1 and F2, and one a central/front [a] in most consonant contexts, with an F2 at 1500 Hz or more. A very low F2 value for a French isolated /a/, that is longer than /a/ extracted from continuous speech, was reported by Vaissière (1985). The acoustic centralization of /a/ in most dialects of French has also been noted (Nguyen & Fagyal 2008).

Applications
The results of the present study can be applied in the domain of didactic phonetics of FFL. Specialized manuals dedicated to phonetic training typically show the articulatory characteristics of isolated vowels through sagittal profiles, and learners are invited to repeat isolated sounds, as if these were an appropriate starting point to pronounce spoken French correctly. On the contrary, as French isolated vowels /a/ and /ɔ/ do not seem to represent a reference in studies on coarticulation, FFL teachers and textbooks should not consider these vowels as a base for acquiring different allophonic variations, typical of continuous speech. The fact that learners do not automatically master vowels in context just because they master them in isolation, which is the most common approach in articulatory exercises, is documented by a recent study (Paillereau 2015a). Paillereau showed that very advanced Czech learners of French succeed in realizing the contrast between isolated ø/oe, whereas they generally produce just one vowel quality corresponding to the two phonemes ø/oe in labial, dental, palato-velar and uvular contexts. A further investigation regarding whether and to what extent phonetic exercises based on isolated sounds are useful and adequate thus needs to be conducted with FFL learners. If a phonetic syllabus is carefully designed from the very beginning, learners could be prevented from acquiring some fossilized pronunciation errors that are very difficult to eliminate (Galazzi-Matasci & Pedoya 1983).