Assessing implicit phonological knowledge through accent imitation

This study investigates learners’ implicit knowledge of Voice Onset Time (VOT), a non-distinctive phonetic difference between German and French. Previous studies on VOT in speech produced by English native speakers learning Spanish (Flege & Hammond 1982) and L1 Spanishspeaking learners of English (Mora et al. 2014) suggest that learners modify their native VOT patterns when attempting to imitate a foreign accent. This was taken as evidence for the development of tacit awareness of crosslanguage VOT difference between L1 and L2 voiceless stops. In order to determine if similar modifications occur in the productions of German speaking learners of French as a foreign language, we assessed learners’ speech through VOT duration measures in word-initial /p,t,k/. Data was collected through a reading-aloud elicitation task. Results provide support for the hypothesis that German native speakers are able to modify VOT duration when mimicking a French accent. Given these findings, we believe that accent imitation tasks could be used in L2 phonology instruction to raise learners’ awareness of non-distinctive phonetic differences. Résumé. Étudier les connaissances phonologiques implicites en langue seconde à travers l’imitation d’accents. Dans cette étude, nous examinons les connaissances implicites d’apprenants de FLE (français langue étrangère) concernant le Voice Onset Time (VOT), une différence phonétique sans valeur distinctive entre l’allemand et le français. Des recherches antérieures sur le VOT dans les productions orales de locuteurs anglophones (Flege & Hammond 1982) et hispanophones (Mora et al. 2014) ont montré que les sujets modifiaient le VOT dans des tâches d’imitation d’un accent étranger. Ces résultats indiquent que ces locuteurs, respectivement apprenants de l’espagnol et de l’anglais, ont vraisemblablement développé des connaissances implicites sur les différences concernant le VOT des consonnes occlusives sourdes dans leur langue maternelle et dans la langue cible. Afin de déterminer si c’est également le cas chez des locuteurs germanophones apprenant le français, nous avons mesuré et comparé le VOT des consonnes occlusives sourdes /p,t,k/. Les données de cette étude ont été recueillies à travers un test de lecture qui incluait l’imitation d’un accent français en allemand. L’analyse valide l’hypothèse selon laquelle les sujets sont capables de modifier le VOT des consonnes /p,t,k/ en position initiale. Au vu de ce résultat, il paraît souhaitable que les tâches d’imitation d’accents trouvent leur place dans l’enseignement de la phonologie afin de sensibiliser les apprenants à des différences phonétiques sans valeur distinctive entre les sons de la L1 et de la L2. © The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0 (http://creativecommons.org/licenses/by/4.0/). SHS Web of Conferences 38, 00010 (2017) DOI: 10.1051/shsconf/20173800010


Introduction
The field of second language acquisition (SLA) has a long-standing interest in the topic of implicit and explicit learning.As Andringa and Rebuschat (2015 : 186) state, "the distinction between implicit and explicit learning and knowledge and how they interface is crucial for a proper understanding of how L2 proficiency develops".In recent discussion (Hulstijn 2005, De Graff & Housen 2009, Han & Finneran 2014), the question of the interface between these two learning modes and language use has emerged as a key issue with regard to second language teaching.
While considerable more work is needed to fully understand how information processing relates to second language speech production, a growing number of studies support the view that clear separations exist between brain areas involved in explicit and implicit learning and memory, and that the two types of knowledge are related to different educational experiences (Ellis andYoung 1988, Ellis 2008).Explicit (or declarative) knowledge generally refers to "things we are aware we know and can usually describe to others" (Anderson & Lebiere, 1998: 5).In contrast, according to Hulstijn, implicit learning can be defined as an "autonomous process, taking place whenever information is processed receptively (through hearing and seeing)" (Hulstijn 2002: 206).The author also stresses that "since implicit learning takes place as an unstoppable information processing mechanism, it will automatically accompany explicit learning activities whenever L2 learners engage in practising the pronunciation of a particular sound, or producing a grammatical structure" (Hulstijn 2002: 207).
Despite recent advances in methodology and new measurement techniques (for example functional magnetic resonance imaging and eye-tracking1 ) used for studying the input processing mecanisms underlying implicit (statistical) and explicit learning, we do not yet have adequate knowledge neither about the nature of the acquired knowledge, nor about how information processing in SLA relates to usage.
It is important to note that the study of the representation of knowledge has acquired significance over the last twenty years.In his 2002 article, Hulstijn argues that automatisation is a concomitant, incidental feature of implicit learning.More precisely, what may appear to be automatisation of declarative knowledge is in fact automatisation through statistical learning and neural network construction.It is worth remembering that automatisation is especially important in phonology acquisition since performance fluency can (at least partially) be considered as a result of implicit learning.According to Hulstijn (2002), acoustic signals are part of what may be called 'low linguistic knowledge'.This view is based on the idea that language phenomena can be mapped on a ranging scale from high to low, according to the degree of consciousness and control.As the author points out, "our brains cannot instruct our ears to disregard tones of a certain frequency" (Hulstijn 2002: 202).Second language phonology acquisition (including the ability to process fine phonetic detail) thus seems prone to statistical learning.
It is widely acknowledged that at relatively advanced levels of proficiency, L2 learners have developed explicit knowledge of the phonological system of the target language which is used in L2 speech perception and production.Furthermore, they generally have developed implicit knowledge of differences between the sound structures of their L1 and those of their L2.To take but one example, it has been shown that learners can acquire the phonotactic constraints of an L2 without explicit instruction and are able to use them efficiently in segmenting continuous speech (Weber & Cutler 2006).
These findings suggest that proficient learners have accumulated rich implicit statistical knowledge of patterns based on the distributional properties of the input (Ellis 2002(Ellis , 2005(Ellis , 2008)).It is noticeable that previous studies on the acquisition of implicit knowledge (e.g.Reber 1967, DeKeyser 1997, Grey, Williams & Rebuschat 2014) have tended to focus on grammar rather than phonology.If phonotactic probability can be learned through exposure to the target language, we might expect that other aspects of phonology illustrate frequency effects in L2 phonology acquisition.
In this paper, we will thus examine another phenomenon that may reflect the importance of implicit learning in second language acquisition.More precisely, we will center on Voice Onset Time (VOT).VOT is an acoustic parameter that measures the time lapse between the stop consonant release and the voice onset of the following vowel (Lisker & Abramson, 1964).According to Liberman et al. (1958), VOT represents the most reliable acoustic cue for the distinction between voiced and voiceless stops.
VOT values vary for each consonant and may be best represented as a range of durations.Lisker andAbramson (1964, 1967) found that three categories of stops can be established along the VOT continuum: stops with a 'voicing lead', i.e. negative VOT values, ranging from about -125 to -75 milliseconds, stops with a 'short voicing lag', exhibiting positive VOT values, ranging from 0 to +25 milliseconds, and finally stops with a 'long voicing lag', i.e. large positive VOT values, ranging from +60 to +100 milliseconds.The authors also highlight language-specific differences with respect to VOT.
It has been suggested that several factors such as stress, point of articulation, age of the speaker and vowel quality influence VOT of voiceless stops (Lisker & Abramson 1964, Cho & Ladefoged 1999, Nearey & Rochet 1994).The research of Cho & Ladefoged (1999) provides evidence that the effect of place of articulation is robust among languages.According to Auzou et al. (2000: 138), « the lag time values tend to increase as the position of occlusion moves posteriorly within the oral cavity: the mean VOT values for [p] are shorter than for [t] which, in turn, are shorter than for [k] » (Auzou et al. 2000: 138).As the mouth cavity is smaller in [k] than in [t] or [p], higher intra-oral air pressure can be produced.This observation might explain why VOT is longer in velar (and dorsal) than in (bi)labial stops.
As for the effect of vowel height, there have been conflicting reports for English.While Lisker and Abramson (1967) concluded that vowel quality has no noticeable effect on the VOT, Klatt (1975) found VOT to be longer for all voiceless stops when followed by high vowels.Similar findings have been reported by Ohala (1981). For French, Fischer-Jørgensen (1972) showed that in French voiceless stops, VOT is longer before high vowels than before low vowels.
With respect to perception of VOT, we know that fine phonetic details are retained in listeners' memories and thus can be imitated.The results of a shadowing experiment (Nielsen 2011) revealed that participants produced significantly longer VOTs after being exposed to target speech with extended VOTs.Cross-subject comparison points out that VOT seems to be a speaker-specific property (Künzel 1987).It therefore could play a crucial role in speaker identification and therefore be the focus of major interest in the forensic context.While the first studies in this field investigated VOT in native speech, the processing of fine phonetic details such as VOT has emerged as a major issue in L2 phonology acquisition research.

Literature review
Previous research on stop consonants in productions of L2 learners have highlighted the effects of VOT on foreign accent ratings.Flege (1988) and Riney & Takagi (1999) found that for English, native listeners rate nonnative speech with longer (more native-like) VOTs as less foreign-accented.As mentioned in section 1, several studies pointed to languagespecific differences in the production of voiceless stop consonants.English and Spanish stops, for example, have different VOT patterns.While English /p, t, k/ fall in the long lag range, Spanish /p, t, k/ exhibit short lag VOT values.Flege & Hammond (1982) studied 50 English native speakers (enrolled in a beginnerlevel Spanish language class), asking them to read a typed list of 21 English sentences with what they considered to be a Spanish accent.Acoustic measurements revealed that these speakers produced /t/ with VOT values that were considerably shorter than is typical of English.More recently, More et al. (2014) examined oral productions of Spanish learners of English.They showed that language learners were able to modify VOT when mimicking a Spanish accent in English.Both studies provide support for the hypothesis that speakers are aware of phonetic differences between their L1 and the target language.
The aim of the present study is to examine whether this is equally true for German and French.It is widely acknowledged that German and French both have distinct stop consonant sets that can be measured and compared using VOT: the voiced /b/, /d/, and /g/, and the voiceless /p, /t/, and /k/.In this study, we chose to concentrate on voiceless stops.In German, initial /p,t,k/ in stressed syllables are generally articulated with strong aspiration (Jessen 1998), whereas French /p,t,k/ are described as normally unaspirated, i.e. vocal cord vibration starting immediately after the stop release (Abdelli-Beruh 2004).In other words, French stops are in the short-lag VOT range, whereas German dialects show long-lag VOT in orthographic <p, t, k> (Braun 1996, Jessen & Ringen 2002).This means that these sound units are equivalent at the phonological level, but that they are realised differently phonetically in German and French.With respect to L2 learning, we might expect that non-distinctive phonetic differences like VOT tend to go unnoticed by learners due to perceptual assimilation of L2 sounds to L1 categories (Best & Tyler 2005).
In her work on accent imitation as voice disguise, Neuhauser (2008Neuhauser ( , 2012) ) found that among native German speakers, the average VOT whilst imitating a French accent was considerably lower than in undisguised target words.It is important to notice that the scope of her study being much larger, VOT was analysed as one feature among others.In fact, her research illustrates several aspects of voice disguise through accent mimicking in the forensic context.In this study, we will focus exclusively on VOT patterns with regard to L2 phonology acquisition.

Task design
The aim of the study is to assess German-speaking learners' implicit knowledge of nondistinctive phonetic differences between L1 German and L2 French speech sounds.More precisely, we will examine whether German speakers reduce VOT in word-initial voiceless stops when imitating a French accent in German.
It is worth noting that research design has become a major concern in studies dealing with implicit and explicit learning.As Andriga and Rebuschat (2015 : 190) state, "one of the thorny issues in particular is the search for measures that can assess whether knowledge of a particular structure is either implicit or explicit".In his 2013 article, Rebuschat reviews three types of measures which have been widely used in psychological research to assess the conscious or unconscious status of knowledge: retrospective verbal reports, direct and indirect tests, and subjective measures.In their 1982 study, Flege & Hammond introduced a delayed mimicry task (i.e.imitation from memory) to assess learners' ability to detect nondistinctive phonetic differences.Although we are aware of the fact that other tests may be used efficiently to investigate VOT, we chose to rely on this method for our study.With regard to task design, one of our concerns was to avoid artificial, decontextualised language production.We therefore excluded reading tasks based on word lists.Previous studies (Flege & Hammond 1982, Mora et al. 2014) chose to embed target words in a sentence frame which was of the form "The … is on the …".Indeed, when using the same carrier phrase for all target words, participants can easily guess which tokens will be analysed and therefore are likely to pay more attention to the pronunciation of these lexical items.Nevertheless, it is clear that target words embedded in carrier sentences represent a major advantage over isolated words because "the longer utterance causes the subject to time the utterance phonemes and syllables in a manner similar to conversational speech rather than the prolongation common to citation speech" (Auzou et al. 2000: 134).

SHS
For our study, we chose target words embedded in sentences that relate to speakers' interests (studies abroad, holiday, family, sports).The participants performed three readaloud elicitation tasks (i.e. three lists of sentences for each task: L1 German, L2 French and French-accented German).The three lists contained the same sentences, but in differently randomised orders.Each list contained 14 sentences: 12 target words embedded in 12 sentences and two sentences used as distractors.
As mentioned earlier, VOT varies with point of articulation and vowel height.For this reason, we used two-syllable target words with /p,t,k/ in word-initial position before /i/ or /a/.It should be noticed that German and French have different stress patterns.While German has word stress, French is characterised by phrasal stress which is generally assigned to the final full syllable of the last lexical item of the stress group (Di Cristo 1998 : 196).Since VOT is longer in stressed syllables, we chose to exclude L2 French target words from our analysis and chose to compare only what can be compared, i.e.VOT values in L1 German tokens and in French-accented German target words.One might ask why participants were asked to read aloud the L2 French sentence lists.While the data obtained through this task doesn't contribute to our analysis, we thought that it would be easier for the participants to switch to the accent imitation mode after having read aloud sentences in both languages.

Participants
A total of seven German-speaking learners of French as a foreign language (mean age: 22 years) took part in the study.All participants were native speakers of German, enrolled in a BA program in French at the time of the study.Their proficiency level in French can be described as upper-intermediate (B2).It is important to note that the participants come from different parts of Germany and Austria.The subjects indicated no history of speech disorders or hearing impairment.

Procedure
The participants (seven German-speaking advanced learners of French as a foreign language) read three lists of sentences containing 12 /p,t,k/ words in German, French and "Frenchaccented" German.For every target word produced in German and in French-accented German, a mean VOT duration was calculated by averaging the VOT durations of the three repetitions of each target word.An overall mean VOT duration score was calculated by averaging mean VOT durations for all voiceless stops in the target words.
All subjects were asked to read aloud the German sentence lists, then the French sentence lists and finally the German sentence lists for a second time, but this time "like someone whose native language is French".No demonstration of a French accent in German and no explicit instructions concerning how one might produce the effect of accentedness were given.All participants were tested individually.Learners' oral productions were recorded as WAV files using an AKG C1000 S microphone and a Marantz PMD 660 digital recorder.The acoustic analysis was carried out with PRAAT (Boersma & Weenink 2016).
For each target word, VOT was measured as suggested by Cho and Ladefoged (1999), i.e. starting at the onset of the release burst of the stop closure and ending at the onset of the SHS Web of Conferences 38, 00010 (2017) DOI: 10.1051/shsconf/20173800010 COULS 2016 first complete vocal fold vibration.In order to guarantee a greater accuracy, all measurements were made by simultaneous examination of oscillograms and wideband spectrograms.After the read-aloud elicitation tasks, we asked participants what they did to produce a French accent in German.The purpose of this question was to determine their (explicit) knowledge of phonetic differences between German and French.

Results
The data suggest that the native German subjects of this study reduce VOT in word-initial voiceless stops during imitation of a French accent.It can be observed that the voiceless stops /p/, /t/, and /k/ exhibit large VOT differences between L1 German and French-accented target words, with VOT values being shorter in French-accented target words.Figure 1 displays the observed mean VOT of initial voiceless stops for each speaker (Sp).Overall, the mean VOT is 56 milliseconds (sd 2 =9) for L1 German tokens and 38 milliseconds (sd=9) for French-accented target words.These results are consistent with earlier findings by Neuhauser (2008) who found a mean VOT of 54 milliseconds for L1 German and a mean VOT of 37 milliseconds for the French accented tokens.It is important to recognise that closer inspection of our data shows high interspeaker variability.This is true for L1 German and for French-accented target words.As mentioned earlier, the subjects come from different regions of Germany and Austria, speaking different dialects which may have an impact on VOT values.The mean VOT reduction among the 7 learners is about 18 milliseconds.In order to determine whether the data sets differ between the two conditions, we ran a Wilcoxon signed-rank test for matched pairs.The test revealed a statistically significant difference concerning VOT duration in L1 German and in French accented target words (p < 0.01).
Figure 2 displays the mean VOT values according to the following vowel.Unsurprisingly, VOT is higher for stops followed by /i/ than for stops before /a/.Interestingly, mean VOT reduction when switching from L1 German to French-accented German is virtually the same for /i/ (17,9 milliseconds) and for /a/ (18,6 milliseconds).This means that speakers have developed rather consistent VOT patterns for each type of vowel.Moreover, we examined VOT with respect to the point of articulation.As expected, VOT varies according to the stop consonant.Strikingly, mean VOT reduction is higher for /p/ and /k/ than for /t/, as can be seen from figure 3. Put another way, mean VOT values are the same for /t/ and /k/ in the French-accented target words.The data for /t/ lead us to hypothesise that there may be a frequency effect that makes /t/ more resistant to changes in the VOT pattern.Another possible explanation could be the impact of vowel length.Having said that, further analysis has to be conducted to reassess this result which should be interpreted with caution.In order to gain a better understanding of the significance of these mean values, we decided to have a closer look at VOT in the French-accented target words.More precisely, we compared learners' performances in the 3 repetitions of the accent imitation task.Neither studies on English and Spanish (Flege & Hammond 1982, Mora et al. 2014), nor Neuhauser's work (2008Neuhauser's work ( , 2012) ) on German and French examined this aspect.Inspection of table 1 reveals that the highest VOT values can be found during the first reading of the sentence list, while the lowest values appear in the third repetition.In other words, with every repetition, participants' (long lag) VOT patterns get closer to (short lag) L1 French patterns.When asked about the changes in their pronunciation in order to produce a French accent in German, participants declared that they modified the "Rsound", "stress", "intonation" and "non pronunciation of <h>".These answers indicate that VOT difference between German and French is not part of the declarative knowledge of the subjects.

Discussion
As we have seen, the subjects reduced VOT in word-initial voiceless stops during imitation of a French accent.With respect to the overall results, one might argue that VOT reduction in the French-accented target words is due to a modified stress pattern.As mentioned in section 4, several subjects claimed that they made changes in stress in order to produce the effect of accentedness.Nevertheless, closer inspection of the target words doesn't show major changes in stress pattern.Where therefore argue that the lower VOT values cannot be attributed to a change from word to phrasal stress.Moreover, the data strongly suggests that vowel height doesn't seem to have an impact on VOT reduction.
We also found that mean VOT-reduction was higher for /p/ and /k/ than for /t/.This result may reflect a frequency effect related to the point of articulation.However, more data is needed to examine why speakers more easily modify VOT for /p/ and /k/ than for /t/.
The data for the French-accented target words suggest that the lowest VOT values can be found in the third reading.This result may indicate that a « training effect » is achieved through repetition of the task.In fact, one participant argued that she felt "like an actress" during the first reading, thinking about what kind of phonetic changes should be made to produce the effect of accentedness.In fact, accent imitation is a difficult task, especially when no demonstration or explicit instructions are provided.Subjects acknowledged that the first reading seemed the most difficult to them as they had reflected on differences in the two phonetic systems while performing the task.
On this basis, it could be interesting to test the efficiency of accent mimicking tasks in second language phonology teaching.As we have seen, with every repetition, VOT values converge to the short lag VOT pattern which characterises L1 French.Furthermore, the fact that the participants didn't explicitly mention stop consonant production as a relevant feature for accentedness supports the view that knowledge of this non-distinctive phonetic difference may be acquired through implicit learning which "takes place autonomously, beyond conscious control, whenever they engage in listening, reading, speaking or writing activities" (Hulstijn 2002 : 208).Yet, on the evidence presented, we cannot be certain at what stage of the learning process this information can be used by learners in speech production and how much exposure to the target language is needed in order to acquire this kind of knowledge.

Conclusion
Our analysis revealed that the L1 German speakers of this study were able to modify VOT in word-initial voiceless stops during imitation of a French accent.Having said that, considerably more work is needed to clarify the role of VOT in the different phases of the learning process.Additional research should be conducted, comparing beginners and advanced students, in order to determine at which stage of second language acquisition learners develop implicit phonological knowledge of VOT variation.Indeed, a more systematic approach is required to assess the evolution of VOT values in learners' productions over time.Moreover, a more detailed analysis might shed light on the impact of vowel length on VOT.
With respect to second language teaching, we believe that accent-mimicking tasks could be efficiently used to raise learners' awareness of distinctive and non-distinctive phonetic differences between L1 and L2 sounds.Nonetheless, our study also pointed to the need for additional research that taps into individual differences concerning use of implicit knowledge.According to Andringa and Rebuschat (2015: 192), "there appears to be an increasing recognition of the idea that individuals may differ in their ability to learn either explicitly or implicitly".Consequently, we can suppose that learning strategies might directly impact on the development of statistical knowledge.It would also be interesting to explore the amount of exposure to the target language that is needed to process non-distinctive phonetic differences.Clearly, more empirical research is necessary before the role of implicit learning in L2 acquisition can be fully determined.Hopefully, future studies will provide a more fine-grained picture of the link between linguistic knowledge and language use.

Fig. 2 :
Fig. 2: Mean VOT (in msec) in L1 German and French-accented German target words with respect to the following vowel

Fig. 3 :
Fig. 3 : Mean VOT (in msec) according to the point of articulation