Issue |
SHS Web of Conferences
Volume 8, 2014
4e Congrès Mondial de Linguistique Française
|
|
---|---|---|
Page(s) | 2675 - 2689 | |
Section | Ressources et Outils pour l'analyse linguistique | |
DOI | https://doi.org/10.1051/shsconf/20140801305 | |
Published online | 24 July 2014 |
Rhapsodie : un Treebank annoté pour l’étude de l’interface syntaxe-prosodie en français parlé
1
MoDyCo (Modèles, Dynamique, Corpus) - UMR 7114 - Université Paris Ouest Nanterre, 200 avenue de la République, 92001 Nanterre Cedex, France
2
Université Saint-Louis, Bruxelles, Préfecture Bruxelles, Belgique
3
LPP, UMR7018, Université Paris 3, 75006 Paris, France, Metropolitan
4
Université de Genève, 2, rue De-Candolle, CH-1211 Genève, Suisse
5
IRCAM, Paris, 75001 Paris, France
6
Université François Rabelais, 3, Rue des Tanneurs, 37041 Tours, France, Metropolitan
7
MoDyCo-UMR7114, 200 avenue de la République, 92000 Nanterre, France, Metropolitan
Contact : sylvain@kahane.fr
We here describe the Rhapsodie resource, a syntactic and prosodic treebank of spoken French, composed of 57 short samples of spoken French (5 minutes long on average, amounting to 3 hours of speech and 33000 words), and an orthographic transcription. The transcription and the annotations are all aligned on the speech signal : phonemes, syllables, words, speakers, overlaps. The main objective of the Rhapsodie project is to define rich, explicit, and reproducible schemes for the annotation of prosody and syntax in different genres (± spontaneous, ± planned, face-to-face interviews vs. broadcast, etc.), in order to study the prosody/syntax/discourse interface in spoken French, and their roles in the segmentation of speech into discourse units. This resource is freely available at www.projet-rhapsodie.fr. The sound samples (wav/mp3), the acoustic analysis (original F0 curve manually corrected and automatic stylized F0, pitch format), the orthographic transcriptions (txt), the macrosyntactic annotations (txt), the prosodic annotations (xml, textgrid), and the metadata (xml and html) can be freely downloaded under the terms of the Creative Commons licence Attribution - Noncommercial - Share Alike 3.0 France. The metadata are encoded in the IMDI-CMFI format and can be parsed on line.
© aux auteurs, publié par EDP Sciences, 2014
Article en accès libre placé sous licence Creative Commons Attribution 4.0
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.