Developing Resources for Automated Speech Processing of Quebec French

Détails

Ressource 1Télécharger: COTE2020&Lancien_Bigi.pdf (1798.27 [Ko])
Etat: Public
Version: Final published version
Licence: CC BY-NC 4.0
ID Serval
serval:BIB_07821F4536D4
Type
Actes de conférence (partie): contribution originale à la littérature scientifique, publiée à l'occasion de conférences scientifiques, dans un ouvrage de compte-rendu (proceedings), ou dans l'édition spéciale d'un journal reconnu (conference proceedings).
Collection
Publications
Institution
Titre
Developing Resources for Automated Speech Processing of Quebec French
Titre de la conférence
Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020)
Auteur⸱e⸱s
Lancien Mélanie, Côté Marie-Hélène, Bigi Brigitte
Statut éditorial
Publié
Date de publication
2020
Pages
5323-5328
Langue
anglais
Résumé
The analysis of the structure of speech nearly always rests on the alignment of the speech recording with a phonetic transcription. Nowadays several tools can perform this speech segmentation automatically. However, none of them carries out the automatic segmentation of Quebec French (QF hereafter) in a proper way. Contrary to what could be assumed, the acoustics and phonotactics of QF differs widely from that of France French (FF hereafter). To adequately segment QF, features like diphthongization of long vowels and affrication of coronal stops have to be taken into account. Thus acoustic models for automatic segmentation must be trained on speech samples exhibiting those phenomena. Dictionaries and lexicons must also be adapted and integrate differences in lexical units (such as very frequent words in QF that are not used in FF) and in the phonology of QF (such as the existence of tense and lax high vowels in QF but not in FF). This paper presents the development of linguistic resources to be included into the SPPAS software tool in order to get Text normalization, Phonetization, Alignment and Syllabification. We adapted the existing French lexicon and developed a QF-specific pronunciation dictionary. We then created an acoustic model from the existing ones and adapted it with 5 minutes of manually time-aligned data. These new resources are all freely distributed with SPPAS version 2.7; they perform the full process of speech segmentation in Quebec French.
Création de la notice
31/10/2020 12:52
Dernière modification de la notice
27/10/2023 7:09
Données d'usage