Developing Resources for Automated Speech Processing of Quebec French

Details

Ressource 1Download: COTE2020&Lancien_Bigi.pdf (1798.27 [Ko])
State: Public
Version: Final published version
License: CC BY-NC 4.0
Serval ID
serval:BIB_07821F4536D4
Type
Inproceedings: an article in a conference proceedings.
Collection
Publications
Institution
Title
Developing Resources for Automated Speech Processing of Quebec French
Title of the conference
Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020)
Author(s)
Lancien Mélanie, Côté Marie-Hélène, Bigi Brigitte
Publication state
Published
Issued date
2020
Pages
5323-5328
Language
english
Abstract
The analysis of the structure of speech nearly always rests on the alignment of the speech recording with a phonetic transcription. Nowadays several tools can perform this speech segmentation automatically. However, none of them carries out the automatic segmentation of Quebec French (QF hereafter) in a proper way. Contrary to what could be assumed, the acoustics and phonotactics of QF differs widely from that of France French (FF hereafter). To adequately segment QF, features like diphthongization of long vowels and affrication of coronal stops have to be taken into account. Thus acoustic models for automatic segmentation must be trained on speech samples exhibiting those phenomena. Dictionaries and lexicons must also be adapted and integrate differences in lexical units (such as very frequent words in QF that are not used in FF) and in the phonology of QF (such as the existence of tense and lax high vowels in QF but not in FF). This paper presents the development of linguistic resources to be included into the SPPAS software tool in order to get Text normalization, Phonetization, Alignment and Syllabification. We adapted the existing French lexicon and developed a QF-specific pronunciation dictionary. We then created an acoustic model from the existing ones and adapted it with 5 minutes of manually time-aligned data. These new resources are all freely distributed with SPPAS version 2.7; they perform the full process of speech segmentation in Quebec French.
Create date
31/10/2020 12:52
Last modification date
27/10/2023 7:09
Usage data