Tahon_LREC-Coling 2024

ALLIES: A Speech Corpus for Segmentation, Speaker Diarization Speech Recognition and Speaker Change DetectionMarie Tahon, Anthony Larcher, Martin Lebourdais, Fethi Bougares, Ana Silnova, Pablo Gimeno

LIUM - Laboratoire d'Informatique de l'Université du Mans , LST - Equipe Language and Speech Technology

Elyadata

Brno University of Technology, Speech@FIT, Brno, Czechia

VivoLab, Universidad de Zaragoza, Spain

Contact: {marie.tahon, anthony.larcher}@univ-lemans.fr

doi:

This paper presents ALLIES, a meta corpus which gathers and extends existing French corpora collected from radio and TV shows. The corpus contains 1048 audio files for about 500 hours of speech. Agglomeration of data is always a difficult issue, as the guidelines used to collect, annotate and transcribe speech are generally different from one corpus to another. ALLIES intends to homogenize and correct speaker labels among the different files by integrated human feedback within a speaker verification system.

The main contribution of this article is the design of a protocol in order to evaluate properly speech segmentation (including music and overlap detection), speaker diarization, speech transcription and speaker change detection. As part of it, a test partition has been carefully manually

segmented and annotated according to speech, music, noise, speaker labels with specific guidelines for overlap speech,
orthographically transcribed. This article also provides as a second contribution baseline results for several speech processing tasks.

Read the PDF

Published on June 4, 2024

Partagez :