Logo ESPERANTO
ESPERANTO

Active Correction for Incremental Speaker Diarization of a Collection with Human in the LoopYevhenii Prokopalo, Meysam Shamsi, Loïc Barrault, Sylvain Meignier, Anthony Larcher

  LIUM - Laboratoire d'Informatique de l'Université du Mans
  Institut Informatique Claude Chappe

 

 

 

Contact: (meysam.shamsi, anthony.larcher, sylvain.meignier)@univ-lemans.fr

 

DOI: https://doi.org/10.3390/app12041782

State of the art diarization systems now achieve decent performance but those performances are often not good enough to deploy them without any human supervision. Additionally, most approaches focus on single audio files while many use cases involving multiple recordings with recurrent speakers require the incremental processing of a collection. In this paper, we propose a framework that solicits a human in the loop to correct the clustering by answering simple questions.

After defining the nature of the questions for both single file and collection of files, we propose two algorithms to list those questions and associated stopping criteria that are necessary to limit the work load on the human in the loop. Experiments performed on the ALLIES dataset show that a limited interaction with a human expert can lead to considerable improvement of up to 36.5% relative diarization error rate (DER) for single files and 33.29% for a collection.

 

 

 

Read the PDF

Partagez :