Logo ESPERANTO
ESPERANTO

Publications

    • The Domain Mismatch Problem in the Broadcast Speaker Attribution Task

      Viñals, I.; Ortega, A.; Miguel, A.; Lleida, E. "The Domain Mismatch Problem in the Broadcast Speaker Attribution Task". Applied Sciences, vol. 11, no. 18, p. 8521, Sept. 2021.

      Read more

    • Generalising AUC Optimisation to Multiclass Classification for Audio Segmentation with Limited Training Data

      Gimeno, P.; Ortega, A.; Miguel, A.; Lleida, E. "Generalising AUC Optimisation to Multiclass Classification for Audio Segmentation with Limited Training Data". IEEE Signal Processing Letters, 28 , pp. 1135-1139, 2021.

      Read more

    • Automatic Voice Disorder Detection Using Self-Supervised Representations

      D. Ribas, M. A. Pastor, A. Miguel, D. Martínez, A. Ortega and E. Lleida, "Automatic Voice Disorder Detection Using Self-Supervised Representations," in IEEE Access, vol. 11, pp. 14915-14927, 2023, doi: 10.1109/ACCESS.2023.3243986.

      Read more

    • Class token and knowledge distillation for multi-head self-attention speaker verification systems

      Mingote, V.; Miguel, A.; Ortega, A.; Lleida, E. Class token and knowledge distillation for multi-head self-attention speaker verification systems. Digital Signal Processing, 2023, vol. 133, p. 103859.

      Read more

    • Multimodal Diarization Systems by Training Enrollment Models as Identity Representations

      Mingote, V.; Viñals, I.; Gimeno, P.; Miguel, A.; Ortega, A.; Lleida, E.
      Multimodal Diarization Systems by Training Enrollment Models as Identity Representations. Applied Sciences. 2022; 12(3):1141
      DOI:10.3390/app12031141, Corpus ID: 246348513

      Read more

    • aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems.

      Mingote, V.; Miguel, A.; Ribas D.; Ortega, A.; Lleida, E. aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 772-784, 2022, doi: 10.1109/TASLP.2022.3145307

      Read more

    • Unsupervised Adaptation of Deep Speech Activity Detection Models to Unseen Domains

      Gimeno, P.; Ribas, D.; Ortega, A.; Miguel, A.; Lleida, E. Unsupervised Adaptation of Deep Speech Activity Detection Models to Unseen Domains. Applied Sciences 2022, 12, 1832. https://doi.org/10.3390/app12041832

      Read more

    • Wiener Filter and Deep Neural Networks: A Well-Balanced Pair for Speech Enhancement

      Ribas, D.; Miguel, A.; Ortega, A.; Lleida, E. Wiener Filter and Deep Neural Networks: A Well-Balanced Pair for Speech Enhancement. Applied Sciences 2022, 12, 9000.
      DOI:10.3390/app12189000, Corpus ID: 252174387

      Read more

    • Shouted and whispered speech compensation for speaker verification systems

      Prieto, S.; Ortega, A.; López-Espejo, I.; Lleida, E. Shouted and whispered speech compensation for speaker verification systems. Digital Signal Processing, 2022, vol. 127, p. 103536.
      https://doi.org/10.1016/j.dsp.2022.103536

      Read more

    • An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies

      Lleida E.; Rodriguez-Fuentes, L.J.; Tejedor; Ortega, A.; Miguel, A.; Bazán, V.; Pérez, C.; de Prada, A.; Penagarikano, M.; Varona, A.; et al. An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies. Appl. Sci. 2023, 13, 8577. https://doi.org/10.3390/app13158577

      Read more

    • Cross-Corpus Training Strategy for Speech Emotion Recognition Using Self-Supervised Representations

      Cross-Corpus Training Strategy for Speech Emotion Recognition Using Self-Supervised Representations
      Pastor, M.A.; Ribas, D.; Ortega, A.; Miguel, A.; Lleida, E. Cross-Corpus Training Strategy for Speech Emotion Recognition Using Self-Supervised Representations. Appl. Sci. 2023, 13, 9062.
      https://doi.org/10.3390/app13169062

      Read more

    • Direct Text to Speech Translation System Using Acoustic Units

      Mingote V., P. Gimeno, L. Vicente, S. Khurana, A. Laurent and J. Duret, "Direct Text to Speech Translation System Using Acoustic Units," in IEEE Signal Processing Letters, vol. 30, pp. 1262-1266, 2023, doi: 10.1109/LSP.2023.3313513.

      Read more

    • Towards Lifelong Human Assisted Speaker Diarization

      Meysam Shamsi, Anthony Larcher , Loïc Barrault,, Sylvain Meignier , Yevheni Prokopalo, Marie Tahon, Ambuj Mehrish, Simon Petitrenaud, Olivier Galibert, Samuel Gaist, André Anjos, Sébastien Marcel, Marta Costa-Jussà, Towards Lifelong Human Assisted Speaker Diarization, Computer Speech and Language, 2023, 77, pp.101437. ⟨10.1016/j.csl.2022.101437⟩.

      Read more

    • Active Correction for Incremental Speaker Diarization of a Collection with Human in the Loop

      Yevhenii Prokopalo, Meysam Shamsi, Loïc Barrault, Sylvain Meignier, Anthony Larcher. Active Correction for Incremental Speaker Diarization of a Collection with Human in the Loop. Applied Sciences, 2022, ⟨10.3390/app12041782⟩. ⟨hal-03563148⟩

      Read more

Partagez :