Logo ESPERANTO
ESPERANTO

Dissemination

    • Speaker embeddings by modeling channel-wise correlations

      Stafylakis T., Rohdin J., Burget L., Speaker embeddings by modeling channel-wise correlations, Interspeech 2021

      Read more

    • Log-Likelihood-Ratio Cost Function as Objective Loss for Speaker Verification Systems

      Mingote, V., Miguel, A., Ortega, A., Lleida, E. "Log-Likelihood-Ratio Cost Function as Objective Loss for Speaker Verification Systems" 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021. Brno, Czech Republic.

      Read more

    • Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021

      Gimeno, P; Ortega, A.; Miguel, A.; Lleida, E. "Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021" 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021. Brno, Czech Republic.

      Read more

    • The Domain Mismatch Problem in the Broadcast Speaker Attribution Task

      Viñals, I.; Ortega, A.; Miguel, A.; Lleida, E. "The Domain Mismatch Problem in the Broadcast Speaker Attribution Task". Applied Sciences, vol. 11, no. 18, p. 8521, Sept. 2021.

      Read more

    • Generalising AUC Optimisation to Multiclass Classification for Audio Segmentation with Limited Training Data

      Gimeno, P.; Ortega, A.; Miguel, A.; Lleida, E. "Generalising AUC Optimisation to Multiclass Classification for Audio Segmentation with Limited Training Data". IEEE Signal Processing Letters, 28 , pp. 1135-1139, 2021.

      Read more

    • The LIUM Human Active Correction Platform for Speaker Diarization

      Flucha, A., Larcher, A., Mehrish, A., Meignier, S., Plaut, F., Poupon, N., Prokopalo, Y., Puertolas, A., Shamsi, M., Tahon, M. (2021) The LIUM Human Active Correction Platform for Speaker Diarization. Proc. Interspeech 2021, 965-966

      Read more

    • MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification

      Mošner L., Plchot O., Burget L., Černock J. (202é) MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification. ICASSP 2022,

      Read more

    • Multi-channel Speaker Verification with Conv-TasNet Based Beamformer

      Mošner L., Plchot O., Burget L., Černock J. (2022) Multi-channel Speaker Verification with Conv-TasNet Based Beamformer. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

      Read more

    • Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch

      Silnova A., Stafylakis T., Mosner L., Plchot O., Rohdin J., Matejka P., Burget L, Glembek O., Brummer N.. (2022) Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch. Odyssey: The Speaker and Language Recognition Workshop 2022

      Read more

    • Development of ABC Systems for the 2021 Edition of NIST Speaker Recognition Evaluation

      Alam J., Beneš R., Beszédeš M., Burget L., Dahmane M., Fathan A., Ghodrati H., Glembek O., Kang W.H., Matĕjka P., Mošner L., Plchot O., Rohdin J., Silnova A., Stafylakis T.(2022) Development of ABC Systems for the 2021 Edition of NIST Speaker Recognition Evaluation. Odyssey: The Speaker and Language Recognition Workshop 2022

      Read more

    • Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings

      Brummer N., Swart A., Mosner L, Silnova A., Plchot O., Stafylakis T. Burget L.,(2022) Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings. Interspeech 2022

      Read more

    • Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundarie

      Stafylakis T., Mosner L, Plchot O., Rohdin J., Silnova A., Burget L., Cernocky J., (2022) Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundarie. Interspeech 2022

      Read more

    • Microphone Array Channel Combination Algorithms for Overlapped Speech Detection

      Mariotte T., Larcher A., Montresor S., Thomas J.-H. (2022) Microphone Array Channel Combination Algorithms for Overlapped Speech Detection. Interspeech 2022

      Read more

    • Phone-Level Pronunciation Scoring for Spanish Speakers Learning English Using a GOP-DNN System

      Vidal J., Bonomi C., Sancinetti M., Ferrer L.. (2022) Phone-Level Pronunciation Scoring for Spanish Speakers Learning English Using a GOP-DNN System. Interspeech 2022

      Read more

    • ViVoLAB System Description for the S2TC IberSPEECH-RTVE 2022 challenge

      Miguel, A., Ortega, A., & Lleida, E. (2022). ViVoLAB System Description for the S2TC IberSPEECH-RTVE 2022 challenge. Proc. IberSPEECH 2022, 284.

      Read more

    • End-to-End Speech Translation of Arabic to English Broadcast News.

      Bougares F. and Jouili S., End-to-End Speech Translation of Arabic to English Broadcast News, The Seventh Arabic Natural Language Processing Workshop (WANLP 2022)

      Read more

    • Automatic Voice Disorder Detection Using Self-Supervised Representations

      D. Ribas, M. A. Pastor, A. Miguel, D. Martínez, A. Ortega and E. Lleida, "Automatic Voice Disorder Detection Using Self-Supervised Representations," in IEEE Access, vol. 11, pp. 14915-14927, 2023, doi: 10.1109/ACCESS.2023.3243986.

      Read more

    • Class token and knowledge distillation for multi-head self-attention speaker verification systems

      Mingote, V.; Miguel, A.; Ortega, A.; Lleida, E. Class token and knowledge distillation for multi-head self-attention speaker verification systems. Digital Signal Processing, 2023, vol. 133, p. 103859.

      Read more

    • Multimodal Diarization Systems by Training Enrollment Models as Identity Representations

      Mingote, V.; Viñals, I.; Gimeno, P.; Miguel, A.; Ortega, A.; Lleida, E.
      Multimodal Diarization Systems by Training Enrollment Models as Identity Representations. Applied Sciences. 2022; 12(3):1141
      DOI:10.3390/app12031141, Corpus ID: 246348513

      Read more

    • aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems.

      Mingote, V.; Miguel, A.; Ribas D.; Ortega, A.; Lleida, E. aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 772-784, 2022, doi: 10.1109/TASLP.2022.3145307

      Read more

    • Unsupervised Adaptation of Deep Speech Activity Detection Models to Unseen Domains

      Gimeno, P.; Ribas, D.; Ortega, A.; Miguel, A.; Lleida, E. Unsupervised Adaptation of Deep Speech Activity Detection Models to Unseen Domains. Applied Sciences 2022, 12, 1832. https://doi.org/10.3390/app12041832

      Read more

    • Wiener Filter and Deep Neural Networks: A Well-Balanced Pair for Speech Enhancement

      Ribas, D.; Miguel, A.; Ortega, A.; Lleida, E. Wiener Filter and Deep Neural Networks: A Well-Balanced Pair for Speech Enhancement. Applied Sciences 2022, 12, 9000.
      DOI:10.3390/app12189000, Corpus ID: 252174387

      Read more

    • Shouted and whispered speech compensation for speaker verification systems

      Prieto, S.; Ortega, A.; López-Espejo, I.; Lleida, E. Shouted and whispered speech compensation for speaker verification systems. Digital Signal Processing, 2022, vol. 127, p. 103536.
      https://doi.org/10.1016/j.dsp.2022.103536

      Read more

    • On the Problem of Data Availability in Automatic Voice Disorder Detection

      Ribas, Dayana; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo
      On the Problem of Data Availability in Automatic Voice Disorder Detection. Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) – Volume 5: HEALTHINF, 2023, ISBN: 978-989-758-631-6.

      Read more

    • S3prl-Disorder: Open-Source Voice Disorder Detection System based in the Framework of S3PRL-toolkit

      Ribas, D., Yoldi, M.A.P., Miguel, A., Martínez, D., Ortega, A., Lleida, E. (2022) S3prl-Disorder: Open-Source Voice Disorder Detection System based in the Framework of S3PRL-toolkit . Proc. IberSPEECH 2022, 136-140,
      doi: 10.21437/IberSPEECH.2022-28

      Read more

    • A Study on the Use of wav2vec Representations for Multiclass Audio Segmentation

      Gimeno, P., Ortega, A., Miguel, A., Lleida, E. (2022) A Study on the Use of wav2vec Representations for Multiclass Audio Segmentation . Proc. IberSPEECH 2022, 56-60, doi: 10.21437/IberSPEECH.2022-1210.21437/IberSPEECH.2022-12

      Read more

    • A Transfer Learning Approach for Pronunciation Scoring

      M. Sancinetti, J. Vidal, C. Bonomi and L. Ferrer, "A Transfer Learning Approach for Pronunciation Scoring," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 2022, pp. 6812-6816, doi: 10.1109/ICASSP43922.2022.9747727.

      Read more

    • Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing

      Kakouros S., T. Stafylakis, L. Mošner and L. Burget, "Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109/ICASSP49357.2023.10094673.

      Read more

    • Strategies for Improving Low Resource Speech to Text Translation Relying on Pre-trained ASR Models

      Kesiraju, S., Sarvaš, M., Pavlíček, T., Macaire, C., Ciuba, A. (2023) Strategies for Improving Low Resource Speech to Text Translation Relying on Pre-trained ASR Models. Proc. INTERSPEECH 2023, 2148-2152, doi: 10.21437/Interspeech.2023-2506

      Read more

    • Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization

      Landini F., M Diez, A. Lozano-Diez and L. Burget, "Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. _, doi: 10.1109/ICASSP49357.2023.10097049.

      Read more

    • Semantic Enrichment Towards Efficient Speech Representations

      Laperrière, G., Nguyen, H., Ghannay, S., Jabaian, B., Estève, Y. (2023) Semantic Enrichment Towards Efficient Speech Representations. Proc. INTERSPEECH 2023, 705-709, doi: 10.21437/Interspeech.2023-2234

      Read more

    • An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies

      Lleida E.; Rodriguez-Fuentes, L.J.; Tejedor; Ortega, A.; Miguel, A.; Bazán, V.; Pérez, C.; de Prada, A.; Penagarikano, M.; Varona, A.; et al. An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies. Appl. Sci. 2023, 13, 8577. https://doi.org/10.3390/app13158577

      Read more

    • An Attention-Based Backend Allowing Efficient Fine-Tuning of Transformer Models for Speaker Verification,

      Peng J., O Plchot, T. Stafylakis, L. Mošner, L. Burget and J. Černocký, "An Attention-Based Backend Allowing Efficient Fine-Tuning of Transformer Models for Speaker Verification," 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, 2023, pp. 555-562, doi: 10.1109/SLT54892.2023.10022775.

      Read more

    • Cross-Corpus Training Strategy for Speech Emotion Recognition Using Self-Supervised Representations

      Cross-Corpus Training Strategy for Speech Emotion Recognition Using Self-Supervised Representations
      Pastor, M.A.; Ribas, D.; Ortega, A.; Miguel, A.; Lleida, E. Cross-Corpus Training Strategy for Speech Emotion Recognition Using Self-Supervised Representations. Appl. Sci. 2023, 13, 9062.
      https://doi.org/10.3390/app13169062

      Read more

    • Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters

      Peng J., Themos Stafylakis, Rongzhi Gu, Oldřich Plchot, Ladislav Mošner, Lukáš Burget, Jan Černocký., "Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109/ICASSP49357.2023.10094795.

      Read more

    • Extracting Speaker and Emotion Information from Self-Supervised Speech Models via Channel-Wise Correlations

      Stafylakis T., L. Mošner, S. Kakouros, O. Plchot, L. Burget and J. Ćernocký, "Extracting Speaker and Emotion Information from Self-Supervised Speech Models via Channel-Wise Correlations," 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, 2023, pp. 1136-1143, doi: 10.1109/SLT54892.2023.10023345.

      Read more

    • Direct Text to Speech Translation System Using Acoustic Units

      Mingote V., P. Gimeno, L. Vicente, S. Khurana, A. Laurent and J. Duret, "Direct Text to Speech Translation System Using Acoustic Units," in IEEE Signal Processing Letters, vol. 30, pp. 1262-1266, 2023, doi: 10.1109/LSP.2023.3313513.

      Read more

    • ON-TRAC Consortium Systems for the IWSLT 2023 Dialectal and Low-resource Speech Translation Tasks.

      Antoine Laurent, Souhir Gahbiche, Ha Nguyen, Haroun Elleuch, Fethi Bougares, Antoine Thiol, Hugo Riguidel, Salima Mdhaffar, Gaëlle Laperrière, Lucas Maison, Sameer Khurana, and Yannick Estève. 2023. ON-TRAC Consortium Systems for the IWSLT 2023 Dialectal and Low-resource Speech Translation Tasks. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 219–226, Toronto, Canada (in-person and online). Association for Computational Linguistics.

      Read more

    • Toroidal Probabilistic Spherical Discriminant Analysis

      A. Silnova, N. Brümmer, A. Swart and L. Burget, "Toroidal Probabilistic Spherical Discriminant Analysis," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109/ICASSP49357.2023.10095580.

      Read more

    • Description and Analysis of ABC Submission to NIST LRE 2022

      Matejka, P., Silnova, A., Slavíček, J., Mosner, L., Plchot, O., Klčo, M., Peng, J., Stafylakis, T., Burget, L. (2023) Description and Analysis of ABC Submission to NIST LRE 2022. Proc. INTERSPEECH 2023, 511-515, doi: 10.21437/Interspeech.2023-1529

      Read more

    • Improving Speaker Verification with Self-Pretrained Transformer Models

      Peng, J., Plchot, O., Stafylakis, T., Mosner, L., Burget, L., Černocký, J.". (2023) Improving Speaker Verification with Self-Pretrained Transformer Models. Proc. INTERSPEECH 2023, 5361-5365, doi: 10.21437/Interspeech.2023-453

      Read more

    • Multi-Channel Speech Separation with Cross-Attention and Beamforming

      Mosner, L., Plchot, O., Peng, J., Burget, L., Černocký, J.". (2023) Multi-Channel Speech Separation with Cross-Attention and Beamforming. Proc. INTERSPEECH 2023, 1693-1697, doi: 10.21437/Interspeech.2023-2537

      Read more

    • BUT Systems for IWSLT 2023 Marathi - Hindi Low Resource Speech Translation Task

      Kesiraju, S., Beneš, K., Tikhonov, M., & Černocký, J.H. (2023). BUT Systems for IWSLT 2023 Marathi - Hindi Low Resource Speech Translation Task. International Workshop on Spoken Language Translation.

      Read more

    • Improving Speaker Diarization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus

      Rahim M. Z., S. S. Juan and F. S. Mohamad, "Improving Speaker Diarization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus," 2023 International Conference on Asian Language Processing (IALP), Singapore, Singapore, 2023, pp. 228-233, doi: 10.1109/IALP61005.2023.10337314.

      Read more

    • Improved Vocal Effort Transfer Vector Estimation For Vocal Effort-Robust Speaker Verification

      I. López-Espejo, S. Prieto, A. Ortega and E. Lleida, "Improved Vocal Effort Transfer Vector Estimation For Vocal Effort-Robust Speaker Verification," 2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP), Rome, Italy, 2023, pp. 1-6, doi: 10.1109/MLSP55844.2023.10285923.

      Read more

    • Towards Lifelong Human Assisted Speaker Diarization

      Meysam Shamsi, Anthony Larcher , Loïc Barrault,, Sylvain Meignier , Yevheni Prokopalo, Marie Tahon, Ambuj Mehrish, Simon Petitrenaud, Olivier Galibert, Samuel Gaist, André Anjos, Sébastien Marcel, Marta Costa-Jussà, Towards Lifelong Human Assisted Speaker Diarization, Computer Speech and Language, 2023, 77, pp.101437. ⟨10.1016/j.csl.2022.101437⟩.

      Read more

    • Active Correction for Incremental Speaker Diarization of a Collection with Human in the Loop

      Yevhenii Prokopalo, Meysam Shamsi, Loïc Barrault, Sylvain Meignier, Anthony Larcher. Active Correction for Incremental Speaker Diarization of a Collection with Human in the Loop. Applied Sciences, 2022, ⟨10.3390/app12041782⟩. ⟨hal-03563148⟩

      Read more

    • An explainable proxy model for multilabel audio segmentation

      Mariotte Théo , Antonio Almudévar, Marie Tahon, Alfonso Ortega. An explainable proxy model for multilabel audio segmentation. International Conference on Acoustics Speech and Signal Processing, IEEE, Apr 2024, Seoul (Korea).

      Read more

    • Predefined Prototypes for Intra-Class Separation and Disentanglement

      Almudévar Antonio, Théo Mariotte, Alfonso Ortega, Marie Tahon, Luis Vicente, Antonio Miguel, Eduardo Lleida. Predefined Prototypes for Intra-Class Separation and Disentanglement. Interspeech 2024,

      Read more

    • A Phonetic Analysis of Speaker Verification Systems through Phoneme selection and Integrated Gradients

      Thebaud Thomas , Gabriel Hernandez Sierra, Sarah Flora Samson Juan, Marie Tahon. A Phonetic Analysis of Speaker Verification Systems through Phoneme selection and Integrated Gradients. Speaker and Language Recognition Workshop - Odyssey, Jun 2024, Quebec, Canada.

      Read more

    • 3MAS: a multitask, multilabel, multidataset semi-supervised audio segmentation model

      Lebourdais Martin, Pablo Gimeno, Théo Mariotte, Marie Tahon, Alfonso Ortega, Anthony Larcher. 3MAS: a multitask, multilabel, multidataset semi-supervised audio segmentation model.Speaker and Language Recognition Workshop Odyssey, Jun 2024, Quebec, Canada.

      Read more

    • Automatic Speech Interruption Detection: Analysis, Corpus, and System

      Lebourdais Martin, Marie Tahon, Antoine Laurent, Sylvain Meignier. Automatic Speech Interruption Detection: Analysis, Corpus, and System. Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-Coling 2024), ELRA Language Resources Association (ELRA); International Committee on Computational Linguistics (ICCL), May 2024, Torino, Italy.

      Read more

    • ALLIES: A Speech Corpus for Segmentation, Speaker Diarization Speech Recognition and Speaker Change Detection

      Tahon Marie, Anthony Larcher, Martin Lebourdais, Fethi Bougares, Ana Silnova, Pablo Gimeno. ALLIES: A Speech Corpus for Segmentation, Speaker Diarization Speech Recognition and Speaker Change Detection. Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-Coling 2024), ELRA Language Resources Association (ELRA); International Committee on Computational Linguistics (ICCL), May 2024, Torino, Italy. <hal-04578441>

      Read more

    • Interprétabilité pour l’identification de locuteurs. Retour sur le projet JSALT 2023

      Tahon Marie, Imen Ben Amor, Nicolas Dugué, Jean-François Bonastre. Interprétabilité pour l’identification de locuteurs. Retour sur le projet JSALT 2023. Actes de la Journée Extraction de connaissances interprétables pour l’étude de la communication parlée. Journée commune AFIA-TLH -AFCP, 2023. <hal-04489273>

      Read more

    • The CONILIUM proposition for Odyssey Emotion Challenge Leveraging major class with complex annotations

      Shamsi Meysam, Lara Gauder, Marie Tahon. The CONILIUM proposition for Odyssey Emotion Challenge Leveraging major class with complex annotations. The Speaker and Language Recognition Workshop (Odyssey), Jun 2024, Quebec, Canada. hal-04600047

      Read more

Partagez :