Logo ESPERANTO
ESPERANTO

Conferences

    • Speaker embeddings by modeling channel-wise correlations

      Stafylakis T., Rohdin J., Burget L., Speaker embeddings by modeling channel-wise correlations, Interspeech 2021

      Read more

    • Log-Likelihood-Ratio Cost Function as Objective Loss for Speaker Verification Systems

      Mingote, V., Miguel, A., Ortega, A., Lleida, E. "Log-Likelihood-Ratio Cost Function as Objective Loss for Speaker Verification Systems" 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021. Brno, Czech Republic.

      Read more

    • Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021

      Gimeno, P; Ortega, A.; Miguel, A.; Lleida, E. "Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021" 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021. Brno, Czech Republic.

      Read more

    • The LIUM Human Active Correction Platform for Speaker Diarization

      Flucha, A., Larcher, A., Mehrish, A., Meignier, S., Plaut, F., Poupon, N., Prokopalo, Y., Puertolas, A., Shamsi, M., Tahon, M. (2021) The LIUM Human Active Correction Platform for Speaker Diarization. Proc. Interspeech 2021, 965-966

      Read more

    • MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification

      Mošner L., Plchot O., Burget L., Černock J. (202é) MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification. ICASSP 2022,

      Read more

    • Multi-channel Speaker Verification with Conv-TasNet Based Beamformer

      Mošner L., Plchot O., Burget L., Černock J. (2022) Multi-channel Speaker Verification with Conv-TasNet Based Beamformer. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

      Read more

    • Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch

      Silnova A., Stafylakis T., Mosner L., Plchot O., Rohdin J., Matejka P., Burget L, Glembek O., Brummer N.. (2022) Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch. Odyssey: The Speaker and Language Recognition Workshop 2022

      Read more

    • Development of ABC Systems for the 2021 Edition of NIST Speaker Recognition Evaluation

      Alam J., Beneš R., Beszédeš M., Burget L., Dahmane M., Fathan A., Ghodrati H., Glembek O., Kang W.H., Matĕjka P., Mošner L., Plchot O., Rohdin J., Silnova A., Stafylakis T.(2022) Development of ABC Systems for the 2021 Edition of NIST Speaker Recognition Evaluation. Odyssey: The Speaker and Language Recognition Workshop 2022

      Read more

    • Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings

      Brummer N., Swart A., Mosner L, Silnova A., Plchot O., Stafylakis T. Burget L.,(2022) Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings. Interspeech 2022

      Read more

    • Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundarie

      Stafylakis T., Mosner L, Plchot O., Rohdin J., Silnova A., Burget L., Cernocky J., (2022) Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundarie. Interspeech 2022

      Read more

    • Microphone Array Channel Combination Algorithms for Overlapped Speech Detection

      Mariotte T., Larcher A., Montresor S., Thomas J.-H. (2022) Microphone Array Channel Combination Algorithms for Overlapped Speech Detection. Interspeech 2022

      Read more

    • Phone-Level Pronunciation Scoring for Spanish Speakers Learning English Using a GOP-DNN System

      Vidal J., Bonomi C., Sancinetti M., Ferrer L.. (2022) Phone-Level Pronunciation Scoring for Spanish Speakers Learning English Using a GOP-DNN System. Interspeech 2022

      Read more

    • ViVoLAB System Description for the S2TC IberSPEECH-RTVE 2022 challenge

      Miguel, A., Ortega, A., & Lleida, E. (2022). ViVoLAB System Description for the S2TC IberSPEECH-RTVE 2022 challenge. Proc. IberSPEECH 2022, 284.

      Read more

    • End-to-End Speech Translation of Arabic to English Broadcast News.

      Bougares F. and Jouili S., End-to-End Speech Translation of Arabic to English Broadcast News, The Seventh Arabic Natural Language Processing Workshop (WANLP 2022)

      Read more

    • On the Problem of Data Availability in Automatic Voice Disorder Detection

      Ribas, Dayana; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo
      On the Problem of Data Availability in Automatic Voice Disorder Detection. Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) – Volume 5: HEALTHINF, 2023, ISBN: 978-989-758-631-6.

      Read more

    • S3prl-Disorder: Open-Source Voice Disorder Detection System based in the Framework of S3PRL-toolkit

      Ribas, D., Yoldi, M.A.P., Miguel, A., Martínez, D., Ortega, A., Lleida, E. (2022) S3prl-Disorder: Open-Source Voice Disorder Detection System based in the Framework of S3PRL-toolkit . Proc. IberSPEECH 2022, 136-140,
      doi: 10.21437/IberSPEECH.2022-28

      Read more

    • A Study on the Use of wav2vec Representations for Multiclass Audio Segmentation

      Gimeno, P., Ortega, A., Miguel, A., Lleida, E. (2022) A Study on the Use of wav2vec Representations for Multiclass Audio Segmentation . Proc. IberSPEECH 2022, 56-60, doi: 10.21437/IberSPEECH.2022-1210.21437/IberSPEECH.2022-12

      Read more

    • A Transfer Learning Approach for Pronunciation Scoring

      M. Sancinetti, J. Vidal, C. Bonomi and L. Ferrer, "A Transfer Learning Approach for Pronunciation Scoring," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 2022, pp. 6812-6816, doi: 10.1109/ICASSP43922.2022.9747727.

      Read more

    • Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing

      Kakouros S., T. Stafylakis, L. Mošner and L. Burget, "Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109/ICASSP49357.2023.10094673.

      Read more

    • Strategies for Improving Low Resource Speech to Text Translation Relying on Pre-trained ASR Models

      Kesiraju, S., Sarvaš, M., Pavlíček, T., Macaire, C., Ciuba, A. (2023) Strategies for Improving Low Resource Speech to Text Translation Relying on Pre-trained ASR Models. Proc. INTERSPEECH 2023, 2148-2152, doi: 10.21437/Interspeech.2023-2506

      Read more

    • Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization

      Landini F., M Diez, A. Lozano-Diez and L. Burget, "Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. _, doi: 10.1109/ICASSP49357.2023.10097049.

      Read more

    • Semantic Enrichment Towards Efficient Speech Representations

      Laperrière, G., Nguyen, H., Ghannay, S., Jabaian, B., Estève, Y. (2023) Semantic Enrichment Towards Efficient Speech Representations. Proc. INTERSPEECH 2023, 705-709, doi: 10.21437/Interspeech.2023-2234

      Read more

    • An Attention-Based Backend Allowing Efficient Fine-Tuning of Transformer Models for Speaker Verification,

      Peng J., O Plchot, T. Stafylakis, L. Mošner, L. Burget and J. Černocký, "An Attention-Based Backend Allowing Efficient Fine-Tuning of Transformer Models for Speaker Verification," 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, 2023, pp. 555-562, doi: 10.1109/SLT54892.2023.10022775.

      Read more

    • Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters

      Peng J., Themos Stafylakis, Rongzhi Gu, Oldřich Plchot, Ladislav Mošner, Lukáš Burget, Jan Černocký., "Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109/ICASSP49357.2023.10094795.

      Read more

    • Extracting Speaker and Emotion Information from Self-Supervised Speech Models via Channel-Wise Correlations

      Stafylakis T., L. Mošner, S. Kakouros, O. Plchot, L. Burget and J. Ćernocký, "Extracting Speaker and Emotion Information from Self-Supervised Speech Models via Channel-Wise Correlations," 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, 2023, pp. 1136-1143, doi: 10.1109/SLT54892.2023.10023345.

      Read more

    • ON-TRAC Consortium Systems for the IWSLT 2023 Dialectal and Low-resource Speech Translation Tasks.

      Antoine Laurent, Souhir Gahbiche, Ha Nguyen, Haroun Elleuch, Fethi Bougares, Antoine Thiol, Hugo Riguidel, Salima Mdhaffar, Gaëlle Laperrière, Lucas Maison, Sameer Khurana, and Yannick Estève. 2023. ON-TRAC Consortium Systems for the IWSLT 2023 Dialectal and Low-resource Speech Translation Tasks. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 219–226, Toronto, Canada (in-person and online). Association for Computational Linguistics.

      Read more

    • Toroidal Probabilistic Spherical Discriminant Analysis

      A. Silnova, N. Brümmer, A. Swart and L. Burget, "Toroidal Probabilistic Spherical Discriminant Analysis," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109/ICASSP49357.2023.10095580.

      Read more

    • Description and Analysis of ABC Submission to NIST LRE 2022

      Matejka, P., Silnova, A., Slavíček, J., Mosner, L., Plchot, O., Klčo, M., Peng, J., Stafylakis, T., Burget, L. (2023) Description and Analysis of ABC Submission to NIST LRE 2022. Proc. INTERSPEECH 2023, 511-515, doi: 10.21437/Interspeech.2023-1529

      Read more

    • Improving Speaker Verification with Self-Pretrained Transformer Models

      Peng, J., Plchot, O., Stafylakis, T., Mosner, L., Burget, L., Černocký, J.". (2023) Improving Speaker Verification with Self-Pretrained Transformer Models. Proc. INTERSPEECH 2023, 5361-5365, doi: 10.21437/Interspeech.2023-453

      Read more

    • Multi-Channel Speech Separation with Cross-Attention and Beamforming

      Mosner, L., Plchot, O., Peng, J., Burget, L., Černocký, J.". (2023) Multi-Channel Speech Separation with Cross-Attention and Beamforming. Proc. INTERSPEECH 2023, 1693-1697, doi: 10.21437/Interspeech.2023-2537

      Read more

    • BUT Systems for IWSLT 2023 Marathi - Hindi Low Resource Speech Translation Task

      Kesiraju, S., Beneš, K., Tikhonov, M., & Černocký, J.H. (2023). BUT Systems for IWSLT 2023 Marathi - Hindi Low Resource Speech Translation Task. International Workshop on Spoken Language Translation.

      Read more

    • Improving Speaker Diarization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus

      Rahim M. Z., S. S. Juan and F. S. Mohamad, "Improving Speaker Diarization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus," 2023 International Conference on Asian Language Processing (IALP), Singapore, Singapore, 2023, pp. 228-233, doi: 10.1109/IALP61005.2023.10337314.

      Read more

    • Improved Vocal Effort Transfer Vector Estimation For Vocal Effort-Robust Speaker Verification

      I. López-Espejo, S. Prieto, A. Ortega and E. Lleida, "Improved Vocal Effort Transfer Vector Estimation For Vocal Effort-Robust Speaker Verification," 2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP), Rome, Italy, 2023, pp. 1-6, doi: 10.1109/MLSP55844.2023.10285923.

      Read more

    • An explainable proxy model for multilabel audio segmentation

      Mariotte Théo , Antonio Almudévar, Marie Tahon, Alfonso Ortega. An explainable proxy model for multilabel audio segmentation. International Conference on Acoustics Speech and Signal Processing, IEEE, Apr 2024, Seoul (Korea).

      Read more

    • Predefined Prototypes for Intra-Class Separation and Disentanglement

      Almudévar Antonio, Théo Mariotte, Alfonso Ortega, Marie Tahon, Luis Vicente, Antonio Miguel, Eduardo Lleida. Predefined Prototypes for Intra-Class Separation and Disentanglement. Interspeech 2024,

      Read more

    • A Phonetic Analysis of Speaker Verification Systems through Phoneme selection and Integrated Gradients

      Thebaud Thomas , Gabriel Hernandez Sierra, Sarah Flora Samson Juan, Marie Tahon. A Phonetic Analysis of Speaker Verification Systems through Phoneme selection and Integrated Gradients. Speaker and Language Recognition Workshop - Odyssey, Jun 2024, Quebec, Canada.

      Read more

    • 3MAS: a multitask, multilabel, multidataset semi-supervised audio segmentation model

      Lebourdais Martin, Pablo Gimeno, Théo Mariotte, Marie Tahon, Alfonso Ortega, Anthony Larcher. 3MAS: a multitask, multilabel, multidataset semi-supervised audio segmentation model.Speaker and Language Recognition Workshop Odyssey, Jun 2024, Quebec, Canada.

      Read more

    • Automatic Speech Interruption Detection: Analysis, Corpus, and System

      Lebourdais Martin, Marie Tahon, Antoine Laurent, Sylvain Meignier. Automatic Speech Interruption Detection: Analysis, Corpus, and System. Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-Coling 2024), ELRA Language Resources Association (ELRA); International Committee on Computational Linguistics (ICCL), May 2024, Torino, Italy.

      Read more

    • ALLIES: A Speech Corpus for Segmentation, Speaker Diarization Speech Recognition and Speaker Change Detection

      Tahon Marie, Anthony Larcher, Martin Lebourdais, Fethi Bougares, Ana Silnova, Pablo Gimeno. ALLIES: A Speech Corpus for Segmentation, Speaker Diarization Speech Recognition and Speaker Change Detection. Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-Coling 2024), ELRA Language Resources Association (ELRA); International Committee on Computational Linguistics (ICCL), May 2024, Torino, Italy. <hal-04578441>

      Read more

    • Interprétabilité pour l’identification de locuteurs. Retour sur le projet JSALT 2023

      Tahon Marie, Imen Ben Amor, Nicolas Dugué, Jean-François Bonastre. Interprétabilité pour l’identification de locuteurs. Retour sur le projet JSALT 2023. Actes de la Journée Extraction de connaissances interprétables pour l’étude de la communication parlée. Journée commune AFIA-TLH -AFCP, 2023. <hal-04489273>

      Read more

    • The CONILIUM proposition for Odyssey Emotion Challenge Leveraging major class with complex annotations

      Shamsi Meysam, Lara Gauder, Marie Tahon. The CONILIUM proposition for Odyssey Emotion Challenge Leveraging major class with complex annotations. The Speaker and Language Recognition Workshop (Odyssey), Jun 2024, Quebec, Canada. hal-04600047

      Read more

Partagez :