Stafylakis T., Rohdin J., Burget L., Speaker embeddings by modeling channel-wise correlations, Interspeech 2021
Mingote, V., Miguel, A., Ortega, A., Lleida, E. "Log-Likelihood-Ratio Cost Function as Objective Loss for Speaker Verification Systems" 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021. Brno, Czech Republic.
Gimeno, P; Ortega, A.; Miguel, A.; Lleida, E. "Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021" 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021. Brno, Czech Republic.
Viñals, I.; Ortega, A.; Miguel, A.; Lleida, E. "The Domain Mismatch Problem in the Broadcast Speaker Attribution Task". Applied Sciences, vol. 11, no. 18, p. 8521, Sept. 2021.
Gimeno, P.; Ortega, A.; Miguel, A.; Lleida, E. "Generalising AUC Optimisation to Multiclass Classification for Audio Segmentation with Limited Training Data". IEEE Signal Processing Letters, 28 , pp. 1135-1139, 2021.
Flucha, A., Larcher, A., Mehrish, A., Meignier, S., Plaut, F., Poupon, N., Prokopalo, Y., Puertolas, A., Shamsi, M., Tahon, M. (2021) The LIUM Human Active Correction Platform for Speaker Diarization. Proc. Interspeech 2021, 965-966
Mošner L., Plchot O., Burget L., Černock J. (202é) MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification. ICASSP 2022,
Mošner L., Plchot O., Burget L., Černock J. (2022) Multi-channel Speaker Verification with Conv-TasNet Based Beamformer. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Silnova A., Stafylakis T., Mosner L., Plchot O., Rohdin J., Matejka P., Burget L, Glembek O., Brummer N.. (2022) Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch. Odyssey: The Speaker and Language Recognition Workshop 2022
Alam J., Beneš R., Beszédeš M., Burget L., Dahmane M., Fathan A., Ghodrati H., Glembek O., Kang W.H., Matĕjka P., Mošner L., Plchot O., Rohdin J., Silnova A., Stafylakis T.(2022) Development of ABC Systems for the 2021 Edition of NIST Speaker Recognition Evaluation. Odyssey: The Speaker and Language Recognition Workshop 2022
Brummer N., Swart A., Mosner L, Silnova A., Plchot O., Stafylakis T. Burget L.,(2022) Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings. Interspeech 2022
Stafylakis T., Mosner L, Plchot O., Rohdin J., Silnova A., Burget L., Cernocky J., (2022) Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundarie. Interspeech 2022
Mariotte T., Larcher A., Montresor S., Thomas J.-H. (2022) Microphone Array Channel Combination Algorithms for Overlapped Speech Detection. Interspeech 2022
Vidal J., Bonomi C., Sancinetti M., Ferrer L.. (2022) Phone-Level Pronunciation Scoring for Spanish Speakers Learning English Using a GOP-DNN System. Interspeech 2022
Miguel, A., Ortega, A., & Lleida, E. (2022). ViVoLAB System Description for the S2TC IberSPEECH-RTVE 2022 challenge. Proc. IberSPEECH 2022, 284.
Bougares F. and Jouili S., End-to-End Speech Translation of Arabic to English Broadcast News, The Seventh Arabic Natural Language Processing Workshop (WANLP 2022)
D. Ribas, M. A. Pastor, A. Miguel, D. Martínez, A. Ortega and E. Lleida, "Automatic Voice Disorder Detection Using Self-Supervised Representations," in IEEE Access, vol. 11, pp. 14915-14927, 2023, doi: 10.1109/ACCESS.2023.3243986.
Mingote, V.; Miguel, A.; Ortega, A.; Lleida, E. Class token and knowledge distillation for multi-head self-attention speaker verification systems. Digital Signal Processing, 2023, vol. 133, p. 103859.
Mingote, V.; Viñals, I.; Gimeno, P.; Miguel, A.; Ortega, A.; Lleida, E.
Multimodal Diarization Systems by Training Enrollment Models as Identity Representations. Applied Sciences. 2022; 12(3):1141
DOI:10.3390/app12031141, Corpus ID: 246348513
Mingote, V.; Miguel, A.; Ribas D.; Ortega, A.; Lleida, E. aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 772-784, 2022, doi: 10.1109/TASLP.2022.3145307
Gimeno, P.; Ribas, D.; Ortega, A.; Miguel, A.; Lleida, E. Unsupervised Adaptation of Deep Speech Activity Detection Models to Unseen Domains. Applied Sciences 2022, 12, 1832. https://doi.org/10.3390/app12041832
Ribas, D.; Miguel, A.; Ortega, A.; Lleida, E. Wiener Filter and Deep Neural Networks: A Well-Balanced Pair for Speech Enhancement. Applied Sciences 2022, 12, 9000.
DOI:10.3390/app12189000, Corpus ID: 252174387
Prieto, S.; Ortega, A.; López-Espejo, I.; Lleida, E. Shouted and whispered speech compensation for speaker verification systems. Digital Signal Processing, 2022, vol. 127, p. 103536.
https://doi.org/10.1016/j.dsp.2022.103536
Ribas, Dayana; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo
On the Problem of Data Availability in Automatic Voice Disorder Detection. Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) – Volume 5: HEALTHINF, 2023, ISBN: 978-989-758-631-6.
Ribas, D., Yoldi, M.A.P., Miguel, A., Martínez, D., Ortega, A., Lleida, E. (2022) S3prl-Disorder: Open-Source Voice Disorder Detection System based in the Framework of S3PRL-toolkit . Proc. IberSPEECH 2022, 136-140,
doi: 10.21437/IberSPEECH.2022-28
Gimeno, P., Ortega, A., Miguel, A., Lleida, E. (2022) A Study on the Use of wav2vec Representations for Multiclass Audio Segmentation . Proc. IberSPEECH 2022, 56-60, doi: 10.21437/IberSPEECH.2022-1210.21437/IberSPEECH.2022-12
M. Sancinetti, J. Vidal, C. Bonomi and L. Ferrer, "A Transfer Learning Approach for Pronunciation Scoring," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 2022, pp. 6812-6816, doi: 10.1109/ICASSP43922.2022.9747727.
Kakouros S., T. Stafylakis, L. Mošner and L. Burget, "Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109/ICASSP49357.2023.10094673.
Kesiraju, S., Sarvaš, M., Pavlíček, T., Macaire, C., Ciuba, A. (2023) Strategies for Improving Low Resource Speech to Text Translation Relying on Pre-trained ASR Models. Proc. INTERSPEECH 2023, 2148-2152, doi: 10.21437/Interspeech.2023-2506
Landini F., M Diez, A. Lozano-Diez and L. Burget, "Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. _, doi: 10.1109/ICASSP49357.2023.10097049.
Laperrière, G., Nguyen, H., Ghannay, S., Jabaian, B., Estève, Y. (2023) Semantic Enrichment Towards Efficient Speech Representations. Proc. INTERSPEECH 2023, 705-709, doi: 10.21437/Interspeech.2023-2234
Lleida E.; Rodriguez-Fuentes, L.J.; Tejedor; Ortega, A.; Miguel, A.; Bazán, V.; Pérez, C.; de Prada, A.; Penagarikano, M.; Varona, A.; et al. An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies. Appl. Sci. 2023, 13, 8577. https://doi.org/10.3390/app13158577
Peng J., O Plchot, T. Stafylakis, L. Mošner, L. Burget and J. Černocký, "An Attention-Based Backend Allowing Efficient Fine-Tuning of Transformer Models for Speaker Verification," 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, 2023, pp. 555-562, doi: 10.1109/SLT54892.2023.10022775.
Cross-Corpus Training Strategy for Speech Emotion Recognition Using Self-Supervised Representations
Pastor, M.A.; Ribas, D.; Ortega, A.; Miguel, A.; Lleida, E. Cross-Corpus Training Strategy for Speech Emotion Recognition Using Self-Supervised Representations. Appl. Sci. 2023, 13, 9062.
https://doi.org/10.3390/app13169062
Peng J., Themos Stafylakis, Rongzhi Gu, Oldřich Plchot, Ladislav Mošner, Lukáš Burget, Jan Černocký., "Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109/ICASSP49357.2023.10094795.
Stafylakis T., L. Mošner, S. Kakouros, O. Plchot, L. Burget and J. Ćernocký, "Extracting Speaker and Emotion Information from Self-Supervised Speech Models via Channel-Wise Correlations," 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, 2023, pp. 1136-1143, doi: 10.1109/SLT54892.2023.10023345.
Mingote V., P. Gimeno, L. Vicente, S. Khurana, A. Laurent and J. Duret, "Direct Text to Speech Translation System Using Acoustic Units," in IEEE Signal Processing Letters, vol. 30, pp. 1262-1266, 2023, doi: 10.1109/LSP.2023.3313513.
Antoine Laurent, Souhir Gahbiche, Ha Nguyen, Haroun Elleuch, Fethi Bougares, Antoine Thiol, Hugo Riguidel, Salima Mdhaffar, Gaëlle Laperrière, Lucas Maison, Sameer Khurana, and Yannick Estève. 2023. ON-TRAC Consortium Systems for the IWSLT 2023 Dialectal and Low-resource Speech Translation Tasks. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 219–226, Toronto, Canada (in-person and online). Association for Computational Linguistics.
A. Silnova, N. Brümmer, A. Swart and L. Burget, "Toroidal Probabilistic Spherical Discriminant Analysis," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109/ICASSP49357.2023.10095580.
Matejka, P., Silnova, A., Slavíček, J., Mosner, L., Plchot, O., Klčo, M., Peng, J., Stafylakis, T., Burget, L. (2023) Description and Analysis of ABC Submission to NIST LRE 2022. Proc. INTERSPEECH 2023, 511-515, doi: 10.21437/Interspeech.2023-1529
Peng, J., Plchot, O., Stafylakis, T., Mosner, L., Burget, L., Černocký, J.". (2023) Improving Speaker Verification with Self-Pretrained Transformer Models. Proc. INTERSPEECH 2023, 5361-5365, doi: 10.21437/Interspeech.2023-453
Mosner, L., Plchot, O., Peng, J., Burget, L., Černocký, J.". (2023) Multi-Channel Speech Separation with Cross-Attention and Beamforming. Proc. INTERSPEECH 2023, 1693-1697, doi: 10.21437/Interspeech.2023-2537
Kesiraju, S., Beneš, K., Tikhonov, M., & Černocký, J.H. (2023). BUT Systems for IWSLT 2023 Marathi - Hindi Low Resource Speech Translation Task. International Workshop on Spoken Language Translation.
Rahim M. Z., S. S. Juan and F. S. Mohamad, "Improving Speaker Diarization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus," 2023 International Conference on Asian Language Processing (IALP), Singapore, Singapore, 2023, pp. 228-233, doi: 10.1109/IALP61005.2023.10337314.
I. López-Espejo, S. Prieto, A. Ortega and E. Lleida, "Improved Vocal Effort Transfer Vector Estimation For Vocal Effort-Robust Speaker Verification," 2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP), Rome, Italy, 2023, pp. 1-6, doi: 10.1109/MLSP55844.2023.10285923.
Meysam Shamsi, Anthony Larcher , Loïc Barrault,, Sylvain Meignier , Yevheni Prokopalo, Marie Tahon, Ambuj Mehrish, Simon Petitrenaud, Olivier Galibert, Samuel Gaist, André Anjos, Sébastien Marcel, Marta Costa-Jussà, Towards Lifelong Human Assisted Speaker Diarization, Computer Speech and Language, 2023, 77, pp.101437. ⟨10.1016/j.csl.2022.101437⟩.
Yevhenii Prokopalo, Meysam Shamsi, Loïc Barrault, Sylvain Meignier, Anthony Larcher. Active Correction for Incremental Speaker Diarization of a Collection with Human in the Loop. Applied Sciences, 2022, ⟨10.3390/app12041782⟩. ⟨hal-03563148⟩
Mariotte Théo , Antonio Almudévar, Marie Tahon, Alfonso Ortega. An explainable proxy model for multilabel audio segmentation. International Conference on Acoustics Speech and Signal Processing, IEEE, Apr 2024, Seoul (Korea).
Almudévar Antonio, Théo Mariotte, Alfonso Ortega, Marie Tahon, Luis Vicente, Antonio Miguel, Eduardo Lleida. Predefined Prototypes for Intra-Class Separation and Disentanglement. Interspeech 2024,
Thebaud Thomas , Gabriel Hernandez Sierra, Sarah Flora Samson Juan, Marie Tahon. A Phonetic Analysis of Speaker Verification Systems through Phoneme selection and Integrated Gradients. Speaker and Language Recognition Workshop - Odyssey, Jun 2024, Quebec, Canada.
Lebourdais Martin, Pablo Gimeno, Théo Mariotte, Marie Tahon, Alfonso Ortega, Anthony Larcher. 3MAS: a multitask, multilabel, multidataset semi-supervised audio segmentation model.Speaker and Language Recognition Workshop Odyssey, Jun 2024, Quebec, Canada.
Lebourdais Martin, Marie Tahon, Antoine Laurent, Sylvain Meignier. Automatic Speech Interruption Detection: Analysis, Corpus, and System. Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-Coling 2024), ELRA Language Resources Association (ELRA); International Committee on Computational Linguistics (ICCL), May 2024, Torino, Italy.
Tahon Marie, Anthony Larcher, Martin Lebourdais, Fethi Bougares, Ana Silnova, Pablo Gimeno. ALLIES: A Speech Corpus for Segmentation, Speaker Diarization Speech Recognition and Speaker Change Detection. Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-Coling 2024), ELRA Language Resources Association (ELRA); International Committee on Computational Linguistics (ICCL), May 2024, Torino, Italy. <hal-04578441>
Tahon Marie, Imen Ben Amor, Nicolas Dugué, Jean-François Bonastre. Interprétabilité pour l’identification de locuteurs. Retour sur le projet JSALT 2023. Actes de la Journée Extraction de connaissances interprétables pour l’étude de la communication parlée. Journée commune AFIA-TLH -AFCP, 2023. <hal-04489273>
Shamsi Meysam, Lara Gauder, Marie Tahon. The CONILIUM proposition for Odyssey Emotion Challenge Leveraging major class with complex annotations. The Speaker and Language Recognition Workshop (Odyssey), Jun 2024, Quebec, Canada. hal-04600047