Logo ESPERANTO
ESPERANTO

Dissemination

    • Speaker embeddings by modeling channel-wise correlations

      Stafylakis T., Rohdin J., Burget L., Speaker embeddings by modeling channel-wise correlations, Interspeech 2021

      Read more

    • Log-Likelihood-Ratio Cost Function as Objective Loss for Speaker Verification Systems

      Mingote, V., Miguel, A., Ortega, A., Lleida, E. "Log-Likelihood-Ratio Cost Function as Objective Loss for Speaker Verification Systems" 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021. Brno, Czech Republic.

      Read more

    • Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021

      Gimeno, P; Ortega, A.; Miguel, A.; Lleida, E. "Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021" 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021. Brno, Czech Republic.

      Read more

    • The Domain Mismatch Problem in the Broadcast Speaker Attribution Task

      Viñals, I.; Ortega, A.; Miguel, A.; Lleida, E. "The Domain Mismatch Problem in the Broadcast Speaker Attribution Task". Applied Sciences, vol. 11, no. 18, p. 8521, Sept. 2021.

      Read more

    • Generalising AUC Optimisation to Multiclass Classification for Audio Segmentation with Limited Training Data

      Gimeno, P.; Ortega, A.; Miguel, A.; Lleida, E. "Generalising AUC Optimisation to Multiclass Classification for Audio Segmentation with Limited Training Data". IEEE Signal Processing Letters, 28 , pp. 1135-1139, 2021.

      Read more

    • The LIUM Human Active Correction Platform for Speaker Diarization

      Flucha, A., Larcher, A., Mehrish, A., Meignier, S., Plaut, F., Poupon, N., Prokopalo, Y., Puertolas, A., Shamsi, M., Tahon, M. (2021) The LIUM Human Active Correction Platform for Speaker Diarization. Proc. Interspeech 2021, 965-966

      Read more

    • MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification

      Mošner L., Plchot O., Burget L., Černock J. (202é) MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification. ICASSP 2022,

      Read more

    • Multi-channel Speaker Verification with Conv-TasNet Based Beamformer

      Mošner L., Plchot O., Burget L., Černock J. (2022) Multi-channel Speaker Verification with Conv-TasNet Based Beamformer. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

      Read more

    • Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch

      Silnova A., Stafylakis T., Mosner L., Plchot O., Rohdin J., Matejka P., Burget L, Glembek O., Brummer N.. (2022) Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch. Odyssey: The Speaker and Language Recognition Workshop 2022

      Read more

    • Development of ABC Systems for the 2021 Edition of NIST Speaker Recognition Evaluation

      Alam J., Beneš R., Beszédeš M., Burget L., Dahmane M., Fathan A., Ghodrati H., Glembek O., Kang W.H., Matĕjka P., Mošner L., Plchot O., Rohdin J., Silnova A., Stafylakis T.(2022) Development of ABC Systems for the 2021 Edition of NIST Speaker Recognition Evaluation. Odyssey: The Speaker and Language Recognition Workshop 2022

      Read more

    • Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings

      Brummer N., Swart A., Mosner L, Silnova A., Plchot O., Stafylakis T. Burget L.,(2022) Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings. Interspeech 2022

      Read more

    • Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundarie

      Stafylakis T., Mosner L, Plchot O., Rohdin J., Silnova A., Burget L., Cernocky J., (2022) Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundarie. Interspeech 2022

      Read more

    • Microphone Array Channel Combination Algorithms for Overlapped Speech Detection

      Mariotte T., Larcher A., Montresor S., Thomas J.-H. (2022) Microphone Array Channel Combination Algorithms for Overlapped Speech Detection. Interspeech 2022

      Read more

    • Phone-Level Pronunciation Scoring for Spanish Speakers Learning English Using a GOP-DNN System

      Vidal J., Bonomi C., Sancinetti M., Ferrer L.. (2022) Phone-Level Pronunciation Scoring for Spanish Speakers Learning English Using a GOP-DNN System. Interspeech 2022

      Read more

    • Generalizing AUC Optimization to Multiclass Classification for Audio Segmentation With Limited Training Data

      P. Gimeno, V. Mingote, A. Ortega, A. Miguel and E. Lleida, "Generalizing AUC Optimization to Multiclass Classification for Audio Segmentation With Limited Training Data," in IEEE Signal Processing Letters, vol. 28, pp. 1135-1139, 2021, doi: 10.1109/LSP.2021.3084501.

      Read more

    • The Domain Mismatch Problem in the Broadcast Speaker Attribution Task

      Viñals I., Ortega A., Miguel A., Lleida E. (2021) The Domain Mismatch Problem in the Broadcast Speaker Attribution Task in IberSPEECH 2020: Speech and Language Technologies for Iberian Languages. Appl. Sci. 2021, 11(18), 8521;

      Read more

    • End-to-End Speech Translation of Arabic to English Broadcast News.

      Bougares F. and Jouili S., End-to-End Speech Translation of Arabic to English Broadcast News, The Seventh Arabic Natural Language Processing Workshop (WANLP 2022)

      Read more

    • Automatic Voice Disorder Detection Using Self-Supervised Representations

      D. Ribas, M. A. Pastor, A. Miguel, D. Martínez, A. Ortega and E. Lleida, "Automatic Voice Disorder Detection Using Self-Supervised Representations," in IEEE Access, vol. 11, pp. 14915-14927, 2023, doi: 10.1109/ACCESS.2023.3243986.

      Read more

    • Class token and knowledge distillation for multi-head self-attention speaker verification systems

      Mingote, V.; Miguel, A.; Ortega, A.; Lleida, E. Class token and knowledge distillation for multi-head self-attention speaker verification systems. Digital Signal Processing, 2023, vol. 133, p. 103859.

      Read more

    • Multimodal Diarization Systems by Training Enrollment Models as Identity Representations

      Mingote, V.; Viñals, I.; Gimeno, P.; Miguel, A.; Ortega, A.; Lleida, E.
      Multimodal Diarization Systems by Training Enrollment Models as Identity Representations. Applied Sciences. 2022; 12(3):1141
      DOI:10.3390/app12031141, Corpus ID: 246348513

      Read more

    • aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems.

      Mingote, V.; Miguel, A.; Ribas D.; Ortega, A.; Lleida, E. Class token and knowledge distillation for multi-head self-attention speaker verification systems. Digital Signal Processing, 2023, vol. 133, p. 103859.
      DOI:10.1109/taslp.2022.3145307 Corpus ID: 246346628

      Read more

    • Unsupervised Adaptation of Deep Speech Activity Detection Models to Unseen Domains

      Gimeno, P.; Ribas, D.; Ortega, A.; Miguel, A.; Lleida, E. Unsupervised Adaptation of Deep Speech Activity Detection Models to Unseen Domains. Applied Sciences 2022, 12, 1832. https://doi.org/10.3390/app12041832

      Read more

    • Wiener Filter and Deep Neural Networks: A Well-Balanced Pair for Speech Enhancement

      Ribas, D.; Miguel, A.; Ortega, A.; Lleida, E. Wiener Filter and Deep Neural Networks: A Well-Balanced Pair for Speech Enhancement. Applied Sciences 2022, 12, 9000.
      DOI:10.3390/app12189000, Corpus ID: 252174387

      Read more

    • Shouted and whispered speech compensation for speaker verification systems

      Prieto, S.; Ortega, A.; López-Espejo, I.; Lleida, E. Shouted and whispered speech compensation for speaker verification systems. Digital Signal Processing, 2022, vol. 127, p. 103536.
      https://doi.org/10.1016/j.dsp.2022.103536

      Read more

    • S3prl-Disorder: Open-Source Voice Disorder Detection System based in the Framework of S3PRL-toolkit

      Ribas, D., Yoldi, M.A.P., Miguel, A., Martínez, D., Ortega, A., Lleida, E. (2022) S3prl-Disorder: Open-Source Voice Disorder Detection System based in the Framework of S3PRL-toolkit . Proc. IberSPEECH 2022, 136-140,
      doi: 10.21437/IberSPEECH.2022-28

      Read more

    • A Study on the Use of wav2vec Representations for Multiclass Audio Segmentation

      Gimeno, P; Ortega, A.; Miguel, A.; Lleida, E. "A Study on the Use of wav2vec Representations for Multiclass Audio Segmentation" Iberspeech 2022. Granada, Spain. Novemberr 2021
      DOI:10.21437/iberspeech.2022-12, Corpus ID: 253434066

      Read more

    • A Transfer Learning Approach for Pronunciation Scoring

      M. Sancinetti, J. Vidal, C. Bonomi and L. Ferrer, "A Transfer Learning Approach for Pronunciation Scoring," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 2022, pp. 6812-6816, doi: 10.1109/ICASSP43922.2022.9747727.

      Read more

Partagez :