Log-Likelihood-Ratio Cost Function as Objective Loss for Speaker Verification SystemsVictoria Mingote, Antonio Miguel, Alfonso Ortega, Eduardo Lleida

ViVoLab, Aragón Institute for Engineering Research (I3A), University of Zaragoza, Spain


Contact: {vmingote,amiguel,ortega,lleida}@unizar.es


DOI. (https://doi.org/10.21437/Interspeech.2021-1085 )


Many recent studies in Speaker Verification (SV) have been focused on the design of the most appropriate training loss function, which plays an important role to improve the recognition ability of the systems. However, the verification loss functions created often do not take into account the performance measures which are used for the final system evaluation.

For this reason, this paper presents an alternative approach to optimize the parameters of a neural network using a loss function based on the log-likelihood-ratio cost function (CLLR). This function is an application-independent metric that measures the cost of soft detection decisions over all the operating points. Thus, prior or relevance cost parameters assumptions are not employed to obtain it. Moreover, this metric has a differentiable expression, so no approximation is needed to use it as the objective loss to train a neural network. CLLR function as optimization loss was tested on the RSR2015-Part II database for text-dependent speaker verification, providing competitive results without using score normalization and outperforming other similar loss functions as Cross-Entropy combined with Ring Loss, as well as our previous loss function based on an approximation of the Detection Cost Function (DCF). 



Read the PDF

Partagez :