Hybrid Speaker Verification Architecture for Sensitive Authentication Services
Abstract— Any biometric recognizer is vulnerable to spoofing attacks and hence voice biometric is no exception; replay, voice conversion and synthesis attacks all provoke false acceptances unless countermeasures are used. This paper focuses on replay attacks, considered as one of the most challenging for recognition systems, as it is easily accessible for attackers. In this paper, we propose a methodology for the fusion of different modes of speaker verification (SV)
operation (fixed-passphrase, text-dependent and text-independent mode), using regression fusion models. We suggest that although the use of fusion on its own may have vulnerabilities, fusion used with an utterance verification engine is effective.  The experimental results with and without spoofing attack conditions, using different single- mode speaker-verification engines (using GMM-UBM, HMM-UBM and i-vector approach), indicated improvement in all the experiments. A 6.75 % EER is achieved as the best speaker verification performance, when using fusion of scores from three modes of operation of HMM-UBM based speaker verification systems. This is a relative improvement of 22.32 % compared with the best performing single mode engine.
Index Terms—Automatic speaker verification, spoofing attack, anti-spoofing, regression fusion.