Abstract
Abstract— Any biometric recognizer is vulnerable to spoofing attacks and hence
voice biometric is no exception; replay, voice conversion and synthesis attacks
all provoke false acceptances unless countermeasures are used. This paper
focuses on replay attacks, considered as one of the most challenging for
recognition systems, as it is easily accessible for attackers. In this paper, we propose a methodology for the
fusion of different modes of speaker verification (SV)
\cite{Campbell_1997} \cite{Bimbot_2004} \cite{Reynolds_2000} \cite{Safavi_2012} 5 is missing \cite{Ganchev_2013} \cite{H_bert_2008}\cite{Larcher_2014} \cite{Ntalianis_2011}
\cite{Beigi_2011}\cite{Sukkar_1996}\cite{DAVIS_1990}\cite{Furui_1981}\cite{Reynolds_1995a}\cite{Campbell_2006}\cite{Dehak_2011}\cite{Mitchell}\cite{Ross}\cite{Nandakumar_2008}\cite{Monte_Moreno_2009}\cite{Soong_1987}\cite{Kittler_1998}\cite{Kuncheva_2007}\cite{Damper_2003} \cite{Bishop}\cite{Raudys_2006}\cite{Chih_Wei_Hsu_2002}\cite{Ramachandran_2002}\cite{Bouchard_2007}\cite{Campbell_1999}\cite{Hermansky_1994}\cite{Viikki_1998}\cite{Kung}\cite{Pal_1996}\cite{Zhang_2013}\cite{Altman_1992}\cite{Witten_2011}\cite{BURGES_1993}(Bishop)
operation
(fixed-passphrase, text-dependent and text-independent mode), using regression
fusion models. We suggest that although the use of fusion on its own may have
vulnerabilities, fusion used with an utterance verification engine is
effective. The experimental results with
and without spoofing attack conditions, using different single- mode
speaker-verification engines (using GMM-UBM, HMM-UBM and i-vector approach),
indicated improvement in all the experiments. A 6.75 % EER is achieved as the
best speaker verification performance, when using fusion of scores from three
modes of operation of HMM-UBM based speaker verification systems. This is a relative
improvement of 22.32 % compared with the best performing single mode engine.
Index Terms—Automatic speaker verification, spoofing attack,
anti-spoofing, regression fusion.