Conclusion: The current paper applies OHEM to speech
anti-spoofing systems and experimentally employs four different sets of
models. The experimental results indicate that the presented system can
identify some unseen attacks well. Our best system achieved an EER of
0.77 % by a score-level fusion. For future work, we hope to employ the
methods used in our experiments in more anti-spoofing models.
References
- Delac K, Grgic M. A survey of biometric recognition
methods[C]//Proceedings. Elmar-2004. 46th International Symposium
on Electronics in Marine. IEEE, 2004: 184-193.
- Wu Z, Evans N, Kinnunen T, et al. Spoofing and countermeasures for
speaker verification: A survey[J]. speech communication, 2015, 66:
130-153.
- Wang X, Yamagishi J. Investigating self-supervised front ends for
speech spoofing countermeasures[J]. arXiv preprint
arXiv:2111.07725, 2021.
- Davis S, Mermelstein P. Comparison of parametric representations for
monosyllabic word recognition in continuously spoken sentences[J].
IEEE transactions on acoustics, speech, and signal processing, 1980,
28(4): 357-366.
- Todisco M, Delgado H, Evans N. Constant Q cepstral coefficients: A
spoofing countermeasure for automatic speaker verification[J].
Computer Speech & Language, 2017, 45: 516-535.
- Tak H, Patino J, Todisco M, et al. End-to-end anti-spoofing with
rawnet2[C]//ICASSP 2021-2021 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021:
6369-6373.
- Wang X, Yamagishi J. A comparative study on recent neural spoofing
countermeasures for synthetic speech detection[J]. arXiv preprint
arXiv:2103.11326, 2021.
- Lavrentyeva G, Novoselov S, Tseren A, et al. STC antispoofing systems
for the ASVspoof2019 challenge[J]. arXiv preprint
arXiv:1904.05576, 2019.
- Li X, Li N, Weng C, et al. Replay and synthetic speech detection with
res2net architecture[C]//ICASSP 2021-2021 IEEE international
conference on acoustics, speech and signal processing (ICASSP). IEEE,
2021: 6354-6358.
- Zhang Y, Jiang F, Duan Z. One-class learning towards synthetic voice
spoofing detection[J]. IEEE Signal Processing Letters, 2021, 28:
937-941.
- Shrivastava A, Gupta A, Girshick R. Training region-based object
detectors with online hard example mining[C]//Proceedings of the
IEEE conference on computer vision and pattern recognition. 2016:
761-769.
- He K, Zhang X, Ren S, et al. Deep residual learning for image
recognition[C]//Proceedings of the IEEE conference on computer
vision and pattern recognition. 2016: 770-778.