Investigating the Role of Speaker Counter in Handling Overlapping
Speeches in Speaker Diarization Systems
- Thanh Thi-Hien Duong,
- Phi-Le Nguyen,
- Hong-Son Nguyen,
- Ngoc Q. K. Duong
Phi-Le Nguyen
University of Science and Technology of Hanoi
Author ProfileAbstract
In real-life conversations, meetings, or debates, there are often
situations where many people speak at the same time, leading to
overlapping speech segments. Such overlapping speech is an extremely
challenging problem for the speaker diarization task. The widely used
clustering-based diarization approaches perform quite poorly under such
situations due to their limited capabilities in handling overlapping
speeches. This paper investigates a speaker diarization framework in
which a new building block, called speaker count, is integrated. Such
speaker counter predicts the number of active speakers in each analyzing
audio window, then its output is used in the conventional
re-segmentation step of the diarization pipelines in order to better
label the active speakers in each considered segment. We also
investigate the effect of the analyzing audio window size on diarization
performance by theoretical analysis. We claim that the speaker count
block ensures a lower diarization error rate when the analyzing window
size is small enough. Experiment results obtained from two
state-of-the-art diarization systems with different settings on two
benchmark datasets, AMI Headset mix and DIHARD III, confirmed the
effectiveness of the proposed approach.