Digital Object Identifier

	10.14457/TU.the.2020.1038 Speaker diarization in broadcast news
รหัสดีโอไอ	10.14457/TU.the.2020.1038
Title	Speaker diarization in broadcast news
Creator	Pantid Chantangphol
Contributor	Sasiporn Usanavasin, Advisor
Publisher	Thammasat University
Publication Year	2563
Keyword	Speech activity detection, Speaker diarization, Convolutional neural network, DenseNet, Deep learning, Feature combination
Abstract	Speaker Diarization is a multimedia indexing technology that makes use of audio information to answer the question “Who spoke when?” This thesis presents a step-by-step speaker diarization system implemented in python-based that is evaluated using the Diarization Error Rate (DER) metric.The proposed system, designed for segmenting audio recordings of broadcast news provides implementations of a combination feature extraction based on Dense Convolutional Network (DenseNet) for segmenting the speech according to speaker idwith various background noise. This clustering algorithm offer lower DER as well as a the computational advantage compared to the other classifier. The proposed speaker diarization achieves a favorable performance on the Hollywood movie dataset, AVA speech dataset, CALL HOME American English, 2003 NIST Rich Transcription and the 2000 NIST Speaker Recognition Evaluation compared to the supervised speaker diarization of the speaker diarization with LSTM.

Thammasat University

Pantid Chantangphol และผู้แต่งคนอื่นๆ. (2020) Speaker diarization in broadcast news. Thammasat University:ม.ป.ท. 10.14457/TU.the.2020.1038

Pantid Chantangphol และผู้แต่งคนอื่นๆ. 2020. Speaker diarization in broadcast news. ม.ป.ท.:Thammasat University; 10.14457/TU.the.2020.1038

Pantid Chantangphol และผู้แต่งคนอื่นๆ. Speaker diarization in broadcast news. ม.ป.ท.:Thammasat University, 2020. Print. 10.14457/TU.the.2020.1038

Digital File #1