Speaker diarization in broadcast news
รหัสดีโอไอ
Title Speaker diarization in broadcast news
Creator Pantid Chantangphol
Contributor Sasiporn Usanavasin, Advisor
Publisher Thammasat University
Publication Year 2563
Keyword Speech activity detection, Speaker diarization, Convolutional neural network, DenseNet, Deep learning, Feature combination
Abstract Speaker Diarization is a multimedia indexing technology that makes use of audio information to answer the question “Who spoke when?” This thesis presents a step-by-step speaker diarization system implemented in python-based that is evaluated using the Diarization Error Rate (DER) metric.The proposed system, designed for segmenting audio recordings of broadcast news provides implementations of a combination feature extraction based on Dense Convolutional Network (DenseNet) for segmenting the speech according to speaker idwith various background noise. This clustering algorithm offer lower DER as well as a the computational advantage compared to the other classifier. The proposed speaker diarization achieves a favorable performance on the Hollywood movie dataset, AVA speech dataset, CALL HOME American English, 2003 NIST Rich Transcription and the 2000 NIST Speaker Recognition Evaluation compared to the supervised speaker diarization of the speaker diarization with LSTM.
Thammasat University

บรรณานุกรม

EndNote

APA

Chicago

MLA

ดิจิตอลไฟล์

Digital File #1
DOI Smart-Search
สวัสดีค่ะ ยินดีให้บริการสอบถาม และสืบค้นข้อมูลตัวระบุวัตถุดิจิทัล (ดีโอไอ) สำนักการวิจัยแห่งชาติ (วช.) ค่ะ