|
Speaker diarization in broadcast news |
|---|---|
| รหัสดีโอไอ | |
| Title | Speaker diarization in broadcast news |
| Creator | Pantid Chantangphol |
| Contributor | Sasiporn Usanavasin, Advisor |
| Publisher | Thammasat University |
| Publication Year | 2563 |
| Keyword | Speech activity detection, Speaker diarization, Convolutional neural network, DenseNet, Deep learning, Feature combination |
| Abstract | Speaker Diarization is a multimedia indexing technology that makes use of audio information to answer the question “Who spoke when?” This thesis presents a step-by-step speaker diarization system implemented in python-based that is evaluated using the Diarization Error Rate (DER) metric.The proposed system, designed for segmenting audio recordings of broadcast news provides implementations of a combination feature extraction based on Dense Convolutional Network (DenseNet) for segmenting the speech according to speaker idwith various background noise. This clustering algorithm offer lower DER as well as a the computational advantage compared to the other classifier. The proposed speaker diarization achieves a favorable performance on the Hollywood movie dataset, AVA speech dataset, CALL HOME American English, 2003 NIST Rich Transcription and the 2000 NIST Speaker Recognition Evaluation compared to the supervised speaker diarization of the speaker diarization with LSTM. |