Anti-spoofing using ResNet50 with linear discriminant analysis for automatic speaker verification
รหัสดีโอไอ
Title Anti-spoofing using ResNet50 with linear discriminant analysis for automatic speaker verification
Creator Peemapot Uparakool
Contributor Waree Kongprawechnon, Advisor
Publisher Thammasat University
Publication Year 2568
Keyword Anti-spoofing, Automatic speaker verification, Linear discriminant analysis, Principal component analysis, ResNet50
Abstract Deep-learning-based models have shown significant potential in speech spoof detection, which is crucial to ensuring the authenticity of speech signals. This work aims to expand the knowledge about deep learning-based spoof detection by integrating ResNet50 with linear discriminant analysis (LDA) to reduce the dimensionality. Using the logical access (LA) subset from the ASVspoof 2019 dataset, we generated mel-spectrogram and gammatone spectrogram representations of the speech signals. ResNet50 was used to extract deep features from these spectrograms, and subsequently LDA was applied to reduce feature dimensionality and improve classification accuracy. Our method significantly outperformed the baseline ResNet50 model by reducing the equal error rate (EER) by 43.55% and increasing balanced accuracy by 48.59% for duplicated mel-spectrogram tensor, 8.95% and 15.52% for differentiated mel-spectrogram tensor, and 44.14% and 44.77% for differentiated gammatone spectrogram tensor, respectively. These results demonstrate the effectiveness of combining ResNet50 with gammatone spectrograms and LDA, providing a more robust solution for audio spoof detection.To further investigate our approach, we extended the evaluation by applying traditional classifiers such as Random Forest (RF), k-Nearest Neighbors (KNN), and Naïve Bayes (NB) on the deep features extracted by ResNet50 and reduced by LDA or PCA. Among all combinations, the LDA-reduced features paired with Naïve Bayes classifier achieved the best result, reaching 88.18% balanced accuracy and 2.80% EER. These findings confirm that our proposed framework not only improves spoof detection performance under a threshold-based scheme but is also compatible with various machine learning classifiers, making it a flexible and effective solution for audio spoof detection tasks.
Thammasat University

บรรณานุกรม

EndNote

APA

Chicago

MLA

ดิจิตอลไฟล์

Digital File #1
DOI Smart-Search
สวัสดีค่ะ ยินดีให้บริการสอบถาม และสืบค้นข้อมูลตัวระบุวัตถุดิจิทัล (ดีโอไอ) สำนักการวิจัยแห่งชาติ (วช.) ค่ะ