![]() |
Toxicity Posts and Hateful Comments Detection in Thai Language Using Supervised Ensemble Classification |
---|---|
รหัสดีโอไอ | |
Creator | Sutthisak Sukhamsri |
Title | Toxicity Posts and Hateful Comments Detection in Thai Language Using Supervised Ensemble Classification |
Publisher | Rajamangala University of Technology Lanna |
Publication Year | 2568 |
Journal Title | RMUTL Engineering Journal |
Journal Vol. | 10 |
Journal No. | 1 |
Page no. | 1-8 |
Keyword | Natural Language Processing, Toxicity Posts, Word Vectorization, Thai Language Corpus, Ensemble Model |
URL Website | https://engsystem.rmutl.ac.th/journal/ |
ISSN | 3027-7426 |
Abstract | Social media platforms are the community people gather in where they can generally express their free willing opinions to others on any topics they attend. However, on many occasions, the cause of violating arguments or an unpleasant atmosphere in the community is initiated by negative, toxic, and hateful posts or comments. For that reason, monitoring post systems on social media is an essential topic in the natural language processing area, especially in multi-linguistics research. In this study, we proposed a method of improvement for the Thai language's toxic and hateful classification that was trained on the dataset of 2,160 posts from the Thai toxicity Twitter corpus for training and verifying. Therefore, we designated the ensemble approach which includes the combination of XGBoost, multinomial naive Bayes, logistic regression, support vector machine, and random forest for classifiers. In summary, the ensemble classifier improved the previous study in the same dataset with 0.7808 precision, 0.7778 recall, and 0.7721 average accuracies in the weighted F1 scoring with an accuracy of 0.8235 in the F1 binary scoring. |