mySentence: Sentence Segmentation for Myanmar Language using Neural Machine Translation Approach
รหัสดีโอไอ
Creator Thura Aung
Title mySentence: Sentence Segmentation for Myanmar Language using Neural Machine Translation Approach
Contributor Ye Kyaw Thu, Zar Zar Hlaing
Publisher Sirindhorn International Institute of Technology, Bangkadi Campus (SIIT-BKD)
Publication Year 2566
Journal Title Journal of Intelligent Informatics and Smart Technology 
Journal Vol. 9
Page no. 1-9
Keyword Sentence segmentation, Neural machine translation, Sequence Tagging
URL Website https://ph05.tci-thaijo.org/index.php/JIIST
Website title Journal of Intelligent Informatics and Smart Technology 
ISSN 2586-9167
Abstract A sentence is an independent unit which is a string of complete words containing valuable information of the text. In informal Myanmar Language, for which most of NLP applications like Automatic Speech Recognition (ASR) are used, there is no predefined rule to mark the end of sentence. In this paper, we contributed the first corpus for Myanmar Sentence Segmentation and proposed the first systematic study with Machine Learning based Sequence Tagging as baseline and Neural Machine Translation approach. Before conducting the experiments, we prepared two types of data - one containing only sentences and the other containing both sentences and paragraphs. We trained each model on both types of data and evaluated the results on both types of test data. The accuracies were measured in terms of Bilingual Evaluation Understudy (BLEU) and character n-gram F-score (CHRF ++) scores. Word Error Rate (WER) was also used for the detailed study of error analysis. The experimental results show that Sequence-to-Sequence architecture based Neural Machine Translation approach with the best BLEU score (99.78), which is trained on both sentence-level and paragraph-level data, achieved better CHRF ++ scores (+18.4) and (+16.7) than best results of such machine learning models on both test data.
Sirindhorn International Institute of Technology, Bangkadi Campus

บรรณานุกรม

EndNote

APA

Chicago

MLA

ดิจิตอลไฟล์

Digital File
DOI Smart-Search
สวัสดีค่ะ ยินดีให้บริการสอบถาม และสืบค้นข้อมูลตัวระบุวัตถุดิจิทัล (ดีโอไอ) สำนักการวิจัยแห่งชาติ (วช.) ค่ะ