Deep Learning based UPoS Tagger for Assamese Religious Text

Kuwali Talukdar; Shikhar Kumar Sarma; Farha Naznin; Ratul Deka

doi:10.61707/nn1dfz44

Authors

Kuwali Talukdar Department of Information Technology, Gauhati University, Guwahati, Assam, India
Shikhar Kumar Sarma Department of Information Technology, Gauhati University, Guwahati, Assam, India
Farha Naznin Department of Information Technology, Gauhati University, Guwahati, Assam, India
Ratul Deka Department of Information Technology, Gauhati University, Guwahati, Assam, India

DOI:

https://doi.org/10.61707/nn1dfz44

Keywords:

Universal Parts of Speech (UPoS), GRU, RNN, BiLSTM

Abstract

Religious texts are known to be with specific patterns of writing, and also involve specific vocabularies. These are also known to follow specific style of writing. Thereby these texts are enriched with typical semantic and syntactic characteristics, demanding special attention for Natural Language Processing (NLP) tasks. This research paper focuses on the application of Deep Learning (DL) techniques for Parts of Speech (PoS) tagging focusing on Assamese language religious texts. We have created a specialized dataset comprising approximately 11,000 sentences extracted from various sources including web crawling and filtering religious texts from existing corpora. The dataset was manually validated by linguists to ensure accuracy, errors, and corrections required. A performance matrix was constructed to analyze the performance of the initial tagging using a pre-existing DL-based model trained for Assamese Universal Parts of Speech (UPoS) tagger. Following this, we utilized a subset of the dataset for manual evaluation, and the validated dataset is then considered as a gold standard training dataset for training other DL models using GRU, RNN and Bidirectional LSTM (BiLSTM) architectures. Training accuracies were recorded and presented, demonstrating the effectiveness of the proposed approach. Accuracies, Precision, and Recall were recorded for all the three Models. F1 scores also have been calculated. Comparison of training and testing accuracies are depicted with performance graphs.

Deep Learning based UPoS Tagger for Assamese Religious Text

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

Make a Submission

Scimago

Note