Deep Learning based UPoS Tagger for Assamese Religious Text

Authors

  • Kuwali Talukdar Department of Information Technology, Gauhati University, Guwahati, Assam, India
  • Shikhar Kumar Sarma Department of Information Technology, Gauhati University, Guwahati, Assam, India
  • Farha Naznin Department of Information Technology, Gauhati University, Guwahati, Assam, India
  • Ratul Deka Department of Information Technology, Gauhati University, Guwahati, Assam, India

DOI:

https://doi.org/10.61707/nn1dfz44

Keywords:

Universal Parts of Speech (UPoS), GRU, RNN, BiLSTM

Abstract

Religious texts are known to be with specific patterns of writing, and also involve specific vocabularies. These are also known to follow specific style of writing. Thereby these texts are enriched with typical semantic and syntactic characteristics, demanding special attention for Natural Language Processing (NLP) tasks. This research paper focuses on the application of Deep Learning (DL) techniques for Parts of Speech (PoS) tagging focusing on Assamese language religious texts. We have created a specialized dataset comprising approximately 11,000 sentences extracted from various sources including web crawling and filtering religious texts from existing corpora. The dataset was manually validated by linguists to ensure accuracy, errors, and corrections required. A performance matrix was constructed to analyze the performance of the initial tagging using a pre-existing DL-based model trained for Assamese Universal Parts of Speech (UPoS) tagger. Following this, we utilized a subset of the dataset for manual evaluation, and the validated dataset is then considered as a gold standard training dataset for training other DL models using GRU, RNN and Bidirectional LSTM (BiLSTM) architectures. Training accuracies were recorded and presented, demonstrating the effectiveness of the proposed approach. Accuracies, Precision, and Recall were recorded for all the three Models. F1 scores also have been calculated. Comparison of training and testing accuracies are depicted with performance graphs. 

Downloads

Published

2024-03-27

Issue

Section

Articles

How to Cite

Deep Learning based UPoS Tagger for Assamese Religious Text. (2024). International Journal of Religion, 5(4), 163-170. https://doi.org/10.61707/nn1dfz44

Similar Articles

1-10 of 17

You may also start an advanced similarity search for this article.