A COMPARATIVE ANALYSIS OF MACHINE LEARNING MODELS FOR SINDHI NEWS CLASSIFICATION

https://doi.org/10.5281/zenodo.19984114

Authors

  • Muhammad Ameen Chhajro*
  • Abdul Qayoom
  • Najma Imtiaz Ali
  • Seema Sultana Bhurgri
  • Zubair Uddin
  • Aadil Jamali

Abstract

Sindhi language is one of the oldest languages of the Asia region, which still has insufficient computational advancement and availability of structured datasets for natural language processing tasks. This study performs a comparative analysis of various machine learning models for Sindhi news headline classification. A Sindhi news dataset of 87,553 headlines was collected from different online newspaper websites and blogs, which consists of two categories: technology and scientific discoveries news data. This research uses the text pre-processing techniques on Sindhi news data to clean the less meaningful data from the dataset, followed by feature extraction techniques to convert the text into a number representation. This research utilizes the machine learning models for classification tasks, including Logistic Regression, Linear SVM, Multinomial Naive Bayes, Radial Basis Function Support Vector Machine (RBF SVM), Random Forest, and Bernoulli Naïve Bayes, which were trained on the Sindhi news dataset, followed by evaluation considering the accuracy, precision, recall, and F1-score metrics. This proposed research study’s experimental results reveal that the RBF SVM achieved an excellent accuracy of 98.91% along with precision (0.99) and F1-score (0.98) for the text classification task in Sindhi language news classification.

Keywords: NLP, Text Classification, Machine Learning Algorithms, Low-resource language, Random Forest, Linear SVM, RBF SVM, Logistic Regression, Bernoulli Naïve Bayes, Sindhi language.

 

https://doi.org/10.5281/zenodo.19984114

 

Downloads

Published

2026-05-02

How to Cite

Muhammad Ameen Chhajro*, Abdul Qayoom, Najma Imtiaz Ali, Seema Sultana Bhurgri, Zubair Uddin, & Aadil Jamali. (2026). A COMPARATIVE ANALYSIS OF MACHINE LEARNING MODELS FOR SINDHI NEWS CLASSIFICATION: https://doi.org/10.5281/zenodo.19984114. Policy Journal of Social Science Review, 4(5), 27–38. Retrieved from https://policyjssr.com/index.php/PJSSR/article/view/925