Analisis Komprehensif Kinerja Model Klasifikasi Sentimen: Evaluasi Lintas Metrik pada Dataset Tweet Film Bahasa Indonesia

Riadi Marta Dinata; Marhaeni Marhaeni; Kurniawan Atmadja; Elda Rayhana; Veriah Hadi; Ujang Al Kaf

Riadi Marta Dinata ISTN
Marhaeni Marhaeni
Kurniawan Atmadja
Elda Rayhana
Veriah Hadi
Ujang Al Kaf

Abstract

Evaluating the performance of text classification models cannot rely solely on accuracy, particularly when dealing with imbalanced datasets or evaluation goals that are sensitive to specific types of errors. This study investigates the performance of five classification algorithms—K-Nearest Neighbor, Support Vector Machine, Random Forest, Logistic Regression, and Naive Bayes—on an Indonesian-language film review sentiment dataset. Each model is assessed across four key metrics: accuracy, precision, recall, and F1-score, using a 10-fold holdout strategy to observe consistency and generalization. Results reveal that SVM outperforms other algorithms with the highest average accuracy of 85.5%, followed by Naive Bayes (83.0%) and Logistic Regression (82.3%). Although Random Forest shows strong precision (85.6%), its lower recall (65.3%) reflects an imbalance in performance across metrics. Evaluation based on target-specific criteria—including false negative sensitivity and probabilistic distribution analysis—confirms the importance of multi-metric validation. In conclusion, SVM emerges as the most reliable choice for sentiment classification in Bahasa Indonesia, offering the best balance across performance indicators, while Random Forest demonstrates vulnerability under complex data splits.Keywords: Sentiment Classification, Model Evaluation, Text Mining, Supervised Learning Algorithms, Indonesian Language

Analisis Komprehensif Kinerja Model Klasifikasi Sentimen: Evaluasi Lintas Metrik pada Dataset Tweet Film Bahasa Indonesia

Data Sentimen Analitik dari Tweteer (X) Tentang Film Berbahasa Indoensia

Abstract

Most read articles by the same author(s)