Sentiment Analysis of Affiliate Video Comments on the TikTok App

Authors

  • Fathimah Nurul Azizah Universitas Sebelas Maret Surakarta Author
  • Dewi Retno Sari Saputro Universitas Sebelas Maret, Surakarta, Indonesia Author
  • Sutanto Universitas Sebelas Maret, Surakarta, Indonesia Author

Keywords:

Sentiment Analysis, TikTok Affiliate, IndoBERTweet, SMOTE, Random Forest

Abstract

Indonesia ranks among the largest TikTok user bases globally and has the highest number of TikTok Shop stores worldwide. However, affiliate program participation in Indonesia remains low at only 3% of total users, far below the 17.6% recorded in the United States. User comments on TikTok affiliate videos represent a valuable data source for assessing consumer responses to promotional content, yet their large volume and unstructured nature make manual analysis inefficient. This study aimed to develop a sentiment classification model for TikTok affiliate video comments using the Random Forest algorithm with Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance. A dataset of 5,726 comments was manually collected from 50 TikTok affiliate videos in the beauty and personal care category. After preprocessing, sentiment labels were automatically generated using IndoBERTweet as a pseudo-labeling approach, resulting in 1,534 negative and 962 positive comments used for binary classification. Term Frequency-Inverse Document Frequency (TF-IDF) was applied to transform textual data into numerical features, while SMOTE was used to balance the class distribution. Model optimization using GridSearchCV with 10-fold cross-validation yielded the best Random Forest configuration with a cross-validation  of 0.8711. The results show that Random Forest combined with SMOTE achieved an accuracy of 82% and a macro-average  of 0.81 in classifying Indonesian-language TikTok affiliate comments.

References

[1] “TikTok Users by Country 2025,” World Population Review, 2025. [Online]. Available: https://worldpopulationreview.com/country-rankings/tiktok-usersby-country. [Accessed: Dec. 9, 2025].

[2] “TikTok Shop Statistics (2025) Global GMV, Sales, and More,” Resourcera, 2025. [Online]. Available: https://resourcera.com/data/social/tiktok-shop-statistics/. [Accessed: Dec. 9, 2025].

[3] “Affiliate Marketing Semakin Meningkat di Indonesia Tapi Jumlah Affiliator Masih Rendah, Ini Penjelasannya!,” KOL.id, 2025. [Online]. Available: https://kol.id/blog/strategiaffiliate-marketing-semakin-meningkat-di-indonesia-tapi-jumlahaffiliator-rendah. [Accessed: Dec. 9, 2025].

[4] N. P. C. Piliana and A. F. Maradona, “Strategy to increase consumer trust in affiliate links on social media,” Quantitative Economics and Management Studies, vol. 5, no. 1, pp. 1–10, 2024, doi: 10.35877/454RI.qems2800.

[5] A. Mathur, A. Narayanan, and M. Chetty, “Endorsements on social media: An empirical study of affiliate marketing disclosures on YouTube and Pinterest,” Proc. ACM Hum.-Comput. Interact., 2018. [Online]. Available: https://arxiv.org/abs/1809.00620. [Accessed: Mar. 18, 2026].

[6] T. U. Yugita and A. B. P. Kasmo, “The role of trust in moderating the impact of affiliate and social media marketing on financial service purchase decisions,” Formosa Journal of Applied Sciences, vol. 3, no. 8, 2024, doi: 10.55927/fjas.v3i8.10726.

[7] M. Umar, H. Binji, and A. Balarabe, “Corpus-based approaches for sentiment analysis: A review,” Asian J. Res. Comput. Sci., vol. 17, 2024, doi: 10.9734/AJRCOS/2024/v17i7481.

[8] Q. Ain et al., “Sentiment analysis using deep learning techniques: A review,” Int. J. Adv. Comput. Sci. Appl., vol. 8, no. 6, 2017, doi: 10.14569/IJACSA.2017.080657.

[9] T. Khan, R. Sadiq, Z. Shahid, M. Alam, and M. Su’ud, “Sentiment analysis using support vector machine and random forest,” J. Informatics Web Eng., vol. 3, no. 1, 2024, doi: 10.33093/JIWE.2024.3.1.5.

[10] M. Kumar, L. Khan, and H. Chang, “Evolving techniques in sentiment analysis: A comprehensive review,” PeerJ Comput. Sci., vol. 11, 2025, doi: 10.7717/peerj-cs.2592.

[11] [11] J. Hu and S. Szymczak, “A review on longitudinal data analysis with random forest,” Brief. Bioinform., vol. 24, 2022, doi: 10.1093/bib/bbad002.

[12] L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001, doi: 10.1023/A:1010933404324.

[13] R. Genuer, J. Poggi, C. Tuleau-Malot, and N. Villa-Vialaneix, “Random forests for big data,” Big Data Res., vol. 9, pp. 28–46, 2017, doi: 10.1016/j.bdr.2017.07.003.

[14] J. Hartmann, J. Huppertz, C. Schamp, and M. Heitmann, “Comparing automated text classification methods,” Int. J. Res. Mark., vol. 36, no. 1, pp. 20–38, 2019, doi: 10.1016/j.ijresmar.2018.09.009.

[15] R. Naseem and S. Sharma, “Benchmarking machine learning methods for sentiment analysis in social media: A comprehensive investigation,” in Proc. IEEE Students Conference on Engineering and Systems (SCES), 2024, pp. 1–5, doi: 10.1109/SCES61914.2024.10652327.

[16] M. Azwar, P. Hariyadi, and R. Azhar, “Assessing Twitter user sentiment regarding divorce issues using the random forest method,” Int. J. Eng. Comput. Sci. Appl., vol. 4, no. 2, 2025, doi: 10.30812/IJECSA.v4i2.4980.

[17] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” J. Artif. Intell. Res., vol. 16, pp. 321–357, 2002, doi: 10.1613/jair.953.

[18] A. Maxwell, T. Warner, and F. Fang, “Implementation of machine-learning classification in remote sensing: An applied review,” Int. J. Remote Sens., vol. 39, pp. 2784–2817, 2018, doi: 10.1080/01431161.2018.1433343.

[19] J. Cui, Z. Wang, S.-B. Ho, and E. Cambria, “Survey on sentiment analysis: Evolution of research methods and topics,” Artif. Intell. Rev., vol. 56, no. 8, pp. 8469–8510, 2023, doi: 10.1007/s10462-022-10386-z.

[20] “TikTok shop statistics (2025): Global GMV, sales, and more,” Resourcera, 2025. [Online]. Available: https://resourcera.com/data/social/tiktok-shop-statistics/. [Accessed: Dec. 9, 2025].

[21] “TikTok beauty & personal care shop in Indonesia 2025,” FindNiche, 2025. [Online]. Available: https://findniche.com/tiktok/top-sellers-beauty-and-personal-careid. [Accessed: Dec. 9, 2025].

[22] F. Koto, J. H. Lau, and T. Baldwin, “IndoBERTweet: A pretrained language model for Indonesian Twitter with effective domain-specific vocabulary initialization,” in Proc. Conf. Empirical Methods Natural Language Process. (EMNLP), Nov. 2021, pp. 10660–10668.

[23] A. Aribowo, S. Khomsah, and S. Saifullah, “Semi-supervised sentiment classification using self-learning and enhanced co-training,” J. Infotel, vol. 17, no. 3, pp. 472–489, Aug. 2025, doi: 10.20895/INFOTEL.v17i3.1344.

[24] J. Ramos, “Using TF-IDF to determine word relevance in document queries,” in Proc. First Instructional Conf. Mach. Learn., 2003, pp. 133–142.

[25] J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, 3rd ed. Waltham, MA, USA: Morgan Kaufmann, 2011.

[26] F. A. Kusumo, D. R. S. Saputro, and P. Widyaningsih, “Sentiment analysis of reviews on X apps on Google Play Store using support vector machine and N-gram feature selection,” Barekeng J. Ilmu Mat. Terap., vol. 19, no. 2, pp. 1037–1046, 2025.

[27] M. R. Ramadhan and K. Budiman, “Sentiment analysis of presidential candidates in 2024: A comparison of the performance of support vector machine and random forest with N-gram method,” Recursive J. Informatics, vol. 3, no. 1, pp. 34–42, 2025.

[28] G. Anyanwu, C. Nwakanma, J. Lee, and D. Kim, “Optimization of RBF-SVM kernel using grid search algorithm for DDoS attack detection in SDN-based VANET,” IEEE Internet of Things Journal, vol. 10, pp. 8477–8490, 2023, doi: 10.1109/JIOT.2022.3199712.

Downloads

Published

2026-05-07

How to Cite

Sentiment Analysis of Affiliate Video Comments on the TikTok App. (2026). Proceeding International Conference on Multidisciplinary Engagement, 1(1), 310-318. https://prosiding.gerakanedukasi.com/index.php/income/article/view/99

Most read articles by the same author(s)

Similar Articles

1-10 of 52

You may also start an advanced similarity search for this article.