Deteksi SMS Spam Berbahasa Indonesia menggunakan TF-IDF dan Stochastic Gradient Descent Classifier

Indonesian SMS Spam Detection using TF-IDF and Stochastic Gradient Descent Classifier

  • Ramaditia Dwiyansaputra
  • Gibran Satya Nugraha
  • Fitri Bimantoro
  • Arik Aranta
Keywords: Klasifikasi Teks, Sms Spam, TF-IDF, Stochastic Gradient Descent

Abstract

Short Message Service (SMS) has evolved in the last few decades. The simplicity of SMS makes this short message service attractive to use as a direct communication service on mobile devices. As the popularity of this service increases, it also harms attacks on mobile devices such as SMS spam. Spam SMS are short messages that the recipient doesn't want, such as advertisements and scams. Spam SMS can overwhelm your inbox and make your mobile device experience less good. One way to overcome this problem is to implement a machine learning model to automatically recognize and filter Spam SMS. This research aims to build a machine learning model that provides higher accuracy for detecting SMS spam in Indonesian using the TF-IDF method and the Stochastic Gradient Descent Classifier. Based on the test results, the model built can detect SMS spam and not spam with an accuracy of 97%.

Published
2021-10-25