Multinomial Naïve Bayes untuk Klasifikasi Artikel Online tentang Gempa di Indonesia
Abstract
Indonesia is a country where earthquake often occurs. This circumstances lead to many amount of news about earthquake in Indonesia was served. One popular way to provide news is through online article. Online article that contains information about earthquake often classified to Economic, Health, and Tourism category. Text classification can help the process of this article categorization. In this paper, a research about how multinomial naïve bayes performs on categorization of online article about earthquake in Indonesia was done. TF-IDF was used to determine weight of each feature. The testing was done by using unigram feature, bigram feature, and the combination of both . Furthermore, the testing was also done by removing stemming and stopwords removal from preprocessing. The highest F-measure obtained by 5-fold cross validation is 95.20% from a scenario where combination of both unigam and bigram feature used plus stemming and stopwords removal are included in preprocessing.