INTEGRASI NAIVE BAYES DENGAN TEKNIK SAMPLING SMOTE UNTUK MENANGANI DATA TIDAK SEIMBANG

Nina Sulistiyowati, Mohamad Jajuli

Abstract


Classification of data with unbalanced classes is a major problem in the field of machine learning and data mining. If working on unbalanced data, almost all classification algorithms will produce much higher accuracy for majority classes than minority classes. This research will implement the Synthetic Minority Over-sampling Technique (SMOTE) method to overcome unbalanced data on credit customer data in Rawamerta teacher cooperatives. The research methodology uses SEMMA with the stages of research Sample, Explore, Modify, Model, and Asses. The Sample Phase was conducted to choose the data of the Rawamerta Teachers Cooperative credit customers for 2015-2017 with a total of 878 data with the attributes used namely income, total deposits, loan amount, duration of installments, services, installments, and credit status. The Explore phase analyzes current classes which are categorized as majority classes because there are 813 data, while traffic classes can be categorized as minority classes because there are 65 data. The data shows an imbalance of data between the two classes. The Modify stages perform the 500% SMOTE process. The Model Stage classifies using Naïve Bayes. Naïve Bayes modeling with SMOTE produced 1131 successfully classified data correctly and 72 data were not classified correctly while without SMOTE resulted in 818 data was classified correctly and 60 data were not classified correctly.

Keywords: Naïve Bayes, SMOTE, unbalanced data

Full Text:

PDF

References


Anindya, A., Indahwati, & Suteyo, B. (2018). Application of SMOTE on CART Method to Handle Imbalanced Data (Study Case: Labor Force Classification in Banten Province). IOP Conference Series: Earth and Environmental Science, 1-18.

Bramer, M. (2007). Principles of Data Mining. Springer Science.

Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artifical Intelligence Research, 16, 321-357.

Defiyanti, S., & Jajuli, M. (2015). Integrasi Metode Klasifikasi Data Clustering dalam Data Mining. Konferensi Nasional Informatika (KNIF), 39-44.

Fuadin, D. N. (2017). Deteksi Botnet Menggunakan Naive Bayes Classifier dengan SMOTE dan Metode BFS. Surabaya: Institut Teknologi Sepuluh Nopember.

Gorunescu, F. (2011). Data Mining: Concepts, Models and Techniques. Romania: Springer-Verlag Berlin Heidelberg.

Hairani, Setiawan, N. A., & Adji, T. B. (2016). Metode Klasifikasi Data Mining dan Teknik Sampling SMOTE Menangani Class Imbalance Untuk Segmentasi Customer Pada Industri Perbankan. Prosiding SNST, 7, 168-172




DOI: https://doi.org/10.25134/nuansa.v14i1.2411

NUANSA INFORMATIKA : JURNAL TEKNOLOGY DAN INFORMASI
p-ISSN :1858-3911 , e-ISSN : 2614-5405
DOI : https://doi.org/10.25134/nuansa
Accreditation : SINTA 5

Organized by Faculty of Computer Science, Universitas Kuningan, Indonesia.
Website : https://journal.uniku.ac.id/index.php/ilkom
Email : [email protected]
Address : Jalan Cut Nyak Dhien No.36A Kuningan, Jawa Barat, Indonesia.

StatCounter

View My Stats Creative Commons

Lisensi Creative Commons
NUANSA INFORMATIKA is licensed under a Lisensi Creative Commons Atribusi 4.0 Internasional.