Perbandingan Algoritma SVM, Random Forest Dan XGBoost Untuk Penentuan Persetujuan Pengajuan Kredit

Mohammad Rizal Givari; Mochammad Riszky Sulaeman; Yuyun Umaidah

doi:10.25134/nuansa.v16i1.5406

Mohammad Rizal Givari Universitas Singaperbangsa Karawang
Mochammad Riszky Sulaeman
Yuyun Umaidah

DOI: https://doi.org/10.25134/nuansa.v16i1.5406

Abstract

Credit is an option for seeking funding for most economic activities. The demand for credit is currently growing very rapidly, in line with the increasing financial needs of the community, especially in developing countries such as Indonesia. Credit analysis needs to be carried out to achieve proper and safe lending. Credit analysis is an observation to see the feasibility of a credit problem. From this analysis, the creditworthiness of the recipient will be known. This study uses the CRISP-DM methodology which consists of 6 stages, namely Bussines Understanding, Data Understanding, Data preparation, Modeling Evaluation, and Deployment by applying the classification method by comparing the SVM, Random Forest, and XGBoost algorithms. This research uses an open source dataset obtained from Kaggle. The results of the research using the SVM, random forest, and XGBoost algorithms get the highest accuracy, recall, precision values in the XGBoost model with 82% accuracy, 70% recall, and 92% precision.

References

Menarianti, I. (2015). Klasifikasi Data Mining Dalam Menentukan Pemberian Kredit Bagi Nasabah Koperasi. Jurnal Ilmiah Teknosains, 26-45.

Larose, D. T., & Larose, C. D. (2014). Discovering Knowledge In Data An Introduction To Data Mining.

Y. Pristyanto, “Penerapan Metode Ensemble Untuk Meningkatkan Kinerja Algoritma Klasifikasi Pada Imbalanced Dataset,” J. TEKNOINFO, 13, no. 1, pp. 11–16, 2019, doi: 10.33365/jti.

Mittal, L., Gupta, T., & Sangaiah, A. K. (2016). PREDICTION OF CREDIT RISK EVALUATION USING NAIVE BAYES,. The IIOAB Journal, 33-42.

Y. Sun, M. S. Kamel, A. K. C. Wong, and Y. Wang, “Cost-sensitive boosting for classification of imbalanced data,” Pattern Recognit., vol. 40, no. 12, pp. 3358–3378, 2007, doi: 10.1016/j.patcog.2007.04.009.

Bawono, B., & Wasono, R. (2019 (3)). PERBANDINGAN METODE RANDOM FOREST DAN NAÏVE BAYES UNTUK KLASIFIKASI DEBITUR BERDASARKAN KUALITAS KREDIT. Seminar Nasional Edusaintek, 343-348.

Hanif, I. (2019). Implementing Extreme Gradient Boosting (XGBoost) Classifier to Improve Customer Churn Prediction. International Conference on Statistics and Analytics.

Astuti, D., Iskandar, A. R., & Febrianti, A. (2019). Penentuan Strategi Promosi Usaha Mikro Kecil Dan Menengah (UMKM) Menggunakan Metode CRISP-DM dengan Algoritma K-Means Clustering. Journal of Informatics, Information System, Software Engineering and Applications (INISTA), 1(2), 060-072.

Feblian, D., & Daihan, D. U. (2016). Implementasi Model CRISP-DM untuk Menentukan Sales Pipeline pada PT X. Jurnal Teknik Industri, 1(1), 1-12.

Fahmi, R. N., Jajuli, M., & Sulistiyowati, N. (2021). Analisis Pemetaan Tingkat Kriminalitas di Kabupaten Karawang menggunakan Algoritma K-Means. INTECOMS: Journal of Information Technology and Computer Science, 67 - 79.

Gaye, B., & Zhang, D. W. (2021). Improvement of Support Vector Machine Algorithm in Big Data Background. Mathematical Problems in Engineering.

Prajarini, D. (2016). Perbandingan Algoritma Klasifikasi Data Mining Untuk Prediksi Penyakit Kulit. INFORMAL: Informatics Journal, 1-5.

Umar, R., & Riadi, I. P. (2020). Perbandingan Metode SVM, RFdan SGDuntuk Penentuan Model Klasifikasi Kinerja Programmerpada Aktivitas Media Sosial. JURNAL RESTI (Rekayasa Sistem dan Teknologi Informasi), 329-325.

Sadewo, M. G., Windarto, A. P., & Hartama, D. (2017). Penerapan Datamining Pada Populasi Daging Ayam RAS Pedaging di Indonesia Berdasarkan Provinsi Menggunakan K-Means Clustering. InfoTekJar (Jurnal Nasional Informatika dan Teknologi Jaringan), 60-67.

Nugroho, Sulistyo, Yusuf. Emiliyawati, Nova. 2017. Sistem Klasifikasi Variabel Tingkat Penerimaan Konsumen Terhadap Mobil Menggunakan Metode Random Forest. Jurnal Teknik Elektro.9(1).

Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics, 1189-1232.

Jiang, Y., Tong, G., Yin, H., & Xiong, N. (2019). A Pedestrian Detection Method Based on Genetic Algorithm for Optimize XGBoost Training Parameters. IEEE Access, 118310 - 118321.

Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794.

M. Syukron, R. Santoso, & T. Widiharih,(2020). “Perbandingan Metode Smote Random Forest Dan Smote Xgboost Untuk Klasifikasi Tingkat Penyakit Hepatitis C Pada Imbalance Class Data”, Jurnal Gaussian,9, 227- 236.

Mohammadi, N., & Zangeneh, M. (2016). Customer Credit Risk Assessment using Artificial Neural Networks. International Journal of Information Technology and Computer Science, 58-66.