COMPARATIVE ANALYSIS OF MACHINE LEARNING ALGORITHMS FOR CUSTOMER CHURN PREDICTION IN BANKING
Abstract
Acquiring new customers is significantly more expensive than retaining existing ones, making customer churn prediction a critical challenge for organizations. This paper reviews the application of machine learning techniques for churn prediction over the past five years, examining models ranging from traditional logistic regression to advanced gradient boosting ensembles and sequence-aware deep learning approaches. The study evaluates the effectiveness of these algorithms in predicting customer churn and improving customer retention strategies.
The analysis compares model performance on widely used benchmark datasets, including IBM Telco, SyriaTel, and the UCI KDD Orange corpus, using evaluation metrics such as Accuracy, AUC-ROC, and F1-score. The paper also investigates techniques for handling class imbalance, particularly the Synthetic Minority Over-sampling Technique (SMOTE), which plays an important role in improving predictive performance on skewed datasets.
A review of more than 30 research studies reveals that gradient boosting algorithms, including XGBoost, LightGBM, and CatBoost, consistently outperform conventional machine learning methods for churn prediction. Among these, CatBoost achieves the highest reported AUC of 0.982 on benchmark datasets. The paper further discusses the role of Explainable Artificial Intelligence (XAI) techniques such as SHAP and LIME in improving model interpretability, enhancing stakeholder trust, and supporting regulatory compliance in AI-driven decision-making.
Authors
Raj Shekhar Singh, Rahul Raj, Karan Raj, Garima, Ankesh Jha
Institution
Noida Institute of Engineering & Technology (MCA Institute), Greater Noida, India

Leave A Comment