A COMPARATIVE STUDY OF MACHINE LEARNING AND CLASSICAL STATISTICAL MODELS IN CREDIT RISK PREDICTION
Keywords:
Credit Risk Prediction, Linear Discriminant Analysis, Machine Learning, Model Governance, Default ClassificationAbstract
Growing interest in machine learning (ML) has led to assumptions that complex algorithms inherently outperform classical statistical techniques in credit risk prediction. To evaluate this belief, five commonly deployed models Logistic Regression, Linear Discriminant Analysis (LDA), Random Forest, Support Vector Machine, and K-Nearest Neighbors were benchmarked on a structured dataset designed to reflect retail lending conditions. Results demonstrate that classical models are not only competitive but often operationally preferable: LDA achieves the strongest balance of recall and AUC, while Random Forest and SVM fail to identify a substantial proportion of defaulting borrowers, producing false-negative rates inconsistent with risk tolerance and regulatory expectations. The findings indicate that algorithmic performance is governed by data geometry rather than complexity, and that monotonic feature spaces do not provide the structural advantage ML methods require. These insights support a criterion-based approach to model selection, prioritizing interpretability, governance alignment, and loss-minimizing error structure.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
















