A cost-sensitive logistic regression credit scoring model based on multi-objective optimization approach

    Feng Shen Affiliation
    ; Run Wang Affiliation
    ; Yu Shen Affiliation


Credit scoring is an important process for peer-to-peer (P2P) lending companies as it determines whether loan applicants are likely to default. The aim of most credit scoring models is to minimize the classification error rate, which implies that all classification errors bear the same cost; however, in reality, there is a significant cost-sensitive problem in credit scoring methods. Therefore, in this paper, a new cost-sensitive logistic regression credit scoring model based on a multi-objective optimization approach is proposed that has two objectives in the cost-sensitive logistic regression process. The cost-sensitive logistic regression parameters are solved using a multiple objective particle swarm optimization (MOPSO) algorithm. In the empirical analysis, the proposed model was applied to the credit scoring of a Chinese famous P2P company, from which it was found that compared with other common credit scoring models, the proposed model was able to effectively reduce type II error rates and total classification error costs, and improve the AUC, the F1 values (reconciliation average of Recall and Precision), and the G-means. The proposed model was compared with other multi-objective optimization algorithms to further demonstrate that MOPSO is the best approach for cost-sensitive logistic regression credit scoring models.

First published online 27 November 2019

Keyword : credit scoring, cost-sensitive, logistic regression, multi-objective optimization, P2P

How to Cite
Shen, F., Wang, R., & Shen, Y. (2020). A cost-sensitive logistic regression credit scoring model based on multi-objective optimization approach. Technological and Economic Development of Economy, 26(2), 405-429.
Published in Issue
Feb 3, 2020
Abstract Views
PDF Downloads
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.


Abdou, H. A., Tsafack, M. D. D., Ntim, C. G., & Baker, R. D. (2016). Predicting creditworthiness in retail banking with limited scoring data. Knowledge-Based Systems, 103, 89-103.

Abraham, T. W. (2018). Estimating the effects of financial access on poor farmers in rural northern Nigeria. Financial Innovation, 4(25), 1-20.

Ala’raj, M., & Abbod, M. F. (2016). Classifiers consensus system approach for credit scoring. KnowledgeBased Systems, 104, 89-105.

Altman, E. I., & Sabato, G. (2005). Effects of the new Basel capital accord on bank capital requirements for SMEs. Journal of Financial Services Research, 28(1-3), 15-42.

Bahnsen, A. C., Aouada, D., & Ottersten, B. (2015). Example-dependent cost-sensitive decision trees. Expert Systems with Applications, 42(19), 6609-6619.

Bahnsen, A. C., Aouada, D., & Ottersten, B. (2014, December). Example-dependent cost-sensitive logistic regression for credit scoring. In 2014 13th International conference on machine learning and applications (pp. 263-269). Detroit, MI, USA: IEEE.

Baldi, P., Brunak, S., Chauvin, Y., Andersen, C. A., & Nielsen, H. (2000). Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics, 16(5), 412-424.

Bequé, A., Coussement, K., Gayler, R., & Lessmann, S. (2017). Approaches for credit scorecard calibration: An empirical analysis. Knowledge-Based Systems, 134, 213-227.

Chen, Z., Li, Y., Wu, Y., & Luo, J. (2017). The transition from traditional banking to mobile internet finance: an organizational innovation perspective-a comparative study of Citibank and ICBC. Financial Innovation, 3(12), 1-16.

Coello, C. A. C., Pulido, G. T., & Lechuga, M. S. (2004). Handling multiple objectives with particle swarm optimization. IEEE Transactions on Evolutionary Computation, 8(3), 256-279.

Coello, C. A. C. C., & Pulido, G. T. (2001, March). A micro-genetic algorithm for multiobjective optimization. In International conference on evolutionary multi-criterion optimization (pp. 126-140). Berlin, Heidelberg: Springer.

Deb, K., Agrawal, S., Pratap, A., & Meyarivan, T. (2000, September). A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II. In International conference on parallel problem solving from nature (pp. 849-858). Berlin, Heidelberg: Springer.

Desai, V. S., Crook, J. N., & Overstreet Jr, G. A. (1996). A comparison of neural networks and linear scoring models in the credit union environment. European Journal of Operational Research, 95(1), 24-37.

Ding, S., Chen, C., Xin, B., & Pardalos, P. M. (2018). A bi-objective load balancing model in a distributed simulation system using NSGA-II and MOPSO approaches. Applied Soft Computing, 63, 249-267.

Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861-874.

García, V., Marqués, A. I., & Sánchez, J. S. (2012, November). Improving risk predictions by preprocessing imbalanced credit data. In International conference on neural information processing (pp. 68-75). Berlin, Heidelberg: Springer.

Greene, W. (1998). Sample selection in credit-scoring models. Japan and the World Economy, 10(3), 299-316.

Günnemann, N., & Pfeffer, J. (2017, May). Cost matters: a new example-dependent cost-sensitive logistic regression model. In Pacific-Asia conference on knowledge discovery and data mining (pp. 210-222). Cham: Springer.

Guo, Y., Zhou, W., Luo, C., Liu, C., & Xiong, H. (2016). Instance-based credit risk assessment for investment decisions in P2P lending. European Journal of Operational Research, 249(2), 417-426.

Huysmans, J., Dejaeger, K., Mues, C., Vanthienen, J., & Baesens, B. (2011). An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decision Support Systems, 51(1), 141-154.

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112, p. 18). New York: Springer.

Kao, L. J., Chiu, C. C., & Chiu, F. Y. (2012). A Bayesian latent variable model with classification and regression tree approach for behavior and credit scoring. Knowledge-Based Systems, 36, 245-252.

Khashman, A. (2010). Neural networks for credit risk evaluation: Investigation of different neural models and learning schemes. Expert Systems with Applications, 37(9), 6233-6239.

Kim, J., Choi, K., Kim, G., & Suh, Y. (2012). Classification cost: An empirical comparison among traditional classifier, Cost-Sensitive Classifier, and MetaCost. Expert Systems with Applications, 39(4), 4013-4019.

Knowles, J. D., & Corne, D. W. (2000). Approximating the nondominated front using the Pareto archived evolution strategy. Evolutionary Computation, 8(2), 149-172.

Kou, G., Chao, X., Peng, Y., Alsaadi, F. E., & Herrera-Viedma, E. (2019). Machine learning methods for systemic risk analysis in financial sectors. Technological and Economic Development of Economy, 25(5), 716-742.

Ling, C. X., & Sheng, V. S. (2011). Cost-sensitive learning. In Encyclopedia of machine learning (pp. 231-235). Boston, MA: Springer.

Marqués, A. I., García, V., & Sánchez, J. S. (2013). On the suitability of resampling techniques for the class imbalance problem in credit scoring. Journal of the Operational Research Society, 64(7), 1060-1070.

Min, J. H., & Lee, Y. C. (2005). Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Systems with Applications, 28(4), 603-614.

Nayak, S. C., & Misra, B. B. (2018). Estimating stock closing indices using a GA-weighted condensed polynomial neural network. Financial Innovation, 4(21), 1-22.

Rashid, A., & Jabeen, N. (2018). Financial frictions and the cash flow–external financing sensitivity: evidence from a panel of Pakistani firms. Financial Innovation, 4(15), 1-20.

Reyes-Sierra, M., & Coello, C. C. (2006). Multi-objective particle swarm optimizers: A survey of the state-of-the-art. International Journal of Computational Intelligence Research, 2(3), 287-308.

Serrano-Cinca, C., & Gutiérrez-Nieto, B. (2016). The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (P2P) lending. Decision Support Systems, 89, 113-122.

Shen, F., Zhao, X., Li, Z., Li, K., & Meng, Z. (2019). A novel ensemble classification model based on neural networks and a classifier optimisation technique for imbalanced credit risk evaluation. Physica A: Statistical Mechanics and its Applications, 526, 121073.

Thomas, L. C. (2010). Consumer credit models pricing, profit and portfolios. Journal of the Royal Statistical Society, 173(2), 468-468.

Tsai, C. F. (2009). Feature selection in bankruptcy prediction. Knowledge-Based Systems, 22(2), 120-127.

Verbraken, T., Bravo, C., Weber, R., & Baesens, B. (2014). Development and application of consumer credit scoring models using profit-based classification measures. European Journal of Operational Research, 238(2), 505-513.

Wang, G., Ma, J., Huang, L., & Xu, K. (2012). Two credit scoring models based on dual strategy ensemble trees. Knowledge-Based Systems, 26, 61-68.

Wiginton, J. C. (1980). A note on the comparison of logit and discriminant models of consumer credit behavior. Journal of Financial and Quantitative Analysis, 15(3), 757-770.

Xia, Y., Liu, C., & Liu, N. (2017). Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending. Electronic Commerce Research and Applications, 24, 30-49.

Yu, L., Yue, W., Wang, S., & Lai, K. K. (2010). Support vector machine based multiagent ensemble learning for credit risk evaluation. Expert Systems with Applications, 37(2), 1351-1360.

Zhu, X., Li, J., Wu, D., Wang, H., & Liang, C. (2013). Balancing accuracy, complexity and interpretability in consumer credit decision making: A C-TOPSIS classification approach. Knowledge-Based Systems, 52(6), 258-267.