Financial distress prediction: a novel data segmentation research on Chinese listed companies

    Fang-Jun Zhu   Affiliation
    ; Lu-Juan Zhou   Affiliation
    ; Mi Zhou   Affiliation
    ; Feng Pei   Affiliation


In the Chinese stock market, the unique special treatment (ST) warning mechanism can signal financial distress for listed companies. In existing studies, classification model has been developed to differentiate the two general listing states. However, this classification model cannot explain the internal changes of each listing state. Considering that the requirement of the withdrawal of ST in the mechanism is relatively loose, we propose a new segmentation approach for Chinese listed companies, which are divided into negative companies and positive companies according to the number of times being labeled ST. Under the framework of data mining, we use financial indicators, non-financial indicators, and time series to build a financial distress prediction model of distinguishing the long-term development of different Chinese listed companies. Through data segmentation, we find that the negative samples have a huge destructive interference on the prediction effect of the total sample. On the contrary, positive companies improve the prediction accuracy in all aspects and the optimal feature set is also different from all companies. The main contribution of the paper is to analyze the internal impact of the deterioration of financial distress prediction in time series and construct an optimization model for positive companies.

First published online 04 November 2021

Keyword : financial distress prediction, Chinese listed companies, ensemble learning, data mining, data segmentation, special treatment

How to Cite
Zhu, F.-J., Zhou, L.-J., Zhou, M., & Pei, F. (2021). Financial distress prediction: a novel data segmentation research on Chinese listed companies. Technological and Economic Development of Economy, 27(6), 1413-1446.
Published in Issue
Nov 18, 2021
Abstract Views
PDF Downloads
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.


Alaka, H. A., Oyedele, L. O., Owolabi, H. A., Kumar, V., Ajayi, S. O., Akinade, O. O., & Bilal, M. (2017). Systematic review of bankruptcy prediction models: Towards a framework for tool selection. Expert Systems with Application, 94, 164–184.

Alfaro, E., Garcia, N., Gamez, M., & Elizondo, D. (2008). Bankruptcy forecasting: An empirical comparison of AdaBoost and neural networks. Decision Support Systems, 45(1), 110–122.

Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23(4), 589–609.

Barboza, F., Kimura, H., & Altman, E. (2017). Machine learning models and bankruptcy prediction. Expert Systems with Applications, 83, 405–417.

Batmaz, I., Danisoglu, S., Yazici, C., & Kartal-Koc, E. (2017). A data mining application to deposit pricing: Main determinants and prediction mod-els. Applied Soft Computing, 60, 808–819.

Beaver, W. H. (1966). Financial ratios as predictors of failure. Journal of Accounting Research, 4, 71–111.

Brown, I., & Mues, C. (2012). An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Systems with Applications, 39, 3446–3453.

Chen, N., Ribeiro, B., Vieira, A. S., Duarte, J., & Neves, J. C. (2011). A genetic algorithm-based approach to cost-sensitive bankruptcy prediction. Expert Systems with Applications, 38, 12939–12945.

Chong, E., Han, C., & Park, F. C. (2017). Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies. Expert Systems with Applications, 83, 187–205.

Cleofas-Sánchez, L., García, V., Marqués, A., & Sénchez, J. (2016). Financial distress prediction using the hybrid associative memory with trans-lation. Applied Soft Computing, 44, 144–152.

Danenas, P., & Garsva, G. (2015). Selection of support vector machines based classifiers for credit risk domain. Expert Systems with Applications, 42(6), 3194–3204.

Das, A. K., Das, S., & Ghosh, A. (2017). Ensemble feature selection using bi-objective genetic algorithm. Knowledge-Based Systems, 123, 116–127.

Davis, J., & Goadrich, M. (2006). The relationship between precision-recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning (pp. 233–240). Association for Computing Machinery.

Dimitras, A. I., Zanakis, S. H., & Zopounidis, C. (1996). A survey of business failures with an emphasis on prediction methods and industrial applications. European Journal of Operational Research, 90(3), 487–513.

Ding, Y. S., Song, X. P., & Zen, Y. M. (2008). Forecasting financial condition of Chinese listed companies based on support vector machine. Expert Systems with Applications, 34(4), 3081–3089.

du Jardin, P. (2016). A two-stage classification technique for bankruptcy prediction. European Journal of Operational Research, 254(1), 236–252.

Dutta, D., Sil, J., & Dutta, P. (2020). A bi-phased multi-objective genetic algorithm based classifier. Expert Systems with Applications, 146, 113163.

Espejo, P. G., Ventura, S., & Herrera, F. (2010). A survey on the application of genetic programming to classification. IEEE Transactions on Sys-tems Man and Cybernetics Part C (Applications and Reviews), 40(2), 121–144.

Farisha, H., Hafiza, A. H., & Zalailah, S. (2012). Motivation for earnings management among auditors in Malaysia. Procedia - Social and Behav-ioral Sciences, 65, 239–246.

Geng, R. B., Bose, I., & Chen, X. (2015). Prediction of financial distress: An empirical study of listed Chinese companies using data mining. Eu-ropean Journal of Operational Research, 241(1), 236–247.

Gorzalczany, M. B., & Rudzinski, F. (2016). A multi-objective genetic optimization for fast, fuzzy rule-based credit classification with balanced accuracy and interpretability. Applied Soft Computing, 40, 206–220.

Heo, J., & Yang, J. Y. (2014). AdaBoost based bankruptcy forecasting of Korean construction companies. Applied Soft Computing, 24, 494–499.

Yu, E. Z., & Cho, S. (2006). Ensemble based on GA wrapper feature selection. Computers & Industrial Engineering, 51(1), 111–116.

Kim, H. S., & Sohn, S. Y. (2010). Support vector machines for default prediction of SMEs based on technology credit. European Journal of Op-erational Research, 201, 838–846.

Kim, M. J., Kang, D. K., & Kim, H. B. (2015). Geometric mean based boosting algorithm with over-sampling to resolve data imbalance problem for bankruptcy prediction. Expert Systems with Applications, 42(3), 1074–1082.

Konak, A., Coit, D. W., & Smith, A. E. (2006). Multi-objective optimization using genetic algorithms: A tutorial. Reliability Engineering and Sys-tem Safety, 91(9), 992–1007.

Krogh, A., & Vedelsby, J. (1995). Neural network ensembles, cross validation, and active learning. Advances in Neural Information Processing Systems, 7, 231–238.

Liang, D., Lu, C. C., Tsai, C. F., & Shih, G. A. (2016). Financial ratios and corporate governance indicators in bankruptcy prediction: A compre-hensive study. European Journal of Operational Research, 252(2), 561–572.

Liang, D., Tsai, C. F., & Wu, H. T. (2015). The effect of feature selection on financial distress prediction. Knowledge-Based Systems, 73, 289–297.

Lin, W. Y., Hu, Y. H., & Tsai, C. F. (2012). Machine learning in financial crisis prediction: A survey. IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews), 42(4), 421–436.

Martin, D. (1977). Early warnings of bank failure: A logit regression approach. Journal of Banking and Finance, 1(3), 249–276.

Miglani, S., Ahmed, K., & Henry, D. (2015). Voluntary corporate governance structure and financial distress: Evidence from Australia. Journal of Contemporary Accounting & Economics, 11(1), 18–30.

Mousavi, M. M., & Lin, J. L. (2020). The application of PROMETHEE multi-criteria decision aid in financial decision making: Case of distress prediction models evaluation. Expert Systems with Applications, 159, 113438.

Olson, D. L., Delen, D., & Meng, Y. Y. (2012). Comparative analysis of data mining methods for bankruptcy prediction. Decision Support Sys-tems, 52(2), 464–473.

Sanchez-Lasheras, F., de Andres, J., Lorca, P., & de Cos Juez, F. J. (2012). A hybrid device for the solution of sampling bias problems in the forecasting of firms’ bankruptcy. Expert Systems with Applications, 39, 7512–7523.

Shearer, C. (2000). The CRISP-DM model: The new blueprint for data mining. Journal of Data Warehousing, 5(4), 3–22.

Sun, J., Lang, J., Fujita, H., & Li, H. (2018). Imbalanced enterprise credit evaluation with DTE-SBD: Decision tree ensemble based on SMOTE and bagging with differentiated sampling rates. Information Sciences, 425, 76–91.

Sun, J., Li, H., Fujita, H., Fu, B. B., & Ai, W. G. (2020). Class-imbalanced dynamic financial distress prediction based on Adaboost-SVM ensem-ble combined with SMOTE and time weighting. Information Fusion, 54, 128–144.

Tian, Y., Shi, Y., & Liu, X. (2012). Recent advances on support vector machines research. Technological and Economic Development of Economy, 18(1), 5–33.

Wang, G., Chen, G., & Chu, Y. (2018). A new random subspace method incorporating sentiment and textual information for financial distress prediction. Electronic Commerce Research and Applications, 29, 30–49.

Wang, G., Ma, J. L., Chen, G., & Yang, Y. (2020). Financial distress prediction: Regularized sparse-based Random Subspace with ER aggregation rule incorporating textual disclosures. Applied Soft Computing, 90, 106152.

Wang, G., Ma, J., & Yang, S. (2014). An improved boosting based on feature selection for corporate bankruptcy prediction. Expert Systems with Applications, 41(5), 2353–2361.

West, R. C. (1985). A factor-analytic approach to bank condition. Journal of Banking & Finance, 9(2), 253–266.

Xia, Y. F., Liu, C. Z., Li, Y. Y., & Liu, N. N. (2017). A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring. Expert Systems with Applications, 78, 225–241.

Zhou, L. G. (2013). Predicting the removal of special treatment or delisting risk warning for listed company in China with Adaboost. Procedia Computer Science, 17, 633–640.

Zhou, L. G., Lu, D., & Fujita, H. (2015). The performance of corporate financial distress prediction models with features selection guided by do-main knowledge and data mining approaches. Knowledge-Based Systems, 85, 52–61.

Zhou, L. G., Tam, K. P., & Fujita, H. (2016). Predicting the listing status of Chinese listed companies with multi-class classification models. In-formation Sciences, 328, 222–236.

Zieba, M., Tomczak, S. K., & Tomczak, J. M. (2016). Ensemble boosted trees with synthetic features generation in application to bankruptcy pre-diction. Expert Systems with Applications, 58, 93–101.