Safety risk evaluations of deep foundation construction schemes based on imbalanced data sets

    Peisong Gong Affiliation
    ; Haixiang Guo Affiliation
    ; Yuanyue Huang Affiliation
    ; Shengyu Guo Affiliation


Safety risk evaluations of deep foundation construction schemes are important to ensure safety. However, the amount of knowledge on these evaluations is large, and the historical data of deep foundation engineering is imbalanced. Some adverse factors influence the quality and efficiency of evaluations using traditional manual evaluation tools. Machine learning guarantees the quality of imbalanced data classifications. In this study, three strategies are proposed to improve the classification accuracy of imbalanced data sets. First, data set information redundancy is reduced using a binary particle swarm optimization algorithm. Then, a classification algorithm is modified using an Adaboost-enhanced support vector machine classifier. Finally, a new classification evaluation standard, namely, the area under the ROC curve, is adopted to ensure the classifier to be impartial to the minority. A transverse comparison experiment using multiple classification algorithms shows that the proposed integrated classification algorithm can overcome difficulties associated with correctly classifying minority samples in imbalanced data sets. The algorithm can also improve construction safety management evaluations, relieve the pressure from the lack of experienced experts accompanying rapid infrastructure construction, and facilitate knowledge reuse in the field of architecture, engineering, and construction.

Keyword : safety risk evaluation, construction scheme, deep foundation, imbalanced data set, ensemble learning algorithm, machine learning

How to Cite
Gong, P., Guo, H., Huang, Y., & Guo, S. (2020). Safety risk evaluations of deep foundation construction schemes based on imbalanced data sets. Journal of Civil Engineering and Management, 26(4), 380-395.
Published in Issue
Apr 20, 2020
Abstract Views
PDF Downloads
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.


Al-Anbari, S., Khalina, A., Alnuaimi, A., Normariah, A., & Yahya, A. (2015). Risk assessment of safety and health (RASH) for building construction. Process Safety and Environmental Protection, 94, 149–158.

Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.

Cao, H. M. (2014). Research on the risk assessment for the construction safety in the planning and design stages of bridge engineering. Advanced Materials Research, 998–999, 1678– 1681.

Cao, Y., Miao, Q.-G., Liu, J.-C., & Gao, L. (2013). Advance and prospects of AdaBoost algorithm. Acta Automatica Sinica, 39(6), 745–758.

Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16(3), 321–357.

Chawla, N. V., Lazarevic, A., Hall, L. O., & Bowyer, K. W. (2003). SMOTEBoost: Improving prediction of the minority class in boosting. Paper presented at the Knowledge Discovery in Databases: PKDD 2003 (pp. 107–119). Springer.

Chuang, L. Y., Chang, H. W., Tu, C. J., & Yang, C. H. (2008). Improved binary PSO for feature selection using gene expression data. Computational Biology and Chemistry, 32(1), 29–38.

Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.

Eskesen, S. D., Tengborg, P., Kampmann, J., & Holst Veicherts, T. (2004). Guidelines for tunnelling risk management: International Tunnelling Association, Working Group No. 2. Tunnelling and Underground Space Technology, 19(3), 217–237.

Ding, L. Y., Yu, H. L., Li, H., Zhou, C., Wu, X. G., & Yu, M. H. (2012). Safety risk identification system for metro construction on the basis of construction drawings. Automation in Construction, 27, 120–137.

Everson, R. M., & Fieldsend, J. E. (2006). Multi-class ROC analysis from a multi-objective optimisation perspective. Pattern Recognition Letters, 27(8), 918–927.

Fernández, A., López, V., Galar, M., del Jesus, M. J., & Herrera, F. (2013). Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches. Knowledge-Based Systems, 42, 97–110.

Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., & Herrera, F. (2012). A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(4), 463–484.

Gao, P., Liu, X., & Tong, R. P. (2013). Risk assessment and preliminary study of safety management system on construction works. Applied Mechanics and Materials, 368–370, 1917–1921.

GB50652-2011. Code for risk management of underground works in urban rail transit.

GB50715-2011. Standard for construction safety evaluation of metro engineering.

Hassan, M. R., Ramamohanarao, K., Karmakar, C., Hossain, M. M., & Bailey, J. (2010). A novel scalable multi-class ROC for effective visualization and computation. In Advances in Knowledge Discovery and Data Mining (pp. 107–120). Springer.

Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. In Machine Learning: ECML-98 (pp.137– 142). Springer.

Kennedy, J., & Eberhart, R. C. (1995). Particle swarm optimization. In Proceedings of ICNN’95 – International Conference on Neural Networks. Perth, WA, Australia. IEEE.

Kennedy, J., & Eberhart, R. C. (1997). A discrete binary version of the particle swarm algorithm. In 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation. Orlando, FL, USA. IEEE.

Krawczyk, B., & Schaefer, G., (2013). An improved ensemble approach for imbalanced classification problems. In 2013 IEEE 8th International Symposium on Applied Computational Intelligence and Informatics (SACI). Timisoara, Romania. IEEE.

Lee, C. Y., & Lee, Z. J. (2012). A novel algorithm applied to classify unbalanced data. Applied Soft Computing, 12(8), 2481–2485.

Lesser, V., Durfee, E., & Corkill, D. (2006). Trends in cooperative distributed problem solving. IEEE Transactions on Knowledge & Data Engineering, 18, 63–77.

Liu, W., Zhao, T., Zhou, W., & Tang, J. (2018). Safety risk factors of metro tunnel construction in China: an integrated study with EFA and SEM. Safety Science, 105, 98–113.

Li, Y. J., Guo, H. X., Liu, X., Li, Y. N., & Li, J. L. (2016a). Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowledge-Based Systems, 94, 88–104.

Li, Y. J., Guo, H. X., Li, Y. N., & Liu, X. (2016b). A boosting based ensemble learning algorithm in imbalanced data classification. System Engineering – Theory & Practice, 36(1), 189–199.

López, V., Fernández, A., García, S., Palade, V., & Herrera, F. (2013). An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Information Sciences, 250(11), 113–141.

Luo, H., & Gong, P. (2015). A BIM-based code compliance checking process of deep foundation construction plans. Journal of Intelligent & Robotic Systems, 79(3–4), 549–576.

Khan, M. N., Ksantini, R., Ahmad, S. I., & Guan, L. (2014). Covariance-guided one-class support vector machine. Pattern Recognition, 47(6), 2165–2177.

Park, J., Park, S., & Oh, T., (2015). The development of a webbased construction safety management information system to improve risk assessment. KSCE Journal of Civil Engineering, 19(3), 528–537.

Patel, D. A., & Jha, K. N. (2017). Developing a process to evaluate construction project safety hazard index using the possibility approach in India. Journal of Construction Engineering and Management, 143(1), 04016081.

Pinto, A. (2014). QRAM a qualitative occupational safety risk assessment model for the construction industry that incorporate uncertainties by the use of fuzzy sets. Safety Science, 63, 57–76.

Preidel, C., & Borrmann, A. (2015). Automated code compliance checking based on a visual language and building information modeling. In Proceedings of the International Symposium on Automation and Robotics in Construction (ISARC) (Vol. 32). IAARC Publications.

Sansakorn, P., & An, M. (2015). Development of risk assessment and occupational safety management model for building construction projects. International Journal of Civil and Environmental Engineering, 9(9), 1248–1255.

Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5(2), 197–227.

Seo, J. W., & Choi, H. (2008). Risk-based safety impact assessment methodology for underground construction projects in Korea. Journal of Construction Engineering and Management, 134(1), 72–81.

Sun, Y., Fang, D., Wang, S., Dai, M., & Lv, X., (2008). Safety risk identification and assessment for Beijing Olympic Venues construction. Journal of Management in Engineering, 24(1), 40–47.

Sun, Y., Kamel, M. S., Wong, A. K. C., & Wang, Y. (2007). Costsensitive boosting for classification of imbalanced data. Pattern Recognition, 40(12), 3358–3378.

Tan, X., Hammad, A., & Fazio, P. (2010). Automated code compliance checking for building envelope design. Journal of Computing in Civil Engineering, 24(2), 203–211.

Tao, X., Li, Q., Guo, W., Ren, C., Li, C., Liu, R., & Zou, J. (2019a). Self-adaptive cost weights-based support vector machine cost-sensitive ensemble for imbalanced data classification. Information Sciences, 487, 31–56.

Tao, X., Li, Q., Ren, C., Guo, W., Li, C., He, Q., & Zou, J. (2019b). Real-value negative selection over-sampling for imbalanced data set learning. Expert Systems with Applications, 129, 118– 134.

Wang, F., Ding, L., Love, P. E. D., & Edwards, D. J. (2016). Modeling tunnel construction risk dynamics: Addressing the production versus protection problem. Safety Science, 87, 101–115.

Wang, Z. Z., & Chen, C. (2017). Fuzzy comprehensive Bayesian network-based safety risk assessment for metro construction projects. Tunnelling and Underground Space Technology, 70, 330–342.

Weiss, G. M., & Provost, F. (2003). Learning when training data are costly: The effect of class distribution on tree induction. Journal of Artificial Intelligence Research, 19, 315–354.

Yang, M., Yin, J. M., & Ji, G. L., (2008). Classification methods on imbalance data: A survey. Journal of Nanjing Normal University (Engineering and Technology Edition), 8(4), 7–12.

Yang, Q. Z., & Xu, X. (2004). Design knowledge modeling and software implementation for building code compliance checking. Building and Environment, 39(6), 689–698.

Zhang, B. X., & Ma, F. H. (2014). Metro construction safety risk assessment of Xi’an based on CIM model. Applied Mechanics and Materials, 638–640, 804–808.

Zhang, S., Shang, C., Wang, C., Song, R., & Wang, X. (2019). Real-time safety risk identification model during metro construction adjacent to buildings. Journal of Construction Engineering and Management, 145(6), 04019034.

Zhang, S., & Zhang, H. X. (2011). Modified KNN algorithm for multi-label learning. Application Research of Computers, 28(12), 4445–4450.

Zhang, Y., Ding, L., & Love, P. E. D. (2017). Planning of deep foundation construction technical specifications using improved case-based reasoning with weighted k-nearest neighbors. Journal of Computing in Civil Engineering, 31(5), 04017029.

Zheng, E. H., Li, P., & Song, Z. H. (2006). Cost sensitive support vector machines. Control and Decision, 21(4), 473–476.

Zheng, X., Ma, F. H. (2014). Metro construction safety risk assessment based on the fuzzy AHP and the comprehensive evaluation method. Applied Mechanics and Materials, 580– 583, 1243–1248.

Zhong, B. T., Ding, L. Y., Luo, H. B., Zhou, Y., Hu, Y. Z., & Hu, H. M. (2012). Ontology-based semantic modeling of regulation constraint for automated construction quality compliance checking. Automation in Construction, 28, 58–70.

Zhong, B., Gan, C., Luo, H., & Xing, X. (2018). Ontology-based framework for building environmental monitoring and compliance checking under BIM environment. Building and Environment, 141, 127–142.

Zhong, B., & Li, Y. (2015). An ontological and semantic approach for the construction risk inferring and application. Journal of Intelligent & Robotic Systems, 79(3), 449–463.

Zhou, H.-b., & Zhang, H. (2011). Risk assessment methodology for a deep foundation pit construction project in Shanghai, China. Journal of Construction Engineering and Management, 137(12), 1185–1194.

Zhou, Y., Ding, L. Y., & Chen, L. J. (2013). Application of 4D visualization technology for safety management in metro construction. Automation in Construction, 34, 25–36.