Forecasting aircraft miles flown time series using a deep learning-based hybrid approach
Neural network-based methods such as deep neural networks show great efficiency for a wide range of applications. In this paper, a deep learning-based hybrid approach to forecast the yearly revenue passenger kilometers time series of Australia’s major domestic airlines is proposed. The essence of the approach is to use a resilient error backpropagation algorithm with dropout for “tuning” the polynomial neural network, obtained as a result of a multi-layered GMDH algorithm. The article compares the performance of the suggested algorithm on the time series with other popular forecasting methods: deep belief network, multi-layered GMDH algorithm, Box-Jenkins method and the ANFIS model. The minimum reached MAE of the proposed algorithm was approximately 25% lower than the minimum MAE of the next best method – GMDH, thus indicating that the practical application of the algorithm can give good results compared with other well-known methods.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Asteriou, D., & Hall, S. G. (2011). ARIMA Models and the Box–Jenkins methodology. In Applied Econometrics (2nd ed., pp. 265-286). Palgrave MacMillan.
Australian Domestic Airline Activity-time series. (n.d.). Retrieved from https://bitre.gov.au/publications/ongoing/files/domestic_airline_activity_Domestic_Annual_Summary_1944_2012-13.xls
Ba-Fail, A. O., Abed, S. Y., & Jasimuddin, S. M. (2000). The determinants of domestic air travel demand in the Kingdom of Saudi Arabia. Journal of Air Transportation World Wide, 5(2), 72-86.
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
Bahdanau, D., Chorowski, J., Serdyuk, D., Brakel, P., & Bengio, Y. (2016). End-to-end attention-based large vocabulary speech recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai (pp. 4945-4949). https://doi.org/10.1109/ICASSP.2016.7472618
Bengio, Y., De Mori, R., Flammia, G., & Kompe, R. (1992). Global optimization of a neural network-hidden Markov model hybrid. IEEE Transactions on Neural Networks, 3(2), 252-259. https://doi.org/10.1109/72.125866
Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157-166. https://doi.org/10.1109/72.279181
Blinova, T. O. (2007). Analysis of possibility of using neural network to forecast passenger traffic flows in Russia. Aviation 11(1), 28-34.
Box, G., & Jenkins, G. (1970). Time series analysis: Forecasting and control (pp. 211-216). San Francisco: Holden-Day.
Chao, J., Shen, F., & Zhao, J. (2011, July). Forecasting exchange rate with deep belief networks. In The 2011 International Joint Conference on Neural Networks (IJCNN) (pp. 1259-1266). IEEE. https://doi.org/10.1109/IJCNN.2011.6033368
Doganis, R. (2009). Flying off course: airline economics and marketing (4th ed.). Abingdon: Routledge.
Erhan, D., Bengio, Y., Courville, A., Manzagol, P. A., Vincent, P., & Bengio, S. (2010). Why does unsupervised pre-training help deep learning?. Journal of Machine Learning Research, 11(Feb), 625-660.
Fang, H. (2012). Adaptive neuro fuzzy inference system in the application of the financial crisis. International Journal of Innovation, Management and Technology, 3(3), 250-254.
Girshick, R., Donahue, J., Darrell, T. & Malik, J. (2013). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2014.81
Goyal, M. K., Bharti, B., Quilty, J., Adamowski, J., & Pandey, A. (2014). Modelling of daily pan evaporation in subtropical climates using ANN, LS-SVR, Fuzzy Logic, and ANFIS. Expert Systems with Applications, 41(11), 5267-5276. https://doi.org/10.1016/j.eswa.2014.02.047
Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. WS13/14 Machine Learning Seminar, (pp. 1-18). Retrieved from https://arxiv.org/pdf/1207.0580.pdf
Hinton, G. E. (1989). Deterministic Boltzmann learning performs steepest descent in weight-space. Neural Computation, 1(1), 143-150. https://doi.org/10.1162/neco.19184.108.40.206
Hinton, G. E. (2009). Deep belief networks. Scholarpedia, 4(5), 5947. https://doi.org/10.4249/scholarpedia.5947
Hochreiter, S., Bengio, Y., Frasconi, P., & Schmidhuber, J. (2001). Gradient flow in recurrent nets: the difficulty of learning longterm dependencies. Retreived from http://www.bioinf.jku.at/publications/older/ch7.pdf
Hoeting, J. A., Madigan, D., Raftery, A. E., & Volinsky, C. T. (1998, May). Bayesian model averaging. In Proceedings of the AAAI Workshop on Integrating Multiple Learned Models (Vol. 335, pp. 77-83).
Jang, J. S. R. 1993. ANFIS-adaptive-network-based fuzzy inference system. IEEE Transactions Systems, Man and Cybernetics, 23(3), 665-685. https://doi.org/10.1109/21.256541
Jang, J. S. R., Sun, C. T., & Mizutani, E. (1997). Neuro-fuzzy and soft computing: a computational approach to learning and machine intelligence (1st ed.). New Jersey: Prentice Hall.
Kondo, T. (1998, July). GMDH neural network algorithm using the heuristic self-organization method and its application to the pattern identification problem. In SICE’98. Proceedings of the 37th SICE Annual Conference. International Session Papers (pp. 1143-1148). IEEE. https://doi.org/10.1109/SICE.1998.742993
Krogh, A., & Hertz, J. A. (1992). A simple weight decay can improve generalization. In M. I. Jordan, Y. LeCun & S. A. Solla. Advances in neural information processing systems (pp. 950-957). The MIT Press.
Kuremoto, T., Kimura, S., Kobayashi, K., & Obayashi, M. (2014). Time series forecasting using a deep belief network with restricted Boltzmann machines. Neurocomputing, 137, 47-56. https://doi.org/10.1016/j.neucom.2013.03.047
LeCun, Y., Bottou, L., Orr, G. B., & Müller, K. R. (1998). Efficient backprop. In Neural networks: Tricks of the trade (pp. 9-50). Berlin, Heidelberg: Springer. https://doi.org/10.1007/3-540-49430-8_2
Lee, H., Grosse, R., Ranganath, R., & Ng, A. Y. (2009, June). Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 609-616). ACM. https://doi.org/10.1145/1553374.1553453
Liu, C., Liu, X., Huang, H., & Zhao, L. (2008). Low circle fatigue life model based on ANFIS. In D. S. Huang, D. C. Wunsch II, D. S. Levine, et al. (Eds.). Advanced intelligent computing theories and applications: With aspects of contemporary intelligent computing techniques (pp. 139-144). Berlin: Springer Verlag. https://doi.org/10.1007/978-3-540-85930-7_19
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
Prechelt, L. (1998). Automatic early stopping using cross validation: quantifying the criteria. Neural Networks, 11(4), 761-767. https://doi.org/10.1016/S0893-6080(98)00010-0
Qiu, X., Zhang, L., Ren, Y., Suganthan, P. N., & Amaratunga, G. (2014, December). Ensemble deep learning for regression and time series forecasting. In 2014 IEEE Symposium on Computational Intelligence in Ensemble Learning (CIEL) (pp. 1-6). IEEE. https://doi.org/10.1109/CIEL.2014.7015739
Riedmiller, M., & Braun, H. (1992). Rprop – A fast adaptive learning algorithm. In Proceedings of the International Symposium on Computer and Information Science VII (pp. 57-64).
Rosenblatt, F. (1958). The perceptron: A probalistic model for information storage and organization in the brain. Psychological Review, 65(6), 386-408. https://doi.org/10.1037/h0042519
Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61, 85-117. https://doi.org/10.1016/j.neunet.2014.09.003
Srinivasan, D. (2008). Energy demand prediction using GMDHnetworks. Neurocomputing, 72(1-3), 625-629. https://doi.org/10.1016/j.neucom.2008.08.006
Stepashko, V. S. (1988). GMDH Algorithms as basis of modeling process automation after experimental data. Soviet Journal of Automation and Information Sciences, 21(4), 43-53.
Xiao, Y., Liu, J. J., Hu, Y., Wang, Y., Lai, K. K., & Wang, S. (2014). A neuro-fuzzy combination model based on singular spectrum analysis for air transport demand forecasting. Journal of Air Transport Management, 39, 1-11. https://doi.org/10.1016/j.jairtraman.2014.03.004
Yetilmezsoy, K., Fingas, M., & Fieldhouse, B. (2011). An adaptive neuro-fuzzy approach for modelling of water-in-oil emulsion formation. Colloids and Surfaces A: Physicochemical and Engineering Aspects, 389(1-3), 50-62. https://doi.org/10.1016/j.colsurfa.2011.08.051