ARTIFICIAL NEURAL NETWORK APPLIED IN FORECASTING THE COMPOSITION OF MUNICIPAL SOLID WASTE IN IASI, ROMANIA

. Neural network time series (NNTS) tool was used to predict municipal solid waste composition in Iasi, Romania. The nonlinear input output (NIO) time series model and nonlinear autoregressive model with external (exogenous) input (NARX) included in this tool were selected. The coefficient of determination (R 2 ) and root mean square error (RMSE) were chosen for evaluation. By applying NIO, the optimum model is 4-11-6 artificial neural network (ANN, R 2 = 0.929) in the case of testing as for the validation, with all 0.849 and 0.885, respectively. Applying NARX, the suitable model became 4-13-6 ANN model, with R 2 = 0.999 for training, 0.879 for testing , and 0.931, respectively 0.944 for validation and all . The resulted RMSE is zero for training and 0.0109 for validation in the case of this model which had 4 inputs, 13 neurons and 6 outputs. The four input variables were: number of residents, population aged 15–59 years, urban life expectancy, total municipal solid waste (ton/year). The suitable ANN model revealed the lowest root mean square error and the highest coefficient of determination. Results indicate that NNTS tool is a complex instrument, NARX is more accurate than NIO model, and can be used and applied easily.


Introduction
There are many significant factors which influence planning and development of sustainable solid waste management systems and one of them is solid waste generation Henriques & Coelho, 2019;Li et al., 2016;Ortiz-Rodriguez et al., 2018;Simion et al., 2012). Prediction of solid waste (SW) quantities that will be generated represents a challenge for all the decision makers involved in the waste management (Ghinea & Gavrilescu, 2010Younes et al., 2015). Different tools were et al., 2018;Noori et al., 2009Noori et al., , 2010aSuliman & Darus, 2019;Wu et al., 2019). An ANN model was proposed by Antanasijević et al. (2013) for prediction of waste generated in 44 countries considering factors like: countries size, inhabitant's number, economic and social aspects and others for the period from 2000 to 2012. In order to determine the solid waste amounts that will be generated, Younes et al. (2015) considered the input variables such as (Table 1): number of inhabitants and also employed/ unemployed persons, gross domestic product (GDP) and energy demand. They used nonlinear autoregressive with external (exogenous) input model (NARX) for prediction (Younes et al., 2015). In another study the authors used data from 2004 to 2009 and applied ANN and response surface model for solid waste generation forecasting (Shamshiry et al., 2014). The input variables considered by Shamshiry et al. (2014) were: fuel consumption, 4-ton truck, 10-ton truck, number of trips made by the trucks to the landfill, number of times the personnel entered into the landfill, number of tourists, and salary per worker per day.
The ANN was also applied for waste prediction in Serbia by Batinic et al. (2011). The inputs variables were: municipality income, employment/unemployment, age, educational and social aspects, while the output variables were six different waste fractions (Table 1). The suitable ANN model in this case was 4-10-6: 4 inputs, 10 hidden layers and 6 outputs (Batinic et al., 2011). ANN modelling was also chosen by Kumar et al. (2011) for prediction of waste in one city from India for the period 2010-2026 considering inputs such as: population number, percentage of urban population, municipal solid waste generated, GDP. Prediction of solid waste from 26 European countries was investigated by Antanasijević et al. (2013) developing ANN models. Waste generation weight was predicted by Zade and Noori (2008) during different seasons, over three years using also ANN. Their results showed that after the investigation of 12 model structures the suitable are structures with 10 and 16 neurons in the hidden layer (13-10-1 and 13-16-1). They selected from these models the structure 13-16-1 which seems to be the most appropriate for waste prediction based on mean absolute error, mean absolute relative error, root mean square error, correlation coefficient and threshold statistics.
ANN models are using previous data for prediction, which makes them suitable for forecasting. They have the ability to plot input and output (Ali & Ahmad, 2019). The nonlinear relationship description and time series simulation can be performed with NARX model which is a ANN prediction model used in many different studies (for prediction of waste generation, solar radiation, gaseous emissions from wastewater plants, electricity prices etc.) (Boussaada et al., 2018;Marcjasz et al., 2019;Vu et al., 2019;Zounemat-Kermani et al., 2019). According to Boussaada et al. (2018), Yu et al. (2019) NARX neural network is a good predictor of time series, widely used and well-fitting. In addition, the novelty of our work consists in the way of applying the ANN methodology for the forecast of solid waste composition and especially by comparing the results obtained using two ANN models (NARX and NIO). Our study provides information on the application of ANN for selection of social and demographic indicators which influence the estimation of solid waste and also for establishing the suitable ANN structure in order to predict the amounts of various waste fractions that will be generated.
Therefore, there is a perpetual need for waste management planning. Forecasting is the first step which has to be performed before designing and implementation of municipal solid waste management systems. It is important the selection of the most suitable model for forecasting in order to obtain the most accurate results. The R 2 values normally ranges from 0 to 1. Good predictions are performed when R 2 is very close to 1 and worse predictions when R 2 is 0 or very close to this value. The R 2 values obtained in different studies were: R 2 = 0.97 (Younes et al., 2015), R 2 = 0.82 (Singh & Satija, 2016), R 2 = 0.81 (Zade & Noori, 2008) and R 2 = 0.64 (Shahabi et al., 2012). The suitable ANN model is considered the model that has the lowest root mean square error and the highest regression coefficient (R 2 ). In this context, the aim of this paper is to predict the amount of municipal solid waste fractions by using ANN models, to investigate the influence of social variables on waste generation and to compare the results obtained by applying two ANN models (NIO and NARX).

Case study
Iasi is a city belonging to the Iasi County, which is a part of the Region 1 North East (NE) established by Law no. 315/2004 on regional development in Romania. The Romanian National Strategy for Waste Management and the National Plan for Waste Management are the basis for waste management in Romania (Romanian National Plan Waste [RNPWM], 2018; Romanian National Strategy Waste, 2013). According to the National Waste Management Plan and the Regional Management Plan for Region NE the responsibility for the municipal waste belongs to the local public authorities and/or to an authorized economic operator.
The prediction is performed in order to provide and apply a modelling tool to help the decision and policy makers and stakeholders concerned about waste management.
For this purpose, data regarding the number of inhabitants, population aged 15 to 59 years, urban life expectancy and amounts of municipal solid waste generated for the following years: 2000-2018 were collected from Iasi County Statistics, Iasi County Council and previous studies and these data were used for training and validation. The forecasted data for 2019-2025 were also considered for testing. The data used are at municipality level. The selection of these variables has been performed based on other studies which demonstrated that these variables were linked with waste generation and also they had a significant influence on waste prediction than others Rimaityte et al., 2012).
The of solid waste generated by population is biodegradable waste (approx. 50%) ( Figure 2) followed by paper and cardboard, plastic, glass, metals and other waste Ghinea, 2012). The municipal solid waste generated is collected by a public company and efforts are made for separate waste collection. Several Romanian companies, in the last years, conducted waste recycling campaigns to involve the inhabitants in the selective collection process. In Romania, waste separate collection is performed usually for materials with high market values, mainly in pilot projects (Ghinea & Gavrilescu, 2019). The municipal solid waste management system in Iasi is presented in our previous works .  applied the Waste Prognostic Tool, regression analysis and time series analysis for prediction of municipal solid waste generation in Iasi city. The same inputs data have been used in this paper to predict SW fractions generation by applying ANN modeling. The neural network time series (NNTS) tool was chosen for the modeling and evaluation.

ANN methodology
ANN is a model that belongs to artificial intelligence. It consists of artificial neurons connected with coefficients, thus forming the neural structure. Compared to the human brain, this is far behind, but can be used to process a large amount of data and to make predictions. In the ANN model, the artificial neuron simulates the function of the biological neuron. Inputs represent the signals that come and are activated, combined, and usually with the help of the transfer function (mainly the sigmoid function) the outputs are produced (Agatonovic-Kustrin & Beresford, 2000). These outputs are generated when the inputs exceed a certain threshold (Han et al., 2018). Therefore, the ANN is a mathematical tool inspired by biological neural networks which includes input, output and hidden layers (Adamović et al., 2016). The inputs and outputs represent variables which will be introduced into the model, while the hidden neurons are determined by trial and error (Younes et al., 2015). Inputs data will be received by input layer neurons, while the response to the input data are provided by the output neurons (Agatonovic- Kustrin & Beresford, 2000;Han et al., 2018). Neurons from the hidden layers are connected only with others neurons, and a single hidden layer is sufficient according to Agatonovic-Kustrin and Beresford (2000). In the learning process, ANN use a weight modification method in order to adjust weights just to improve outputs. During prediction, fed backwards through the network is performed for the adjustment of weight and error minimization. This prevents the same error from occurring again. In the first phase (training), inputs are transformed by the connection weights. Number of neurons is particularly important in this phase, because too few neurons will hinder the learning process, and too many will cause an overload. The weighted are used to predict data only after the desired outputs are produced. In the prediction phase network works only by forward propagation of data and the output of this propagation represents the prediction model for data validation (Agatonovic-Kustrin & Beresford, 2000). The ANN modeling can be performed with NNTS tool which is an instrument that can be applied to solve three nonlinear time series problems. The NNTS tool includes: nonlinear autoregressive model with external (exogenous) input (NARX), nonlinear autoregressive model (NAR) and nonlinear input output model (NIO) (MathWorks, 2016). The first technique used was nonlinear input output (NIO), in which the series y(t) is predicted based on the predicted series (Eq. (1)) (MathWorks, 2016): The second one technique applied was NARX, which is a dynamic feedforward network with backpropagation (lag) and is a multiple-layer network (Younes et al., 2015). The series y(t) is predicted based on the following equation (Eq. (2)) (MathWorks, 2016): where: y(t) series given d past values of y(t) and another series x(t), d is tapped delay lines that store the previous values of y(t) and x(t) sequences. Two statistical indices were used to evaluate the performance of ANN models: -the root mean squared error (RMSE) (Eq. (3)): RMSE is a performance function and measures the network's performance according to the mean of squared errors and also can be defined as the average squared differences between outputs and targets.
-the coefficient of determination (R 2 ) (Eq. (4)): where x t is the actual output and x 0 is the predicted output, n is the number of the outputs, t x is the mean of actual output and 0 x is the mean of the predicted output. The R 2 values represent the correlation between output and targets and there is a close connection between these two when R 2 = 1.
R 2 and RMSE indices are the most common used in waste prediction (Abbasi & Hanandeh, 2016;Adamović et al., 2016;Kannangara et al., 2018;Shahabi et al., 2012;Younes et al., 2015;Zade & Noori, 2008 and others). The default metric for many models is still RMSE, due to the fact that the loss function defined in terms of RMSE is easy to differentiate and the mathematical operations are performed much easier. R 2 shows information about how well the variation in the output was explained by the targets and outputs.
This tool also includes different types of algorithms such as: (1) Levenberg-Marquardt (LM) backpropagation; (2) Bayesian Regularization (BR) algorithm and (3) Scaled Conjugated Gradient (SCG) (Noori et al., 2010b(Noori et al., , 2011. The most used is Levenberg-Marquardt (LM) algorithm (Ranganathan, 2004) which is considered the fastest training function as well (MathWorks, 2016). The LM method description can be found in literature (Ranganathan, 2004). Different authors used LM for prediction of municipal solid waste generation in Malaysia and India respectively (Singh & Satija, 2016;Younes et al., 2015). The LM algorithm was also selected for application in this study.
The NNTS tool from Matlab version 9.1 (MathWorks, 2016) was chosen in this study for prediction of SW fractions generation. Figure 3 illustrates the network modeling steps followed by us.
NIO and NARX were applied in order to establish the suitable model and to compare them with each other. For this study the input variables are: x 1 -number of residents, x 2 -population aged 15 to 59 years, x 3 -urban life expectancy and x 4 -total municipal solid waste (ton/year). The inputs correspond to those presented in the previous work  for the period 2000-2015, while for 2016-2018 were taken from Iasi County Statistics and Iasi County Council and the forecasted data (2019-2025) from (RNPWM, 2018) and Iasi County Statistics. Inputs data is a 26×4 matrix representing dynamic data 26 time steps of 4 elements (x 1 , x 2 , x 3 , x 4 ), while targets is a 26×6 matrix representing dynamic data 26 time steps of 6 elements (paper waste, plastic waste, glass waste, metal waste, biodegradable waste and other waste). Figures 1 and 2 illustrates the values of input and output variables considered in this research. All data were normalised by using Eq. (5) (Zade & Noori, 2008): The model architecture used is presented in Figure 4. The training data were set at 55% for training, 20% for validating and 25% for testing, respectively. A similar procedure was also chosen and applied by Singh and Satija (2016). The network generalization is represented by the validation time steps, while the network performance is shown through time steps testing. The training is considered complete when the generalization stops improving. Data conditioning are: for training were used the data from 2000-2013, for validation the data from 2014-2018 and for testing phase the data obtained for 2019-2025 (predicted data).
In this study it was also checked if there are any outliers. According to NIST/SEMATECH (2015) the outliers are telling us something and in this case there is recommended to graphically check outliers by using a box plot or quantitatively check for outliers by Grubbs test for a single outlier, while for more than one outlier Tietjen-Moore test must be performed. It can be established how far the outlier is from the others by calculus. Z score is used for the detection of outliers (Guthrie, 2008) and is determined with Eq. (6): where: Yi -sample, Y − -sample mean, SD -sample standard deviation.
Z score determination is important, since it represents a way for results comparison with a normal (Gaussian) distribution and also enables comparison of two scores from different samples. In this paper, Z score was determined by using Excel file.

Results and discussion
In order to establish the most suitable network structure for prediction of SW fractions generation, several neural networks structure with input, hidden and output layers were investigated. Different neurons in the hidden layer were considered. In the first part of the evaluation of NIO model was considered, while in the second part NARX model was assessed. Table 1 presents the ANN models structure, root mean squared error (RMSE) and coefficient of determination (R 2 ) values for training (tr), testing (ts), validation (v) and all (a). For the ANN model 4-11-6 the R 2 = 0.883 in the case of tr and R 2 = 0.929 for ts, while R 2 it is 0.849 and 0.885) for the v and a. It can be observed that the suitable ANN model with maximum correlation R 2 compared with the other ANN models is the 4-11-6 ANN model.

Prediction of SW fractions by using NIO model
For RMSE the lower values are considered better, while a 0 value shows no error. From Table 2 it can be observed that   The histogram illustrated in Figure 5b is skewed right with one tail of the distribution considerably longer on the right side. It can be observed one outlier for training points. The Grubbs test was performed in order to check if this point is an outlier or not. For a significance level 0.05 (two-sided) with the critical value of Z = 2.680 the point is furthest from the rest, but not a significant outlier (Z = 1.628 for this point), which means that this point is not an outlier. Figure 6 shows the linear regression among the model outputs and the corresponding targets. The values of R 2 present the relationship between observed and predicted values of solid waste during tr, ts and v phases of the 4-11-6 ANN model. R 2 values were calculated based on the difference between the predicted values and actual amount of SW fractions generated. According to Table 2 the best result is obtained using 4-11-6 ANN structure for all ANN model structures the RMSE values are close to 0. The lowest RMSE values for tr, ts and v were recorded by considering different type of models (4-8-6, 4-11-6 and 4-6-6). The values presented in Table 2 are obtained by applying the Levenberg-Marquardt training algorithm.
Considering that the ANN model 4-11-6 is suitable, the results obtained by considering 11 hidden neurons in hidden layers and 2 as the number of time delays were discussed further.
It was observed that when the validation error increased for six iterations the training stops and this was appeared at iteration 8 (Figure 5a). At epoch two it was obtained the proper validation and it was observed that the next epochs did not reduce the errors. In Figure 5b the training data are represented by the blue bars, the validation data by green bars and the testing data by the red bars. and is very good illustrated by Figure 6. The best fit linear regression line between outputs and targets is provided by the solid line (MathWorks, 2016). The equations are also illustrated in the Figure 6. Table 3 illustrates the ANN models structure, RMSE and R 2 values for tr, ts, v and a obtained by applying NARX. The Levenberg -Marquardt training algorithm was also considered. From Table 3 it can be notice that the ANN model 4-13-6 is the adequate model: since RMSE values are lower, and also the R 2 is equal with 0.999 for tr, 0.879 for ts, 0.931 for v and 0.944 for a. NARX proposes the dependence between SW generation by fractions and inputs variables. In this study, we have chosen 13 hidden neurons in hidden layers and we selected the time delays 2.

Prediction of SW fractions by using NARX model
From Figure 7a it can be observed that the training stopped when the validation error increased for 2 iterations. This was occurred at iteration 7. Also, the best validation appeared at epoch 5. Results showed that the performance gradient magnitude is 1.3455 E-09, while the validation checks are 2 at 7 epochs (Figure 7b). From Figure 7c it can be observed one possible outlier for testing points.
The Grubbs test was performed and confirmed that for a significance level 0.05 (two-sided) with the critical value of Z = 2.680 the last point is not an outlier (Z = 1.676). In Figure 8 is presented the linear regression obtained between the outputs and targets data. It can be notice that the outputs closely follow the targets. The values of R 2 were calculated and presented in Figure 8 for tr, ts and v phases of the 4-13-6 ANN model. Figure 9a illustrates the comparison between the actual and predicted SW fractions generation. There are five major errors illustrated by orange lines, one of them is for testing of one target point and the others for validation targets. The model is able to represent the targeted output.
In Figure 9b the autocorrelation of error 1 is illustrated, which describes how the prediction errors are related in time. In Figure 9b it is only one value that exceeds confidence limit and is at zero lag. This fact indicated a perfect prediction model and also that the prediction errors are completely uncorrelated with each other (MathWorks, 2016). From Figure 9b the model seems to be adequate since the correlations fall within the confidence limits.
The results obtained after the application of NIO and NARX indicated two different adequate models for SW fractions generation prediction. When NIO was used the ANN model suitable is 4-11-6 (4 inputs, 11 hidden layers and 6 outputs), while the NARX best architecture obtained was 4-13-6 (4 inputs, 13 hidden layers and 6 outputs). Comparing root mean squared error and coefficient of determination values obtained for both models it can be observed that the RMSE values are lower for 4-13-6 especially for tr and v steps, while the values of R 2 are higher for 4-13-6. Also, the values for error correlation are lower for the 4-13-6 model (see Figure 9b) than for 4-11-6 model.
Eleven neurons in the hidden layer are the adequate number when ANN model is 4-11-6 as can be seen in Figure 6. The following linear equation was obtained for 4-11-6 model while when 4-13-6 model was considered the tool provides the next equation which can be considered as suitable (Figure 8). In the other studies regarding   solid waste prediction reported in the literature, the equations obtained were: (Singh & Satija, 2016), (Younes et al., 2015). The both models obtained in this study can be suitable models since they have higher values of the coefficient of determination for SW fractions generation prediction. In this study the most suitable is 4-13-6 model which have R 2 = 0.94, in others the values obtained were: R 2 = 0.97 (Younes et al., 2015), R 2 = 0.82 (Singh & Satija, 2016), R 2 = 0.81 (Zade & Noori, 2008) and R 2 = 0.64 (Shahabi et al., 2012).

Connection weights
The relative importance of each input variable in the network was determined by applying Garson's algorithm (GA), according to Olden and Jackson (2002), Gevrey et al. (2003), Olden et al. (2004), Ibrahim (2013). Connection weights are presented in Table 4, and were obtained at the end of training process by using NIO for 4-11-6 model structure. In order to obtain product P ij (Table 5), the absolute value of the hidden-output layer connection weight was multiplied by the absolute value of the hidden input layer connection weight. Q ij values were obtained by dividing the P ij by the sum for all the input variables. S i is the sum of Q ij , for example  Results obtained in Table 5, for 4-11-6 model structure, indicate that population aged 15 to 59 years is the most influence factor on the generation of waste fractions. After calculation of the relative importance for 4-13-6 model structure when NARX was used, it was observed that municipal solid waste amount input factor has the highest RI (28.6%), following by number of inhabitants (24.67%), population aged 15 to 59 years (24.5%) and finally by urban life expectancy (22.23%).
The studies published previously indicate NARX as a most often applied method (Menezes & Barreto, 2008;Shen & Chang, 2013;Singh & Satija, 2016;Younes et al., 2015) which performs the standard neural-network-based predictors (Song & He, 2014), illustrates the nonlinear data and different relationships (Younes et al., 2015) and with recurrent connection as the most important property.
Since in this study only the social variables were selected for evaluation, in the following studies economic indicators such as gross domestic product per capita, employment and unemployment rates and other indicators (like domestic material consumption or waste separate collection system) that influence waste generation will be considered as input variables.

Practical applications and future research perspectives
Forecasting of waste quantities and composition is an important step in the development of long-term municipal solid waste management systems in cities or regions. It is essential to predict the solid waste generation in order to help the decision makers to plan and manage the waste properly according to the waste management legislation. In this study, we wanted to answer the following questions: Q1. The composition of municipal solid waste can be determined with the help of ANN, based on input data such as: number of inhabitants, age of the population, urban life expectancy and total amount of waste generated? Q2. Which of these two ANN models: NIO or NARX can be successfully applied? Q3. How can the structure of the optimal algorithm be determined? Q4. Can we recommend the use of ANN for forecasting the composition of municipal waste?
This paper continues the series of studies on the forecast of municipal solid waste, started for the city of Iasi in Table 5. Connection weight products, relative importance, and rank of inputs for 4-11-6 model structure when NIO was applied Hi dden neu ron 2016 . The study provides information for decision-makers to improve the management system of municipal solid waste and for future research.
In future studies, we will also consider other input data, such as: economic data, waste collection system data and also, data on the behavior of residents. Since, in this study, we have performed the waste composition forecasting by applying ANN, in the following studies the results obtained in this paper will be compared with those obtained by applying regression analysis  and autoregressive integrated moving average (ARIMA).

Conclusions
In sustainable solid waste management, the solid waste prediction is a key aspect. By knowing the amount of SW fractions that will be generated the decision makers will be able to perform adequately planning and management of waste. In this paper ANN was used to investigate and predict solid waste composition in Iasi, Romania. The following conclusions can be drawn: -NNTS tool is a complex instrument which can be easily applied for SW fractions prediction; -the results obtained by applying NARX and NIO models showed that both can be used for prediction, but NARX is more accurate; -solid waste prediction depends on the variables which significantly influence the results. The proper input variables (like: number of residents, population aged 15 to 59 years, urban life expectancy, total municipal solid waste) should be selected considering the impact of input variables on model performance. Relevance, computational effort, training difficulty, dimensionality and comprehensibility are the main key considerations. The input variables must describe the behavior of the output variables and should have a minimum degree of redundancy. It is also important that the optimum input variable set does not contain uninformative variables; -the optimum algorithm structure should be determined by choosing the smallest values RMSE for training, testing and validation (compared with the others models investigated) and the highest R 2 (0.94) and was a feedforward NARX network (4 inputs, 13 neurons and 6 outputs) with Levenberg-Marquardt training algorithm; -the NARX model can be used and applied easily also in other cities of the country because waste composition is similar (the percentages are the same), instead the waste amount produced is different and but is closely related to the inputs variables, which may slightly vary from one region to another. -with the ANN methodology can be determined the linear regression between the inputs and targets data and also a time series response.