CASH FLOW PREDICTION FOR CONSTRUCTION PROJECT USING A NOVEL ADAPTIVE TIME-DEPENDENT LEAST SQUARES SUPPORT VECTOR MACHINE INFERENCE MODEL

. Cash flow information is crucial for the decision making process in construction management. Due to the complexity and the dynamic progress of a construction project, forecasting cash flow demand throughout various phases of the project remains a challenging problem. This article presents a novel inference model, named as Adaptive Time- dependent Least Squares Support Vector Machine (LS-SVM AT ) for cash flow prediction. In the LS-SVM AT , Least Squares Support Vector Machine (LS-SVM) is integrated with an adaptive time function (ATF) to generalize the input- output mapping of cash flow. Since cash flow data are time-dependent, data points recorded in different periods can contribute dissimilarly to the training process of the prediction model. Thus, the role of the ATF is to determine the ap - propriate weight associated with each data point at a specific time period. By doing so, LS-SVM AT can better deal with the dynamic nature of the time series. Furthermore, to identify the optimal parameters for the inference model, Differen - tial Evolution (DE) based cross validation process is utilized in this research. Comparing to other benchmark methods, the proposed model has identified the most appropriate time function and has yielded superior forecasting results. There - fore, LS-SVM AT can be a promising tool for construction managers in cash flow prediction.


Introduction
Construction project is shown to be context-dependent and highly uncertain; this explains why the construction industry suffers the largest number of bankruptcies compared to other sectors of the economy (Boussabaine, Kaka 1998). The nature of construction projects, which is characterized by constant changes in the environment, pressures to maintain schedules and reduce costs with increasingly complex construction techniques, makes the task of project management a significant challenge.
In project management, cash is a critical factor that imposes significant influence on project profitability (Hwee, Tiong 2002;Jiang et al. 2011). Poor cash flow control can lead to project failure for contractors due to liquidity shortage for supporting their daily activities (Khosrowshahi, Kaka 2007). Hence, reliable cash flow prediction over various phases of a project is desirable since it puts the project manager in a better position to identify potential financial problems and to develop appropriate strategies to mitigate the negative effects of such problems on the project success (Hwang, Liu 2005).
For the purpose of project control, Russell et al. (1997) pointed out that one may identify and keep track of several time-dependent variables that change through the construction progress. Such method can help manager monitor the project status and foresee some undesirable events that may happen in the future. Nevertheless, predicting project performance dynamically in terms of cash flow is enormously challenging. It is because each time point is associated with numerous time-dependent variables.
Hence, another alternative for project control in terms of cash flow is to analyse the pattern of cash demand in the past and then infer that pattern into the future. This, in essence, is the time series forecasting approach that can effectively assist both short-term and long-term decision making (Williams 1994;Nam et al. 2007). Due to its practical importance, various approaches that employ traditional statistical methods as well as advanced artificial intelligence (AI) methods have been developed to tackle the time series problems.
A survey carried out by Sapankevych and Sankar (2009) found that real-world time series are too complex to be modelled using traditional statistical methods. Thus, tics, utilizing deterministic types of time function may not yield desirable performance. Thus, this article aims to put forward a flexible form of time function, namely ATF, which is capable of adapting itself to different time series problems in an autonomous manner.
Additionally, in AI field, it is recognizable that tuning parameters play an important role in establishing the predictive model (Suykens 1999;Bishop 2006). These parameters control the model's complexity, and they are needed to be determined properly via cross-validation. In doing so, the main objective is to obtain an optimal model that is capable of producing the best predictive performance on new data. In this study, DE, a fast and effective stochastic optimizer proposed by Storn and Price (Price et al. 2005), is employed in the cross-validation process to achieve such objective.
Therefore, this article aims to propose a hybrid AI model that employs various advanced techniques to help project managers in predicting the future cash demand. The second section of this paper describes the proposed approach LS-SVM AT . The cross validation process for optimizing model's parameters is mentioned in the third section. The fourth section demonstrates application of the inference model in cash flow forecasting. Conclusion on our study is mentioned in the final section.

Adaptive time-dependent least squares support vector machine
LS-SVM is a supervised learning technique proposed for solving both classification and regression problems. For standard formulation of LS-SVM, readers are guided to previous works of Suykens et al. (2002) andDe Brabanter et al. (2010). This section aims to describe the LS-LSVM AT which incorporates LS-SVM with an adaptive form of time function.
Consider the following regression model that defines the mapping relationship between a response variable and independent variables: (1) where: x ∈ R n , y ∈ R, and is the mapping to the high dimensional feature space.
Given a training dataset the formulation of LS-SVM AT can be given as follows: Minimize ; (2) Subjected to k = 1, ... , N, where: w ∈ R n is the normal vector to the regression hyperplane; b ∈ R is the bias; e k ∈ R is an error variable; γ > 0 denotes a regularization constant; s k ∈[0, 1] is a time-depended weight associated with an error variable.
It is noted that s k is a function of time s k = f(t k ) where t k is the time period of the data point k. As this demands more advanced forecasting algorithms such as Kalman filters, artificial neural networks (ANN), and support vector machines (SVM). The survey also pointed out that among AI methods, SVM has the ability to accurately forecast time series data especially when the data of interest are non-linear and non-stationary.
Moreover, SVM has been utilized in a variety of applications in construction engineering Cheng et al. 2010). Hence, this method can be promising for cash flow problem. The principles of SVM are based on the structural risk minimization and statistical learning theory. After being trained, SVM is capable of predicting the future value of cash demand. The advantages of this method are strong inference capacity and generalization. Nevertheless, SVM training requires solving a quadratic programming problem subjected to inequality constraint. This means that the training process for large data sets requires expensive computational cost (Guo, Bai 2009).
LS-SVM has been proposed recently to overcome the drawback of SVM (Suykens et al. 2002;De Brabanter et al. 2010). One obvious advantage of LS-SVM is the alleviation of the computational cost. In its training process, a least squares cost function is proposed to obtain a linear set of equations in the dual space. Consequently, to derive the solution, it is required to solve a set of linear equations, instead of the quadratic programming as in SVM. And, this linear system can be efficiently solved by iterative methods such as conjugate gradient. Studies have been carried out to demonstrate the prediction capability of LS-SVM (Yu et al. 2009;Samui et al. 2011).
However, one shortcoming of LS-SVM is that it does not consider the unbalanced feature of real-world time series. In construction industry, data related to cost are collected periodically in various phases of the project; thus, many factors of interest in construction lend themselves to time series behaviour (Khosrowshahi, Kaka 2007). It is obvious that during the project execution, data recorded more recently could provide more information for the decision makers. It is because cost items reported in distant past may be less relevant due to changes in the working environment. In cash flow prediction, data collected closer to the timing of prediction should be more important than that which is more distant. Furthermore, the level of importance of data recorded in different phases of the project may not be similar. Thus, any forecasting techniques applied for the case of cash flow should take into account this very nature of time series.
To address the issue of unbalanced learning, Lin and Wang (2002) first proposed the weight, which is expressed in terms of a fuzzy membership, associated with each input point; and this allows each one to contribute dissimilarly in the training process. Two time functions, namely linear and quadratic, are used to calculate those weights. Nevertheless, since each time series and each domain problem may possess different characteris-mentioned earlier, real-world time series data can be unbalanced due to the fact that recent data can be more relevant than distant ones. Therefore, data points should to be weighted differently according to time. This study introduces the ATF for determine the value of s k used in Eqn (2). Consider the case of a construction project, the total project duration can be divided into a number of completion periods (U). Thus, the proposed time function assigns small weighting values for data points at the initial phase of the project. Meanwhile, data points recorded at the later phase are coupled with greater weights because they possess more valuable information for the training process (Fig. 1). Using the proposed ATF, the duration of a completed project is divided into T time domains. It is noted that the number of time domains (T) is not necessarily equal to the number of completion periods (U).
For the first time domain D 1 , the time function has two free parameters: the initial value and the slope. The weights within D 1 are calculated as following: ( 3) where: s k denotes the weighting value associated with the data point k; t k is the time period; s o is the initial value of the weighting value and it is usually set to be relatively close to 0; a 1 represent the slope parameter.
Meanwhile, the time functions for other domains only need the slope parameter to specify its shape. The weights within a time domain D v are calculated in the following way: (4) where: s o,v-1 is the maximum weighting value in the previous time period; a v represents the slope parameter.
To solve the optimization problem stated in Eqn (2), one can construct the Lagrangian and derive the dual problem. Similar to original LS-SVM (Suykens et al. 2002), the Lagrangian is given by: where α k are Lagrange multipliers.
And the conditions for optimality are given by: After elimination of e and w, the following linear system is obtained: where: And the kernel function is applied as follows: The resulting LS-LSVM AT model for function estimation is expressed as: where α k and b are the solution to the linear system (7). The kernel function that is often utilized is Radial Basis Function (RBF) kernel given as following: where σ denotes the kernel function parameter. It is noticed that the regularization constant (γ), kernel function parameter (σ), the initial weight (s o ), and the slope parameters (a v ) are the tuning parameters of the proposed LS-LSVM AT . In our research, these tuning parameters are automatically optimized by the DE based cross validation process which is demonstrated in the next section of the article.

Differential Evolution
Differential Evolution (DE) is an Evolutionary Algorithm which is designed for real parameter optimization (Price et al. 2005). DE algorithm relies on the implementation of a novel crossover-mutation operator, based on the linear combination of three different individuals and one subject-to-replacement parent (or target vector) (Becerra, Coello 2006). The crossover-mutation operator yields a trial vector (or child vector) which will compete with its parent in the selection operator. The selection process is performed via selection between the parent and the corresponding offspring (Mezura- Montes et al. 2004). The algorithm of DE is depicted in Figure 2. In this figure, it is noted that NP represents the size of the population; X j,i is the j th decision variable of the i th individual in the population; g is the current generation; and D denotes the number of decision variables. rand j (0,1) is a uniform random number lying between 0 and 1; and rnb(i) is a randomly chosen index ranging between 1 and NP.
In the selection process, the trial vector is compared to the target vector (or the parent) (Storn, Price 1997). If the trial vector can yield a lower objective function value than its parent, then the trial vector replaces the target vector. The selection operator is expressed as following: (15) where: X i,g represents the parent vector at generation g; U i,g denotes the trial vector at generation g; X i,g+1 is the chosen individual which survives to the next generation (g+1).
The optimization process iterates until the stopping criterion is satisfied. The user can set the type of this stopping condition. Commonly, maximum generation (G max ) or maximum number of function evaluations (NFE) can be applied as the stopping condition. When the optimization process terminates, the final optimal solution is available for the user assessment.

Differential Evolution based cross validation
In machine learning, one important goal is to construct a prediction model that can deliver the best generalization. It is because model performance on the training data set is not necessarily a good indicator of the predictive performance on testing data due to the problem of overfitting (Bishop 2006). Over-fitting arises when a regression model fits the training set very well, but performs poorly on the new data set. Hence, to build a desirable prediction model, one commonly used technique is the S-fold cross-validation (Suykens et al. 2002;Bishop 2006;Samarasinghe 2006). Employing this approach, the training data is divided into S folds and this allows a proportion (S -1) / S of the available data to be used for training while other portion of the data is for assessing model performance.
Since our study employs LS-SVM AT as the regression machine, its tuning parameters consist of the regularization parameter γ, the RBF kernel parameter σ, the initial weight s o , and the slope parameters a v . The proposed cross validation approach utilizes DE (Price et al. 2005) to automatically explore various combinations of (γ, σ, s o , and a v ) and to identify the optimal set of these tuning parameters. In the following section, the DEbased cross-validation (Fig. 3) is described in detail.
In the step of data processing, the training data set is divided into S folds, in our study S is selected to be five. In each run, one fold is used as a validating set; meanwhile, the other folds are used for training the model. In LS-SVM AT training, the machine is utilized to learn the mapping function between input and output for each run. After the training process, the propose method is applied to predict the output of the validating sets. In order to determine the optimal tuning parameters, the following objective function is used in the step of fitness function evaluation: where and denote the training and validating error, respectively, for the k th run. The training and validat- where and represent predicted and actual value for output j th . In addition, N represents the number of training data points used in each run.
The fitness function, in essence, represents the tradeoff between model generalization and model complexity. It is worth noticing that well-fitting of the training set may reflect the model complexity. However, complex model tends to suffer from over-fitting (Suykens et al. 2002;Bishop 2006). Thus, incorporating the error of the validating data can help identify the model that features the balance between minimizing training error and achieving generalization capability.
In each generation, the DE algorithm carries out mutation, crossover, and selection process to guide the initial population to the final optimal solution. The search terminates when the current generation g achieves the maximum number of generation G max . After being optimized, the prediction model is ready to be used in the next step.

Experimental results
This section of the article illustrates the performance of the proposed inference model LS-SVM AT in real-world cash flow prediction problems. The database used in the paper was collected from a construction contractor in Taipei from 1996 to 2006. Herein, the standard cumulative cost-time curves were employed for cash flow forecasting. As cash flow is recorded sequentially, it features characteristics of time series. Hence, LS-SVM AT , which fuses LS-SVM and ATF, can be very potential solve to the problem at hand.
The employed database contains percentage of expenditure cash flow taken from 13 high rise building projects. The LS-SVM AT utilizes 10 projects as training set and 3 projects as testing set. Every project was separated into 20 sections; each section represents a uniform period of 5% total project completion. It is noted that for weighting data, the total project duration is divided into 5 time domains (T = 5).
To predict future cash flow demand, three sequential periods of expenditure cash flow were utilized as input patterns. There are 17 input data in a completed project from the first set (1, 2, 3) to the final set (17,18,19). Prediction results are represented by the cumulated cash flow ratio of the 4 th through the 20 th periods. Hence, 170 data points are employed to train the inference model and 51 data cases for testing its performance. Table 1 illustrates the cash flow data for one project in the database. In Table 1, the three inputs X 1 , X 2 , and X 3 represent the pattern of expenditure cash flow in the past; meanwhile, the output Y denotes the forecasted cash demand of the project.
To illustrate that LS-SVM AT is capable of delivering accurate predictive results, the proposed model is benchmarked with Evolutionary Support Vector Machine Inference Model (ESIM) (Cheng, Wu 2009). ESIM is established based on the standard SVM without any weighting mechanism. Additionally, LS-SVM models that utilizes linear and quadratic time functions proposed by Lin and Wang   Fig. 3. Differential Evolution based cross-validation process by the unit of the predicted output; hence, it can express the difference between the actual and the predicted cash flow. MAPE is calculated as follows: (20) where: A i Y and P i Y are actual and predicted value for output i, respectively; n denotes the number of data points.
When the training process terminates, LS-SVM AT as well as other benchmark models, including LS-SVM L , LS-SVM Q , and ESIM can be employed for predicting new cases of project cash flow. It is noticed that since the first three models utilized time functions for dealing with unbalanced learning, the optimal ATF for LS-SVM AT is illustrated in Figure 4; meanwhile, the optimal shapes of linear and quadratic time function for LS-SVM L and LS-SVM Q are shown in Figure 5 and Figure 6, respectively.
The result comparison is provided in Table 2. It is recognizable that the proposed model has achieved the most desirable outcome in both training and testing set. For training set, RMSE and MAPE of LS-SVM AT are 0.020 and 4.540, respectively. When predicting testing cases, LS-SVM AT obtains 0.022 for RMSE and 4.731 for MAPE. Additionally, since the training error and testing error of LS-SVM AT are relatively close to each other, it can be shown that the inference model does not suffer from over-fitting. This means DE based cross validation process is really effective to achieve a balance between model generalization and model complexity.
Furthermore, prediction results of the other two LS-SVM models, which also utilizes time functions, are better than that of ESIM, which is not equipped with time function. Thus, from the conducted experiments, it can be concluded that time functions are able to enhance the prediction capability. In addition, because forecasting outcome of LS-SVM L is slightly better than that of LS-SVM Q , it can be seen that model performance may vary when different types of time function are employed. (Lin, Wang 2002) are also used for result comparison; the two models are named as LS-SVM L and LS-SVM Q . The formulations of the linear and the quadratic function are shown in Eqn (18) and Eqn (19), respectively: where: t 1 and t m denote the first and the last time period of a project, respectively; σ represents the time function parameter which is the lower bound of weighting values. It is worth noticing that when LS-SVM is integrated with the two aforementioned time functions, all of model tuning parameters, including the regularization parameter, the kernel function parameter, and the time function parameter, are selected via the DE based cross validation process.
Moreover, to quantify the modelling accuracy of each approach, Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) have been employed as evaluation criteria. RMSE (Eqn (17)) is first computed by summing the squared deviations between the predicted and the observed outputs, then taking the square root of summation. Since the errors are squared before calculating the average deviation, more weights are given to data points with larger prediction errors. Thus, RMSE is effective for identifying undesirable large deviations. Meanwhile, MAPE quantifies the modelling performance by computing the ratio between the deviation and actual output. MAPE magnitude is not affected The detailed results for 51 testing cases, including actual and predicted cash flow, are provided in Table 3. Herein, the result deviation is calculated as the absolute different between actual and predicted cash flow at a time period. The average deviation of LS-SVM AT , LS-SVM L , LS-SVM Q , and ESIM are 1.83%, 2.30%, 2.51%, and 3.58%, respectively. Furthermore, the maximum deviations of the four models are 4.85%, 10.32%, 14.34%, and 8.98%. Hence, prediction performance in terms of result deviation of LS-SVM AT is significant better than that obtained from other approaches. If we set 5% as the error threshold for predicting project cash flow, only the proposed inference model is qualified.

Discussion
Based on the results shown above, it can be seen that the LS-SVM AT , which is fused with the newly developed ATF, provides the best forecasting outcome. Thus, this form of time function, illustrated in Figure 4, is deemed best suited for the cash flow prediction problem. The ATF seems to be divided into two regions that are significantly different from each other. In the first region, from time period 1 st to 7 th , the weighting values increases gradually from 0.44 to 0.96. In the second region, the values of time function get stable reflected in an apparently straight line. Additionally, it can also be observed that the optimal time function is found to be concave-down. Meanwhile, the quadratic time function (Fig. 6), due the restriction of its functional formula, is always depicted in a concave-up shape. Thus, this explains why the quadratic function cannot capture the trend similar to that of the proposed adaptive method.
Furthermore, this weighting strategy of the ATF is understandable in the context a construction project. As in the initial stage of a project, fewer tasks were commenced and thus only a small proportion of works was carried out. The works on site at this stage are mostly preparatory activities; financial transactions and working information were less likely to appear. Thus, the weighting value for data points collected in this period should be low. When the project moves forward, it reaches a stable state in which most of the main activities are executed. Therefore, information collected at the later time periods should be considered more valuable and given greater weights. Accordingly, the ATF function is found to best demonstrate this feature as it delivers the best forecasting results.
Moreover, since LS-SVM AT is a hybrid AI model, the approach can be quiet complex for practical managers. Nevertheless, construction management is a complicated field and predicting project performance is by no means an easy task. Therefore, it is very challenging to construct a simple model that yields highly accurate forecasting performance. Even though the proposed model is relatively complicated to establish, with more effort on software engineering, a user-friendly interface can be integrated into the model; and this enables LS-SVM AT to be a more promising tool for practical project management.

Conclusions
This article has presented a novel approach, named as LS-SVM AT , to assist construction managers in dealing with project cash flow forecasting. The proposed approach, which fuses LS-SVM and ATF, is developed as an intelligence model specifically designed for time   Continued Table 3 series data. LS-SVM AT makes prediction of future project cash demands by case learning of patterns in the past. Moreover, DE searching algorithm is utilized in the cross validation process to identify the most appropriate tuning parameters without the need of trial-and-error process. Experimental result and performance comparison have proved the strong potential of the new inference model. Currently, LS-SVM AT has a limitation: the model is built using a single database collected from one construction contractor in Taipei. Although the data are quiet homogeneous and capable of facilitating cash flow estimation effectively, more historical cases from different contractors should be incorporated to enhance the generalization of the prediction model. On the other hand, all of the recorded data used for cash flow prediction are highrise building projects. Hence, data collected from other types of construction project, such as highway and tunnelling structure, can be worth investigated. It is because other project types may possess different characteristics. Nevertheless, the procedure of collecting new data cases requires time and effort. Therefore, we would like to consider these to be promising future research directions.