DEVELOPING A CONSTRUCTION-DURATION MODEL BASED ON A HISTORICAL DATASET FOR BUILDING PROJECT

The delay of vast building projects is still a common problem. This situation is extraordinarily severe to steel reinforced concrete (SRC) building projects that keep going to promote a new structure system in Taiwan’s construction industry. The aim of this study is to develop a feasible contract duration model based upon few SRC building cases. A logical approach is employed to select and assure the “good” regression model identified when project characteristics were known and external uncertainties were reasonably estimated. Different necessary diagnostics had been adopted to examine the aptness of the model before inference. The cross-validation is used to validate the appropriateness of the variables selected and magnitudes of the regression coefficients. The mean of the square prediction errors (MSPR) is selected to measure the predictive ability of the model proposed, and the result shows that the predictive ability of the selected regression model could be adequate. Finally, several cases are taken to test the predictive accuracy of the model proposed, and the result shows that the actually necessary construction duration is considerably closed to the duration predicted by the mode. It is concluded that the predictive duration model proposed could be applicable to the SRC construction projects with a reasonable reliability.


Introduction
People in Taiwan experienced an unprecedented catastrophe, an earthquake measuring 7.3 on the Richter scale, on September 21, 1999. Damage caused by the Chi-Chi Earthquake included thousands of deaths and severely wounded, with 44,338 houses completely destroyed, 41,336 houses severely damaged, and a total of US $9.2 billion worth of damage. As a result, builders and architects of modern buildings which collapsed were detained and accused by authorities. After the impact of the Earthquake, a conservative structural frame system, members of frame with Steel Reinforced Concrete (SRC) with better security, has gradually started to be adopted in the construction industry. In these SRC building contracts, clients had set aside additional duration for construction, but construction delay still generally occurred. The contract duration needed for SRC building does not included the duration of Steel structure construction and the duration of RC construction. Hence, difficulties and complexities of SRC building in the construction phase always give rise to delay. This leads to building cost far more expensive than tradition RC and Steel structure unless the owners take into account durability and building safety. Therefore, SRC structures are not popular in Taiwan today, with only few cases up to writing this re-search although promotion of new SRC structures in Taiwan's construction industry is still ongoing. However, there is no national specification on the appropriate construction duration for SRC structure, causing problems regarding construction duration for owners and constructors. This leads to cases of falling behind contract schedule and give rises to disputations.
The construction duration has been observed as one of the main criteria for assessing the performance of building projects in the construction industry (Bromilow 1969;Dissanayaka, Kumaraswamy 1999;Kaka, Price 1991;Love et al. 2005;Ng et al. 2001;Ogunsemi, Jagboro 2006;Walker 1995). A project will be considered successful if it is completed on time, is within budget, and meets the specified quality standards (Chan, Kumaraswamy 1996, 1997. Understandably, schedule overrun brings about project cost overrun. Although the industry participants are aware of the importance of duration in the construction phase of projects, it was observed that significant part of the contracts had not met the stipulated period. Since 1960's, according to Bromilow's research report, only 12.5% of building contracts were completed within the scheduled completed dates and the overall average actual time is more than 40% longer than the original schedule (Bromilow 1969). After four decades, the inability to complete on time is still a prevailing problem. Al-Khalil and Al-Ghafly (1999) reported that completed public projects overrun approached 70% of the original schedule through a comparison of the outcomes for projects with their original schedules. Odusami and Olusanya (2000) concluded that projects executed an average delay of 51% of a planned duration for most projects in the metropolis. Blyth et al. (2004) stated that 50% of contracts were completed after stipulated durations longer than schedules. Iyer and Jha (2006) observed that over 40% of construction projects were behind original schedule and delay lasted for months. Completed projects lasting much longer than the original schedule results in various disadvantages such as: additional cost, reduced contractor's profit, loss of reputation, and delay of client's operating plan. Many provoking disputes emanated from the various reasons of construction delay (Ng et al. 2001). The construction duration overrun is problematic in the construction industry and generates much concern. These statues of delay are still universal in the performance of building projects (Aibinu, Odeyinka 2006).
Several methods exist for estimating the contract duration of building construction projects such as the period expected by client, the special consideration, the time requirement for the project work to be done, and the time taken as recorded through historical project information (Kumaraswamy, Chan 1995;Love et al. 2005). In practice, most methods estimating project duration in the industry depends on the subjective skill and cognition of the estimators and planners, rather than on objective assessment. Khosrowshahi and Kaka (1996) stated that forcing the project into a desired time mold can lead to adverse consequences, giving rise to a chain reaction which affects the performance of the organization in other areas. Underestimation of the project duration raises additional events of penalty, disputes, etc. for the contractors and clients. On the other hand, overestimation of the project duration may lose organizational competitiveness in the industry. Both of these could have undesirable effects on project performance and achievement of project objectives.
Accurate and reasonable contract duration may avoid higher bid cost, and decrease the possibility of disputes between contractors and clients. It is also useful to the contractor in assessing the risks of meeting the client's requirement. The aim of this study is to develop a feasible and suitable project contract duration model based on the historical data set, enabling clients and contractors to estimate better accurate construction duration.

Literature review
There were a wide variety of approaches in dealing with the factors affecting construction durations. Several predicting models have been developed from these factors. Bromilow (1969) built a relation between the duration and cost of building contracts known as the Bromilow's time-cost (BTC) model. The BTC model revealed that the time taken to construct a project is only highly correlated with the construction size as measured by the final cost. Ng et al. (2001) revisited the BTC model with more new project data due to improved productivity, checked on its appropriateness for various data subgroups including those of a project type, and compared it with previous models developed at different time periods. Results show that different parameter estimates are needed for different project types.
Researches pointed out that the BTC model potentially falls short by not considering factors other than cost when establishing the construction time for a given project (Ng et al. 2001;Walker 1995;Nkado 1992Nkado , 1995, further developing the construction time model by categorizing the activities during the construction phase into work groups: substructure; superstructure; cladding/envelope; finishes; M&E services; and their sequential start-start times. The durations of these activity groups can be predicted from 11 variables: gross floor area; area of ground floor; approximate excavated volume; building height; number of stories; end use; cladding type; presence of atrium; building location; intensity of services and site accessibility. It was claimed that the model could provide an objective basis for evaluating the implications of the clients' stipulated completed times at early stages of design. Khosrowshahi and Kaka (1996) assumed that a project can be defined in terms of a series of variables that characterize the project. They identified the most influential variables and combined a number of these variables such as scope; floor; start-months; horizontal-across; build-ability; frame; project-operation; units; abnormality and log cost to develop and propose a model to estimate project duration. Based on these variables, Chan and Kumaraswamy (1999a, b) carried out several investigations in the public housing building construction process to identity a set of significant variables influencing construction duration of projects. The durations of the primary work packages, i.e., piling; pile caps/raft; superstructure; E&M services; finishes and their respective sequential start-start log times; were modeled in terms of the identified sets of critical factors by multiple linear regression exercises, concluding that the model developed was applicable to the public housing industry. Love et al. (2005) analyzed certain significant variables, i.e., project duration; project type; procurement method; tender type; gross floor area (GFA) and the number of stories, proposing an alternative model to replace the BTC model. They concluded that the GFA and the number of stories in a building were key determinants of time performance in forecasting construction duration in the project. Blyth et al. (2004) developed a predicting duration model based on project characteristics by combining the twenty one most influential project variables. They concluded that the predicting model can achieve the 7% maximum in predicting over duration. However, Kaka and Price (1991) revealed that private building varied significantly and fitted poorly in this model, suggesting that further classification of projects may be required to enhance the accuracy of the relevant variables relationship. Ogensemi and Jagboro (2006) opinioned that the BTC model is a non-linear type form, and introduced a breakpoint between two linear models, forming a piecewise linear model to improve the accuracy of the BTC model. In a questionnaire employed to determine factors affecting the performance of construction project, climate condition at site had been identified as the most important factor by owners; consultants and contractors because it affected the productivity and time performance of project (Enshassi et al. 2009). Zavadskas et al. (2010) assumed that risk could make cost and time overrun in the construction projects. They divided project risks into three groups: external; project and internal risk, where weather is considered as external risk. According to the aforementioned existing literature, there are a number of variables which may act individually or in combination to influence project duration. Therefore, it is worthy to further study how suitability of SRC building projects on construction duration. The extent of the effect of those variables is dependent on the nature of the project and external uncertainties. Since construction projects are distinguished diversiform categories such as building, civil and others, the homogeneity of data should be divergent from one another. The study would focus on private SRC building projects to build up a reasonable construction duration model.

Methodology of the model development
Multiple linear regression analysis is a widely used statistical procedure for determining the relationship among relevant variables (Blyth et al. 2004;Chan, Kumaraswamy 1997, 1999aLove et al. 2005). It is difficult to acquire only one best significant combination from the vast and potential independent variables in the objective approach. The identification of potential proper variables requires some intuition and practical experimental experience. Due to the variation in the uses of regression models, no particular subset of explanatory regression variables is the best. A descriptive use of a regression model typically will emphasize precise estimation of the regression coefficients, whereas a predictive use will focus on the prediction errors (Neter et al. 1996;Siegel 2003). In general a regression model includes the selection of explanatory diagnostics for examining the appropriateness of a fitted regression model, the remedial measures when the model conditions are not met, and validation of the regression model.
In this study two types of variables are incorporated into the construction prediction model. One is categorized into project characteristics which had some influence on the construction duration performance extracted from the discussions of previous researchers. The other is uncertain external factors which cannot be foreseen by the clients or contractors during construction. Several factors investigated, together with practical common recognition in construction industry, have been used in this predicting duration models, with project characteristics including construction cost, duration, and size of project etc., whereas the uncertain variables are defined as the external condition and internal condition.
Variables in models need to be properly defined. The contract duration is defined as the date from the agreed construction commencement date to the planned completion date. The actual construction duration for a project is considered to start when the date of contract commencement takes effect and end on the date of practical completion on site. The main reason for considering the agreed construction commencement date is that project activities are not usually continuous from project inception to contract commencement due to preparing or awaiting some unexpected events. However, the operation of the contract has become effective. On the other hand, contract initial cost is taken as the tender price or construction budget, whereas the actual cost is the final cost in the construction phase. The contract final cost could be different from the contract initial cost when variation during construction phase is taking into consideration. Except for the aforesaid possible variables, the identification of other potentially proper variables requires an intuition and practical approach.
When projects are executed at different periods, it would be necessary to adjust for fluctuating factors to avoid the disparities and consider the discounted values of project cost in relation to a particular year; the 2006 contract price indices are employed as the common-base set of data.
A total amount of fifty-six building projects with a total value exceeding US $212,977,449 dollars were collected for analysis. The procurement of all these projects uses the traditional design-bid build concept. This database represents new building in Taiwan completed in the period between 2000 and 2006. The contract cost for the project ranges from $842,356 dollars to $11,969,230 dollars, having a mean (M) of $3,803,169 dollars and standard deviation (SD) of $ 2,477,694 dollars. The buildings ranged from 2,251 m 2 to 51,904 m 2 gross floor area (GFA), 2 to 14 stories high, and took 197 to 1080 days to construct. Results showed that only a few contracts were completed on or before the date originally expected. The time performance for the steel reinforced concrete (SRC) building projects was far worse than expected.
The premises of building projects carried out in this research are: 1) All of the contractors are competent and efficacious in setting up the construction process as well as work within the same norm.
2) Mass materials such as steel, windows etc. are provided under the condition that there are no delays occur by the supplier chosen by the client.
3) All buildings have the same structure system (SRC) and most building materials are similar in likelihood.
In multiple linear regression analysis, one important objective is the identification of the best combination of variables in acquiring better significant combinations from the vast and potential independent variables within the objective. There are some techniques in SPSS (10.0) software package which are used as the alternative to the best solution. In the study, the forced entry technique is adapted to conduct the multiple regression analysis. It will be found that different combinations might produce different results during the process of developing the statistical models. A significant level of 10% is set to test.
There is a significant predicting relationship between the project construction duration and a relatively small number of measurable parameters of a project and its environment. These steps identified the suitable combination of the constituent variables of the model as follows: 1) The determination of the dependent variable to dig into the possible predictor (independent) variables from the data set.
2) The observation of the histogram of the data to test the normality in the distribution; if not, the variables are transformed into a suitable form such as the logarithm scale.
3) The observation of the scatter plot to examine the linearity in the distribution of the data; if not, one or two variables need to be transformed into a suitable form such as a logarithm form, resulting in a linear relationship for the linear regression technique to be used accordingly.
4) The regression of predicting an equation model to identify the variables which pass the statistical testing and can reasonably explain the variation within the data. − R 2 or adjusted 2 R has reached a saturation level. The construction model frame mentioned previously can be exhibited as Fig. 1.

Building the regression model
In this study, the aim of the research is to build a feasible and applicable contract stipulated duration model using the prediction regression algorithm for clients and contractors. We present a strategy for the model-building process which involves several phases. The first phase is to identify a subset that includes three explanatory variables quested from the characteristics of the building construction contract to build a fitted regression model. The first phase regression model is fitted with the following results: 2 R = 0.908, F = 172.057. This should be contently acceptable for the fitted results. In the next steps, the regression model formed in first phase is applied to fit the actual duration; the result is considered "not good" as the stipulated duration results obtained were: 2 R = 0.658, F = 33.321. When the explanatory variable (contract cost) is substituted with actual cost, the result is only "slightly better" than the former with 2 R = 0.698, F = 43.416. After a hard trial effort, other characteristics have no effect in heightening the value of 2 R and F. It is imputed that some unknown factors deeply influence the efficacy of the fitted regression model. In accordance with literature, it is judged that the construction-duration performance could be influenced by a pool of uncertain external or internal factors in the construction phase. Subsequently, two factors, number of change orders and rainy days, are formulated and incorporated one after another into the third phase regression model. The number of change order was taken as the average of change orders of all cases during the construction phase while rainy days were estimated as the average of rainy days which cease construction in the same duration (e.g. the same commence month and completing month that the case was taken) of by-gone three years. This process leads to the selection and identification of the ultimate regression model, fitted with the following "good" results: 2 R = 0.920, F = 115.315. The procedures of operation are illustrated in where: i Yˆ is log (predictive construction duration); X 1 is log (contract initial cost); X 2 is GFA / expected contract duration; X 3 is number of stories; X 4 is modified contract duration (estimated rainy days + expected contract duration) / expected contract duration; X 5 is change order; (…) is standard error of the estimated coefficient; 2 R is adjusted R 2 ; n is number of case.  ﹡The actual construction duration is substituted by the expected contract duration during the 1st phase.

Criteria for model selection
For variables reduction, it is necessary to identify a small group of "good" regression models according to some specified criteria. Those criteria could provide timesaving algorithms for identifying the "best" subset. During the comparison of the regression models, more than one criterion is considered in evaluating the possible subsets of independent (X) variables. In this study, those different criteria for comparing the regression models that would be used with the regression selection procedure are R 2 , adjusted 2 R , AIC (Akaike Information Criterion) and mean square error (MSE).
The potential X variables included six explanatory variables, leading to 2 6 possible subset regression models that could be formed in the selection procedure. Table 2 illustrated an abridgement of all different possible subsets by the all-possible-regression approach (Neter et al. 1996). In the "best" subsets algorithms, four criteria points to the same "best" subset, subset (X 1 , X 2 , X 3 , X 4 , X 5 ), which may be regarded as the tentative regression model. The selection of the final regression model depends greatly upon diagnostic results. Residual plots and diagnostic checks are performed mainly to identify influential outlying observation, multicollinearity, heteroskedasticity, etc., and to examine the appropriateness of the fitted regression model.

Diagnostics
When a regression model is built, it is important to examine the aptness of the data model before inferences are made. In this study, some graphic methods for studying the appropriateness of the model build, such as linearity of the regression function or normality of the error terms and the like, as well as several formal statistical tests will be discussed.
Scatter Plot Matrix and Correlation Matrix. A scatter plot matrix facilitates the study of the relationships among the variables by the scatter plots within a row or a column. Scatter plots of the response variable and against each predictor variable can help determine the nature and strength of the bivariate relationships between each of the predictor variables and the response variables. The scatter plot can also find gaps and outliers in the data points. Table 3 and Fig. 2 show that some of the predictor variables are correlated with each other. The degree of linear association among the predictor variables is moderate or relatively low.

Fig. 2. Scatter plot matrix among variables
Residual. The following plots of residuals are taken from several informal diagnostic plots of residuals to provide information on any types of departures from the linear regression model.
1) The residual plot against the fitted value in Fig. 3 shows no evidence of serious departures from the model. 2) The normal probability plot of residuals in Fig. 4 illustrates a slight departure from linearity. However, the problem of normality was not considered a serious impact on inferences to be made from the regression.

Fig. 4. Normal P-P Regression Standardized Residual Plot
Test for Heteroskedasticity. The homoskedasticity assumption for multiple linear regression states that the variances of the unobservable error conditional on the explanatory variables is constant. When homoskedasticity fails, the standard errors are no longer valid for constructing confidence intervals and t statistics. Similarly, F statistics are no longer F distributed and Lagrange multiplier (LM) statistic no longer has an asymptotic chisquare distribution (Wooldridge 2002). That is, the statistics used to test the hypothesis for assumptions are not valid in the presence of heteroskedasticity. Many tests for heteroskedasticity have been suggested over the years. In this study, the Breusch-Pagan test is applied to test for heteroskedasticity (Wooldridge 2002). We assumed that the ideal assumption of homoskedasticity holds, and we required the data to tell us otherwise. The steps for testing for heteroskedasticity are abbreviated. Regressing the squared OLS (ordinary least squares) residuals on the independent variables, produced R-squared = 0.082; thus, LM = 4.592, and the p-value = 0.287. The p-value (the smallest significance level for test) exceeds the desired significance level. Therefore, we fail to reject the null hypothesis of homoskedasticity in the model proposed at 10% level. We may conclude that heteroskedasticity is not a problem to the study model.
Identifying Outlying Observations. The outlying or extreme cases may involve large residuals and have dramatic effects on the fitted least square regression function. There is a need to identify the outlying cases carefully and decide whether they were to be retained or eliminated. A case may be regarded as outlying with respect to its Y value, its X value, or both. Not all outlying cases have a strong influence on the fitted function. The following steps were performed to determine if the regression model under consideration is heavily influenced by one or a few cases in the data set.
Identifying Outlying Y Observations. Frequently, the detection of outlying Y observation is based on an examination of the residuals. In the study, we utilize the mean of studentized deleted residual for the diagnosis of outlying Y observation (Neter et al. 1996). A formal test performed by means of the Bonferroni test procedure to determine whether the case with the largest absolute studentized deleted residual is an outlier. If the regression model is appropriate, no case is an outlier due to a change in the model; each studentized deleted residual will follow the t distribution. The studentized deleted residual in Fig. 5a shows that cases 21 and 27 have the largest absolute studentized deleted residual. Case 27 which has the largest absolute studentized deleted residual is an outlier resulting from a change in the model. Using the Bonferroni simultaneous test procedure with a family significance level of α = 0.10: t(1-α/2n;n-p-1) = t(0.9992;50) = 3.506, where n is number of cases, p is number of parameters.
Since t(27) = (2.474;3.506), we may conclude that case 27 is not an outlier. The other case 21 is also found to be not outlier.
Identifying Outlying X Observations. The leverage value is a useful indicator in a multivariable setting deciding whether or not a case is an outlier with respect to the X value. The leverage values greater than 2p/n( = 0.214) are considered as outlying cases with regard to their X values (Neter et al. 1996). The leverage value in Fig. 5b shows that cases 53 and 55 have the largest value, between 0.325 and 0.394. Although these values exceed 0.214, they do not exceed 0.5, and hence could indicate a moderate leverage. We shall need to ascertain how influential those cases are in the fitting of the regression function. The following three measures of influence are widely used in practice, each based on the omission on a single case to measure its influence. 1) Influence on Single Fitted Value: The guideline for identifying influential cases indicates that a case is influential if the absolute value of DFFITS exceeds 1 for small to medium data sets (Neter et al. 1996). Fig. 5c shows that those DFFITS values of case 27, 49, 51, 53 and 56 were much lower than 1. We may conclude that those cases were not influential to require remedial action.
2) Influence on All Fitted Value -Cook's distance: The Cook's distance measure considers the influence of any one case on all fitted values. The Cook's D value in Fig. 5d depicts that case 49 has the largest distance value, D = 0.1596, and is lower than the critical value of 0.904. It may be concluded that case 49 does not influence the regression fit.
3) Influence on the Regression Coefficients -DFBETAS: The guideline for identifying influential cases indicates whether the absolute value of DFBETAS exceeds 1 for small to medium data sets or not (Neter et al. 1996). All values of DFBETAS were much lower than 1. We may claim that there were no influential factors which require remedial action.
All three influence measures (DFFITS, Cook's distance, and DFBETAS) have been ascertained to have no influential cases on the fitting of the regression function.
Multicollinearity Diagnostics -VIF. The VIF is a formal method of detecting the presence of multicollinearity which is widely used. The largest VIF value among all X variables is often used as an indicator of the severity of multicollinearity. A maximum VIF value in excess of 10 is frequently taken as an indication that multicollinearity may unduly influence the least squares estimates. From Table 4, it is observed that all values of parameters are lower than 10, point ingout that there is no serious multicollinearity in the model.

Model validation
Validation of the regression model involves the appropriateness of the variables selected, the magnitudes of the regression coefficients, the predictive ability of the model, and the like. Cross-validation is used to validate a regression model by splitting the data into two sets. The number of cases for a model-building set should be at least 6-10 times the number of variables in the pool of predictor variables (Neter et al. 1996), the other is for a modelvalidation set. In present study, the entire data collected is not enough to make an equal split for the five independent variables selected to develop the regression model. Therefore it was determined that the validation data set is smaller than the model-building data set. The collected thirtycases set was used for estimating the regression model, and the twenty-six cases set was developed to validate the stability of the model. The result can be compared for consistency with the regression coefficients between the two models illustrated in Table 5.  Neter et al. (1996) declared that a mean of measuring the predictive ability of the regression model selected is to use this model to predict each data set, followed by calculating the mean of the squared prediction errors, as denoted by MSPR, which stands for mean squared prediction error: where i Y is the value of the response variable in the ith validation case; i Yˆ is the predicted value for the ith validation case based on the model-building data set; n is the number of cases in the validation data set.
If the mean squared prediction error MSPR is fairly close to MSE based on the regression fit to the modelbuilding data set, it shows that the error mean square MSE for the selected regression model is not seriously biased and gives an appropriate indication of the predictive ability of the model (Neter et al. 1996). It was found that the MSPR = 0.002572 is quite close to MSE = 0.002610, highlighting that the predictive ability of the selected regression model could be adequate in the future.

Discussion of the results
Some limitation on the applicability of this proposed model arises from the size of samples and the range of building functions that it encompasses. The sample size of 56 sets a limitation, but the adjusted R-squared of 0.920 indicates that the sample size was sufficient to produce a significant model for prediction construction duration. The coefficient of GFA per Contract duration is negative, which indicates that construction duration tends to decrease with GFA per Contract duration for the project sample. The outcome also had been acquired by Love et al. (2005). On the other hand, coefficients of other explanatory variables are positive, which indicates that project construction duration tends to increase.
During construction stage, change order of projects might greatly affect the duration of projects; the more change orders, the more influences. At the model building stage of this research, expected change order adopt the average number of change order in construction stage of all 56 cases. For SRC buildings in Taiwan, they are still at early stage of development. Due to characteristics of SRC buildings in construction process, it is far more complicated than traditional RC structure, and leads to more change orders. The average number of change orders is up to 2.7 times and is shown on the equations of model building from this research. As for the estimation of duration by the model proposed in this study, it might be operated and judged through the contents of construction design and tender. In the situation of no change order, the number of change order might be set as zero.
The reason why rainy day is considered in the prediction model of this research is due to the lack of attention in literature. However, the condition of climate do affects the productivity performance of construction pro-jects. Rainy days do influence the implementation of schedule. There are two rainy seasons in Taiwan which are plum raining and typhoon seasons. When construction projects encounter these two seasons, it would be difficult to push construction on schedule. Therefore, this factor is considered as a variable and single out.
The modified contract duration shown in equation 1 is formed by estimated rainy days plus construction days during expected construction period of contract. The estimated rainy day is based on average rainy days of the past three years in the same beginning and finished date. For example, the expected construction duration of one SRC building is ten months from February 1 of 2009 to November 31 of 2009, the estimated rainy day is adopted as average rainy days from 2006 to 2008 in the same beginning and finished date. Fig. 6 illustrated the accuracy of the model with low deviation and residual. The result showed that the model possesses effective ability to predict construction duration of SRC building project.

Fig. 6. Predicting Construction Duration vs. Actual Construction Duration
In order to further confirm the predictive ability of the model proposed in practical application, other different but similar 11 SRC construction projects not in the data set used for modeling, as they finished after the sampling, were taken to test. The information of their basic data and the outcomes of prediction are shown as follows: contract initial cost of US $843,750-US $20,090,750, expected contract duration days of 300-900, a gross floor areas of 1,579-60,380 m 2 , stories of 3-15, change orders of 1-9, and actual construction days of 292-1060. The error percentage of the forecasting construction duration came -7.37% to 3.30%. The results showed effectiveness of the model in this study to forecast construction duration for SRC structure. Furthermore, it also elucidated that a case had unreasonable contract duration as underestimated by the client, whereas the actually necessary construction duration was considerably close to the duration predicted by the model proposed. The model could be an objective and reliable tool to client and contractor for estimating the actual necessary duration and further evaluating contract duration of SRC building.

Conclusions
In this study, a set of 56 SRC construction cases were used to develop a construction-duration prediction model for SRC buildings. This research identifies the significant factors that could be derived from building project characteristics and uncertain factors. A logical approach is employed to select the "good" regression model when the contract cost, the gross floor area and stories are known, while the numbers of change orders and rainy days are rationally estimated. Necessary diagnostics are adopted to examine the aptness of the model before inference. The cross-validation is used to test the appropriateness of the variables selected and magnitudes of the regression coefficients. The MSPR is also selected to measure the predictive ability of the model proposed, with result showing that the adequately predictive ability of the model. Furthermore, additional 11 newly finished cases are taken to test the predictive accuracy of the model individually, and the result shows that the actually necessary construction duration is considerably closed to the predictive duration. According to our forgoing derived process, it is sufficiently easy for clients to determine a suitable and applicable contract duration of SRC building. It can provide contractors an objective basis for assessing the completion duration to decide what policy to implement for the SRC construction project. This model also can facilitate a rapid appraisal of design change and weather factors on the timely performance of building projects. In other words, it could allow clients and contractors to pre-determine some arrangement for alleviating the influence of external and internal uncertainty. Overall, it is concluded that the process of development is a pragmatic approach, and the prediction model is an indispensable, fast, cost-efficient, and relatively easy forecasting tool to be utilized in practical construction management.