AN INTEGRATED WEB-BASED DATA WAREHOUSE AND ARTIFICIAL NEURAL NETWORKS SYSTEM FOR UNIT PRICE ANALYSIS WITH INFLATION ADJUSTMENT

.


Introduction
The competitive nature of the construction industry necessitates every contractor to make an accurate cost estimate. The estimate becomes quite uncertain in countries like Turkey where the inflation rate is significantly high. The high level of inflation, which is inevitably reflected as continual price increases for materials, labour and equipment costs, has introduced a strong element of risk and uncertainty into the economic planning of construction projects. This effect is particularly evident in the construction industry, which by its nature is volatile and difficult to manage (Mohamed et al. 1994). Therefore the estimator must foresee clearly the future inflation and include them in the bid prices. Estimation is an important part of planning and is an important determinant in effort and time required to do the job. The accuracy of estimate will greatly affect the ability to deliver on time and within the constraints of the budget (Chassiakos and Sakellaro-poulos 2008). The expertise of estimators is crucial in this process (Polat and Donmez 2010a). Thus, it is reasonable that contractors attach great importance to train estimators, which could help to provide clients with better prices (Polat and Donmez 2010b).
In construction industry cost estimates are characterized by rapidly changing of labour, materials, and special machinery costs, and with variation from one location to another. The building and construction industry does not always run smoothly and has lots of ups and downs. Many factors contribute to the unpredictability of this sector and it is therefore very important to operate as safely as possible (Chassiakos and Sakellaropoulos 2008). Contractors are often compelled to offer the lowest price in order to achieve competitive advantage against their rivals as the hit rate in competitive bidding is predominantly determined by how low a contractor can bid relative to other bidders in the traditional contracting environment (Nassar 2003;Skitmore and Smyth 2007; element involved in the series of events that leads to a profitable completion of a contract in construction industry. The ISSN 1392-3730 print/ISSN 1822-3605 online Arditi et al. 2008;Šiškina et al. 2009;Plebankiewicz 2009). Thus, price is the most important basis for differentiation amongst contractors (Polat and Donmez 2010b). An accurate estimate plays a vital role in preparing solid groundwork for the construction industry. In order to reduce these risk factors, accurate unit price analysis with integrated cost index that adjusts the effect of inflation is needed.
An estimate, if prepared months or possibly a year or more ago, will need to be based on current price levels. Knowing the future is something every company dreams of, not only would it help them utilize its full potential, but it would also ensure that it does not put a foot wrong. However, it is impossible to predict the future accurately. It is, however, not very difficult to predict 'a very near' outcome based on previous and current scenarios (Chassiakos and Sakellaropoulos 2008). The adjustment factors in these instances can be assessed from actual historic information concerning how construction prices (including materials, labors and equipments) actually moved over the period since the estimate was prepared. Another category of cost assessment that is required is in respect of forward projecting scheme prices to either the date of tender or even further in time to the completion date of the construction. Therefore the estimator must foresee clearly the future inflation and include them in the bid prices (Adeli 1990).
To overcome these problems, the appropriate data, such as materials cost, labour wages, productivity, cost indices and historical data, must be available to the cost estimator at the appropriate time. Majority of construction firms who store these data are storing them in spreadsheets. Such systems are good for storing and managing the data but are not very effective in integrating and analyzing the data. Moreover, the data via these systems were found to be non-integrated and stored in forms and format that made it difficult for the decision makers to make quick decisions (Azhar 2005). Another reason behind the low performance of these systems is the use of Traditional Online Transaction Processing (OLTP) database technology. The OLTP databases are suitable to answer "who" and "what" type of questions, they are not very effective in answering "what-if", "why", and "what next" type queries (Ahmad and Azhar 2005) This paper outlines the development of a prototype system which integrates Web-Base Data Warehouse and Artificial Neural Networks for Unit Price Analysis with inflation adjustment, called "DANUP". Data warehousing is an in advance approach to the integration of data from multiple, possibly very large, distributed, heterogeneous databases and other information sources. In this approach, selected information from each source is extracted in advance, filtered and transformed as needed, merged with relevant information and loaded in a repository Data Warehouse (Theodoratos and Sellis 1999).
It is well known that information becomes valuable resources only when it is utilized. For example, Song et al. (2002) introduced the concept of reusing building documents from design and construction phases. Soibelman and Kim (2002) explored the way of knowledge discovery by generalizing rules from existing databases. Chau et al. (2003) conducted research on the application of data warehousing technique. Ahmad et al. (2004) used data warehousing in a decision support system for site selection for land development projects.
Accurate cost estimation needs lots of data and information which can be obtained through interaction and cooperation of different specialties and human resources. An important element in this interaction is the information management and communication process which constitute a determinant factor for the efficiency of human resources cooperation. Thamhain and Wileman (1986) stated that communicating effectively among task groups is the third most important factor for the success of a project. In order to achieve this, more emphasis should be given to information technology tools. Database and internet technologies provide substantial capabilities in this direction. In this paper, web based data warehouse technology has been used in developing the proposed system to ensure the accuracy of data used in estimation through the interaction of different specialties and human resources involved in the project.

Web-based Knowledge Management
The essence of knowledge management is not only to produce the information, but to capture data at the source, transmit it to a data warehouse, analyze it, and then communicate the information to those who can act on it. The tool that can accomplish this goal is the Internet. The Internet provides an excellent vehicle for corporate data communications and collaborations. It also facilitates business transactions and marketing exploration. The Internet interfaces (Web browser) assure that the right knowledge gets to the right people (those who can act) in the right time frame, with a focus on the ease of use. In addition to the Internet, the concepts of the Intranet and Extranet have been developed. The Intranet means that a company uses the Internet technology for communication within the company usually has a firewall protection to limit external access to internal corporate employees. The extranet further extends the Intranet capability by allowing outside companies to gain access to selected internal corporate data (Chou and Lin 2002).

Objectives of the Research
The undertaken research aims at providing relatively accurate and dependable source of information of unit prices in construction industry. The main objectives of this research are: − Reverting and keeping the historical cost data; − Developing a reliable and easy to use source for unit price analysis, that will be trusted and used in the construction market; − Allowing the contractors to see the methodology used, assumption made and conditions of study to make it easier for them to apply their own cases.

Methodology & System Development
The developed system consists of three integrated models including web-based data warehouse, artificial neural networks model, user interfaces and web applications. The first model is a Web-based Data warehouse developed by using Microsoft Visual Studio 2005 which establishes the main structure of the program. Microsoft Visual Studio is an Integrated Development Environment (IDE) from Microsoft which can be used to develop console and graphical user interface applications along with Windows Forms applications, web sites, web applications, and web services for all platforms supported by Microsoft Windows (Skibo et al. 2006). The main unit of data storage is a database, which is a collection of tables with typed columns. Databases of the system are created by using Microsoft SQL Server 2000 which is a relational database management system (RDBMS) produced by Microsoft (Davidson 2001).
The second model is the prediction of cost indices; an artificial neural network model for forecasting the cost indices during the project period was developed. The artificial neural network back propagation algorithm is implemented in MATLAB package. After training and testing the model, it is used to predict the future cost indices and the output stored in the appropriate tables in Data Warehouse to be used for cost adjustments.
The third model is the analysis and interfaces design, Web pages of this model are designed and developed by ASP.NET which is a web application framework developed by Microsoft to allow programmers to build dynamic web sites, web applications and web services (Walther 2003). Programming part of the system to connect databases, user interfaces, and web applications is done with C# (C Sharp). C# (C Sharp) is a multi-paradigm programming language that encompasses functional, imperative, generic, object-oriented (class-based), and component-oriented programming discipline (Walther 2003).
4.1. The stages of Data Warehouse Architecture A data warehouse system comprises the data warehouse and all components used for building, accessing and maintaining the data warehouse. The stages of data warehouse architecture are data acquisition from internal and external sources, data warehousing and data access. The DANUP data warehouse architecture is shown in Fig. 1. It can be seen from Fig. 1 that, the internal and external data sources are operational databases, historical data and flat files (i.e., spreadsheets or text files). The data extraction and integration is used to extract and transfer data from the sources. Then in the second stage the data is loaded and stored into the data warehouse using various third party loaders such as SQL loader. The center of a data warehouse system is the data warehouse itself. The warehouse is then used to populate the various subject oriented data marts. A data mart is a repository of data gathered from operational data and other sources that is designed to serve a particular community of knowledge workers. In the third stage, the integrated data can be accessed by the end user for reporting and analysis requirements (Inmon and Kelley 1993). The internal sources are the existing databases of relevant departments in the company including purchasing department, plant department, estimating and tendering department and human resources department. From these data bases, the required data for materials costs, equipment costs, and labor costs, productivity of labor and equipment, required amount of materials for a unit of activity are directly obtained. The external sources are the databases of the Ministry of Work & Settlement of Turkey and the database of the State Institute of Statistics in Turkey. The obtained data in this way are the historical and current Building Construction Cost Indices (BCCI). The material, labor and equipment resource types which are not included in the system can be entered by using the Resource menu as it is explained in section 4.3.

The DANUP Database Structure
There are many methods to generate the unit price of an item. In this research each item is divided into three subdivisions: materials, labors, and equipments. The total Fig. 1. The DANUP Data Warehouse Architecture required quantities of each kind or class of material, labor or equipment in a unit are found and multiplied by their individual unit cost. The total costs of the three sub-heads are summed up to give the estimated cost of the item of work. After a careful examination of the data required for preparing unit price, the DANUP is designed with 13 tables: six fact tables and seven dimension tables. Six fact tables which give detailed information are introduced. The tables have the following names: Cost, Material, Labor, Equipment, Project Information and Cost Indices. Their description is presented in Table 1. There are also seven dimensional tables which provide descriptive information: Time, Location, Description, Supplier, Category, Item and Quantity. Their description is given in Table 2.
After defining the fact tables, dimension tables and sub dimension tables, they are designed and the relationships between them are established. The snowflake schema is adopted in this research to allow easy and rapid transformation of data from operational databases into the data marts. The dimension tables are connected to the fact table through foreign keys. The same technique is used to connect sub dimension tables with dimension tables and the relationship between all tables is established to form a snowflake schema. The heart of the system which consists of various tables and the relationship between these tables is shown in Fig. 2.
A data warehouse collects all of the data into one system, organizes the data so it is consistent and easy to read, keeps "old" data for historical analysis and makes access to and use of data easy so that users can do it themselves.
The model is developed using database management and web-development capabilities of Microsoft Visual Studio 2005. All authorized users can login to the system which is installed on the company's main server through   internet connection. The user interfaces are as user friendly as possible so that there is no need of being professional computer user to be able to work with it. Data warehousing capability of the program helps company to keep a copy of every single work in database. All these data are prepeared for utilization in different reports and future projects.

Artificial Neural Networks
Contractors need to develop a contracor's cost index that can monitor the price fluctuation in a specific contract's cost (Park et al. 2010). The research focuses on development of a framework for automated unit price analysis, cost estimating and integrated cost index. To achieve this objective, the cost indices for the last ten years were added to the developed model and the cost indices for the project period were predicted by using Artificial Neural Network (ANN) as a forecasting method. ANN has been widely applied to various areas. Neural network models consist of simple computational units organized into a sequence of layers and interlinked by a system of connections. The neural network models have the capability of determining the relations between the input and output parameters (Sonmez and Ontepeli 2009). ANN is developed in three layers; an input layer, middle or hidden layer(s), and an output layer. Each layer consists of several neurons, which are interconnected by sets of correlation weights. The input layer's neurons receive their activation from the environment, while the activation levels of neurons in the hidden and output layers are computed as a function of the activation levels of the neurons feeding into them. The information which is received as inputs will be transferred to the hidden layer, and produce an output with the transfer function. Additionally, the learning processing (or training) is formed by adjusting the weight of interconnectivity neurons (Al-Tabtabai 1998).

Back Propagation Algorithm in MATLAB
Different available Neural Network Models were investigated to find a suitable one meeting the expectations. There are many software packages which implement the back propagation algorithm, however many of these software packages are huge; they need to be compiled and sometimes difficult to understand. The MATLAB Package is chosen for implementing the back propagation algorithm since the MATLAB environment is easily used, as well as having numerous functional and technological facilities (Jankovski and Atkočiūnas 2010). Also with the graphical capability of the MATLAB the network parameters can be graphed to see what is going on inside and specific network. MATLAB is commercial software developed by MathWorks Inc. it is an interactive software packages for scientific and engineering numeric computation (MathWorks 2004;Nazari and Ersoy 1992;Rumelhart et al. 1986). The Back Propagation Network (BPN) model is used in applied MATLAB algorithm. Among different models, BPN is the most popular and has the highest success rate. A BPN learns by example. You give the algorithm examples of what you want the network to do and it changes the network's weights so that, when training is finished, it will give you the required output for a particular input. A BPN model is composed of several layers of neurons. Each layer contains a predetermined number of neurons. Every neuron in a layer connects to all neurons in the adjacent layers. The network is first initialized by setting up all its weights to be small random numbers. Next, the input pattern is applied and the output is calculated. The calculations give an output which is completely different to the target since all the weights are random. Then the error of each neuron is calculated. The error is then used mathematically to change the weights in such a way that error will get smaller. This part is called the reverse pass (Nazari and Ersoy 1992). This process is repeated again and again until the error is minimal.

Number of Used Input Groups
The cost indices calculated in this research are for all different types of costs. As total prices of 295 items, were gathered from Ministry of Public Work & Settlement of Turkey. Out of these 295 items, 20 are on labor, 7 are on machinery, 146 are on construction material and 122 are on installation material are gathered. Codes of main "group and subgroup" which are used in this study were formed by the Division of Construction Statistics according to the classification which is made by the Ministry of Public Works and Settlement.

Prediction of Building Construction Cost Indices (BCCI)
The developed artificial neural network model is used to predict the Building Construction Cost Indices (BCCI) for the project period starting from the last published data. The data available of the past BCCI were obtained from the State Institute of Statistics -Turkey and divided into two sets, the training set and the test set. MATLAB package is used for writing the code of the algorithm for the training of all the neural networks. The Back Propagation Model is used in developing the networks, so a three layer back propagation neural networks is created for training. A two input neurons layer, five neurons hidden layer, and one output neuron layer are created. The training data set is continuously looped through the network and after every predefined number of iterations; the test set data is passed through the evolved network to generate an output. Then the error of each neuron is calculated. The training is stopped as the error fall to a lower value than the target value. The total error is evaluated by adding up all the errors for each individual neuron and then for each pattern in turn to give a total error as shown in Fig. 3.
The network keeps training until the total errors falls to some pre-determined low target value and then it stops. Once the network has been fully trained, the test set is used to check the validation networks. Two different sets of data are used for training and testing, the obtained results from training and testing set are shown in Tables 3 and 4 Fig. 4, as it can be observed the error rate converges to a relatively small Mean Square Error (MSE), MSE Training = 0.019 and MSE Test = 0.022, which is decided to be acceptable. After the validation of the network is checked, it is used to generate the predictive cost indices. The last two published BCCI data are used as inputs to generate the output for the next quarter. The new generated BCCI and the last published one will be the new inputs. This method continuously looped till the generation of the BCCI for the project period. The obtained results for the years 2005 till 2007 are as shown in Table 4. The developed model performs the prediction of cost indices for various data. However, Table 4 shows only the predicted cost indices for general construction materials. The developed model can be used for forecasting the BCCI for the whole project period (Baalousha and Mohamed 2007).

Calculation of Adjustment Factors
The indices presented in this research are specifically designed for constructional works. The indices are used to escalate or inflate various project cost features to current or future price levels.
Inflation indices use a base year to generate index values as adjustment factors. The ratio of the two used values expresses proportion of costs which corresponds to different periods. The index value of item A at current time with known cost and the predicted index value in future time with unknown cost is applied for calculation of the future cost of the item as it is shown in the following formula: where: Cost index quarter A represents the predicted cost index for the quarter where the unit costs to be estimated; Cost index quarter B represents the cost index for the quarter that the known unit costs are calculated from.

Program Menus and Interfaces
The third model of the developed system starts with an interactive user interface. The user interface is linked to the tables of the data warehouse to provide a flexible and friendly working environment. DANUP has different menus and submenus which allow the user to perform the unit price analysis of varying activities and the project cost estimation. Three types of resources, namely, material, equipment and labor, are used in the unit price analysis of each activity.
In the first stage, all types of resources are entered into the system by using a "Resource menu" as it is depicted in Fig. 5. In order to facilitate the understanding of unit price analysis, it is preferred to identify the resources by their codes as defined in the Unit Price Analysis Book, published by the Ministry of Public Works and Settlement in Turkey. The units, unit prices and extra information "if available" for every type of resources are introduced into the system through resource menu.
After defining all of the resources, in this stage the unit prices are analyzed activity by activity. For a specific activity, the required resources are selected through the resource list created before and introduced into the system by using the "Package Formation" menu. Each type of resources is added through this page one by one, as it is shown in Fig. 6.
The resources are described by their codes, units, quantities, unit prices, and inflation adjacency factors.
Once the quantity is introduced, the system automatically delivers the unit price, unit and the adjacency factor of the specified resource from the database.
The system adjusts the resources unit price automatically using the adjacency factor, and then the total resource price for that specific activity is calculated. The system adds up all prices of resources required for a specific activity to find the total unit price of the activity. Since the user is able to observe the way of analyzing and deriving the unit price, the system is attributed trustful by users.
In the third stage, "Project Formation" the system performs total project cost estimation. The bill of quantities are entered as input data or imported from another computer programs. Each project is defined by a specific code and the information related to the project is added by the user, as it is shown in Fig. 7.
In this "Project Formation" menu, the required activities for the project are introduced by their codes one by one. The system automatically delivers the total quantity obtained from the bill of quantities and the activity unit price obtained previously in the "Package Formation" stage is. The DANUP system estimates total cost of the project (see Fig. 7).
The user can refer to any resource, activity or project information through the Search Section. As it is shown in Fig. 8, entering the code of a resource, an activity or a project, the system delivers all appropriate information which is already stored in its data warehouse.
When it is required, the updating or deleting the unit price or quantity of any coded resource or activity or project is realized by using the "Update/Delete" menu, as it is depicted in Fig. 9.

Conclusions
Construction cost estimation is one of the most information-dependent processes. Cost estimation involves a large number of activities, and requires the employment of several human resources with various specializations. Thus, communication plays a vital role in the accuracy of cost estimation process. To overcome this communication deficiency, an integrated Web-based Data Warehouse System called DANUP has been developed. It combines database and internet technology to exploit the potential of data centric web databases in enhancing the communication process during project development period. It can be finally concluded that the system brings the following main benefits: i) Updated unit price is automatically calculated whenever there is increase in the price of material, labor, equipment etc.; ii) Web-based System provides improvement in effectiveness: an increase in intellectual specialization within a company; iii) Data will be kept timely by modifying the system; iv) Data Warehouse will assure that data are measured and indexed properly which will result in faster analysis of the data; v) Application of the system delivers crucial time savings. The time that will be otherwise spend by the estimator to do thousands of repetitive calcuations by hand and calculations which require expert's reasoning and judgment can be utilized for other useful purposes; vi) Adjustment of unit prices for the future can be generated considering inflation; vii) Estimator's knowledge and expertise is available for the use of other project team members; viii) The accuracy and precision of the system is much higher than hand calculation; ix) Availability of a system for trustful cost estimation.