MEASURING INFORMATION DEPENDENCY FOR CONSTRUCTION ENGINEERING PROJECTS

. Information dependency may be the most important key for managing information exchange to reduce project risks. Studies to date have not successfully discovered objective and quantitative surrogate to measure information dependency. This paper suggests an approach to measure information dependency with the productivity relationships among various disciplines for heavy industrial engineering projects. As part of a Construction Industry Institute (CII) study, the authors identified the information exchange pattern of engineering disciplines. Based on the patterns, the authors discovered the information dependency that various engineering disciplines had with their productivity relationships and conducted a survey afterwards for validation. Both results show significant and consistent evidence suggesting that: 1) information of equipment and piping disciplines is statistically dependent rather than the other paired disciplines; and 2) productivity relationship can be a legitimate surrogate to measure information dependency between equipment and piping disciplines. As such, this study enlightens a research trajectory for improvement of engineering productivity.


Introduction
Effective exchange of information is critical to successful engineering for heavy industrial projects such as chemical manufacturing, electrical generating, gas distribution, or oil refining, etc. (CII 2008). Project engineering complexity has increased dramatically with technology development. This complexity can be attributed to process design, system integration, construction method selection and even sustainability considerations. As a result, coordination of the engineering process requires intensive collaboration among various disciplines. Given the enormous uncertainty common in many large complex projects, intensive information exchange among the different engineering disciplines creates high project risks. For sequential engineering activities, if a task fails to meet its performance expectations, it will most likely have an impact on the performance of the next tasks (Ortiz et al. 2009). Furthermore, engineering errors for a task may produce a significant amount of reworking for others. Thus, to effectively allocate project contingencies and accurately predict project schedule and cost, project managers should be aware of interdisciplinary information dependency (Kim, Gibson Jr. 2003;Ortiz et al. 2009;Watermeyer 2002).
Knowledge of information dependency among different engineering tasks serves an important purpose for sequence optimization or interface management. In previous studies, organizations optimize work processes (including overlapping tasks, activity sequence reordering, etc.) to effectively compress project schedule (Austin et al. 1996;Hegazy et al. 2001;Oloufa et al. 2004;Sanvido, Norton 1994). For example, Data Structure Matrix (DSM) is perhaps the most wellknow method to optimize engineering networks and deliver products with high quality and low cost (Bashir et al. 2009), however, quantified and precise associations in an engineering network are required to make the model reliable. As a result, from a technical perspective, the application of DSM has been considered premature in the construction industry because of its lack of measures for information dependency based on empirical data.

Definitions of information dependency
Many studies have attempted to define and quantify information dependency among engineering tasks. For instance, Pekericli et al. (2003) identified characteristics for information dependencies: task sensitivity, timing of created information, parties involved, frequency of communication, type and format of information, and method and bandwidth of information delivery. Based on these characteristics, the study proposed a number of factors to model information dependencies. Nonetheless, the study falls short by not using real data for validation. Zhang et al. (2006) developed an approach to measure the dependency strength of coupling tasks during new product development. In this study, the author mathematically defined influence parameters on task output, parameter change, feedback change and expected task change. Additionally, the author developed an equation representing dependencies with predefined influence parameters. In order to simulate the impact of communication on project performance, Ortiz et al. (2009) set probability for information exchange from the SimVision User Guide considering the experience of the general contractors, project scope and size. With this foundation, the authors developed an approach to facilitate project managers that are designing project networks; however, the probability value was selected according to a guide rather than based on empirical data and therefore the results may not be conclusive. Bashir et al. (2009) developed a metric to quantify the level of project complexity which involves many interdependent tasks. However, the metric did not perform satisfactorily and ultimately, the authors recognized that their experiment should be performed based on more diverse and actual project data.

Information dependency based on empirical data is imperative
In summary, most of the studies characterized task dependencies in terms of communication frequency as well as the amount of shared information. Engineering tasks which share a significant amount of information indicate that they are highly dependent and intensive collaboration is required; thus, the performance of an anterior task may affect the performance of its successors. For instance, a common engineering parameter of an oil refining plant is the nozzle specification demonstrated on piping layout and equipment configurations. Different types of equipment may have various nozzle specifications for which piping layout will be designed accordingly. Although these dependency characteristics have been explored in the building industry (Bashir et al. 2009;Pekericli et al. 2003), limited research has addressed engineering information dependencies for the heavy industrial projects. According to Liao (2008), heavy industrial projects include oil refining plants, chemical manufacturing facilities, and power generation plants while building projects include office building, Laboratory, etc. In addition, engineering processes for heavy industrial projects are different from those of building projects and so while valuable, the lessons learned from the building industry for information modeling are limited in application within the industrial sector.

Research objective and hypothesis
The objective of this research is to establish that engineering discipline information is interdependent and that productivity correlations among the disciplines can be used to establish these dependencies. Thus, the hypothesis is that relationships between the predecessor and successor engineering disciplines can be modeled through correlation analysis of engineering productivity performance for the disciplines.

Methodology
A rigorous literature review was conducted to capture knowledge related to information exchanges and a summary of their patterns was made. The authors then collected productivity data through the CII ongoing program, Engineering Productivity Metric System (EPMS). By performing regression analyses on productivity data, patterns of information flow among engineering disciplines were discovered. Dependencies of information flow were modeled with linear regression; afterwards, a survey was conducted in CII trainings and workshops. A comparison was conducted between results from regression models and a survey was conducted for validation. Lastly, the conclusions were made and recommendations for future research were also addressed.

The Engineering Productivity Metric System (EPMS)
In 2002, with the collaborative input of many industry experts, the Construction Industry Institute (CII) commenced development of a standardized Engineering Productivity Metric System (EPMS) for the purpose of benchmarking engineering productivity. The EPMS defines engineering productivity as a ratio of engineering direct work hours to be issued for construction quantities (Kim 2007). Engineering direct work hours refers to the work hours for activities such as deliverable production, site investigations, meetings, planning, constructability, engineering rework, and request for information (RFI). Indirect engineering work hours, by CII's definition, include activities such as document control and quality management and are excluded from productivity calculations (Kim 2007).
The EPMS consists of a set of metrics for six major disciplines which account for the majority of the engineering work for industrial construction and which are often on the critical path. These disciplines include concrete, steel, electrical, piping, instrumentation, and equipment. As noted, all of the metrics are defined as engineering work hours per issued for construction quantities and these quantities are measured in various units. For instance, piping is measured in linear foot and equipment is measured in piece. The EPMS uses a hierarchical metric structure, where every discipline has their underlying metrics: Level II (discipline), Level III (sub-category), and Level IV (element). Level I is a project level summary and is not addressed in this paper. The major advantage of a hierarchical EPMS is that engineering productivity data can be collected flexibly at various levels of detail, and can be aggregated to the discipline level (Kim 2007).
Two items are addressed for clarification of the data used in this study. First, only the Level II metrics are utilized in this study because of data availability. In the metric hierarchy, the lower the level, the greater the data precision, however, at the lower levels, the sample-sizes become limiting. To address the restriction on minimum sample size for regression analysis, data precision was sub-optimized. Second, although the EPMS tracks concrete and steel separately, most CII companies track concrete and structural hours together as a single civil discipline. Thus, concrete and steel engineering productivity were normalized and combined into a single civil discipline, for this research (Liao et al. 2009).
Several major engineering firms have submitted their data and employed these metrics to benchmark their productivity against the EPMS database. After six years of data collection from 2002, a significant amount of engineering productivity data has been collected from various engineering organizations using EPMS. This data provides a significant opportunity to examine engineering information dependencies via productivity relationship among various disciplines.

Software used in data preparation/analyses
Data preparation is the essential foundation for effective data analysis. In this research, engineering productivity data were first stored in a secured Microsoft SQL Server 2005 † database. Next, engineering productivity data tables were exported and saved as Microsoft Access † files for ease of query. After further preparation, tables were exported to Microsoft Excel † because of its high compatibility with statistical packages. SPSS † was utilized to perform data analyses. Given relatively small sample size compared to other research fields, a p-value of 0.1 was determined as the acceptance level for significance test in this study, balancing the chance of identifying a false relationship with the chance of missing a significant correlation (Bobko 2001).

The EPMS database
A total of 112 heavy industrial projects with engineering productivity data were submitted to the EPMS database from 2002 to 2008. The total installed cost of all projects is US$ 4.5 billion. Table 1 presents the distribution of these projects by respondent type, project type (process or non-process), project nature (addition, grass roots, or modernization), and also project size.
Contractors submitted the majority of data with a total of 92 projects whereas owners submitted only 20. Based on the observation of the PM team, the data disparity by respondent is primarily because contractors are better staffed to track engineering productivity and more readily have access to the data. All projects submitted were heavy industrial projects which are further classified into two major categories: process and non-process. Process projects include projects such as chemical manufacturing, oil refining, pulp & paper and natural gas processing projects. Non-process projects include power and environmental remediation projects. This taxonomy was developed based on Watermeyer's definition, which defined non-process projects as those that yield products that cannot economically be stored (Watermeyer 2002). Process projects comprise the majority of the productivity dataset with a total of 77, and the remaining 35 are non-process projects. An analysis of project nature reveals that 37 are additions, 53 are modernizations, and 22 are grass roots. In accordance with CII convention, a project with a budget greater than five million dollars is categorized as a large project. Accordingly, 68 projects were categorized as large projects (greater than five million dollars) and the remaining 44 projects were categorized as small ones (less than five million dollars).
A distribution of direct engineering work hours by discipline was also produced and is presented in Fig. 1. The piping discipline accounts for the majority of work hours with 45%, a substantially higher percentage of the total hours than other disciplines. This distribution may not be typical of most projects but is reasonable since these are industrial construction projects.

Patterns of information flows
According to Watermeyer (2002), for heavy industrial projects, equipment is either engineered or selected from the catalogue provided by vendors. Once equipment information, such as installed locations or configurations becomes available, plant layout drawings are developed. At this point, civil engineering, instrumentation (control) engineering, piping engineering and electrical engineering are involved for equipment support, process and layout engineering (Skinner 1968;Watermeyer 2002). As shown in Fig. 2, the flow of engineering information for equipment is generally upstream and a long-lead item, whereas piping, civil, instrumentation and electrical follow. However, the information flow among the down-stream disciplines is project-sensitive. In other words, it is difficult to generalize the sequence among various disciplines regarding their information exchange.

Discovery of information dependency
Providing patterns of information flow from equipment to other disciplines, embedded information dependency was then discovered with productivity relationships of various disciplines via regression, instead of simple correlations. Three major steps were conducted: 1) the authors worked closely with the Productivity Metrics team (PM team), an ad-hoc committee of the CII BM&M committee, to select project characteristics as controlled variables incorporated in regression analyses, enhancing its credibility of comparisons among models; 2) engineering productivity metrics were transformed as well as aggregated prior to regression analysis; and lastly 3) the regression models were developed between the equipment (upstream) discipline and other downstream disciplines. The relationships among downstream disciplines were not included in this study because theoretical evidence for their information flow is insufficient.

Selection and coding of project characteristics
The authors worked with the PM team to select control variables for regression analyses. Project type and project size were selected because they are the key surrogates of engineering complexity, which significantly affect productivity (Liao 2008). Namely, four regression models were developed between four downstream disciplines and the upstream discipline (i.e. equipment discipline), project type and project size. A general form is listed as Eq. (1) (''Downstream EP i '' indicates engineering productivity of the i th downstream discipline): (1)

Transformation and aggregation of engineering productivity metrics
The EPMS consists of engineering productivity metrics with various units. For example piping productivity uses (design hours per linear foot), equipment productivity (design hours per equipment piece), and instrumentation productivity (design hours per tagged device) producing discipline level metrics. Electrical and civil disciplines, however, require aggregation from their underlying metrics. Because the distributions of the underlying metrics are positively skewed, a z-score method developed by Liao et al. (2009) was used to normalize data with natural log transformation producing a standard normal distribution and then aggregate them to the discipline level. Quantilequantile probability plots (Q-Q plot) were next utilized to examine metric normality. Through this process, five engineering productivity metrics (equipment, piping, civil, instrumentation, electrical) were prepared for further analysis.

Regression analyses
Project type and size characteristics were incorporated in the models with the transformed and normalized productivity data and regression analyses were performed. Multicollinearity concerns were checked using the Variance Inflation Factor (VIF) to prevent potential instability of the regression coefficients. As a result, all VIFs are less than two, smaller than the rule of thumb four, indicating no excessive correlations between independent variables for all models. As shown in Table 2, the equipment-piping model is significant, illustrating that 50% of the variability of piping productivity can be explained by equipment productivity controlling project type and size while the other 50% may be explained by other factors not captured in the model, for instance, drawing review (R 2 00.5, b 00.5, p B0.1). The result also indicates that when equipment engineering productivity improves 1 standard deviation (i.e. saves 2.72 engineering hours per piece of equipment), piping engineering productivity improves with 0.5 standard deviations (i.e. saves 1.65 engineering hours per linear feet of pipe) and the a, however. For many projects, when equipment is under development and changes take place to accommodate requirements of the project, the piping engineering team may experience significant amount of modifications on piping layout, joint, or material engineering.
Although project type and size may have partial impact on civil and electrical engineering productivity, no statistical evidence was found to support the impact of equipment engineering productivity on the other downstream disciplines. These results demonstrate relatively slack relationships among these disciplines.

Validation of the measurement of Information Dependency
A survey was conducted to collect feedbacks from industry for validation of the results. A Likert scale ranging from 1 (very weak) to 5 (very strong) was used to assess the strength of information dependency characterized by the communication frequency and the amount of shared parameters between paired disciplines. The survey was distributed to industry practitioners at CII training sessions and workshops in 2008. A total of 60 respondents completed the survey. The major functions performed by the respondents' organizations include: engineering Á 50 percent, engineering-procurement-construction (EPC) Á 40 percent, and other 10 percent. Functions of the other respondents include construction management and vendor (or supplier). All the respondents have more  than five year experience in construction engineering, indicating credible feedback for this study. Average dependency scores for paired disciplines for information dependency as determined through the survey are presented as Fig. 3. The results indicate that equipment-piping has the highest mean information dependency with a score of 4.15 whereas equipmentinstrumentation, equipment-electrical, and equipmentcivil had lower dependency ratings of 3.71, 3.65, and 3.81, respectively. After conducting one-way Analysis of Variance (ANOVA) test to compare the means across different groups, as demonstrated in Table 3, a significant difference was discovered (F 04.75, df 03, p B0.1). The heterogeneity difference in group means (Table 4) shows is established and thus the post hoc test with the Tukey method was applied. Pair-wise comparisons in Table 5 demonstrate that the average response of equipment-piping is significantly higher than all other responses, indicating that the experts considered that a significantly larger amount of parameters is shared between equipment and piping disciplines and thus more intensive communication/collaboration is required in this relationship than for the other paired relationships.
By comparing results of productivity analyses and the survey, an interesting finding was discovered. Productivity relationships indicate that information of piping engineering is significantly dependent on that of equipment; however, no statistical evidence was discovered for the relationships between equipment and civil, instrumental, and electrical disciplines. The survey data demonstrate that information dependency between piping and equipment disciplines is statistically higher than the others. Because the productivity data and survey data are consistent, productivity relationships can be referred as a legitimate surrogate for measuring information dependency.

Discussion
Both regression models and survey results demonstrate a, suggesting that information released from the equipment discipline significantly affects piping engineering. For instance, if a change occurs on equipment engineering, piping parameters such as layout or material selections may change significantly. A prudent project manager should prioritize the equipmentpiping engineering interface when allocating limited management resource. Practices such as early freezing of equipment information and precise transformation of equipment information are highly recommended to avoid unnecessary risks in heavy industrial projects.
However, productivity relationships between equipment and other downstream disciplines (civil, electrical, instrumentation) do not show significant results. This does not mean that information between them is irrelevant. Equipment information may still affect these other disciplines; however, other discipline engineering may not be as ''sensitive'' as piping discipline when equipment changes occur. Nonetheless, productivity relationships among the downstream disciplines should be further explored at various detailed levels when sufficient data are available.

Conclusions and Recommendations
Information dependency is critical to engineering management wherein task sequencing methodology or prioritizing interface management may apply. In this study, the authors conducted data analyses on productivity relationships as well as a survey and the results were consistent. These results support the argument that productivity relationship can be a legitimate measure of information dependency, at least between equipment and piping disciplines, and thus indicate an important milestone of design research. Project managers can verify important management interface and allocate resource accordingly, thereby improving engineering performance. Future studies can use this approach to: 1) discover information dependencies on element level when more data becomes available; and 2) develop design structure matrix to optimize engineering sequence on various levels with results derived from this research.