BIG DATA IN CIVIL ENGINEERING: A STATE-OF-THE-ART SURVEY

. Data generation has increased drastically over the past few years. Data management has also grown in importance because extracting the significant value out of a huge pile of raw data is of prime importance while making different decisions. This article reviews the concept of Big Data. The Thomson Re-uters Web of Science Core Collection academic database was used to overview publications that contained “BIG DATA” keywords and were included in Web of Science Category under “Engineering”. The analysis of publications was made according to year, country, journal, authors, language and funding agency.


Introduction
Generating of information from gained data is vitally important in terms of regulating life. Especially business enterprises need to store and transform data quickly and properly into information bases in order to achieve the objectives such as to be more competitive in the market, produce new products and be innovative. The increase in the amount of data sources also increases the amount of the data acquired. Therefore, storing and processing data has become difficult and classical approaches remain incapable to do it. Large amounts of data with a wide range can be stored, managed and processed using Big Data. Besides, Big Data ensures delivering proper information quickly and offers advantage and convenience to firms, researchers and consumers by taking the properties of Volume, Value, Variety, Veracity and Velocity into consideration (Ozkose et al. 2015).

Understanding of Big Data
Big Data is defined differently in literature. There is a number of definitions: Big Data is the amount of data beyond the ability of technology to store, manage and process efficiently. Big Data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization. Big Data Technologies are new generation technologies and architectures which were designed to extract value from multivariate high volume data sets efficiently by providing high speed capturing, discovering and analysing. As the definitions suggest, there are some points to be taken into consideration in Big Data sets. The data should be complex and multiple, and be of considerable size. Therefore conventional methods have difficulty in analysing Big Data sets and new methods and technologies are needed (Ozkose et al. 2015).
Big Data is a term for large and complex data sets, from music downloads to medical records and social media messages. Big Data is usually described by the four V's: 1. Volume: scale of data; 2. Velocity: analysis of streaming data; 3. Variety: different forms of data; 4. Veracity: uncertainty of data (Moreno-Sandoval et al. 2015).
Big Data can be divided into 5 classes, regarding their characteristics: Data Sources (Web & Social, Machine, Sensing, Transactions and IoT), Content Format (Structured, Semi-Structured and Unstructured), Data Stores (Document-oriented, Column-oriented, Graph based and Key-value), Data Staging (Cleaning, Normalization and Transform) and Data Processing (Batch and Real time) (Ozkose et al. 2015).
Big data are worthless, if not managed and analysed for extracting useful information. Gandomi and Haider (2015) divide the overall process of extracting insights from big data into five stages, as shown in Figure 1. These five stages form the two main sub-processes: data management and analytics. Data management involves processes and supporting technologies to acquire and store data and to prepare and retrieve it for analysis. Analytics refers to techniques used to analyse and acquire intelligence from big data (Gandomi, Haider 2015).
There are several methods of the Big Data analysis, based on ways of data acquisition: text analytics, audio analytics, video analytics, social media analytics, and predictive analytics. Ozkose et al. (2015) define those methods as follows: -Text analytics is used for information retrieval from data. -Predictive analytics is based upon estimating future considering current or stale data. Predictive analysis is used to capture the relationships of data and discover the patterns. Predictive analytics which is primarily based on statistical methods is highly applicable on many disciplines. Big Data is used efficiently in many fields of activity, such as: automotive industry; hi-tech; oil and gas industry; telecommunication sector, medicine and healthcare; media and show business; travel and transport sector; social media and online services; information and communication sector. One of the fields where Big Data can be sourced and transformed into the useful information is Civil Engineering.

Research methodology
In this paper, the literature related to Big Data has been reviewed comprehensively on the basis of papers referred in Thomson Reuters Web of Science academic database. Following the methodological analysis ( Fig. 2) on the entire body of collected publications, a number of articles were reviewed from the first international publications in the area to date (January 2016). The presented research attempts to answer the following questions: (1) How have the papers been distributed by the period of publishing? (2) How have the papers been distributed by country? (3) How have the papers been distributed by author? (4) How have the papers been distributed by journal? (5) How have the papers been distributed by funding agency?

Number of publications by different databases and by year
The analysis of the subject of Big Data has been done online. 5160 publications were found, including articles (1619) in different databases (Fig. 3). Mainly publication were found in Web of Science database. It contains 2664 referred publications (Fig. 4) on the topic of Big Data (15 January 2016), covering all types of documents, including articles (1060) ( Table 1). Articles 1060 Publications on Civil Engineering All 590 Articles 139 As depicted in Figure 3, the first scientific research on the topic of BIG DATA was done in 1974. The extent of research in the area has been rapidly increasing during the last ten years. Numbers of publications on BIG DATA increased from one-to-two papers per year up to 110 in 2012. More than 90 per cent of publications were published in the last three years (2013)(2014)(2015).
As depicted in Figure 5, the first scientific research on the topic of Big Data in Civil Engineering was done in 2006. The extent of research in the area has been rapidly increasing during the last ten years, mostly in 2013 (195 articles per year).  Further, the analysis focused on the use of "BIG DATA + Civil Engineering" topic by country. The information is given in Figure 6. Articles were an-  Table 2 published their articles on the topic of using Big Data in Civil Engineering.   Table 3 provides information on journals in ISI Web of Science database, which issued articles on the Big Data use in Civil Engineering. In total, articles were published in 73 journals. The majority of articles -15 -were published announced in IEEE Network. The second place, with 6 publications, is occupied by the International Journal of Production Economics and IEEE Transactions on Knowledge and Data Engineering. Table 4 shows the number of publications on using Big Data in Civil Engineering in Web of Science Core Collection database by funding agencies. In total, articles were funded by 84 agencies. The leader is National Natural Science Foundation of China (20 articles). The second place, with 8 publications, is occupied by the National Science Foundation.