AN APPROACH OF ENSURING INTEROPERABILITY OF MULTI-DIMENSIONAL DATA WAREHOUSES FOR MONITORING OF WATER RESOURCES

X The structure of interoperable data warehouses for monitoring of water quality is proposed in this research. X New approach of monitoring of water reservoirs with the evaluation of impacts of water pollution and keeping track of ecological balance is presented. X Representation of data structures for biological elements evaluation in water reservoirs is provided. X Dataflow and step-wise algorithm for monitoring and handling of water treatment data are proposed. X The structures of data warehouses with a web portal connection for more adaptable monitoring and management of water resources are analysed. X Discharges of municipal, domestic and industrial wastewater into surface waters during 2014–2018 years are presented. Abstract. The realization of really working, on-line system for evaluation of pollution of water processes requires to represent the complexity of such phenomenon. The multi-layered structure of distributed information systems under different responsibilities identify some problems for developing of an adequate decision support system (DSS) working on-line. The aim of this research is to develop the interoperable infrastructure of multi-dimensional data warehouses working on-line for integration of monitoring data, enabling the conditions for more adequate decisions. The design approach includes the constructions of the knowledge base with some layers of representation, including the domain specific ontology, which are needful for evaluation of impacts of water pollution. The results are integrated with the structures of the ISMA database, EuroWaterNet, Water Resource Management Information System – WRMIS, following the requirements of EU Water Framework Directive (2004). The presented results on integration of information sources and collaboration workflows help in searching of suitable indicators for revealing the situations of water pollution from wastewater bodies and can help to retrieve the main objects which influence the monitoring of effluxes in the Baltic Sea.


Introduction
The some scientific problems are arising in the development of big data warehouses, such as the ensuring of the interoperability of multi-dimensional data structures, which maintain data flows from different data sources in accounting of water management sector; developing of adequate decision support models; synchronization of real data streams (Kadadi et al., 2014;Khouri et al., 2012; procedures of water bodies, by helping to retrieve operatively working rules for treatment of situations, and decision making, still remain the problem. The developing of adaptable DSS by integrating all working information systems (ISs) will be proposed for water sector by paying more attention to wastewater pollution evaluation processes. Such efforts are backgrounded under the achieving of the common goals of the Helsinki Convention adopted in 1994 (HELCOM, 1994) for countries which are responsible to preserve and restore the ecosystems of the Baltic Sea.
The complexity of assessment of the environmental pollution situations has straight relation with the evaluation criteria, classification of water resources, dynamicity of sewage contamination processes, and right treatment of pollution factors of all environmental points. The requirements for development of strategically integrated view of usage of distributed data warehouses are formulated following by the results of previously realized projects and systems, such as the DANCEE project (Danish Co-operation for the Environment in Eastern Europe, 2004), EU PHARE Twinning project (Andersen et al., 2002), EuroWaterNet infrastructure (Nixon et al., 1998); ISMA system (Dzemydienė et al., 2016). Requirements remain for the multi-compositional infrastructure of working on-line systems and providing synchronous and asynchronous activities of responsible organizations. This poses several challenges: how to integrate data warehouses, how to provide complex semi-automatic e-services, and how to influence control procedures.
The Baltic Sea protection Action Plan focuses on the activities which help in reducing of the pollution of loads of nitrogen and phosphorus compounds, which became the main blame for eutrophication processes in the Baltic Sea (HELCOM, 2007), and updated for new programming period in 18 December, 2020 (HELCOM, 2021). The BSAP is an ambitious Program for restoring of the good ecological status of the Baltic marine environment until 2021 year. But the process of assessment of situations is complex by providing the on-line monitoring of water bodies and evaluating of sampling data according to the measurement of water pollution concentrations. The statistical information analysis shows that the concentrations of nitrogen and phosphorus compounds grow and influence higher percent of arising of the algal blooms and other related processes.
The Baltic Sea is divided into seven parts, each of them is with different reduction targets and maximum Baltic Sea pollution per country. The major part of Lithuania's pollution (more than 85%) comes to the Baltic Proper, the rest to the Gulf of Riga. These Lithuanian commitments were transferred to the Government of the Republic of Lithuania from 2017, February 1 No. 88 of the Water Development Program for 2017-2023 year by approving the Resolution No. 88.
The aim of this research concerns the development of infrastructure of framework with integration possibilities of some kinds of heterogeneous points for monitoring data, and analyzing the influence of factors to the evaluation of processes of assessment of the quality of water in some regions of Baltic Sea. We assume, that creation of the domain specific ontology can help in water treatment processes and became an important instrument for achieving the interoperability of distributed data warehouses. But the important question arises -how to create the multi-level computer-based ontology, and this influences the formulation of main objectives: -to provide conceptual description of commonly understandable and unifying data structures of primary data (as primary level of ontology); -to develop the conceptual structures of more common for all participating authorities meta-models of functioning of integrated structures of data warehouses (as repositories of big data storages), and such structures became the background of developing the second level of domain ontology); -to develop the data obtaining and transferring algorithms for interchangeable web-based working area; -to reveal the data processing patterns for decision support and to represent them in algorithms with properties of synchronous and asynchronous processing of activations of events. The previously obtained results of construction of the multi-componential infrastructure of the framework are presented in the articles by unifying the network of some sensor-based structures as buoys working on the wireless internet platform in water of Baltic Sea (Gricius et al., 2015), and by presenting the results obtained from previously proposed the Water Resource Management Information System (WRMIS) structure with implementation of multi-dimensional infrastructure for information support (Dzemydiene & Maskeliunas, 2011;Dzemydienė et al., 2015). The WRMIS was developed for the purposes of retrospective analysis of contamination processes of water resources. But some aspects of evaluation of situations with diagnosis possibilities have to be analyzed in more detailed style. The components of decision support information infrastructure are described for aim of assistance in contamination evaluation processes of wastewater in this research area. Some issues for development of computer-based ontology are described, which help in search for indicators allowing more accurate assessment of water pollution problems. The proposed methodology (approach) for creation of multi-level structure of domain specific ontology, we assume, is quite novel in comparison with previously described works. The results show the examples of designing of meta-data models of the repositories of the data warehouses by integration of some kinds of workflow models of processes. Efforts are forwarding for integration of different data warehouses which are distributed by different responsibilities of institutions for more adequate, on-line decision support.

Related works
The efforts for development of the unified coherent framework for solving interrelated problems of water pollution in the area of water environment protection began to develop faster from the implementation of the EU Water Framework Directive (COM, 2012;Maia, 2017). The goals of Sustainable development strategies of the Agenda 21 (United Nations Division for Sustainable Development, 1992) and the Agenda 2030 (United Nations General Assembly, 2015) inspired our works. The Strategy for sustainable development provided by United Nations (UN) as well "influence the development of more effective means, for ensuring the growth of the economy, but with restriction of some activities of enterprises, which will held the increasing of usage of natural resources, and will not correspond for requirements of not exceeding of the limits of environmental pollution" (Agenda 2030 (United Nations General Assembly, 2015)). The National governances try to correspond to the requirements of EU and UN in order to preserve a cleaner environment, with more high-quality of water resources. For realization of these plans, it is important to use the all complex means including the state, municipal, private activities, and EU structural funds for environmental protection, and for more effective work (Ministry of Environment of the Republic of Lithuania, 2015; The Government of the Republic of Lithuania, 2020).
The developing processes of the unified framework for water management sector include the components of the European Environment Information and Observation Network responsible for the ReportNet structure, i.e. the European network of data collecting and reporting (EIO-NET) for reporting on the different levels of statistical data for national and European statistical infrastructures. Developing works started with complementary mandates and cooperating possibilities in a mutual beneficial way, aiming at streamlined data collection, reduced reporting burden, multiple use of data provided by Member States of EU. Efforts of integration of the structures for data and information flows developed in different research projects such as the DANCEE (Danish Co-operation for the Environment in Eastern Europe, 2004), EU PHARE Twinning project (Andersen et al., 2002), European environment information system -Reportnet (Saarenmaa et al., 2002), by implementing innovations in strategic co-ordination actions (Swanson et al., 2004). All countries of EU provided data about specific environment data changes in ReportNet structures, and Eurostat system.
The levels of representation of dynamical aspects of observable processes (i.e., monitoring subsystems) are connected with the WRMIS and other systems, which help in realization of communication between data-warehouses (DWs) in multiple objective decision making processes (Dzemydiene & Maskeliunas, 2011;Dzemydienė et al., 2015). But some unsolved issues are revealed by implementation of WRMIS infrastructure, especially in the field of interoperability of data warehouses during the monitoring processes. In this research we are trying to develop some additional means for solving the interoperability problems on the level of synchronous and asynchronous transferring of data into/from the data warehouses.
The influencing factors from air and soil pollution have straight relation for quality of open reservoirs of water and influence the quality of groundwater bodies. In respect of such influencing factors, some works can be mentioning by  in the area of development of DSS of air pollution, by (Marčiulaitienė et al., 2016;Mėžinė et al., 2019) in the analysis of pollution of Baltic Sea region. Ideas of MULINO project by (Mysiak et al., 2005) are helping in the area of environment protection, paying more attention in concrete aspects of water resources management.
The methodology of designing of environment DSS is proposed in the work of (Poch et al., 2004). Interesting decision support tools for waste management and treatment is proposed in (den Boer et al., 2007). The proposals for allocation of water quality monitoring stations in Brazil water region are described in (de Souza Fraga et al., 2019).
The approach of evaluation of quality of surface water by using GIS and a heavy metal pollution index (HPI) by providing model of recognition process in a Coal Mining area of India is presented in the research (Tiwari et al., 2015). How the network of pipe replacements influence the water supply network quality is presented in (Van Dijk & Hendrix, 2016) research study. These related studies influence the description of some additional features of the recognition and treatment components in our system.
The research efforts have been provided in the area of development of on-line monitoring networking infrastructures, by unifying such possibilities with artificial intelligence systems and smart services. The properties of on-line monitoring facilities, with extension of multifunctionalities of smart services can help to eliminate limitations for developing the adaptable DSS in future for more operative water sector protection.

The requirements for infrastructure of information systems in wastewater treatment sector
For development of main components of the unified framework for water management treatment, the main principal requirements are provided. The extension of functionalities of the Water Resource Management Information System (WRMIS) is based on the following main principles: -The techniques and the architecture selected should be in compliance with the development of directions not only in Lithuania, but also in EU. Compliance with the guidelines and requirements identified by European Environmental Agency (EEA) on future architecture of European environmental reporting. -Users should have access to data by use of the Internet and Intranet. -The access to DW should be organized by establishing the WRMIS Portal, with proper links to the participating data warehouses.
-The structure of repository of unified framework data should be stored in one place, and the operations on the databases should be performed by use of transactions. This requirement will ensure that all information generated by the system always will be based on the best available data. -Various types of data providers to the system require that special software should be available for them. It can be achieved by implementing the Client/Server system based on services including the participating data of warehouses. -The fact that data is stored in the unified place and is operated on the Internet/Extranet platform implies that the databases and allocated tools have to be secured by the restricted access. The distributing of database systems are based on integration facilities of the developing the unified data warehouse with possibilities for provision of web-based services.
The main structure in Figure 1 represents the communication process between institutions and distribution of data between the warehouses, working under responsibility of the web portal with specific server applications. The portal shall serve as the main source for environmental data and it is proposed that this part of the application is placed in the Ministry of Environment, and maintained by the staff of the Information Management Division.
Such infrastructure is recommended, because it will improve both quality of data, and value of the reports. It will eliminate some of the boring and time consuming work associated to collection of data from different sources and bringing it on a needful format, so it can be used in a common context. The WRMIS will also make it possible to establish easy and flexible procedures for data capture.
The future reporting and administration of the water bodies will be complex and it will be necessary to combine information from many different institutions and sectors. It is therefore desirable that data will be integrated in a system, from which reliable information and raw data will be easily accessible.
The collaboration processes integrate databases, data warehouses and information systems of stakeholders in water management and wastewater treatment sector of responsible institutions of EU and Lithuania according to the requirements of EU Water Framework Directive (COM, 2012).
The main information flows and collaboration processes of such infrastructure are presented in Figure 1.
The WRMIS prototype provides the information for users on: water quality in rivers and lakes; outlets from Waste Water Treatment Plants (WWTPs) and industries (only Urban WWTP > 2000 p. e.); ground water; river basins; protected areas.
This system gives possibilities to combine those data.

Structures of algorithms for monitoring and data handling according to the requirements of WFD
Quite important became the objective -to develop the commonly understandable structure of the monitoring data and integrate such data into quite operatively Figure 1. The simplified structure of communication between the main components of the WRMIS prototype system proceeding decision making process. The proposed algorithms are presented by achieving the objective for prove of the qualitative monitoring of water quality and environmental pressures (Figure 2 and Figure 3).
The algorithm structure is backgrounded on the stepwise manner style structure for monitoring and handling of data (in Figure 2). The steps are organized according to requirements and the implementation of recommendations of the Water Framework Directive (COM, 2012), by elaborating and revising cycles of River Basin Management Plan (RBMP) and the corresponding data needs. The algorithm ensures the development of right steps for data handling and data assessment by provision of target information for helping in the identification and management of environmental problems.
The WRMIS is elaborated to fulfil the following requirements: -To create the appropriate infrastructure for easy data flow between the institutions; -To select trusted data protection means for secure data storage and transferring; -Transparency; -Scalability and flexibility; -Minimization of redundant data; -Facilities to retrieve data for reporting; -To develop the structure and possibilities to make data publicly available. If decisions are taken on the basis of unreliable information, unnecessary, costly or with the ineffective measures, it causes the unreliable results in many cases.
The data flow structure between authorities for monitoring and management of data objects is developed and presented in Figure 3. The structure of workflows is prepared according to the requirements of WFD, and results of WRMIS prototype development, using the results of some EU funded projects, and their componential view by DANCEE project (Danish Co-operation for the Environment in Eastern Europe, 2004) and EU PHARE Twinning project (Andersen et al., 2002), etc.
The developed algorithm became as the data processing pattern for decision support process and can be understandable as computer-based protocol in synchronous and asynchronous processing of monitoring data for open reservoirs of water bodies. It influence future development of more detailed ones. The implementing of base of common recognizable rules in recognition processes for defining the status of water quality became the aim of more automatized system creation.
The algorithm include the classification of regions of environment and protection requirements. Such treatment became more complex. For implementation of such classification into the more automatic process of decision support system is very important as context information recognition. The development of more detailed algorithms became the objective for our future research works.
As well it is important to provide more detailed responsibility of data flow and work flow process. We present the structure of such data flows with detailed responsibilities of monitoring data, reporting data about water quality and other important data in all this water management and treatment process (Figure 3).
The presented data flow structure in Figure 3 is quite common. The authorities, which are responsible for monitoring of water reservoirs are following this structure, but for retrieving of more concrete information, the conceptual models require the structures of multi-layered ontological representations in more detailed style.

Representation of knowledge base structures by patterns of domain specific ontology
The reporting and administration of the water bodies is complex and there are necessary to combine information from many different institutions and sectors. It is therefore desirable that data are integrated in a system, from which reliable information and raw data is easily accessible. An information management system has to be prepared for data exchange between different Lithuanian institutions and international institutions.
The structures of working on-line repositories is based on the macro-level models and ontologies. On the level of developing the domain-based ontology for data warehouses, it is needful the realization of requirements of interoperability of conceptual structures. Consequently, the

Monitoring stations
Sampling points

Determinands / parameters
Dates Limit values

Protected areas
Pressure

Management of Water Resources
Obligations ( Pollution by other substances identified as being discharged in significant quantities into the body of water. An example of representation of data structures of primary level of domain specific ontology of water pollutants which are needful for implementation in repositories is developed by representing the object classes for evaluation of biological elements of water reservoirs, and is illustrated in Figure 4, by using Unified Modelling Language (UML).
The quality of elements applicable to artificial and heavily modified surface water bodies shall be those applicable to whichever of the four natural surface water categories (i.e., rivers, lakes, transitional waters and coastal waters) most closely resembles the heavily modified or artificial water body concerned.
The more detailed interoperable structures are developed for synchronous implementation of activities with data flow elements, and the water quality monitoring data ontology was developed ( Figure 5) and applied.
The Regional Environment Protection Departments (REPD) are responsible for maintaining of primary collections of data from enterprises.
The assessment of pressures in surface water bodies is based mainly on: -  The usage of computer-based, domain-specific ontology for representing the conceptual structures of water pollutants and water quality monitoring data ontologies are helpful in all data collection points, distributed water quality monitoring databases and WRMIS portal. Ontology ensures smooth integration of data across all components of the Water Resource Management Information System.

Assessment of influencing factors on surface water quality in open reservoirs
The processes of environmental monitoring in Lithuania (including water monitoring) are executed according to the State Environmental Monitoring Program (The Ministry of Environment of the Republic of Lithuania, 1998) and by following the Directive 2000/60/EC (The European Parliament and the Council of EU, 2020). The Program indicates the main monitoring principles, goals, structure, sites, parameters, measuring units, sampling and reporting rates, responsible institutions, data usage, etc.
The data on water quality are collected from many sources. But there are 8 main REPD water quality data collection points under the supervision of the Environmental Quality Assessment and Modelling Division of JRC. The uniting water quality monitoring database named "VAN-MON" stores data from 103 sites, primarily in rivers, but also in few lakes, too. In all REPDs most of the data are captured in similar local databases. Four times a year data are transmitted to JRC into the central warehouse of databases.
The main task of on-line analytical support and possibilities to work of smart services is to define the clear and adequate structure of the repository of unified data warehouse, which is working as mainframe server. Data from various online transaction processing (OLTP) applications and other sources is selectively extracted and organized on the data warehouse from different databases for using them by analytical applications, adaptable for the different user roles, which are composed by restrictions of their responsibilities.
Analysis is made both on water samples and on sediment samples. General chemical analysis of water samples is performed on a regional level, but chemical analysis of heavy metals and organic pollutants is performed mainly in the laboratories of JRC prepared for sufficient quality assurance in Vilnius, and Klaipėda.
Surface water pollution is the process of discharge of households and industrial waste water sources into the surface water bodies.
Water pollution indicators, their appropriate choice and assessment tools should be taken into account when looking for appropriate pollution accounting methods and their decision-making processes. Without the creation and modification of waste water treatment plants and sewage collection networks, without further measures to reduce agricultural pollution, the pollution level will not be reduced to the required level.
The processes of environmental monitoring in Lithuania (including water monitoring) are executed according to the National Environmental Monitoring Program of MoE (approved by the Governmental Resolution No. 27 of 1998 07 01). It indicates the main monitoring principles, goals, structure, sites, parameters and measuring units, sampling and reporting rates, responsible institutions, data usage, etc. Data on monitoring of surface waters is stored in VANMON database. Data flows for VANMON database with river and lake monitoring data comply with the Environmental Monitoring Program. The ordinary Figure 5. The scheme of water quality monitoring data ontology for data collections of WRMIS warehouses data flow cycle of monitoring data to be stored in VAN-MON database is following: -Monthly sampling in about 103 sites (primarily in rivers but also in few lakes) by eight REPDs; -Sending of monitoring samples to REPD laboratories, and in more complicated cases (i.e. heavy metals, pesticides, phenols, etc.) -to the laboratory of Environmental analysis service of JRC of MoE in Vilnius; -JRC laboratory sends analysis results to corresponding REPD; -All REPDs send (by e-mail, and at the same time a paper-version with signatures) quarterly monitoring data to Environmental quality assessment division of JRC of MoE. Data received from REPDs is stored to the VANMON database. The data flows for the DB of water consumption and emissions of JRC (which covers annual data on waste water discharges and use of water, rivers, wastewater treatment plants, and industries) are regulated by "Regulations of data collection, database and of summaries forming, annual report preparation for statistical accountability forms No. 1 -Water, No. 2 -Air, No. 3 -Waste", approved by MoE order No. 150 of 17.10.1996. For illustration of the distribution of the discharge of municipal, domestic and industrial wastewater into surface waters by counties of Republic of Lithuania during 2014-2018 years (in thousands of m 3 of water effluxes) is presented in Table 1. The analysis is provided by official data of the Department of Statistics of Republic of Lithuania (Official Statistics Portal, 2019). The reasons for majority of pollution of water bodies is caused by agricultural activities, the negative impact of which is not significantly reduced until now (Resolution No. 1247 of the Government of the Republic of Lithuania, 16.09.2009).
The main goal is to reduce the pollution of the Baltic Sea and is based on reduction of sewage pollution and soil pollution indicators, according to the requirements of the Baltic Convention, including all countries located around the Baltic Sea. These indicators must be understood in the same way for assess of pollution by all countries. According to the commitments of HELCOM, and the Republic of Lithuania should reduce some kinds of emissions to the Baltic Sea. For example, in 2016 year Lithuania have set a goal to reduce the pollution of waste water, by achieving certain quality results, where it is planned to reduce the quantities of some components which are important for the pollution of the Baltic Sea, which would constitute until 8970 tons of nitrogen and 1470 tons of phosphorus until 2021 year in waste water effluxes by following the state-  Figure 6.
Forecasts that would enable the achievement of adequate wastewater quality are not comforting given the current impact on soil contamination. The monitoring of soil contamination in Member States of EU can cover about 250,000 potentially polluting operational sources. If the current trends continue in soil contamination processes across the EU, studies indicate that the pollution of soil (that must be cleaned) will increase by 50% until 2025 year. And these factors influence contamination of water bodies (WB). The national reports of EU countries show that, for example, heavy metals and mineral oil are the most commonly found components of soil contamination, while mineral oils and chlorinated hydrocarbons are the most commonly found pollutants in groundwater. WBs are not in good conditions located to the intermediate Curonian Lagoon and coastal Baltic Sea WB categories. The results of the state monitoring show that even 51% of the rivers and 40% of the lake water bodies do not meet the criteria for good status. The data represents the results of analysis about the amounts of effluxes of drain polluted sewage in open WB by counties of Lithuania during the period of 2014-2018 years (Figure 7).
After the modernization of wastewater treatment plants in the major cities of Lithuania, the impact of point pollution sources (urban or corporate wastewater) on the status of water bodies has significantly decreased. However, this has not happened with pollution from agricultural fields (diffuse sources) -at present, diffuse pollution is the biggest source of water pollution, the importance and impact of which, as the data show, is increasing.

Conclusions
The presented results show the picture of complex problems under consideration. The infrastructure of decision support system for waste water evaluation and situation diagnosis is complex and multi-dimensional. Works are constantly being carried out. The developed approach should be implemented for achieving the requirements of interoperability of distributed data warehouses in the processes of water resource management, and it contains the proposal of developing the multilayered computer-based ontology of domain with patterns of data obtaining processes. Practical implications of presented WRMIS system influence the decisions provided for responsible institutions of Lithuania and other countries, members of the HELCOM, responsible on the protection of the water and marine environment of the Baltic Sea.
Novelty of obtained results is in the developing of structures of knowledge base and DWs maintaining algorithms, which help in the assessment of the monitoring data requirements in the wastewater sector, by providing integrated collaboration infrastructure of data warehouses, and influencing their interoperability. According to the results of the analysis of sewage accounting and pollution indicators, the pollution and impacts of economic actors on pollutants in the Baltic Sea region are not decreasing. Legislation should provide for more effective ways of taxing pollution activities. More attention should be paid to strategic and tactical planning of the environment, operational control of the economic and ecological balance. It is recommended to monitor the impact of pollution, to regularly monitor the work of cleaning systems and to keep track of the ecological balance of indicators.
The European Environment Information and Observation Network have responsibilities for ReportNet -the infrastructure for supporting and improving data and information flows. All countries are providing data about specific environment data changes. The Lithuanian Geological Survey; Lithuanian Hydrometeorological Service; Ministry of Environment of the Republic of Lithuania; Marine Research Centre are institutions responsible for providing and maintaining such data in Lithuania. Water resource management information system combines multi-dimensional infrastructure of information.