PREDICTION AND OPTIMIZATION OF A DESULPHURIZATION SYSTEM USING CMAC NEURAL NETWORK AND GENETIC ALGORITHM

X Based on 44640 sets of data, a ten-input two-output CMAC model was built for WFGD process with high accuracy. X Proposed GREA-UDM-CMAC-LNA-GA methodology is logical and effective for model building and optimization of high-dimensional data-generation systems. X Compared with optimal operation parameters, optimized economic cost with GA was reduced by more than 30% under the same input conditions and restrains. Abstract. In this paper, taking desulphurizing ratio and economic cost as two objectives, a ten-input two-output prediction model was structured and validated for desulphurization system. Cerebellar model articulation controller (CMAC) neural network and genetic algorithm (GA) were used for model building and optimization of cost respectively. In the model building process, the grey relation entropy analysis and uniform design method were used to screen the input variables and study the model parameters separately. Traditional regression analysis and proposed location number analysis method were adopted to analyze output errors of experiment group and predict the results of test group. Results show that regression analyses keep high fit degree with experiment group results while the fitting accuracies for test group are quite different. As for location number analysis, a power function between output errors and location numbers was fitted well with the data of experiment group and test group for SO 2 . Prediction model was initialized by location number analysis method. Model was validated and cost optimization case was performed with GA subsequently. The result shows that the optimal cost obtained from GA could be reduced by more than 30% compared with original optimal operating parameters under same constraints.


Introduction
In last several years, under the pressure of environmental issues, many scholars have been committed to the optimization of environmental facilities in thermal power plants.
As the most widely used desulphurization method, wet flue gas desulphurization (WFGD) has drawn many attentions absolutely. So far, researchers have built systematic theoretical models, performed detailed numerical simulation, carried out complex simulation experiments and verified the models for the system (Kiil et al., 1998;Warych & Szymanowski, 2001;Gutiérrez et al., 2006;Dou et al., 2009;Wang & Dai, 2018;Liu et al., 2019). In recent (2011) studied external recurrent neural network (RNN) for PH control in the absorber. Yang et al. took advantage of Radial Basis Function (RBF) neural network to tune PID parameters online (Yang et al., 2016). Cheng with her fellows applied Back-Propagation (BP) network monitoring the PH of the tank directly to predict desulphurization process and use Levenberg-Marquard (LM) algorithm to improve the BP network . They also reproduced the control strategy using RNN neural network like Wu et al. (Cheng & Xie, 2017). Fu et al. (2019) used long short-term memory neural network establishing an efficiency prediction model and got higher efficiency than the least squares support vector machine model and RNN model. Yet, these studies only focus on the traditional parameter PH to ensure sulfur removal rate, which ignored the important parameter of energy consumption or capital cost. Setting the desulphurization cost as first goal and satisfying different sulfur removal rates, Warych and Szymanowski (2002) proposed a systematic cost calculation methodology and used BP net for specific operation parameters' optimization. Guo et al. (2019) used BP net and particle swarm optimization algorithm to model and optimize desulphurization efficiency and comprehensive cost of WFGD system, respectively. In the small quantity of neural network-based optimization studies for WFGD, BP net are in absolute dominance. Although BP net can realize the modelling of WFGD system, it has been widely recognized as having problems of slow learning speed and falling into local minimums easily (D. Jin & Lin, 2012;W. Jin et al., 2000). As for the desulphurization system, it produces a large amount of data all the time, which stings the weakness of BP net. Moreover, lacking of clear guiding theory for the numbers of hidden layers and neurons in the construction process imposes restrictions on the popularization of obtained results. Therefore, model selection is an important part of the research field of WFGD for intelligent computing.
Modelling the structure and function of the cerebellum, CMAC was put forward by Albus for manipulator control problems (Albus, 1975a). Different from other neural networks, CMAC belongs to table reference technology. It uses memory system achieving mapping and has characteristic of partial learning (Albus, 1975b). CMAC has the advantages of fast learning and response, no local minimum problem, being suitable for large amounts of data and possesses fair generalization ability (Albus, 1975b;Ching-Tsan & Chun-Shin, 1996). As can be seen, the characteristics and advantages of CMAC well fit the needs of desulphurization system. Therefore, the application of CMAC in desulphurization system is reasonably expected to achieve satisfying prediction accuracy, which will be further favorable for process optimization. However, the studies on the error bound of CMAC are insufficient (Lin & Chiang, 1997;Lin & Wang, 1996). Physical storage size on the computer required by data is usually the first parameter to be determined (Lin & Wang, 1996;Tamura et al., 2017). This method is feasible for simple models and small data volumes. However, for multi-dimensional and data-generation objects, it is conceivable that the practice of the method will be less effective.
Slow learning speed and falling into local minimums easily of BP net restrain its operation data volume. CMAC can process large volume of data but deficient studies of its error bound limit applications of the method. If an appropriate method can be used to predict the error of CMAC, then the large amount of data will no longer be a disaster, but an advantage for multi-dimensional and data-generation objects. Uniform design method (UDM) has a simple and mature error analysis system, which is incomparable to analyze and predict errors on the basis of limited experiments (Fang et al., 2000). It is more than appropriate to use the UDM to analyze and predict the output error of CMAC for complex systems. In short, using UDM to construct CMAC model for WFGD can not only provide a new mathematical modeling method for the process, but also promote the research of error analysis of CMAC, which will further promote its application in similar objects.
Based on above concepts, distinguishing from existing researches, this paper applies CMAC to build prediction model for WFGD system with desulphurizing ratio and the economic cost two objectives. Data volume is expanded further compared with published studies. A total of 44640 sets of original data are used. To be specific, in the establishment process of prediction model, since the grey relation entropy analysis (GREA) can make full use of the individual information to obtain overall proximity (Liu & Forrest, 2010;Biswas et al., 2014), the method was applied to screen model input variables. To reduce the number of experiments and analyze error effectively, UDM was then applied to design and study the CMAC model parameters. Common linear regression and quadratic linear regression were subsequently used for the analyses of obtained results. At the same time, considering the importance of memory space for CMAC, the size of memory space and output errors were also connected and investigated for error analysis (called as location number analysis method, LNA) innovatively. With the simple logic, mature theoretical system and global optimality, GA is still one of the main optimization tools (Mostofi & Hasanlou, 2017;Armaghani et al., 2018;Munroe et al., 2019). Hence, after the completement of CMAC prediction model building, GA is finally utilized to optimize the cost. A set of logical and scientific analysis methodology was actually proposed and attempted, which follows the process of GREA-UDM-CMAC-LNA-GA. The practice of the methodology will provide reliable experience and reference for many other similar systems with high dimension and large amount of data.

Double-loop wet flue gas desulphurization process
The data analyzed in this paper is from an ultra-supercritical unit in Jiangsu province, China. The unit adopts WFGD mechanism, launching a double-loop desulphurization tower. The schematic is shown in Figure 1. In this system, when the flue gas enters the desulphurization tower and goes upwards, it meets water and dissolves into it. Then, the dissolved sulfur dioxide releases hydrogen ion, which combines with carbonate to form carbon dioxide. Following, sulfite reacts with calcium ion and oxygen to generate calcium sulfate precipitates. The reactions in these processes are described as R1-R3.
The entire reaction system operates continuously based on the interactions of these reactions. In addition, the double-loop tower is divided into independent upper and lower absorption spaces, which cooperate with and complement each other.
Two traditional indicators are adopted to evaluate the performances of the system, namely desulphurizing ratio (R ds ) and calcium sulfur ratio (R cas ), which are defined as followings: where: C is and Cos represents inlet and outlet concentration of SO 2 respectively, mg/m 3 ; V in and V out represents total inlet and outlet volume of flue gas respectively, 10 4 m 3 /h; m represents mole number.

CMAC neural network
The parameters that need to be determined in a CMAC model include input and output variables, quantization level (Q l ) of each input variable, generalization parameter (c) and learning rate (ε). Take the two-dimensional model in Figure 2 as an example to illustrate the principle of CMAC. Under high dimensions, the squares would be hypercubes (Albus, 1975a(Albus, , 1975bChing-Tsan & Chun-Shin, 1996). For every dimension, input data is first quantified by Q l to quantization numbers. According to different partition modes of quantization layers, quantization numbers are then matched with block codes. Subsequently, block codes of same layer from all dimensions are combined together (Ab, Dd, Ef). The combination of each layer generates feature code. Every feature code corresponds to a memory location, which stores one or more (depends on the number of output variables) weight values. The collection of all possible combinations constitutes the memory space, and weight matric is formulated at the same time. For certain input vector, the sum of the selected weights is the prediction output. The quantization layer number coincides with the value of c. Total location number (ln) or the size of memory space can be generalized as: (3) where: n represents dimension; Q bn represents quantization block number, which is the total number of block codes in certain dimension and certain quantization layer; symbol refers to round up to an integer. The calculation of weights (w) is usually an iterative process: in which, f(y) means prediction value of the model for output variable, y means actual value of the set. w′ represents weight after iteration.
In the model building process, for the ease of comparison with uniform emission standard, the output variables are Cost and C os but not R ds , of which the Cost is defined as follows.
in which, M CaCO3 means the mass of CaCO 3 , kg; m SO2 means mole number of SO 2 , mol; Pc represents power consumption, kW•h, which consists of stirrer power, slurry circulating power, oxygen air power and gypsum discharge pump power. The prices of every unit CaCO 3 (t) and auxiliary power (kW·h) are taken as 100 and 0.35 CNY. Multiplying the value by 600 is to avoid operation loss and error. The advantage of defining cost in this way is that different loads can be compared in the same class. In addition, mean relative error (MRE) (Ansari et al., 2018) index is used as training target so as to modify and improve accuracy of model, namely: in which, N means total sets of data used. The advantage of taking MRE as indicator lies in that it emphasizes the relative error, which weakens the influence of actual values of data. In this way, the results obtained based on WFGD system can be extended to other similar objects.

Practice of GA in CMAC
General processes of GA are population initialization, fitness calculated, genetic evolution and population convergence (Whitley & Tutorial, 1994;Deb et al., 2002). However, the application of the combination GA and CMAC in industrial field needs some necessary preparation, including: (1) Determine the values of characteristic parameters to be optimized; (2) Select similar samples and determine the maximum and minimum values of the similar samples for the remaining input variables; (3) Calculate quantization interval length (Q il ) of each variable based on the extremum of all preprocessed data and Q l , namely: in which, MAX i and MIN i means the maximum and minimum value of preprocessed data for variable i.
(4) Determine new quantization levels Ql′, given as: Optimization is performed within the extremum values of similar samples, while the data not within the extremum interval of similar samples are not within the optimization range. Therefore, different from the Q l in the experimental and test groups, the Q l ′ here will be greatly reduced.
(5) Determine binary digit numbers (N bd ). Binary array expression precision should be less than quantization interval length. That is, 2 bd N should be no less than Q l ′. The binary digit numbers of all remaining variables are linked together as individual genes.
(6) Fulfill weight matric, the meaning of which is to fill the possible experienced weight space bound by the minimax of the similar samples under specific characteristic value. The filling principle is that the weight of untraversed location is equal to the average of other weights adjacent to it. In other words, the weight of n-dimensional untraversed hypercube is determined by 2n coplanar hypercubes. The untraversed initial values are set to 1, which is much larger than the maximum of traversed weights. The advantage of the initialization is to avoid false local optimal problems caused by large quantity of small weights, which is beneficial to the next step of optimization.

Data preparation
The obtained data variables from the power station include unit load (MW), inlet and outlet flue gas parameters: volume (V in , V out ), sulphur concentration (C is , C os ) and oxygen content (%), humidity (%), temperature (°C); liquid tank parameters: slurry density (ρ t , kg/m 3 ), PH, liquid level (H, m); circulating slurry volume (V c , t/h), volume and density of limestone supplements slurry (V s , t/h and ρ s , kg/m 3 ); current parameters: current of oxidizing air blower (I o , A), stirrer (I s , A), gypsum discharge pump (I gdp , A); oxidizing air volume (m 3 /h), defogging pressure difference (Pa) and some other parameters. Due to the two-stage absorption tower, the primary parameters are suffixed with -1 and the secondary parameters are suffixed with -2, for example, PH-1 and PH-2.
The data selection period is August 2018. The data is recorded in minute and totals 44,640 sets. The changes of traditional parameter load are shown in Figure 3.

Data preprocessing
Raw data is always incomplete and inconsistent, which not only brings difficulty to data analysis directly, but also gives rise to barriers to conclusions. To improve the quality of the data to be mined, the raw data should firstly be preprocessed. This article illustrates the data preprocessing process through the case of slurry density. Raw data and processed data of the slurry density is shown in Figure 4. Black and red dots represent raw and processed data respectively.
As can be seen from the Figure 4, although raw data can basically reflect changes in trends, it has distinct features of high noises and multiple interruption intervals, which brings great resistance to subsequent model building and data analysis process. To address the problems, the characteristic values of interruption intervals were first obtained, including the length of intervals, variable values of starting and ending points and extreme values. Then, similar intervals were found for each interruption interval. The similar interval refers to close extreme values, the close span of variable values, and the same or opposite trend. Subsequently, interpolation or resampling method was used to fill the discontinuous intervals. After that, fit and filter the overall data samples. In the fitting and filtering process, the Savitzky-Golay filter fitting method was adopted (Schafer, 2011). The sliding window width is set as 201, while the rank of sliding curve equation is 3 in the case.
After preprocessing, the data can be applied to the next data modeling and optimization process. Portion feature values of processed data are shown in Table 1.

Experimental design
GREA is generally considered as an objective and effective system analysis method and it is widely used in multifactor correlation analysis (Deng, 1982(Deng, , 1989Liu et al., 2011). Here, this method was used to analyze the correlation of cost with variables. Based on the results, variables of load, C is , V in , I o , ρ s , H-1, PH-1, ρ t-1 , V s-1 and V c were adopted to constitute the model inputs. As introduced  in 2.2, the Q l of the ten parameters, accompanied with generalization parameter c, total 11 parameters need to be determined. Cost and C os are employed as output variables normally. After preliminary testing, it is appropriate to take ε as 0.34. All 44640 sets of data are in service.

Experiment group
For multi-factor and multi-level experiments, uniform design method has incomparable advantages on saving time and experimental materials (Fang, 1994;Fang et al., 2000;Li et al., 2004). According to the principle of uniform design method, 13 levels are necessary for 11 variables (factors). Since U* table generally has less uniformity discrepancy, U* 12 11 table was selected. The specific data is shown in Table 2.
Regression analysis theory is the common method applied to the analysis of the results of uniform design (Li et al., 2003). The linear regression analysis was firstly used for the analysis, of which the basic formula is expressed as: where, x represents each variable, Y refers to output MRE. Based on unique calculation principles of UDM (Fang, 1994), the obtained results of linear regression analysis are shown in Table 3.
In the formulas, the values of b 0 are the actual, and the rest values are all enlarged by 10 5 times. From the results, the fitting formulas of Cost and C os have certain similarities: (1) The positive and negative properties of their coefficients are consistent; (2) Both coefficients of parameter c are much larger than those of other ten parameters. On the one hand, the results illustrate the importance of c for CMAC. On the other hand, it is proved that the MRE index can indeed weaken the influence of sample data, which is beneficial to the promotion of obtained conclusion.
Subsequently, quadratic linear analysis was also used for error analysis, of which the basic formula is shown in Formula 11. , 11 11 11 2 0 1 1, 1 1 Using quadratic stepping method of quadratic linear analysis, the obtained error prediction formulas of cost Yc and SO 2 Ys are: .
For the convenience of comparison, the fitting results of the two regression methods and actual vaues of experimental group are drawn in Figure 5 intuitively.

Test group
In order to further verify the prediction ability of the formulas obtained by the experiment group, test group was then conducted. The logic of the parameters for test group is unified level from U* 12 11 uniform table. The specific data is shown in Table 4. Data, learning rate and other parameters remain the same with experiment group.
Prediction formulas obtained in 4.1.1 were further used to predict MREs of test group. Both actual and predicted values are drawn in Figure 6.
As can be seen from Figure 6, for cost, linear regression formula still has high accuracy with the actual results, although it does not completely follow the trends of actual results. In terms of SO 2 , prediction errors of linear regression formula have an increasing deviation with actual data. The maximum even reached 8.6%. Whether the error will continue to increase or not is inevitably questioned. The straightforward explanation for this phenomenon is that the coefficients of some variables are a little bigger, which can be further attributed to the use of lesser levels of uniform design table. Generally speaking, for UDM, the more levels are, the stronger independence between experiments will be, and the obtained formulas are naturally more predictive. It can be expected that a more accurate result would be obtained by improving levels (Fang et al., 2000). As the reason for why there is no similar problem of deviation for cost maybe ascribed to data. Since cost is calculated rather than measured, it is relatively easier to speculate.   1  8000  5800  2700  4600  3700  5600  2700  2700  4600  2700  545   2  7500  5400  2600  4300  3450  5300  2600  2600  4300  2600  515   3  7000  5000  2500  4000  3200  5000  2500  2500  4000  2500  485   4  6500  4600  2400  3700  2950  4700  2400  2400  3700  2400  455   5  6000  4200  2300  3400  2700  4400  2300  2300  3400  2300  425   6  5500  3800  2200  3100  2450  4100  2200  2200  3100  2200  395   7  5000  3400  2100  2800  2200  3800  2100  2100  2800  2100  365   8  4500  3000  2000  2500  1950  3500  2000  2000  2500  2000  335   9  4000  2600  1900  2200  1700  3200  1900  1900  2200  1900  305   10  3500  2200  1800  1900  1450  2900  1800  1800  1900  1800  275   11  3000  1800  1700  1600  1200  2600  1700  1700  1600  1700  245   12  2500  1400  1600  1300  950  2300  1600  1600  1300  1600  215 Even worse, prediction formulas of quadratic stepping method have completely wrong trends for both cost and SO 2 . This demonstrates that the regression formulas are only the fitting of the experimental data, but not implys a deep physical relationship. The reason why the method is able to achieve such a high consistency with the results of experiment group simply lies on that it has more compound modes of variables. Attempts to use other quadratic stepping methods yielded even worse results. It is no doubt that quadratic linear analysis is not suitable for the case.
Quadratic linear regression has no error prediction ability while linear regression has progressively larger prediction errors. In order to obtain more accurate fitting curve for SO 2 , new analysis approaches must be explored based on the current results. It is explained in more detail in part 4.2 that why more accurate error prediction model of SO 2 is needed.

Location number analysis method
Considering the fact that CMAC belongs to table reference techniques, the size of the "table" is the reflection of detailed degree of mapping and determines the space to store information (Tao et al., 2002). Prediction error was related to location number tentatively. Location numbers are calculated by Formula 3. The numbers of previous experiment and test group are shown in Table 5.
Based on the results of experiment group, the relationship between prediction error of SO 2 and location number is fitted as: (14) Prediction results of the two groups are presented in Figure 7.
From the Figure 7, LNA prediction results have the validity for trends, but often have apparent deviation in comparison with other prediction results for experimental group data. However, for test group, the prediction results of fitting formula were highly coincident with actual values. It is easy to find out that the location numbers of test group were generally close to 10 12 , so the basic idea is that the location number prediction formula has high accuracy in the nearby area. Reviewing the diagram (a) in Figure 7, location number prediction results also have high consistency at some other areas, for example, at number 2, 3, 4, 8, 11, 12. Obviously, the least squares principle is working. Generally, location number fitting formula keeps trends correctness in wide span of location number and maintains high precision in specific areas simultaneously.
Location number curve fitting produces satisfactory prediction results. This conforms the importance of the size of the "table". From the results, the error and location  number are in relationship of power function. However, this result is based on the current range of location number. Whether it is practical in a wider range can be further studied in the field of CMAC error analysis. As for this paper, a more credible SO 2 error prediction formula has been obtained.
Comparing the results of above methods comprehensively, quadratic linear analysis has the highest consistency with experiment results and the worst predictions for test group. Linear regression analysis has relatively poorer similarity with experimental data but is able to maintain validity for prediction. Location number analysis method keeps trends correctness in wide space, and meanwhile maintains high precision in areas.

Model initialized
Now that the error prediction formulas have been settled. The cost and SO 2 share the same structure of CMAC and have similar output precision. Then, the next step is to determine the allowable error line. The average sulfur removal rate of preprocessed data is 0.9887, and SO 2 concentration of chimney inlet is 17.87 mg/m 3 . If the allowable deviation is 1%, that is, if the sulfur removal rate is reduced by 1%, the average concentration of export SO 2 would be 34.64 mg/m 3 , which meets the national SO 2 ultra-low emissions of 35 mg/m 3 standard in China . Therefore, model parameters are determined as the prediction error of SO 2 is 1%.
As can be seen from Figure 6a, location number fitting formula of SO 2 has high prediction accuracy nearby 1% error. Therefore, the prediction formula is adopted for model determination. The Q l of 10 input parameters are fixed as minimum values from uniform design table, and final model parameters were determined by only adjusting the value of c. Calculated c should be no more than 259. Hence, the model parameters are acquired as listed in Table 6.
Using the constructed CMAC model, the cost and SO 2 error are 0.9741% and 0.9812% respectively. Output SO 2 error is close to the objective error. The results demonstrated that this dual-output model already has high accuracy. There is no need to further adjust parameters. So far, CMAC prediction model has finally been initialized.

Model validation
Model validation is an essential step in model building process. In order to verify the above determined model parameters, model validation was then conducted. The model training was performed with randomly selected 95% of the preprocessed data, and the correctness test was then proceeded with the remaining 5% (2232 sets). Except data volume, the remaining parameters remain the same with experiment and test group. The prediction results are drawn in Figure 8.
It needs to be noted that each output of CMAC is the sum of c weights. Since the amount of data is much smaller than weight space and the validation model was built with 95% of data, it is inevitable that the locations are not traversed for the remaining 5% data.
As can be seen from Figure 8, the results of cost and SO 2 share similar data. the MRE is approximately an oblique straight line with not traversed number, which is the embodiment of the CMAC prediction principle. This shows that large prediction errors are mainly caused by  non-ergodicity. Then, the MRE of SO 2 and cost for 991 sets of completely traversed data are 1.118% and 1.204% respectively, which are larger than the design error. This may be due to the lack of generalized traversed data resulting in insufficient generalization performance. In optimization process, all sets of data would be in service. At that time, if 5% of data was randomly selected, the predicted MRE would be 0.9741% and 0.9812% in theory. That means the error curve has no influence on optimization process. Therefore, although the errors are larger than the design values, it is not caused by inappropriate CMAC model parameters, but by non-ergodicity. Hence, the model parameters are finally determined. In addition, the importance of data volume to the construction of the CMAC model can be clearly perceived. Although it is unrealistic to increase the amount of data to completely fill the weight matrix, when the amount of data increases, the number of corresponding weights within the range of each variable will be greatly improved under different working conditions. In that case, even if some locations are not traversed, predictions of weights can be filled by appropriate methods. Therefore, it is worthwhile to enlarge the data volume in future work.
A total of 44640 sets of data were used in the construction of the prediction model. However, it still faces the problem of small data volume, which shows the ability of the CMAC in mathematical modeling of large amounts of data (Ching-Tsan & Chun-Shin, 1996). This characteristic is obviously very important especially for systems that produce large amounts of data. Compared with the most published studies based on small data volume, the CMAC model proposed in the paper seems more practical.

GA optimization model
One of the significances of model building is to optimize process. After the preparation progress as described in section 2.3, binary array can establish contact with the actual CMAC model input data through minimax values. Optimization process can then be conducted through general GA steps. In the case, characteristic parameters are determined as Load, C is and V in , while similarity interval are selected as Load ±5, C is ±50, V in ±15.
As can be seen form Figure 3, load was in rapid transition, which greatly limits the number of similar samples. With more similar samples, the range of parameters values would be wider, the results would be more global and less likely affected by deviation points. Hence, the horizontal interval near the 25th is the best period for the selection of samples to be optimized. Operating condition of (Load, C is , V in ) = (701.73, 1262.03, 237.93) is selected as optimization objective. The results of the preparation process are shown in Table 7. After completion of the preparatory phase, specific evolutionary rules need to be developed. In the process of genetic evolution, the population keeps 200, and population initialization was generated by random binary array. Individual fitness was calculated by Formula 15: In the sixth step of the preparation phase, most of the vacant weights are initially assignment with large values and then determined by iterations, so these weights tend to be much bigger, which makes it difficult to meet the required sulfur removal rate. In order to avoid this situation, a strict evolutionary rule is adopted as: The top 10 percent of the fitness samples (first 20 samples) were identified as strong biological samples. The next 20 samples are positioned as medium biological sample. Half of the offspring are completely generated by combination of strong parent samples randomly. The other half progenies are generated equally by strong-medium and medium-medium combination. Gene intersections are randomly selected. Before and after the intersections are copies of the parental genes separately. The rate of gene mutation is set to 5%, and the mutation point is also random.
Due to the fast transitions of working conditions during the operation of the desulphurization system and the discontinuous operation of the pump, it is inevitable that some obtained values are smaller than the normal. From the Formula 6, cost is the superposition of factors. Hence, in these situations, the calculated fitness would be larger than actual values, which would naturally affect the process of genetic variation and ultimately affect the optimization results. In order to avoid the influence of abnormal data on the optimization results, the optimization was proceeded with improvement of R cas constraint. Before the selection of strong and medium samples, population is screened firstly and these with R cas < 0.9 are deleted. To ensure the characteristic of global optimization, a relatively conservative Rcas is selected.
In the process of optimization, CMAC model was constructed based on the determined parameters in Table 6. All 44640 sets of data were put into service. Learning rate also kept 0.34. After several optimization experiments and comparative analysis of the results, the optimal parameters and their fitness curves were obtained. The optimal fitness in each generation is shown in Figure 9. As can be seen, the optimal fitness in the first ten generations is rising rapidly except the short pause in the 6th and 7th generations. This might be because the local optimum was encountered. Besides, after 13 generations of genetic evolution, the optimal fitness reached a stable value of 35.94, which means the optimum operating condition has been found under the input condition.
At the same time, the parameters of lowest cost satisfying the ultra-low emissions standard were found in similar samples and compared. The optimized results using GA (condition 1) and optimum operating parameters in original data (condition 2) were listed in Table 8.
The cost and R ds of the two optimum conditions are respectively 50.04, 0.9905 and 71.88, 0.9883. From the two sets of data, most data of condition 1 have decreased to some extent while V s-1 increases due to the low level of ρ t-1 . From the perspective of cost, the reduction of V c and ρ t-1 means savings of actual materials. Meanwhile, the reductions of V c and I o represent a reduction in power consumption. From the Formula 6, they together lead to a cut down of cost. In fact, the success should be attributed to the deeper reason, the excellent generalization and prediction ability of CMAC (Zhou et al., 2018). Because of CMAC's extraordinary generalization ability, untraversed operating conditions gain their predicted values, and optimization can be carried out smoothly. In addition, this result also reflects the global optimization of genetic algorithms (Horton et al., 2018).
From the definite optimization case, it can be seen that the model is expected to be able to give appropriate adjustable parameters such as V c , V s-1 in time under different conditions. This will provide guidance for actual engineering operation and help the system to ensure a higher sulfur removal rate with lower cost, thereby achieving the effect of energy conservation. Of course, the above results are based on the data and the specific object, an absorber of 1000 MW plant with a double-loop desulphurization process. For other different objects or desulfurizing towers with different design loads and structures even same object with different data, the model dimensions or input variables of the prediction model may be different. However, no matter how objects and data change, the entire method proposed in the article has been proved to be correct and feasible.
The article is just a tentative application of CMAC neural network for modeling systems with large amount of data. There are details that can be further studied and improved. Specifically, the data used in the article is monthly data. It is still small compared with the research object which generates data all the time. It is worth studying on larger volume of data in future research. What's more importantly, the characteristics of the research subject may be gradually changing, which means the closer the data is to the current, the higher the value, and conversely, the longer the data is, the lower the value. Therefore, it is meaningful to weight the prediction accuracy of different data period to make the prediction model closer to the current object.

Conclusions
Using 44640 sets of data, a ten-input two-output CMAC model was built for WFGD system based on GREA and UDM. Compared with traditional regression analysis, location number analysis method was attempted to predict error and determined model parameters. Optimization of cost was further conducted using GA. From the results, some conclusions can be drawn as followings: 1) From the high accuracy of the prediction model and the significant energy saving of optimization case, it can be demonstrated that CMAC is suitable for mathematical modeling of such multi-dimensional and data-generation system. The GREA-UDM-CMAC-LNA-GA methodology is logical and effective for model building and optimization. 2) The power function between error of SO 2 and location number and its accurate predicted values of test group prove the correctness of the location number analysis method. It is much more convenient and effective to determine model parameters of CMAC compared with regression analysis. 3) With mature theoretical basis, it is promising to use GA to optimize parameters. In the optimization case, CMAC's optimal cost is reduced by more than 30% compared with regression analysis under same constrains, which fully shows feasibility and effectiveness of GA. The table reference principle of CMAC is the full extraction and utilization of the information of historical data. The successful implement of CMAC in WFGD system provides a new feasible and reliable solution for industrialization to reduce economic cost. Besides, the use of MRE index makes the analysis results in this paper have reference value for similar systems.

Funding
This work has been financially supported by the National Key R&D Program of China (2018YFC1901200).

Author contributions
Baosheng Jin and Yueyang Xu conceived the study and were responsible for the design and development of the data analysis. Yueyang Xu were responsible for data collection and analysis. Zhiwei Kong, Yong Zhang and Xudong Wang were responsible for data interpretation. Zhiwei Kong wrote the first draft of the article.

Disclosure statement
The authors declare no competing financial interest.