A GENETIC PROGRAMMING APPROACH FOR ESTIMATING ECONOMIC SENTIMENT IN THE BALTIC COUNTRIES AND THE EUROPEAN UNION

. In this study, we introduce a sentiment construction method based on the evolution of survey-based indicators. We make use of genetic algorithms to evolve qualitative expectations in order to generate country-specific empirical economic sentiment indicators in the three Baltic re-publics and the European Union. First, for each country we search for the non-linear combination of firms’ and households’ expectations that minimises a fitness function. Second, we compute the frequency with which each survey expectation appears in the evolved indicators and examine the lag structure per variable selected by the algorithm. The industry survey indicator with the highest predictive performance are production expectations, while in the case of the consumer survey the distribution between variables is multi-modal. Third, we evaluate the out-of-sample predictive performance of the generated indicators, obtaining more accurate estimates of year-on-year GDP growth rates than with the scaled industrial and consumer confidence indicators. Finally, we use non-linear constrained optimisation to combine the evolved expectations of firms and consumers and generate aggregate expectations of of year-on-year GDP growth. We find that, in most cases, aggregate expectations outperform recursive autoregressive predictions of economic growth.


Introduction
Agents' actions tend to depend on their expectations (Doronina Koltan et al., 2013). In a context of high uncertainty like the current one, agents' expectations become crucial for economic analysis. As expectations are not directly observable, they are usually obtained through surveys. The fact that survey expectations capture the beliefs and intentions of economic agents prior to the publication of GDP growth figures, makes them especially useful for economic forecasting.
Since 1961, the European Commission (EC) has conducted surveys of businesses and consumers in the member states of the European Union (EU). These surveys ask respondents if they expect an increase or decrease in their economic expectations, thus facilitating comparability between countries' economic conditions. From the difference between the response percentages of the extreme categories, a diffusion index called balance is calculated and published regularly at the end of each month.
Soft computing, as opposed to traditional computing, deals with approximate models and gives solutions to complex real-life problems (Ibrahim, 2016). Soft computing is based on techniques such as artificial neural networks Safa et al., 2020), support vector regression (Jović et al., 2016;Shamshirband et al., 2014) and genetic algorithms Shariati et al., 2020). The main motivation of the study is to propose a methodology for the construction of indicators of economic sentiment based on the modelling of agents' expectations through genetic algorithms.
The main advantage of the proposed approach is that it does not require knowledge of the underlying interconnections between different expectations as a precondition for its application. As a result, it allows evolving a wide range of business and consumer expectations without any required knowledge regarding their interactions in order to generate non-linear country-specific economic sentiment indicators. Therefore, the proposed data-based approach is especially suitable in this case, in which there is no a priori functional relationship between the survey balances and the year-on-year growth rate of GDP.
The aim of the paper is three-fold. First, we propose an alternative approach for economic sentiment construction based on symbolic regression (SR) and genetic programming (GP). This procedure is based on a heuristic search of the optimal functional form that combines qualitative survey data in order to estimate economic growth. The main motivation to use this approach lies in its flexibility, given that the search algorithm not only selects those survey indicators that best describe economic growth dynamics, but also combines the selected variables without imposing any type of restriction on the models or the parameters. With this aim, we use all survey variables from the industry and consumer surveys in the three Baltic States (Estonia, Latvia and Lithuania) and the EU to evolve country-specific non-linear economic sentiment indicators for firms and households. We design two alternative SR experiments for each country: one for the industry survey in order to produce an evolved industry confidence indicator and another one for the consumer survey to generate an evolved consumer confidence indicator.
Second, we use the obtained evolved expressions to compute the frequency distribution and the lag structure of the different survey variables selected by the genetic algorithms to track economic growth in each of the Baltic countries and the EU. The proposed empirical approach searches among the space of all models and, expresses the potential solutions in the form of computer programs, which are represented as tree structures where every tree node has an operator function and every terminal node an operand. This general representation scheme makes mathematical expressions easy to evolve and evaluate. Third, with the aim of evaluating the performance of the evolved indicators, we analyse their out-of-sample predictive capacity, comparing it with that obtained using the EC confidence indicators. On the other hand, we use Granger causality tests to evaluate the predictive power of firms and consumer expectations of economic growth. Finally, we combine firms' and consumers' expectations by means of a generalized reduced gradient non-linear algorithm that yields the optimal weights in order to generate aggregate expectations of economic growth that we compare with recursive autoregressive forecasts used as a benchmark.
As a result, we provide researchers and practitioners with three outcomes: (i) evolved sentiment indicators that yield estimates of year-on-year GDP growth rates using survey expectations as the sole input; (ii) insight into the relative importance of the indicators of the industry and the consumer surveys of each country, as well as the lag structure of the selected variables determined by the algorithm; and (iii) the relative weight of the expectations of both firms and consumers in order to improve the forecast accuracy of predictions of economic growth in each country. These results are intended to highlight the value of the information coming from tendency surveys for the economic analysis, and also to provide researchers with an alternative way of generating agents' expectations through empirical methods.
In order to cover this deficit, in this study we apply GP to construct indicators of economic sentiment in the Baltic countries and the EU. With this aim, we use the survey expectations collected in the balances to generate non-linear confidence indicators through an evolutionary process in which no assumptions are made about agents' expectations.
The paper is divided into three sections. Next, the applied methodology and the design of the experiments are described. Section 2 presents the dataset and assesses the evolved expressions. Results of the forecasting comparison are provided in Section 3. Finally, the main conclusions and future lines of research are presented.

Experimental setup
As stated by Petković et al. (2016), soft computing techniques have shown to be proficient in numerical mapping between data and variables of nonlinear frameworks. Among them, hybrid intelligent systems such as adaptive neuro-fuzzy inference systems, which enhance the ability to automatically learn and adapt, are increasingly being used by researchers in various fields Toghroli et al., 2014). Other techniques of soft computing that are increasingly being used due to their flexibility for analysing unknown relationships between variables are genetic algorithms.
GP is a heuristic search technique for evolving programs that can be regarded as a generalisation of genetic algorithms (see Katebi et al. (2020) for comparative analysis of metaheuristic optimization algorithms). This optimization approach is based on the representation of programs in tree structures. SR, as opposed to conventional regression analysis, which is based on a certain ex-ante model specification, uses GP to search for relationships between a given set of variables and evolves the functions until reaching a solution that can be described as the functional form that best approximates the interactions between the variables of the system. This function allows not only to visualise relevant links but also makes it possible to detect unknown relationships. Given the suitability of genetic algorithms for detecting patterns in large data sets and for the automatic resolution of optimisation problems, GP is increasingly being applied in more areas (Alexandridis et al., 2017;Eliiyi et al., 2009;Fernández et al., 2019;Pan et al., 2019). Most of its applications in economics have been made in finance (Acosta-González & Fernández, 2014;Larkin & Ryan, 2008;Vasilakis et al., 2013). Applications of GP in macroeconomics have been scarce (Álvarez-Díaz & Álvarez, 2003(Álvarez-Díaz & Álvarez, , 2005Chen et al., 2012;Duda & Szydło, 2011;Kotanchek et al., 2010;Koza, 1992;Kronberger et al., 2011;Marković et al., 2017). See Claveria et al. (2017) for a review of recent applications of GP in economics.
GP can be regarded as a generalization of genetic algorithms, which are the most popular type of evolutionary algorithm. This type of algorithm selects the fittest programs for reproduction to produce new and fitter offspring that become part of the new generation. GP applies operations analogous to natural genetic processes to an initial population of programs: reproduction, crossover and mutation. Operations are applied recursively, generation after generation. The termination criterion can be set, either when a specific program reaches a desired fitness level, or when a predefined number of generations is reached.
In this study, we use GP to find the relationship between a wide spectrum of expectations and year-on-year GDP growth rates. This approach allows deriving algebraic expressions that optimally combine survey expectations to monitor economic outcomes. See Figure 1 for a visual description of the experiment.
To implement the GP experiments, the following items must be predetermined: 1. Initial population -A random population of 70000 functions is generated, from which the best 10000 individuals are selected. 2. Fitness function -The mean square error (MSE) is used to assess the fitness of each individual. 3. Strategy for the selection of parents -The tournament method is used to guarantee the diversity in the population. As a result, the best two of three randomly selected individuals are mated. 4. Probability of a new generation -The mutation probability is set at 0.20 and the crossover probability at 0.75. 5. Termination criterion -A maximum of 100 generations is set. On the assumption of having established a minimum fitness as a stopping criterion, if no individual reached it, the process would be repeated using the new generation as the population, so that the fitness of the population would improve in successive generations. Macias-Escobar et al. (2019) showed the effectiveness of population evolvability to solve dynamic optimization problems.
During the evolutionary process there is a trade-off between the level of precision sought and the complexity of the structure of the resulting models. In order to achieve simple and easy-to-apply functional forms: (i) we limit the arithmetic operators to elementary functions (addition, subtraction, product and division), and (ii) we introduce regularisation terms in the slope, the curvature, and the complexity of the inferred functions. See Ardia et al. (2019) and Hastie et al. (2017) for a justification of the need to regularise. We use the Distributed Evolutionary Algorithms in Python (DEAP) package developed by Fortin et al. (2012).
For each country, we compare two sources of information: annual GDP growth rates and responses from businesses and consumers about the expected direction of change for a wide variety of economic variables. First, we run the experiment for the industry survey and then we repeat the experiment for the consumer survey indicators. As a result, we infer two analytical expressions per country: one that combines firms' expectations to economic growth and, another one that combines consumers' expectations to economic outcomes.

Evolved economic sentiment indicators
In this section we present the output of the GP experiments undertaken, in which the genetic algorithms searched the space of mathematical expressions to find patterns across survey variables that best tracked economic growth dynamics. We used year-on-year growth rates of seasonally adjusted GDP provided by Eurostat (2019) and all monthly and quarterly data from the EC industry and consumer surveys (see Table 1). Mutate all individuals with a probability P mut Iterate for the size of the population: Tournament: select 3 random individuals from the population Mate the two best In both surveys, results are aggregated in balances. For the present study, all survey data were aggregated quarterly. The sample period goes from 2003.Q1 to 2019.Q2, but as the last fourteen quarters are used as the out-of-sample period to evaluate forecast accuracy, we estimate the SRs in the period 2003.Q1-2015.Q4.
We ran two different SR experiments for each country. In the first, we regressed GDP growth on the industry survey indicators to generate evolved industrial confidence indicators that give estimates of firms' expectations regarding the evolution of economic activity (Exp.IND). While in the second, we do the same with consumer survey indicators to infer evolved consumer confidence indicators that provide quantitative estimations of households' economic growth expectations (Exp.CONS). The obtained evolved indicators for firms and consumers are displayed in Table 2.
With respect to the obtained evolved economic indicators presented in Table 2, we observe differences between the industrial and consumer confidence indicators, showing the former more complex structures, mostly non-linear, including ratios between the survey variables. Regarding the number of lags of the selected variables, these appear indistinctly with and without lags.
We also find that variable X5 from the industry survey ("production expectations for the months ahead") is the most frequent in the evolved industry indicators, followed by the quarterly variable X14 ("competitive position outside EU"). Regarding consumer expectations, variables X3 ("assessment of the general economic situation over the last 12 months"), X4 ("expectation about the general economic situation over the next 12 months"), X7 ("un-employment expectations over the next 12 months") and X9 ("major purchases over the next 12 months") are the variables with the highest predictive power.
Next, we compute the relative frequency with which each survey variable appears in the evolved indicators. Results are summarised in Figure 2. It can be seen that the frequency distribution of the consumer survey variables is multi-modal.

Empirical results
In this section we assess the performance of the evolved indicators in four respects. First, we compare the evolution over time of the expectations generated with the proposed indicators with those obtained with the indicators constructed by the EC. Next, we run Granger causality tests to evaluate whether the evolved expectations are useful for forecasting economic growth. Third, we use a non-linear algorithm to compute the weights that optimally combine the evolved expectations of firms and consumers to generate aggregate expectations of economic growth. Fourth, we evaluate the out-of-sample performance of the evolved agents' expectations for several forecast horizons by comparing them to recursive AR forecasts used as a benchmark. Finally, we discuss the obtained results. Since the result of the evolved confidence indicators is expressed as expected annual GDP growth rates, we have regressed the GDP growth of each country on the different variables of the survey that are part of the EC confidence indicators during the in-sample period (2003. Q1 to 2015.Q4). The OLS estimates of the weights of the components of the respective confidence indicators of each country allow us scaling the EC confidence indicators so that they are directly comparable with the evolved confidence indicators. The last fourteen quarters of the sample (2016.Q1 to 2019.Q2) are used as the out-of-sample period. The vertical line in Figure 3 marks the beginning of the out-of-sample period (2016:Q1). We can observe that both indicators move closely together with GDP over the sample period, with the only exception of the scaled consumer confidence in the EU, which shows a poor out-of-sample performance.

Assessment of the performance of the evolved indicators
Next, we compute the out-of-sample root mean square forecasting error (RMSFE) obtained with the different confidence indicators. To test whether the reduction in accuracy is statistically significant, we additionally compute the Diebold-Mariano (DM) statistic of predictive accuracy (Diebold & Mariano, 1995), applying the modification proposed by Harvey et al. (1997). Results in Table 3 show differences between the results obtained for the industry and for consumers comparing predictions of two models: evolved confidence and scaled confidence. For the industry confidence indicators, only in Lithuania the evolved indicator generates significantly lower forecast errors than the scaled confidence indicators. For the consumer confidence indicators, evolved indicators significantly outperform the scaled confidence indicators in all countries except Estonia.
By means of Granger causality, Dubinskas and Stungurienė (2010) showed that the impact of the financial crisis was greater in Latvia and Estonia than in Lithuania. To complement the previous analysis, we use a Granger causality framework to measure the incremental predictive power of the evolved expectations with respect to the past values of the series to be predicted. We want to note that the use of year-on-year growth rates of GDP may generate autocorrelated error terms that could bias the results of the Granger causality tests.
In Table 4 we present the F-test results for four different forecast horizons (h = 1, 2, 3 and 4 quarters). Again, we observe notable differences between firms' and consumers' expectations.
In the case of consumers' expectations, for all forecast horizons expectations Granger-cause GDP growth for Lithuania and the EU, which implies that including past values of expectations improves predictions of the reference series based only on their own past values. For Latvia and Estonia, this is only the case for h = 2. On the contrary, consumers' expectations are informed by economic growth in Estonia and Latvia, but not in Lithuania and the EU. Regarding firms' expectations, we find bidirectional causality in all countries for h = 3. For the rest of forecast horizons we find that GDP growth informs firms' expectations in all cases, but not the other way around. These results corroborate the ones obtained in the previous out-of-sample comparison. Finally, we combine the forecasts obtained by averaging the evolved expectations of firms and consumers. Gelper and Croux (2010) have shown that the ad hoc calculation of the aggregation weights of the components of the European Economic Sentiment Indicator (ESI) constructed by the EC allows improving its predictive capacity. In order to find the relative weights of firms' and consumers' evolved expectations, we use non-linear constrained optimisation (Kwiatkowski, 1992). Specifically, we apply a generalised reduced gradient algorithm that minimises the sum of the squared forecast errors. We impose two restrictions with respect to the weights: that they are not negative and that the sum of both is equal to one. The obtained weights are reported in Table 5. It can be seen that while for Estonia and the EU the weights obtained for industry expectations are higher than for consumption, in Latvia and Lithuania they are quite similar.
We combine evolved firms' and consumers' expectations by applying the relative weights displayed in Table 5 and generate aggregate expectations of economic growth (Exp.Agg). We compare the obtained expectations to recursive AR forecasts used as a benchmark for several forecast horizons (h = 2, 4 and 8 quarters). We use the Akaike Information Criterion (AIC) for model selection, considering up to a maximum of 8 lags.
We perform a recursive forecasting exercise to evaluate the predictive performance of the evolved expectations across agents for different time horizons. In Table 6 we present the results. Given that the official GDP data is usually published with a delay of around 70 days, expectations are compared with two-quarter ahead forecast (h = 2). As the evolved sentiment indicators are not re-estimated after the last in-sample period (2015.Q4), we also include one-and two-year ahead forecasts (h = 4 and h = 8).  While for shorter horizons the evolved indicators only outperform the AR forecasts used as a benchmark in half of the countries, as h increases the evolved aggregate expectations yield lower forecast errors in all countries. These results are in line with those obtained by Gelper and Croux (2010), who found that alternative aggregation schemes of survey variables improved their relative forecasting performance at longer forecast horizons. Our results provide mixed evidence about the predictive power of survey expectations, and are in line with recent research. While some authors such as Lacová and Král (2015) and Breitung and Schmeling (2013) did not find sufficient evidence regarding the informative content of survey expectations, most studies offer evidence to the contrary, both for consumption in Chile (Acuña et al., 2020) and Indonesia (Juhro & Iyke, 2020), such as for inflation in Brazil and Turkey (Altug & Çakmakli, 2016) and for GDP in the EA (Claveria et al., 2007;Girardi et al., 2015).

Discussion
In this study, the informative content of survey expectations is analysed. On the one hand, we evaluate their predictive capacity, both for companies and consumers. On the other, we examine the role that machine learning techniques can play to generate more sophisticated aggregation schemes, allowing the construction of sentiment indicators with greater predictive capacity. The main contribution of the present study is threefold. First, we provide researchers with country-specific evolved confidence indicators that generate quantitative estimates of economic growth. Second, we give insight regarding the relative importance of the variables of the industry and the consumer surveys in the Baltic economies. Additionally, we provide information on the lag structure of the variables selected by the algorithm to form part of the evolved indicators. Finally, we evaluate the relative importance of the expectations of both firms and households to mirror economic growth dynamics in each country.
Some of these issues are of great importance when using qualitative survey expectations for economic analysis, and have been previously addressed by Gelper and Croux (2010). The authors used dynamic factor analysis and partial least squares to aggregate the information coming from the survey indicators that are used by the EC to construct the ESI. When comparing the alternative aggregation schemes to the ESI, the authors found that the predictive performance of the ESI was comparable to the both alternatives, especially for short forecast horizons.
In this study, we focus on the construction of country-specific data-driven sentiment indicators making use of all available information from the industry and the consumer surveys. In spite of the different approach and the different methods used for the construction of sentiment indicators, we also obtain mixed evidence regarding the explanatory power of qualitative survey data on expectations, and observe differences across countries. When aggregating firms' and consumers' expectations, we obtain similar results in Estonia and Latvia for short forecast horizons, while in Lithuania the obtained results are closer to those obtained for the EU. Estonia and Latvia are also the countries in which aggregate expectations show a major incremental predictive power when compared to autoregressive forecasts of economic growth. Furthermore, we show the ability of GP to solve optimisation problems for economic analysis. This research links with previous works by Claveria et al. (2018), Duda and Szydło (2011) and Kotanchek et al. (2010), who made use of GP for the generation of empirical models used for the monitoring of the economic activity.
Regarding the implications for economic policy, we provide researchers and practitioners with a set of country-specific confidence indicators. The proposed indicators are very easily implementable, and allow transforming the qualitative expectations of companies and consumers into estimates of interannual GDP growth rates. Additionally, the proposed data-driven approach does not make any assumption regarding economic agents' behaviour. The fact that the confidence indicators have been generated independently for businesses and consumers allows policy makers to obtain information on both the demand and supply sides of the economy.

Conclusions
The main objective of the paper is three-fold. First, we aim to provide researchers with an empirical modelling approach that allows generating economic sentiment indicators, which transform qualitative survey expectations into estimates of annual GDP growth. Second, we assess the predictive performance of the obtained evolved indicators, providing new evidence regarding the predictive capacity and the lag structure of each of the variables contained in the industry and the consumer surveys conducted by the European Commission (2009). Third, we evaluate the out-of-sample forecasting performance of agents' expectations by comparing them to the official confidence indicators and to autoregressive forecasts used as a benchmark.
Firms' and consumers' expectations about the expected direction of change of a wide range of variables are independently evolved to derive country-specific empirical confidence indicators for the industry and consumers. The analysis is done for the three Baltic economies and the European Union. The methodology is based on the application of evolutionary algorithms that search for the optimal non-linear combination of survey expectations that best tracks the evolution of economic outcomes in each economy.
In order to evaluate the information content of the different variables contained in the surveys and their optimal lag structure, we examine the relative frequency with which they are selected by the algorithm for the generation of the evolved mathematical expressions. We find that firms' production expectations is the variable from the industry survey most frequently selected, both contemporaneously and lagged. We also observe that most questions of the consumer survey appear in the indicators, presenting a multi-modal distribution. The assessments and expectations about the general economic situation and the expectations of unemployment and major purchases are the variables that show the highest predictive power.
Then, we compare the out-of-sample forecast accuracy of firms' and consumers' evolved expectations to the scaled industry and consumer confidence indicators constructed by the European Commission. We also observe differences across countries and between industry and consumer expectations. For the former, only in Lithuania evolved industry confidence indicators produce significantly lower forecast errors than the scaled confidence indicator, while for the consumer survey that is the case in Latvia and the European Union.
We evaluate the ability of the empirical confidence indicators to predict economic growth in each country. With this aim, we apply Granger causality tests and find that economic growth informs firms' expectations in all cases, but not the other way around. Regarding consumers' expectations, for all forecast horizons, expectations Granger-cause GDP growth in Lithuania and the European Union, which implies that including past values of expectations improves predictions of the reference series based only on their own past values.
Finally, we combine firms' and consumers' evolved expectations to generate aggregate expectations using a generalised reduced gradient nonlinear algorithm that computes the optimal weight in each country. We assess the out-of-sample forecasting performance of aggregate expectations by comparing their accuracy to that of recursive autoregressive forecasts used as a benchmark. We find that the relative performance of aggregate expectations improves as the forecast horizon increases.
These findings aim to improve the forecasting potential of business and consumer survey data in the Baltic countries and the European Union and, provide new tools to construct indicators that anticipate future demand growth for planning purposes. While we have shown the usefulness of genetic programming for solving optimisation problems, we want to stress the empirical nature of the obtained evolved expressions. Another issue derived from the flexibility of the data-driven approach lies in the fact that there may be multicollinearity between the variables of the expressions generated by the algorithms. Accordingly, an issue left for further research is the the design and implementation of recursive experiments to check the robustness of the evolved expressions.

Funding
This work was supported by the Spanish Ministry of Science and Innovation under Grant PID2019-107579RB-I00.

Author contributions
The manuscript is a joint work of the three co-authors. All authors read and approved the final manuscript.

Disclosure statement
The authors declare that they have no competing interests.