A Global Optimization Method Based on the Reduced Simplicial Statistical Model

. A simplicial statistical model of multimodal functions is used to construct a global optimization algorithm. The search for the global minimum in the multidimensional space is reduced to the search over the edges of simplices covering the feasible region combined with the reﬁnement of the cover. The reﬁnement is performed by subdivision of selected simplices taking into account the point where the objective function value has been computed at the current iteration. For the search over the edges the one-dimensional P-algorithm based on the statistical smooth function model is adapted. Diﬀerently from the recently proposed algorithm here the statistical model is used for modelling the behaviour of the objective function not over the whole simplex but only over its edges. Testing results of the proposed algo-rithm are included.


Introduction
Although the majority of global optimization methods are constructed using various heuristics the approaches based on theoretical models of multimodal objective functions remain in the focus of theoretically oriented researchers. To maintain some average optimality of global search statistical models of multimodal objective functions are needed. For the basic theoretical ideas and their implementations we refer to the monographs [5,6,9,10]. A disadvantage of global optimization algorithms aimed at some optimality is time-consuming auxiliary computations which are needed to plan the current iteration. The computational burden depends on the complexity of computing the optimality criterion with respect to a model of objective functions, e.g. the probability to improve the estimate of the global minimum at the current iteration. For the stochastic models and radial basis functions models, those computations include the inversion of a matrix of the order equal to the number of objective function values computed during the previous iterations [13]. The computational difficulty, caused by the inversion of large matrices, can be avoided using the simplified multidimensional statistical models [14,15], however auxiliary computations also in this case remain rather intensive. The reduction of auxiliary computations is an urgent and challenging problem of statisticalmodels-based global optimization.
A sub-class of one-dimensional models, namely the Markovian stochastic functions, has computationally favourable properties that enable (avoiding the inversion of large matrices) the development of efficient one-dimensional global optimization algorithms. An example of that sub-class is the Wiener process which was the first model used for those purposes [10]. However, for smooth functions the P-algorithm based on the smooth function model has better convergence properties as shown in [1]. Our idea is to apply the efficient onedimensional P-algorithm based on the smooth function model as a subroutine of a multidimensional search algorithm. The search is performed over the edges of an iteratively refined simplicial covering of the feasible region. To justify the use of the P-algorithm for the one-dimensional minimization a reduced statistical model is used modelling the behaviour of the objective function not over whole feasible region but only over the edges of the current simplicial covering.
Some other advantages of simplicial covering of the feasible region are outlined in [17]. Since a simplex is a polyhedron in n-dimensional space with the minimal number of vertices, simplicial partitions are preferable when the values of an objective function at all vertices of partitions are used during optimization. Simplicial partitions may be used to vertex triangulate feasible regions of non rectangular shape defined by linear inequality constraints, that may be used to avoid symmetries in optimization problems [16]. Simplicial statistical models may be used to select a candidate simplex [8] in simplicial branch and bound algorithm with Lipschitz bounds [7] that reduces memory requirements and makes optimization faster compared with the common best-first selection strategy.

The Reduced Simplicial Statistical Model
Let us consider the global minimization (GM) problem min x∈A f (x), A ⊂ R n , where f (·) is a continuous function, and A is a compact set. As shown in [12] the typical for 'black box' optimization uncertainty about an objective function can be represented by a statistical model of f (x). However, the implementation of multidimensional GM algorithms based on the classical statistical models (random fields) is challenging because of time consuming auxiliary computations. The simplicial statistical model proposed in [15] is computationally simpler than a Gaussian random field however it can be further simplified as it is shown below.
Let us briefly introduce the P-algorithm based on simplicial statistical model [15]. At the (k + 1)-th minimization iteration A is covered by simplices S j , j = 1, . . . , m, and k values of the objective function are computed at the vertices x i of the simplices. A family of Gaussian variables ξ x is chosen for the statistical model of the objective function where the mean of ξ x , x ∈ S j is piecewise linear over S j and the variance of ξ x , x ∈ S j is piecewise quadratic over S j . The next value of f (x) is computed at the point of the simplex selected by maximization of the probability P k (x) = P{ξ x ≤ y ok } where y ok = min i=1,...,k f (x i ) − ε, and ε denotes an improvement threshold. For a discussion about the choice of value of ε we refer to [2,10]. The maximization of P k (x) over a particular simplex is facilitated by the properties of P k (x) considered as the objective function of the auxiliary maximization problem. The selected simplex is partitioned by one of several partitioning methods. The experience in testing a version of the P-algorithm, which is proposed in [15], has shown that the maximum point of P k (x) is frequently attained at an edge of the considered simplex. In such a case the selected simplex is partitioned into two simplices using the maximum point of P k (x) as a new vertex of both descendant simplices. Even in the case when the maximum point is an interior point of the considered simplex a method of partition using the maximum point as a pivot point of partition is not necessarily superior to the bisection using e.g. the middle point of the longest edge for the new vertex of the descendant simplices.
Let us construct the GM algorithm based on simplicial covering of the feasible region and the bisection of the selected simplex using a point on an edge for a new vertex of both descendant simplices. Then for the construction of the algorithm only the statistical model defined over edges of simplices is needed. Based on the comparison of various statistical models in [11] and on our past experience, we chose the statistical smooth function model defined in [1] to model the behaviour of the objective function over the edges of the simplicial partition. That model is a stationary Gaussian stochastic process with the correlation function r(·) satisfying the following regularity assumptions for finite λ 2 , λ 4 , and a > 1, as t → 0, and r(t) log(t) → 0, as t → ∞.

The P-algorithm Based on a Reduced Simplicial Statistical Model
The search for the global minimum at the (k + 1)-th minimization iteration is considered. The feasible region is covered by m simplices, and k function values are computed at the vertices of the simplices. Assume that the next point for computation of f (x) can be chosen on an edge only, and that the behaviour of f (x) over an edge is modeled by the smooth function model. The latter assumption implies that the maximum point of max x P ij k (x) = P{ξ(x) ≤ y ok } where x belongs to the segment of line connecting vertices x i and x j can be computed by a simple formula [1]. An appropriate monotonically increasing function of max x P ij k (x) can be used to compute the criterion for the selection of simplices. Let γ ij is used, which is monotonically related to max x P ij k (x). Therefore, at the current search iteration the vertices x i and x j are selected according to the maximum criterion γ ij , and the point x k+1 for the next computation of the objective function value is defined by the following formulas: The point x k+1 is not only the site of the current computation of the objective function value but also the pivot of the refinement of the simplicial covering. The refinement means the bisection of all simplices sharing the selected edge: the point x k+1 is the common vertex of all new simplices. The termination condition of a search method is defined according to the general concept of the method. For example, termination conditions of local descent methods normally require fulfilment of necessary minimum conditions with great precision. However such a termination condition would not necessarily be rational to apply for a GM method. Many GM methods are not very efficient for local refinement of the identified local minimizers. Therefore, hybrid methods seem favourable where a GM method is used to find rough approximations of the main minimizers, and the latter are refined by a local minimization method. Although in the present paper we do not consider the hybridization, the termination condition will be defined supposing the refinement of the best or several best candidate solutions by a local method.
During the search the edges incident to the best function values found become shortest because of (3.2). The current point for computation of the objective function value by the P-algorithm is selected on the edge where the improvement is most probable. Therefore, the selection of the shortest edge indicates likelihood that it is located close to the true minimizer. Theoretical investigation of the relation between the length of the shortest edge and the distance between the best point found and the true minimizer is extremely difficult. Nevertheless such a relation can be guessed, and it seems reasonable to stop GM when the length of the selected edge becomes shorter than a predefined Δ guessing that the current best point belongs to the Δ vicinity of a true minimizer not including other local minimizers.
The choice of the parameter ε is motivated by its influence to the strategy of search of the original one-dimensional P-algorithm where the increase of ε makes the search more global, and, contrary, the vanishing of ε implies stronger concentration of trial points in the vicinity of the best found point ensuring better asymptotic convergence of the candidate solution to the minimizer [2,10]. The proposed multidimensional algorithm consists from single steps of onedimensional P-algorithm repeated with different input data, and it is supposed for a rough finding of a global minimizer but not for its precise computation. Therefore, ε should be chosen sufficiently large. An acceptable empirical rule of thumb is to choose the value equal to a half of difference between average of computed objective function values and the current minimum.
The search by the constructed algorithm is illustrated by the results of minimization of two well known test functions. The first considered test function by Rastrigin is defined by the following formula it has more than 30 local minima in the feasible region. Contour lines of this function are presented in Figure 1a. The global minimum is equal to −2, and it is attained at the point (0, 0). With the parameter defining the very rough termination condition Δ = 0.1d, where d is the length of the diagonal of the feasible region, the algorithm has stopped after 71 computations of the objective function values. The best found value was −1.9986 and the distance between the best found point and the true solution point was 0.001d. The simplicial covering of the feasible region after termination is shown in Figure 1b. Such a fast location of the global minimizer with a high precision can be considered as occasional. However, the continuation of search (by decreasing Δ) shows rational concentration of points in the attraction regions of the main local minima. In the case of Δ = 0.05d the algorithm stopped after 182 computations of the objective function values, and the simplicial covering is shown in Figure 2a; the previous approximation was not improved. The distribution of 361 point where the objective function values were computed before stopping with Δ = 0.03d are shown in Figure 2b. The further concentration of points in the regions of attraction of main local minima seems not to be rational and can be explained by inadequacy of the statistical model to describe the local behaviour of the objective function. For this test function the rational termination condition is defined by Δ = 0.05d.
A similar experiment has been performed with another well known test function by Branin. The function is defined by the formula cos(x 1 ) + 10,  The minimization was performed similarly as in the previous case. The algorithm with Δ = 0.1d, Δ = 0.05d, and Δ = 0.03d has stopped after 65, 151, and 293 computations correspondingly. The best function values found were 0.5343, 0.4620 and 0.4056. The distances between the best point found and the global minimizer were 0.0279d, 0.0096d, and 0.0117d. The search results are illustrated by Figures 3b and 4. The results of this experiment support the suggestion of the previous experiment about rationality of choice Δ = 0.05d.

Testing Results
Various test problems (n ≥ 2) for global optimization from [3,4] have been used in our experiments. Test problems with (n = 2 and n = 3) are numbered according to [3]. For (n ≥ 4) function names from [4] are used. The results are shown in Table 1. The known global minimum is denoted by f * , the approximation found using the proposed algorithm is indicated as f min , the relative distance between the known minimizer and the best point found is denoted by ΔX/d, the last column shows the number of function evaluations (NFE).
Let us comment on the testing results for n = 2 and n = 3 first. The tolerance in the termination condition was set equal to Δ = 0.05d based on the discussion in the previous section. In all cases except 13 and 23 the global minimum point has been found with precision exceeding Δ. The number of function calls is considerably smaller than in the case of a Lipschitz model-based algorithm with rougher precision [3]; this advantage is gained at the price of withdrawal of the guaranteed solution with prescribed accuracy (of any problem with known Lipschitz constant). The failure could occur in specific cases which, we believe, are not very common in practical applications. For example, the best found point for test problem 13 was (0.0762, 0.1829) with function value 0.0599. The true minimum point is (0.0115, 0.0144) with the function value 0.0001. Let us mention that the function value at the point (0.01, 0.01) equal to 1.1304 was computed by the algorithm. Such sharp changes of function values (similar to discontinuity with respect to changes of arguments of size 0.001) are not supposed by the used statistical model. Similar considerations apply to test problem 23. For other test problems both approximation of the global minimum and global minimizer are close to the known ones. The number of function calls increases with the dimensionality of problems. It is natural since also the number of simplices (say regular simplices with edge length ) needed to cover a unit cube increases exponentially with n. Assuming that the number of local minimizers of the targeted objective functions remains of the same order as in the case of considered two and three dimensional test functions the tolerance in the termination condition can be increased. The results in Table 1 show that with Δ = 0.1d the four and five dimensional test problems were solved successfully when the algorithm stopped after a relatively modest number of function evaluations.

Conclusions
A simplicial covering of the feasible region is suitable for adapting onedimensional statistical models of multimodal functions to multidimensional global optimization. The proposed algorithm outperforms the Lipschitz model based algorithm with respect to the number of function calls; this advantage is gained at the price of possible failures for atypical problems.