Multiple Solutions of Mixed Variable Optimization by Multistart Hooke and Jeeves Filter Method

In this study, we propose a multistart method based on an extended version of the Hooke and Jeeves (HJ) algorithm for computing multiple solutions of mixed variable optimization problems. The inequality and equality constraints of the problem are handled by a filter set methodology. The basic ideas present in the HJ algorithm, namely the exploratory and pattern moves, are extended to consider two objective functions and to handle continuous and integer variables simultaneously. This proposal is integrated into a multistart method as a local search procedure that is repeatedly invoked to converge to different global and non-global optimal solutions starting from randomly generated points. To avoid repeated convergence to previously computed solutions, the concept of region of attraction of an optimizer is implemented. The performance of the new method is tested on benchmark problems. Its effectiveness is emphasized by a comparison with a well-known solver. 2164 M. Fernanda P. Costa et al. Mathematics Subject Classification: 90C11; 90C26; 90C56


Introduction
This paper presents a multistart method based on a derivative-free local search procedure, the Hooke and Jeeves method [8], to locate multiple solutions of nonlinear constrained mixed variable optimization (MVO) problems.This type of problem contains both continuous and integer decision variables and we use the following mathematical form min f (x, y) subject to g i (x, y) ≤ 0, i = 1, . . ., m h j (x, y) = 0, j = 1, . . ., p l x ≤ x ≤ u x , l y ≤ y ≤ u y where x, l x , u x ∈ R nx , y, l y , u y ∈ Z ny (in particular y ∈ {0, 1} ny ), n x is the number of continuous variables, n y is the number of integer variables, l x and l y are the vectors of the lower bounds for the continuous and discrete variables respectively, and u x and u y are the vectors of the corresponding upper bounds.We will use n to represent the total number of variables of the problem.We do not assume that the functions f , g i , i = 1, . . ., m and h j , j = 1, . . ., p are convex and thus many global and local (non-global) solutions may exist.It is not an easy task to compute just one solution of the MVO problem, to compute multiple solutions (global and local) in an MVO context is therefore one of the most complex optimization problems.When addressing the issue of computing just one global optimal solution of an MVO problem, stochastic and deterministic methods are available.The stochastic methods to compute global solutions are nowadays common and easy to implement.Their convergence properties do not depend on the structure of the problem at hand.Studies about the convergence properties of stochastic methods are capable of guaranteeing, in the limit, convergence to a global solution with probability one.One of the most used stochastic algorithms is the so called multistart.It is very popular since it can be used in a wide range of applications and it is easy to implement.Genetic and evolutionary algorithms for computing a global optimum of problems like (1) are presented in [7,11].Known swarm intelligence based optimization algorithms, such as ant colony optimization, have been used to solve MVO problems [14].Heuristics are also common.For example, in [15], an effective exhaustive enumeration method where only a portion of all candidate suboptimal solutions realized during the search are examined and poor points are discarded in favor of more promising ones, is proposed.
The most common deterministic approaches to compute a global solution of MVO problems are branch and bound techniques, outer approximation, general Benders decomposition and extended cutting plane methods.The main advantage of these approaches is that they may guarantee to find the global optimum, although they also require large amounts of computational time.In general, the complexity rises exponentially with the dimension of the problem.
Recent derivative-free methods for locating a local minimizer of MVO problems are presented in [1,12].In the first paper, the generalized pattern search algorithm for linearly constrained (continuous) optimization was extended to mixed variable problems and the constraints are treated by the extreme barrier approach.In the second paper, a minimization of a penalty function distributed along all the variables is performed.Continuous and discrete search procedures, as well as a penalty parameter updating rule are the most relevant issues of the therein presented method.
In this paper, we address the issue of locating multiple solutions of MVO problems by using a multistart paradigm and extending the Hooke and Jeeves (HJ) method to inequality and equality constrained problems and integer variables.The HJ method, developed for unconstrained optimization, uses exploratory and pattern moves to locally search for a point that improves the objective function value relative to a central point of the search [8,9].When the variables of the problem must satisfy a set of bound or/and inequality and equality constraints, the basic HJ algorithm should be extended to ensure that only feasible points are considered as potential solutions.The extension herein presented uses a reformulation of problem (1) as a bound constrained bi-objective optimization problem and includes a filter methodology [2,3,5] to assess objective and constraint violation functions to promote convergence to feasible and optimal solutions.Furthermore, discrete values for the integer variables are ensured by using a specific pattern of points spread over the search space.The multistart HJ filter method presented in this study is an extension of the work proposed in [4], in the sense that multiple solutions have now to be located in a single run of the algorithm targeting both global and local minimizers.A multistart method is a suitable stochastic strategy to locate multiple solutions.Since some solutions may be computed more than once, the concept of region of attraction is integrated into the algorithm to avoid repeated convergence to previously computed solutions.
The remaining part of this paper is organized as follows.In Section 2 we present the multistart method and in Section 3 we list our strategy for considering integer variables and for using the filter methodology within the HJ local search.We report in Section 4 the results of the numerical experiments carried out with a benchmark set of MVO problems, as well as some comparisons with another multistart based method.Finally, we conclude the paper in Section 5.

Multistart method
This section presents the main ideas of a multistart method that uses the concept of region of attraction to avoid repeated convergence to previously computed solutions.This is the most popular stochastic method when multiple solutions are required.To compute multiple solutions, the simplest idea is to repeatedly call/invoke a local search procedure starting each time from a different randomly selected point, so that hopefully different optimal solutions will be found.However, the same solutions may be computed over and over again.Convergence to previously computed solutions can be avoided by implementing a clustering technique which aims to define prohibited regions based on the closeness to the previously computed solutions [16,17,18].Each time a point is randomly selected from these prohibited regions, it will be discarded since the implementation of the local search procedure will eventually produce one of the previously computed solutions.This way the number of calls to the local search procedure is reduced.The ultimate goal of a multistart algorithm is to invoke the local search procedure as many times as the number of optimal solutions of problem (1).
Our implementation of a multistart method for MVO problems defines the randomly selected starting points as follows: x i = (l x ) i + λ((u x ) i − (l x ) i ) for i = 1, . . ., n x y j = (l y ) j + τ j for j = 1, . . ., n y where the notation (l x ) i stands for the component i of the vector l x , λ is a random value uniformly distributed in [0, 1] and τ j is a number randomly selected from the set {0, 1, . . ., ((u y ) j − (l y ) j )}.In what follows, we will use the following notation: X = (x, y) T to represent the vector of the n variables, L = (l x , l y ) T and U = (u x , u y ) T to represent the vectors of the bounds on the variables.
The region of attraction of a minimizer, X * = (x * , y * ) T , associated with a local search procedure L, is defined as: where L(X) produces the minimizer X * after invoking the local search procedure L starting at the point X.This means that if a point X is randomly selected from the set [L, U ] and belongs to the region of attraction A of the minimizer X * , then this minimizer would be obtained when L is invoked starting from that X.Thus, the main idea is to invoke the local search procedure only when the randomly selected point does not belong to any of the regions of attraction of already computed minimizers or, equivalently, to the union of those regions of attraction, since they do not overlap.However, it is not an easy task to compute the region of attraction A of a minimizer X * .An alternative process consists of estimating the probability, p, that a sampled point will not belong to the union of the regions of attraction of the previously computed minimizers.This is estimated by the probability that the sampled point will not belong to the region of attraction of the nearest to X minimizer, herein denoted by X * o (where o is used to denote the index of this particular minimizer), i.e., where s is the number of the already computed minimizers and A o is the region of attraction of the minimizer X * o (see details in [18]).In practice, the value of p is approximated by which is the probability that the sampled point does not belong to the sphere centered at the minimizer X * o with radius R o .For any i, let R i represent the maximum attractive radius of the minimizer (j) i is one of the sampled points that converged to the minimizer X * i after invoking the local search procedure.Given X, let d i = X − X * i be the distance between the point X and the minimizer X * i .If d i < R i then the point X is likely to be inside the region of attraction of X * i and the local search procedure ought not to be invoked since the probability of converging to the minimizer X * i is high.However, if the direction from X to X * i is ascent then X is likely to be outside the region of attraction of X * i and the local search procedure ought to be invoked, starting from X, since a new minimizer could be detected with high probability [18].Thus, the estimated value of the probability that X / where is a function that depends on the directional derivative of f along the direction from X to X * i , r i is the number of times the minimizer X * i might have been recovered thus far, and Φ is a function that: • tends to 0 as d i R i approaches 0, • tends to 1 as d i R i approaches 1, • tends to 0 as r i increases.Here, we use the function Φ proposed in [18] for 0 and compute (X, X * i ) using forward differences to estimate de gradient of f at X, δf (X), (X, . We consider the direction from X to X * i ascent if (X * i − X) T δf (X) > 0. Algorithm 1 displays the main steps of the multistart algorithm.The set ∆ * , empty at the beginning of the iterative process, contains the computed minimizers that are different from the previous ones.To check if a computed minimizer, X * , has been previously identified the following proximity conditions must hold for any l in the indices set of previously identified minimizers {1, . . ., s}, and a small γ * > 0. Although this type of methods is simple, they would not be effective if a bad stopping rule is used.The main goal of a stopping rule is to make the algorithm to stop when all minimizers have been located with certainty.Further, it should not require a large number of calls to the local search procedure to decide that all minimizers have been found.A simple stopping rule, adopted from [10], uses an estimate of the fraction of uncovered space, P (s) = s(s+1) t(t−1) , where s gives the number of (different) computed minimizers and t represents the number of times the local search procedure has been invoked.The multistart algorithm then stops if P (s) ≤ , for a small > 0. However, if this condition is not satisfied the algorithm is allowed to run for K max iterations.
The next section addresses the issue related with the local search procedure, L. It is a derivative-free pattern search type method that is prepared to handle inequality and equality constraints by means of a filter methodology [2,3,5,7].The method extends the Hooke and Jeeves approach, as outlined in [8,9], for solving a mixed integer constrained optimization problem.The herein presented study is an extension of the work proposed in [4], in the sense that multiple optima of problems like (1) are required rather than just a global optimum.Two crucial tasks are to be carried out: one is related with the definition of the pattern in the HJ search procedure to take care of continuous and discrete variables simultaneously, and the other uses the filter methodology to handle the inequality and equality constraints.In what follows this extension will be called 'mixed-HJ-filter'.

The 'mixed-HJ-filter' method
The local search procedure is an iterative method that is applied to a sampled point (x, y) and provides a trial point (x + , y + ) that is a global or local mini- Randomly generate X ∈ [L, U ] using (2); 5: Set o = arg min j=1,...,s d j ≡ X − X * j ; 6: if the direction from X to X * o is ascent then if rand(0, 1) < p then 16: end if

25:
Set k = k + 1; 26: until s(s + 1)/ ((t(t − 1)) ≤ or k ≥ K max mizer of the problem (1).Using a filter methodology [5], the basic idea is to reformulate (1) as a bi-objective optimization problem where θ(x, y) = g(x, y) + 2 2 + h(x, y) 2 2 is the nonnegative constraint violation function and v + = max{0, v}.The filter technique incorporates the concept of nondominance, borrowed from the field of multi-objective optimization, to build a filter set that is able to accept trial approximations if they improve the constraint violation or the objective function value.A filter F is a finite set of points (x, y), corresponding to pairs (θ(x, y), f (x, y)), none of which is dominated by any of the others.A point (x, y) is said to dominate a point (x , y ) if and only if θ(x, y) ≤ θ(x , y ) and f (x, y) ≤ f (x , y ).
The most relevant steps of the 'mixed-HJ-filter' algorithm are described in the remaining part of this section.At the beginning of the iterative process, the filter is initialized to F = {(θ, f ) : θ ≥ θ max }, where θ max > 0 is an upper bound on the acceptable constraint violation.

Exploratory moves
Based on the current point (x, y), the search begins with a central point, set as (x, ȳ) = (x, y), and defines a sequence of trial approximations along the unit coordinate vectors e i ∈ R n with a fixed step size α x ∈ (0, 1]: where D ∈ R nx×nx is a weighting diagonal matrix.The search for acceptable trial points follows the rules: • when searching along each e i to find a trial point that is acceptable by the filter, the positive direction is tried first; • each time a point (x + , y + ) falls outside [L, U ] the point is projected onto the search space [L, U ]; • if the trial point improves over (x, ȳ), reducing θ or f by a certain amount, i.e., if one of the conditions holds, for fixed constants γ θ , γ f ∈ (0, 1), and is acceptable by the filter, (x + , y + ) is accepted and replaces (x, ȳ); • the search goes on along the remaining unit coordinate directions, as described above; • if, on the other hand, the trial approximation (x + , y + ) is dominated by the current filter, the search for a trial point is repeated, according to (7), but along the negative direction, before passing to the next unit coordinate direction.We also note that whenever a point is acceptable, the point is added to the filter F, and all dominated points are removed from the filter.
When the search along the n coordinate vectors terminates, the most nearly feasible point (it may be a feasible one) among the accepted trial points is selected.Let that point be denoted by (x inf , y inf ).If (x inf , y inf ) = (x, y), the search is termed successful and the vector (x inf , y inf ) − (x, y) defines a promising direction, known as pattern direction; otherwise the search is called unsuccessful, and a restoration phase is invoked.
We note that if a sequence of trial points is feasible, the condition (8) guarantees that the trial approximation (x + , y + ) must satisfy the second condition in order to be acceptable.This way the optimal solution is guaranteed.

Pattern moves
When a successful search occurs, a move along the pattern direction is carried out.The search along the n unit coordinate vectors, using (7) and the above described rules, is conducted using (x inf , y inf ) + ((x inf , y inf ) − (x, y)) as the central point.At the end, the most nearly feasible point among the generated trial points is selected.When a new acceptable (x inf , y inf ) is found then it is accepted as the new iterate, replaces (x, ȳ), and the pattern move is repeated.

Restoration phase
When it is not possible to find a non-dominated trial point, a restoration phase is invoked.In this phase, the most nearly feasible point in the filter, (x inf F , y inf F ), is recuperated and the search along the n coordinate vectors, as previously described and using (7), is repeated but with (x inf F , y inf F ) as the central point.When a non-dominated trial point is found, it becomes the central point for the next iteration.Otherwise, the iteration is unsuccessful, the search returns back to the current (x, ȳ), the step size α x is reduced, and a new search consisting of exploratory moves and pattern moves is repeated taking (x, ȳ) as the central point.

Termination rule
When α x is reduced within an unsuccessful iteration, it may fall below a sufficiently small positive tolerance, α min .This is an indication that the first-order convergence has been attained [9] and the 'mixed-HJ-filter' algorithm may terminate.
In this section, we present the full description of the problems, the results obtained by the proposed algorithm and a comparison with the Multistart function from MATLAB TM Optimization Toolbox.The proposed 'MS+m-HJ-f' algorithm found two solutions.The global solution was found in all the 30 runs and the local in 24.The best global function value -6.666657 (with a constraint violation of 0.000000e+00) and minimizer (6.6665758e-01, 6) was produced by the 'mixed-HJ-filter' algorithm, in one of the 15 local 'mixed-HJ-filter' calls of run 28, after 0.861 seconds (sec), 604 function evaluations (feval) and 44 iterations (iter).On the other hand, the best value for the local solution, -5.000000 (with a constraint violation of 0.000000e+00) was attained at the point (4.000000e+00, 1) in one of the 18 local calls of run 20, after 1.273 sec, 369 feval and 31 iter.
When solving this problem, our algorithm found two optimal solutions, one global and one local.The global was located in all runs and the local only in two out of the 30 runs.(In the other runs, the algorithm terminated with a solution that is far away from the expected optimal value.)The best value 189.2946 (with a constraint violation of 0.000000e+00) for the global minimum was reached in one of the 5 local calls of the 11th run, after 4.532 sec, 683 feval and 28 iter of the 'mixed-HJ-filter' algorithm.The located minimizer was (0.000000e+00, 1.666417e+01, 100).On the other hand, the best value for the local solution 291.7167 (with a violation of 4.551885e-11) was found during one call of the local procedure (out of 15) of the run 16, after 1771 feval, 66 iter and 7.793 sec.The attained minimizer was (1.544511e+01, 6.831081e+00, 218).

Problem 4.5
(Example 13 in [13], f1 in [7]) with 2 known solutions (1 global and 1 local) The 'MS+m-HJ-f' algorithm identified two solutions, one global and one local.The global was located in all runs and the local was located only in two runs.The point (5.00000000e-01, 1), with a best global value of 2.00000000 and a constraint violation of 0.000000e+00, was attained during one of the 9 local calls of run 18, after 0.531 sec, 261 feval and 29 iter of the 'mixed-HJfilter' algorithm.The best local function value 2.236128 (with a constraint violation of 0.000000e+00) was reached at (1.118064e+00, 0), during one of the 12 local calls of run 8, after 0.709 sec, 549 feval and 41 iter.(Example 15 in [13], f9 in [7], Test Problem 1 in [6]) with 1 global and 1 local as reported in [6] The 'MS+m-HJ-f' algorithm located in 28 runs a global minimum around 7.66718 and in all runs a local around 8.240.However, in 14 of the 30 runs, a minimum around 8.476 was identified by our algorithm.In one of the 15 local calls of run 24, the algorithm converged to (1.118035e+00, 1.310370e+00, 0, 1, 1), yielded the best global value 7.667178, with a constraint violation value of 4.99375e-12, and required 2054 feval, 63 iter and 4.369 sec.The best value found for one of the local minima was 8.240262 with a constraint violation value of 4.178278e-11, at the point (5.000063e-01, 2.080083e+00, 1, 0, 1), after 2376 feval, 48 iter and 7.230 sec (during one of the 9 local calls of run 18).From the results, it is possible to conclude that the point (1.118026e+00, 2.080097e+00, 0, 0, 0) corresponds to the other local minimizer, with a best function value of 8.476343 and a constraint violation of 1.159699e-09, and was produced after 1912 feval, 50 iter and 4.646 sec of a local call (out of 15) of run 13.(Example 14 in [13], f6 in [7], Test Problem 3 in [6]) with 1 global solution and some local solutions (9 were reported in [13] although one of the reported local minimizers is the point (0.20, 0.800, 1.908, 1, 1, 0, 1) with function value 4.580, whereas the reported global is (0.2, 0.8, 1.907878, 1, 1, 0, 1) with a function value of 4.579582.) The proposed 'MS+m-HJ-f' algorithm located the global minimum around 4.5796 in all runs.Some local minima have been identified: one around 5.807 in 6 runs, other around 5.780 in 4 runs and another around 7.768 in 4 runs.The best value for the objective function value at the global minimizer (1.999892e-01, 7.999976e-01, 1.907877e+00, 1, 1, 0, 1) was 4.579609 (with no violation) and was obtained during one local call (out of 9) of run 3.The 'mixed-HJfilter' algorithm required 5486 feval, 76 iter and 28.430 sec.To reach the best value of the local minimum 5.806857 at (0, 5.010788e-01, 1.498921e+00, 1, 1, 1, 1) the local 'mixed-HJ-filter' algorithm required 6039 feval, 81 iter and 25.929 sec (in one of the 21 calls of run 27).To reach the best value 5.780268 at the local minimizer (7.000959e-01, 7.999291e-01, 1.499948e+00, 0, 1, 1, 0), 2217 feval, 59 iter and 2.904 sec were required in one of the 12 local calls of run 24.Finally, the best value found for the other local minimum was 7.815788 (with no violation) at the point (7.814582e-01, 1.280481e-01, 1.499893e+00, 0, 0, 1, 0) after 3950 feval, 62 iter and 6.565 sec (in one of the 21 local calls of run 12).
We now include a comparison with the results obtained by the Multistart function from MATLAB, where we have integrated our 'mixed-HJ-filter' algorithm as its local solver.The Multistart function accepts Optimization Toolbox functions fmincon, fminunc, lsqnonlin and lsqcurvefit as local solvers, but none of these are prepared to handle integer variables.For a fair comparison we set the number of randomly generated starting points in the Multistart function to the same value used in our 'MS+m-HJ-f' algorithm, which was 21 (= K max + 1).In the Multistart function, a new minimizer is identified by checking its objective function value and the vector itself with those of previously located optimal solutions.We set that tolerance to the value herein used for γ * .Table 1 aims to compare the efficiency of both multistart kind algorithms.For each problem, the table shows averaged values over the 30 runs: HJcall avg (average number of local 'mixed-HJ-filter' calls), 'N f e avg ' (average overall number of function evaluations of a run) and 'T avg ' (average overall time of a run, in seconds) only for our algorithm.For example, when solving Problem 4.1, the 'mixed-HJ-filter' algorithm was invoked on average 16.5 times per run (out of a maximum of 21 (=K max + 1)).The proposed 'MS+m-HJ-f' requires on average 11513 function evaluations and 33 seconds per run.On the other hand, we observe that the Multistart function invokes the local 'mixed-HJ-filter' procedure in every iteration and is consequently more expensive in terms of function evaluations than 'MS+m-HJ-f'.We may conclude from the results of Table 1 that the proposed 'MS+m-HJ-f' behaves favorably when compared with the Multistart function.Now, Table 2 reports for each identified optimal (global or local) solution: 'SR' (success rate -the percentage of runs that found that particular solution at least in one call to the local procedure, out of 30), 'f avg ' (the average of the best f values obtained during the runs where the solution was identified) and 'HJ−N f e avg ' (the average number of function evaluations required by the 'mixed-HJ-filter' procedure while converging to those best solutions).Regarding the global solutions, the results show that both algorithms have equal 'SR' although the 'f avg ' produced by the Multistart function are slightly better than those obtained by our algorithm.We remark however that 'MS+m-HJ-f' is more effective since those values of 'SR' were obtained with less local 'mixed-HJ-filter' calls and fewer function evaluations (recall HJcall avg and 'N f e avg ' respectively).As far as the local minima are concerned, we conclude that: i) both algorithms found almost the same number of minima (Multistart function did not find one of the locals in Problem 4.7); ii) 'MS+m-HJ-f' has a higher 'SR' in 3 out of 11 minima, while Multistart function has a higher 'SR' in 5 (and equal 'SR' in 3 minima); iii) Multistart function produces slightly better average function values mostly (at a cost of more 'mixed-HJ-filter' calls and overall function evaluations, see Table 1).

Conclusions
We presented a multistart kind algorithm that uses an extended version of the HJ method, as a local search procedure, for computing multiple solutions of MVO problems.The multistart algorithm aims to randomly generate points from the search space and to invoke a local search procedure to converge to the multiple solutions of the problem.To avoid repeated convergence to an already identified solution, the algorithm uses the concept of region of attraction of a minimizer.To be able to handle continuous and discrete variables simultaneously and the inequality and equality constraints, the extended HJ method uses a specific pattern of points spread over the search space and integrates a filter set methodology by solving a bound constrained bi-objective optimization problem each time it is invoked.
The new algorithm has been tested with benchmark problems and compared with the Multistart function from MATLAB, where we have integrated our 'mixed-HJ-filter' algorithm as the local solver in the Multistart function.
From the comparison, we conclude that the herein presented multistart kind algorithm, based on the 'mixed-HJ-filter' algorithm, behaves rather well and is more effective, in the sense that it is able to identify the same number of solutions using on average less local 'mixed-HJ-filter' calls and fewer function evaluations than the other solver in comparison.