DEVELOPING A HYBRID DATA MINING APPROACH BASED ON MULTI-OBJECTIVE PARTICLE SWARM OPTIMIZATION FOR SOLVING A TRAVELING SALESMAN PROBLEM

. A traveling salesman problem (TSP) is an NP-hard optimization problem. So it is necessary to use intelligent and heuristic methods to solve such a hard problem in a less computational time. This paper proposes a novel hybrid approach, which is a data mining (DM) based on multi-objective particle swarm optimization (MOPSO), called intelligent MOPSO (IMOPSO). The ﬁ rst step of the proposed IMOPSO is to ﬁ nd ef ﬁ cient solutions by applying the MOPSO approach. Then, the GRI (Generalized Rule Induction) algorithm, which is a powerful association rule mining, is used for extracting rules from ef ﬁ cient solutions of the MOPSO approach. Afterwards, the extracted rules are applied to improve solutions of the MOPSO for large-sized problems. Our proposed approach (IMOPSP) conforms to a standard data mining framework is called CRISP-DM and is performed on ﬁ ve standard problems with bi-objectives. The associated results of this approach are compared with the results obtained by the MOPSO approach. The results show the superiority of the proposed IMOPSO to obtain more and better solutions in comparison to the MOPSO approach.


Introduction
A traveling salesman problem (TSP) is a traditional and well-known optimization problem in the fi eld of operations research. There are n cities and distances between cities are specifi c and known. In this paper, a symmetric TSP is considered, in which the distance from city i to city j is equal to the distance from city j to city i. A salesman starts from one arbitrary city and visits all cities exactly once and at the end returns to the fi rst city. In other words, the aim of a TSP is to fi nd a tour between cities that minimizes the total distances travelled by the salesman. This problem can be explained by graphs, in which cities are vertices of a graph and the route between two cities is an edge in the graph. The weight of each edge is the distance between two cities connected by an edge. The Hamilton tour is a tour between vertices that visits all vertices once. Therefore in this case, the purpose is to fi nd a Hamilton tour so that the sum of edge weights in the tour is minimized. The input information is the distance matrix that shows the distance among any two pairs of cities. It can be obtained from coordination of cities in the two or three-dimensional space. Each city is specifi ed with horizontal and vertical indices in a two-dimensional plane. The distance between each pair of two cities is equivalent to the Euclidean distance between two points in the two-dimensional space.
There are many researches in the literature that use intelligent approaches, such as artifi cial neural network (ANN), for solving TSPs. Masutti and de Castro (2009) developed a modifi ed version of an immune self-organizing neural network for solving a TSP. The results show that the performance of their proposed algorithm is better than other neural network methods in the literature. Leung et al. (2004) applied an expanding self-organizing map, called ESOM on some examples that their range is varied from 50 to 2400 cities. The results show the superiority of the proposed approach over some of other SOM approaches in the literature. Jin et al. (2003) proposed an integrated SOM (ISOM) with a new learning rule that combines three learning procedures available in the literature. Yan and Zhou (2006) applied a three-tier multi-agent approach to present solutions for TSPs. These tiers are ant colony optimization agent, genetic algorithm agent, and fast local searching agent. The results of this paper indicate the suitable performance of the proposed approach for both solution quality and computational time criteria. Tan et al. (2006) developed an improved multi-agent approach to solve largescale TSPs. The proposed approach uses three kinds of agents with different functions that are generating a new solution, optimizing the current solution group, and refi ning the best solution. The experimental results show the good performance of the proposed approach. Liu et al. (2006) developed a hybrid of particle swarm optimization (PSO) and memetic algorithm for solving TSPs. In addition, it includes a simulated annealing (SA) local search based approach.
In the real world, there is usually more than one objective function. For example it is necessary to minimize the distance, cost, time and risk simultaneously. So it is necessary to consider more than one distance matrices between cities to minimize multiple objectives. In this paper, like Cheng et al. (2011), Samanlioglu et al. (2008, Jozefowiez et al. (2008) and Zhong et al. (2010), a bi-objectives TSP is considered. In multi-objective problems, there is no best solution, in which a collection of solutions is considered as best solutions. This collection, which is called non-dominated (effi cient) solutions, is related to the dominance concept investigated in below. Consider that A and B are two solutions in a minimization multi-objective problem. Suppose that the following two conditions are occurred.
a) The objective values of solution A are less than or equal to the objectives of solution B. b) The value of at least one objective of solution A is less than the considered objective of solution B.
In this condition, it is called that solution A dominates solution B. Indeed, solution B does not have any advantage in comparison with solution A. If there is not any solution that dominates solution A, it is called that solution A is a non-dominated solution.
The aim of solving multi-objective problems is to fi nd non-dominated solutions. In the literature, it is an evident approach in the recent years for solving problems with multiple objectives. Jaszkiewicz (2002) presented a genetic local search for multi-objective optimization problems to create non-dominated solutions. In each iteration of a local search, the process is implemented on generated offspring in order to increase the quality of solutions. At the end, he examined the effi ciency of the proposed approach on TSP instances. Yang et al. (2008) considered a dynamic multi-objective TSP (DMO-TSP) of a mobile communication network that its attributes change dynamically. Attributes are the number of cities and confl ict a degree between objectives. They proposed a parallel form of multi-algorithm co-evolution strategy (MACS) for solving this complicated model.
It is obvious from the literature that multi-objective particle swarm optimization (MOP-SO) approach is not used for solving multi-objective TSPs. It is also apparent that the useful data mining (DM) approach is not used for solving TSPs effectively. It is worth noting that DM is a collection of computational techniques that can be used for fi nding knowledge, hidden patterns and rules from data in different sciences. Ince and Aktan (2009) introduced and applied some of data mining techniques in their research. In the recent years, data mining approach has been used for optimization purposes. In this paper, one of data mining techniques is used for extracting rules from non-dominated solutions in a multi-objective TSP (MOTSP). Indeed, this paper presents a hybrid approach consisting of the MOPSO procedure and data mining process for solving MOTSP. Whereas DM is an intelligent approach for solving problems, the proposed approach is then called intelligent MOPSO (IMOPSO). Three major steps of the proposed IMOPSO for solving MOTSPs are stated as follows: Step 1: Solving some MOTSPs with the MOPSO approach.
Step 2: Extracting rules from non-dominated solutions that are obtained in Step 1 in order to establish a rule set for each problem.
This method simulates a moving group of fi sh or birds, called particle or swarm in PSO, respectively. In comparison with genetic algorithms (GAs), particle and swarm are similar to chromosome and population, respectively. In PSO, particles created in the fi rst iteration are not excluded and are remained until the end. In each iteration, every particle has a position and a velocity, in which the positions of particles are updated in order to obtain better solutions. The best position for each particle is stored as personal best (i.e., pbest) position. The best position of all of particles is stored as global best (i.e., gbest) position. The best position is the position that has the minimum/maximum objective function. Symbols of the PSO procedure are stated as follows: x i,1 , x i,2 , …, x i,n : n continuous decision variables Velocity vector in the i-th iteration. pbest i : Vector that stores the best position of the particles during iterations gbest i : Best positions of all particles during iterations c 1 , c 2 : Predefi ned coeffi cients r 1 , r 2 : Random numbers between 0 and 1, generated for each particle in each iteration w: Inertia factor that can be equal to one The basic PSO approach for solving single-objective problems is stated as follows: 1) Initial particles are generated randomly.
2) Initial velocities of particles are zero.
3) In each iteration, the velocity of each particle is computed by: 4) The position of each particle is updated by using the following equation.
5) The above process is repeated until a termination condition is occurred. This condition is usually a number of iterations.
PSO is suitable for continuous variables. It is worth noting that a TSP is a problem with integer variables. So it is necessary to modify PSO to be applicable for solving TSPs. For this purpose, the rank ordered value (ROV) method is used as same as given in Liu et al. (2006). For solving an n-city TSP, a string with n numbers is defi ned, namely original string. Numbers of this string are in [0, 1] range, in which each number is correspondent to one city. Corresponding to each original string, a tour consisting of n cities is defi ned by using the ROV method. This method performs in three steps explained by: 1) Sorting the numbers of the original string in an ascending order.
2) Specifying the rank of each numbers in an ascending order.
3) Creating a tour with the rank of cities in an ascending order. In this paper, a multi-objective particle swarm optimization (MOPSO) algorithm is used for solving a multi-objective traveling salesman problem (MOTSP). For this purpose, a crowding distance (CD) factor is defi ned on the basis of the concept given in Deb et al. (2002). This factor is used for specifying how much a solution is crowded with other solutions. In other words, it is a density estimator used for non-dominated solutions. Consider a collection that includes m non-dominated solutions. The CD factor for each solution is calculated by the following steps: 1) For each objective, sort solutions in an ascending order.
2) The CD for the fi rst and last solutions in order is equal to ∞. In an applicable case, it can be equal to a big number. 3) For the other solutions, the CD is calculated by the relation shown below: ( Steps of MOPSO are explained as follows: 1) Initial particles are generated randomly.
2) Initial velocities of particles are zero.
3) Evaluate all particles and select non-dominated solutions from swarm. Non-dominated solutions are stored in a pool, called repository. In each iteration, new non-dominated solutions are added to repository. If any of the current solutions of repository is dominated by new solutions, it is deleted from repository. The capacity of repository is limited and is defi ned by the user. Suppose that a number of non-dominated solutions are more than the capacity of repository. So it is necessary to delete (or exclude) some solutions from repository. In this situation, non-dominated solutions are sorted in an ascending order on the basis of their CD factor. Solutions with the smaller CD factor are excluded. It means that solutions, which are more crowded with other solutions, are deleted. It results in that solutions, which are less crowded with other solutions, are remained. It results in more diversifi cation in the space search process during the algorithm implementation. 4) pbest of each particle is updated. In the fi rst iteration, pbest is equal to the initial position of a particle. In the next iterations, pbest for each particle is updated by using three simple rules as follows: a) If the current solution (i.e., position) dominates the pbest solution, the pbest solution is equal to the current solution. b) If the pbest solution (i.e., position) dominates the current solution, the pbest solution is remained without any change. c) If neither of them dominates the other, one of them is selected randomly as the pbest solution. 5) In each iteration, the velocity of each particle is calculated by: There is a main difference in the velocity equation between single and multiobjective problems. In multi-objective problems, there is not any global solution.
Instead, there is a repository of non-dominated solutions. H implies to one of the solutions that is selected from repository. There are some ways for selecting a solution from repository at random. In this paper, similar to Tsou et al. (2007) it is selected from less crowded solutions. For this purpose, solutions in repository are sorted on the basis of their CD factors. Then 10% of solutions with less CD factors are specifi ed and H is selected from them randomly. So, rep H is a vector stating position of the selected solution and is used in Eq. (4). 6) Update the position of each particle by using the following equation.
7) The above process is repeated until a termination condition is occurred. This condition is usually a number of iterations.
It is necessary to tune up some parameters before running the algorithm. It is recommended that number of particles is set between 20 and 80 and number of iterations (swarms) is set between 80 and 120 (Coello, Lechunga 2002). In this paper, we consider 20 for the swarm size and 80 for a number of iterations. c 1 and c 2 coeffi cients are equal to 2. The repository capacity of should be defi ned by the user. In this paper, we consider that the repository capacity is equal to 20.

Data mining process
The data mining process is expressed on the basis of a standard procedure that is called the CRISP-DM algorithm and is explained in the previous studies, such as Olson and Delen (2008), Nisbet et al. (2009), Mladenić (2003, Han and Kamber (2006), Gupta (2006), Lin et al. (2008), Maimon and Rokach (2005), Riccia (2000) and Larose (2006). The six steps of this algorithm are as follows:

Business understanding
In this phase, the objective of the data mining process is defi ned. Usually, the business objective is considered in data mining studies. So this phase is named business understanding. However, the objective in this study is to fi nd rules in non-dominated solutions for some examples of TSPs. In other words, in this paper the goal of the data mining study is fi nding suitable rules and patterns in non-dominated solutions of a TSP.

Data understanding
In this phase, a perception from data set is obtained. There are two cost matrixes between cities in each TSP problem. Usually the cost matrix is the distance matrix between cities. So there are two distance matrices that show distances between cities. In an n-city TSP, distance matrices are n×n matrices as shown by D n,1 and D n,2 . In a bi-objectives TSP, each edge that connects two cities has two weights related to the two distance matrices. For example, consider p-q edge that connects p and q cities. S p is the set of edges that connect p to other cities. Sum p,1 is the sum of the weight of edges in S p on the basis of the D n,1 matrix. Similarly Sum p,2 is the sum of the weight of edges in S p on the basis of the D n,2 matrix.
The considered data set includes a table consisting of some rows and some columns. Rows and columns of this table are called records and fi elds, respectively. Each record presents one edge between two cities. Each fi eld presents one of edge attributes. In this paper, nine attributes are considered for edges that are mentioned in below: A fi eld: This fi eld is a binary (0 or 1) fi eld that relates to existence of an edge in a non-dominated solution. If an edge exists in a non-dominated solution the value of this fi eld is equal to 1. Also if an edge does not exist in a non-dominated solution the value of this fi eld is equal to 0. As before said the goal of data mining process is fi nding rules in non-dominated solutions. So it is necessary to focus on edges that exist in non-dominated solutions. Therefore in data mining process, only edges with A = 1, which exist in a non dominated solution, are important. In other words, DM analysis is performed on edges with A = 1. In this paper, the max-min method is used for normalizing. For example, consider a set of n variables that are called x 1 , x 2 , … , x n . The normalized values of this set are shown by nx 1 , nx 2 , … , nx n . min -x and max-x are the minimum and maximum values of the set, respectively. Each value is normalized by using Eq. (6) stated as follows:

Data preparation
Usually a pure data set is not suitable for performing data mining algorithms. Data preparation provides the possibility to present a standard framework for decision making and comparison. Since data in different problems have different values, so it is necessary to convert values to standard and normalized values. Thus in this phase, preparation of data set is performed. The value of fi eld A is equal to 0 or 1 and values of F1, F2, G1, and G2 fi elds are in [0, 1] range. So there is no need for changing these values any more.

Modeling
In this phase, a suitable data mining algorithm is performed on the normalized data set, and then the results are obtained. Association rule mining (ARM) algorithms apply on a data set of records and fi elds to effi ciently extract suitable rules that explain relationships between fi elds. In general for applying ARM algorithms, fi elds are specifi ed as input or output fi elds (factors). The ARM algorithms present if-then rules to explain relations between input and output fi elds (factors). For example, consider this rule: If "B < x" then "A = 1". Antecedent of the rule is "B < x" and B is one of input fi elds. "A = 1" is consequent of the rule and A is one of output fi elds. Rules have two major indices, called support and confi dence. For aforesaid simple rule support is the percentage of records that "B < x" condition is occurred. This value is shown with Y. Consider percentage of records that both of "B < x" and "A = 1" conditions are occurred. This value is shown with X. Confi dence is equal to division of X to the Y. Indeed confi dence is the accuracy of rule and is a good measure for specifying how much a rule is reliable. Support and confi dence are two important criteria for selecting suitable and effi cient rules. Rules with high support are frequent and rules with high confi dence have high accuracy.
The goal of this study is to fi nd suitable rules about edges in non-dominated solutions.
In other words, the purpose is to specify which edges have more chance to be in a nondominated solution. So output fi eld (goal fi eld) is existence of an edge in a non-dominated solution. In this case, fi eld A is output fi eld and other fi elds that are attributes of edges (i.e., F1, F2, G1 and G2) are input fi elds. There are two major ARM algorithms, namely Apriori and GRI (Generalized Rule Induction). It is necessary to mention that the Apriori algorithm does not accept continuous fi elds. Since that input fi elds include continuous values, this algorithm is not applicable for the considered data set, and the GRI algorithm is used for extracting rules. The GRI method is introduced and applied in previous studies, such as Larose (2005), Abbas et al. (2002) and Bramer (2007Bramer ( , 1999.
To perform the GRI algorithm, the SPSS Clementine 11.1 software is used.

Evaluation
In this phase, the results of the previous phase are evaluated and analyzed. For each non-dominated solution, rules that include "A = 1" term as a consequent are considered. Since that number of rules is much, it is necessary to select some of rules for more analysis. In this paper, since confi dence is the good criterion for rule selection, a threshold for rule confi dence is defi ned. Rules that their confi dence is lower than threshold are excluded and rules that their confi dence is higher than threshold are stored for more analysis in the next step. Threshold selection for confi dence of rules is user-defi ned and depends to the nature of the considered problem and its needed accuracy. In this paper, 70% is a good threshold that can satisfy needed accuracy and is considered as the minimum threshold for confi dence of rules.

Deployment
In this phase, the results of the previous steps (i.e., extracted rules) are used for solving MOTSP problems. Indeed, the set of extracted rules from effi cient solutions of an MOTSP problem is considered. For example, consider RS n that is a rule set of effi cient solutions from an n-city bi-objectives problem. This rule set contains m rules as mentioned in Table 1.
After establishing a rule set, it is necessary to use of the rule set to solve another k-city (k > n) problem. The following steps that explain the IMOPSO approach are performed for this purpose: 1) First a k-city bi-objectives TSP problem with the MOPSO method is solved. The obtained effi cient solutions constitute a set that is called ES k . Suppose that ES k contains p effi cient solutions. 2) For i-th (1 ≤ i ≤ p) solution of the ES k set, consider the j-th (1 ≤ j ≤ m) rule of RS n . If consequent of the j-th rule is "False", then the next rule will be selected. If consequent of the j-th rule is "True", it means that this rule states the conditions that two cities (e.g., x and y) with probability CR j can be adjacent. 3) Therefore, it is probable that cities x and y are adjacent in effi cient solutions of the k-city bi-objective TSP problem. If cities x and y are adjacent in the i-th solution, it will be remained without any change. 4) If cities x and y are not adjacent in the i-th solution, a random number (e.g., RI) will be generated and compared with the j-th rule confi dence. If (RI ≤ CR j), then the i-th solution will be changed so that cities x and y will be adjacent. For this purpose, a tour of cities will be considered and one of the adjacent cities of x (e.g., z) will be selected. Then, position of y and z will be exchanged to reach a new tour that includes x and y as adjacent cities. 5) The previous steps (i.e., Steps 2, 3 and 4) are performed several times to obtain a diverse set of solutions. 6) At the end, new set of solutions are explored to select effi cient solutions from that.
Afterwards, the new obtained effi cient solutions from the IMOPSO approach are compared with the previous effi cient solutions obtained by the MOPSO approach in order to fi nd the fi nal set of effi cient solutions. This comparison specifi es that how the proposed hybrid approach can improve the ability to reach the effi cient solutions.

Conclusions
This paper has proposed an integrated intelligent approach for solving a multi-objective traveling salesman problem (MOTSP). This approach has used data mining and multiobjective particle swarm optimization (MOPSO). First, fi ve problems were solved by the MOPSO approach. Then, data mining (DM) was used to fi nd knowledge from ef-fi cient solutions of MOTSPs. So DM based on MOPSO was called intelligent MOPSO (IMOPSO) as a novel hybrid approach. Then, the GRI algorithm, which was an association rule mining algorithm, was performed and the extracted knowledge was explained as if-then rules. Extracted rules were used for solving new problems. The process of rule extracting and applying them to improve solutions of the MOPSO approach was stated in a standard data mining framework, called CRISP-DM algorithm. The proposed approach was compared with the MOPSO approach resulting that the IMOPSO approach has two major benefi ts. First, it produces new effi cient solutions and therefore increases the number of non-dominated (effi cient) solutions. The second benefi t is that most solutions of the MOPSO approach are dominated by the effi cient solutions of the IMOPSO approach. So, the IMOPSO approach presents better solutions. Indeed, In addition, a few solutions of the IMOPSO approach were dominated by solutions of the MOPSO approach. In other words, the IMOPSO approach produced solutions that were better than solutions of the MOPSO approach in terms of the solution quality and quantity. Table 7 shows that 91, 90, 64.5 and 43% of effi cient solutions of the MOPSO approach are dominated by the effi cient solutions of the IMOPSO approach in case of applying EX29, EX42, EX48 and EX76 rule sets, respectively. In addition, it has shown that only 0, 0, 11.5 and 15% of effi cient solutions of the IMOPSO approach are dominated by the effi cient solutions of the MOPSO approach in case of applying EX29, EX42, EX48 and EX76 rule sets, respectively. Furthermore, in multi-objective problems, fi nding many numbers of effi cient solutions is a major benefi t. The IMOPSO approach provides more effi cient solutions in comparison to the MOPSO approach. Applying the hybrid proposed approach in this paper to the other optimization problems can be suggested for future research. Furthermore, it is suggested to develop a rule-based optimization approach that uses other rule extracting techniques during the optimization process. Abdorrahman HAERI is a PhD student in Department of Industrial Engineering at University of Tehran. His interested research area is about data mining and soft computing methods, such as genetic algorithm and particle swarm optimization. His previous papers were about applying data mining methods, such as association rule mining, on optimization and decision-making problems.