BUS ROUTE DESIGN IN SMALL DEMAND AREAS

. The paper deals with the situation when a low populated area is in need of public transport service. It is necessary to design a bus route, passing through the area and meeting the accessibility and efficiency requirements. The article presents a mathematical formulation of the problem in terms of the network theory together with two exact and several heuristic methods for finding a solution. The paper describes that the problem is NP-hard, and therefore computing experience is outlined.


Introduction
Network design problems can be considered one of the main streams of recent network research. A similar situation is faced when discussing transport networks, both freight (Chen, Zeng 2010;Janáček, Gábrišová 2009) and passenger ones (e.g. Matuška 2010). Several authors are oriented to design techniques, e.g. Fan and Machemehl (2006) while the others deal with the evaluation of the already existing networks, e.g. Hu et al. (2010).
Among the above introduced problems, a notable position belongs to the optimal choice of some elements of the given network. Janáček and Gábrišová choose a compact node subset as a location of facilities. Other authors look for some routes in networks, e.g. Jakimavičius and Burinskienė (2010, 2009a, 2009b find several types of the shortest route (in km or hours) for a traveller information system, Matis (2010) resolves the street routing problem of freight transport, Szűcs (2009) applies it for cooperative transport systems.
Routing problems are especially important for public transport and mainly for the urban one. Erlander and Scheele (1974) supposed a set of routes constructed manually by a transport engineer, Cipriani et al. (2005) let the computer create a set of routes, each one as the shortest path connecting a pair of important (= large demand) nodes. Both dealt with sufficient demand areas. Yang et al. (2007) use a similar approach. They start with the O-D matrix and aim for a set of routes maximizing the number of direct travellers per unit of length. The paper by Fan and Machemehl (2006) deals with the network design minimizing a complex objective function encompassing user costs, operator costs and unsatisfied demand costs. Borndörfer et al. (2007) describe the use of the column-generation approach to a similar problem. Agrawal and Mathew (2004) present the parallel genetic algorithm.
Our approach is slightly different. We do not assume sufficient demand for the possibility of designing a set of (quasi straightforward) routes. Therefore, our goal is to construct a single route (curvilinear) satisfying passengers distributed in the given area. Our model differs from those suggested by Peško (2003Peško ( , 2004 who seeks a circular route passing through all demand points while we admit to reducing this set. In the second one, Peško allows refusing a part of demand, whereas we do not. Suppose that there is a low populated area in need of a new bus route. The area might be a new residential district consisting of family houses and other sources of passenger demand, e.g. shops, sports and cultural centres, offices etc. The points of passenger demand can be represented by a set of vertices V of network (graph) G = (V, E, q, d) where edge set E represents walkways between neighbouring vertices, q(v), v V is the size of passenger demand (outgoing and ingoing together) during a time unit and d(e) is the length of edge e. Some vertices from set W V are suitable for bus stopping. The road segments connecting neighbouring vertices from W are denoted by F and the length of e F by (e). Graph GW = (W, F, ) represents the network suitable for bus service in the area.
However, the desired bus route cannot pass through all possible bus stops w W since service would be-come too expensive. Then, public administration and the management of a transport company are looking for shorter route r = (s 1 , …, s n ), s i W covering demand 'satisfactorily' both for the company and passengers. In other words, the new route has to meet two contradictory claims: economy for a provider and accessibility for clients -passengers.
Our problem is related to that introduced in the paper by Schöbel (2005) and starts from demand points similarly as we do. Difference is that it begins with the given network and looks for the location of stops; however, we start with the given possible stops and look for the best 'simple' network represented by a route. The generalization of Schöbel's approach can be found in the paper by Groß et al. (2009).

Economical Aspects of Bus Route Design
Economy claim can be oriented to two directions: 1.1.1: a minimal length of the route with a minimum number of stops; 1.1.2: minimal frequency, i.e. the number of services (courses) during one time unit. This paper deals with 1.1.1 since 1.1.2 can be solved independently of route design.

Accessibility
Accessibility to passengers is usually postulated in one of the following ways: 1.2.1: for given limit , each passenger source or destination ought to have the closest bus stop not more than distance ; 1.2.2: for given distance limit and percentage limit , the percentage of passengers having the distance of their source or destination from the closest bus stop greater than ought to be less than ; 1.2.3: for given distance , the mean value of passenger sources and destination distance from the closest bus stop ought not to be more than . In the sequel, we shall deal with form 1.2.3 only. Nevertheless, it is easy to see that the adjustment of our methodology to 1.2.1 or 1.2.2 is easy. As concerns 1.2.1, however, we have to emphasize the results of Černá (2003a, 2003b; Černá et al. 2007) showing that rigorous insistence on 1.2.1 often leads to inefficient solutions to both passengers and providers.

Optimization Problem
Following 1.1 and 1.2, we can formulate a mathematical version of the problem looking for bus route r = (s 1 , …, s n ) where the mean value of passenger sources and destination distance from the closest bus is less than and the length of r is minimal.

Basic Problem
Let G = (V, E, q, d) be a (non-oriented) graph with demand function q: V 0; ) and length d: E (0; ). Let d (u, v) be the distance of u, v V obtained by the extension of the length of edges. Let W V and GW = (W, F, ) be a graph with edge length (not necessarily equal to d even on E F). Let (S) be the length of the shortest path connecting the vertices of S on GW for each S W. Let (0; ) and q = The problem is to find S W such that: 2.1.2: (S) min.

Appended Problem
If there were several solutions to problem 2.1, then the problem is to take set S with a minimum number |S| of elements inside it.

Methods of a Solution to Problems 2.1 and 2.2
Since problem 2.1 contains the open travelling salesman problem (OTSP) as a sub-problem, it is obviously NPhard in the sense of Garey and Johnson (1979). Therefore, we have to propose some heuristics for finding a solution in addition to the integer linear programming (LP) model and the combinatorial exact method we shall start with.

Exact Method (EM)
The basic principle of EM is passing through the set of possible subsets of vertices and solving OTSP for the subsets fulfilling 2.1.1 which is of the 'Depth-First-Search' type. Initial step: We first find S fulfilling 2.1.1. Then, we find the first record (S) solving OTSP.
Recursive step: Once set S fulfilling 2.1.1 is found, each of its extensions is omitted, since it cannot shorten length (S). Then, we look for the next S fulfilling 2.1.1 in the adjacent branch of solution structure.
The computational complexity of this method is very high and depends on the number of vertices and distance limit . Resulting computational time was acceptable at least for our small test networks and increased very rapidly dealing with more difficult problems. Therefore, we have had to make some optimizations in order to speed up the algorithm thus outlining them briefly. Further details can be found in Přibyl (2009).
The first one is omitting extended sets mentioned in the recursive step.
The second step of optimization is based on the recursive construction of the path solving OTSP. Following each step, the achieved length is compared with the record. Once it is reached, construction is interrupted.
Optimal solutions achieved using the above introduced method are very important for testing heuristic methods described in 3.2÷3.4.

General Greedy Heuristics (GGH)
Initial step: We put S = {m} where m = m(GW) is the median of GW.
Recursive step: If S fulfils 2.1.1, then we consider S a solution.
If S does not fulfil 2.1.1, we find w W -S such that (S {w}) is minimal, put S {w} S and turn to the recursive step again. If w does not exist, i.e. if S = W, then the problem is unsolvable.

Neighbour Greedy Heuristics (NGH)
We shall use denotation N(S ) = {w W: w S , s S , (w, s) F} for the neighbourhood of S in GW.
Initial step: put S = S = {m} where m = m(GW). Recursive step: if S fulfils 2.1.1, then, we consider S a solution. In case it does not, we look for: If such vertex does not exist, heuristics is not able to solve the problem and we stop. In case it does, we put

Integer LP Model
Problem 2.1 can be formulated by means of integer linear programming.
We suppose that m n are positive integers, V = {1, 2, …, n} is the set of all vertices and W = {1, 2, ..., m} V. For each pair i V, j V, given number d ij represents walking distances between nodes i and j. Similarly, ij represents riding distance on set W. Certainly, d ii = 0 and jj = 0 for i V and j W. Moreover, > 0 represents the upper bound of mean walking distance. Finally, F W W represents arcs on the given digraph DG = (W, F , ) derived from GW = (W, F, ) so that oriented arcs (u, v) F and (v, u) F if the edge (u, v) F, whereas length remains unchanged.

Interpretation of the LP Problem
In (3.1), value D represents a 'big number' , i.e. a substitute of infinity. Q represents the total number of passengers.
Value u j = 1 expresses the fact that the j-th vertex is chosen to the route and u j = 0 means the opposite.
Value v ij = 1 says that the j-th vertex is the closest vertex of the route to the i-th vertex.
x jk = 1 k-th vertex is the immediate successor of the j-th vertex on the route. y j is the ordinal number of the j-th vertex on the route.
Value z in (3.2) represents the total length of the resulting route.
(3.3) expresses the constraint of accessibility.
(3.4) ensures that the closest vertex is chosen among the ones of the route.
(3.5) and (3.6) choose the closest vertex of the route to the i-th vertex.
(3.7) ensures that the k-th vertex is the successor of the j-th vertex only if both vertices are chosen.
(3.8) ensures that the route does not continue after the last vertex and (3.9) makes the same job before the first vertex.
(3.10) and (3.11) ensure that, on the route, the immediate successor of the vertex with ordinal number y j is assigned ordinal number y j + 1.
(3.12) ensures that the vertices not belonging to the route have no ordinal numbers.
It follows, that each feasible solution of problem 2.1 can be expressed by values u j , v ij , x jk and y j fulfilling constraints (3.3)÷(3.11) and vice versa. Therefore, this problem is solvable by integer linear programming which implies that problem 2.1 is NP-easy. Considering the fact, we have shown it is also NP-hard. Thus, we can deduce that the problem is NP-complete.

Experimental Results
Table summarizes the results of the above described heuristic and exact methods on 9 tested networks. The diameter of these networks is about 30 km and the number of vertices is 20. Distance limit was set to 4.
Having these parameters given, computational time depends on the network layout. For the exact method, it varies between 50 and 120 minutes. For all heuristic methods, it is less than 1 second. Optimal solutions Sopt obtained by the exact method (EM) are highlighted in the bold font and the best solutions to the heuristic methods are highlighted in bold and italics in the Table. In addition, we assume GW equals G in all cases. The last column of the table shows whether the median m(GW) of the network lies on the optimal route achieved by the exact method. One can see that in 2 cases from 9, i.e. about 22%, it is not true. Therefore, heuristic methods starting in the median and containing it in the resulting route cannot lead to the optimum.
Since their duration is very short in the cases of about 20 vertices, the repeated use of the same heuristics but starting in a different vertex can be reasonable. Fig. 1 shows the results of Network 3. The numbers in brackets are those of passengers randomly generated between 0 and 100. The integer LP model was verified on several independent randomly generated networks employing m = n {11, 12, 13, 15} vertices and using freeware solver 'LPSolve' . The obtained results were identical with those received applying the exact method presented in 3.1 on the same hardware. Fig. 2 shows the computational times of LP model and EM for the test networks. LPSolve solution to the network having m = n = 20 vertices did not reach the end even following several days. Fig. 2 shows several interesting features. The optimal solution obtained by the exact method The best solution obtained by the combined heuristic method 19 (67) 5 (14) 20 (9) 6 (83) 13 (13) 11 (59) 17 (92) 16 (64) 12 (15) 3 (18)  First, the shape of the graphs corresponding to the EM method is regular and 'not surprising' . On the contrary, the ones of LP are quite 'random' .
Further, EM graphs demonstrate exponential dependence as we can expect. However, generally, the LP ones seem to increase faster than exponentially. These observations confirm the rule that a method of a solution designed directly for some type of problems works better than a general one.

Conclusions
The paper shows that the design of the first public transport route covering a new residential district can be formulated as a network problem solved by means of several exact and heuristic techniques. The authors believe it is good news for public administration and transport companies. Perhaps, however, their problem can differ from the basic problem 2.1.
For example, the area of weak demand may be connected with a more densely populated district of the town by a road with the given location of the first stop w 1 . Then, the formulation can be slightly modified requesting that the designed route has to start in w 1 . It is obvious, that necessary modifications of all methods 3.1÷3.5 are very simple.
The same can be said about the case when the area lies between two more populated districts and two 'obligatory' stops w 1 and w 2 (the first and last terminals) are given.
On the other hand, if there is necessity for designing, for example, two routes, it will not be easy to modify our models and methods for this purpose. Hence, it could stimulate further research.