PREDICTION OF PASSENGER FLOW ON THE HIGHWAY BASED ON THE LEAST SQUARE SUPPORT VECTOR MACHINE

. A s upport vector machine is a machine learning method based on the statistical learning theory and structural risk minimization. The support vector machine is a much better method than ever, because it may solve some actual problems in small samples, high dimension, nonlinear and local minima etc. The article utilizes the theory and method of support vector machine (SVM) regression and establishes the regressive model based on the least square support vector machine (LS-SVM). Through predicting passenger flow on Hangzhou highway in 2000–2008, the paper shows that the regressive model of LS-SVM has much higher accuracy and reliability of prediction, and therefore may effectively predict passenger flow on the highway


Introduction
e prediction of passenger ow on the highway is very important dynamic analysis. It plays a key role in the rational allocation of resources and manages the investment structure of a corporation. However, the factors having an in uence on passenger ow are rather complex. Traditional prediction methods have straight-line extrapolation, exponential smoothing, regression analysis and moving average (Chat eld 2003). However, these models have some limitations on solving highly nonlinear problems, so these models cannot meet objective requirements. In recent years, some prediction methods such as the grey system (Lin, Liu 2005;Zhang, Shi 2005), neural network (Vlahogianni et al. 2005;Junevičius, Bogdevičius 2009;Cigizoglu 2005;Ma et al. 2004), etc. have been introduced. ese methods are some good solutions to solve nonlinear problems; nevertheless, they have some shortcomings. Although arti cial neural networks have nonlinear mapping ability and generalizing capacity, the speed of algorithm training is very slow, and the errors of learning processes are converged easily to local minimum. erefore, arti cial neural networks are ine cient to ensure the accuracy of learning processes. In addition, the arti cial neural network can guarantee empirical risk minimization under conditions of limited samples. In case of a number of samples, articial neural networks can easily get into dimension dis-asters; they hardly generalize and explain the obtained results due to over tting (Peng et al. 2007). e essence of grey prediction is exponential growth prediction (Xie, Liu 2005). Grey prediction requires that original time series should be non-negative monotonic functions and can accord with exponential laws. en, the condition of grey prediction may fail to be satis ed. For these reasons, it is necessary to seek for new ways to accurately predict passenger ow on the highway. e Support Vector Machine (SVM) based on the statistical learning theory was originally proposed by Vapnik (1998) and is a new type of classi cation and regression tools. As the best small-sample learning theory, SVM applications have been successfully developed in some areas such as pattern recognition, function approximation and nancial time series (Kim, Sohn 2009;Vapnik 1998). Compared with the neural network, SVM could well solve some problems of small sample size, high dimensionality, nonlinear and local minima (Suykens, Vandewalle 1999). However, when dealing with serious sample problems, SVM also faces some problems. Taking quadratic programming (QP) as an example, QP should undertake the matrix operations of the kernel function in each of the iterations, but the matrix memory of the kernel function was squared up with the number of samples. Owing to the accumulation of iteration error, the accuracy of algorithms would not be accepted. e least squares support vector machine (LS-SVM) is the extension of SVM. Compared with standard SVM, LS-SVM substitutes equality constraints for inequality constraints on the SVM algorithm; solutions to QP problems are directly transformed into the solution to linear equations. e article is divided into ve sections. Section 2 contains the basic principle of SVM and LS-SVM. Section 3 constructs the prediction model of passenger ow on the highway according to characteristics of highway passenger transport. Section 4 introduces passenger ow on Hangzhou highway, makes a prediction and presents experiment results. Finally, Section 5 summarizes the paper.

Introduction to SVM and LS-SVM
SVM (Cao, Tay 2003) is based on the statistical learning theory and is a new machine learning method. e basic idea showing that SVM (Vapnik 1999) solves the problems of regression is to map input space into higher feature dimensional space by non-linear mapping. In higher dimensional space, SVM utilizes the principle of structural risk minimization to construct a linear decision function and do linear regression in new feature space.

SVM
where: input feature vectors -x i R n ; target valuesy i R and -i = 1, 2, …, l.
Our goal is to construct a regression function that represents the dependence of sample output y on inputs x. Let's de ne the form of this function as: where: Z is weigh vector; b is deviation. In order to measure deviations between the estimate and target value, we rst de ne H-insensitive loss functions in the following form: en, we nd optimal Z and b to approximate the minimum of empirical risk function R emp or the minimum of the loss function as: Supposed that all training data could be tted inerrably by the linear function within the scope of H according to Eq. (2), the regression problem is converted into the minimizing decision function expressed by Eq. (5): where: c > 0; [ are an upper and a lower limit of a slack variable, the rst term of Eq. (5) is responsible for nding a smooth solution while the second one minimizes training errors (c is the trade-o parameter between the terms).
Consequently, Lagrange can be formed as: where According to the coincidence theorem, the above Eq. (6) can be converted into the following dual problem with an objective function and constrains: where: the parameters of i D and * i D are Lagrange multipliers.
A er calculating Eq. (7), the coe cient of the regression equation in Eq. (2) is as follows:

LS-SVM Regression
LS-SVM (Burges 1998) is a variant of SVM proposed by Suykens. e standard SVM regression by Vapnik is modi ed to transform the QP problem into a linear problem (Suykens, Vandewalle 1999, 2000. ese modications are formulated in the de nition of LS-SVM as follows: where: Z is a full vector; [ i is error variable; b is deviation and γ is an adjustable constant. From Eq. (9), the following Lagrange function can be formed: where: the parameters of D i are Lagrange multipliers.
Conditions for optimality are the following: e corresponding linear equation set (a Karush-Kuhn-Tucker system, see Suykens et al. 2002) is: where: y = (y 1 , …, y l ); Finally, the regression model for LS-SVM can be obtained in the form of:

Construction of the Regression Model Based on LS-SVM
According to the basic principle of the LS-SVM regression problem, the regression process of SVM is shown in Fig.1. e actual steps are as follows: Step 1: e determination of in uencing factors and data samples. Based on the goals of prediction, we determine the factors that in uence prediction goals and form training and testing datasets.
Step 2: Scaling data. In order to increase computing speed and prediction accuracy and avoid too high or low characteristic value, we need to scale the sample dataset.
e transformation equation is shown as follows: where: ij x c is scale values and passenger ow; min i x and max i x are the minimum and maximum of scale values and passenger ow respectively.
Step 3: Determine the input vectors of LS-SVM and establish mapping from input vector x n = [x n-1 , Step 4: Selecting the kernel function. A selection of the kernel function and parameters both directly inuence generalization capacity of LS-SVM. In LS-SVM model, the commonly used Kernel function includes the following: Linear: ( , ) K x y xy ; Polynomial: Radial basis function (RBF): Sigmoid: Step 5: Parameter optimization and model application. We train the sample dataset by utilizing LS-SVM model. rough cross validation, we nd the best training parameters of LS-SVM, including penalty coe cient(C), kernel function parameter (V), etc. and input the testing dataset to predict passenger ow by using Eq. (13).
Step 6: Performance evaluation. e accuracy evaluation of the model is as follows: Prediction error: Root mean square error: Prediction accuracy: where: i y is the actual value of the sample; * i y is the estimated value according to LS-SVM model; n is the number of the prediction value.

Architecture of the Index System
e prediction of passenger ow on the highway is based on historical data and in uencing factors, and therefore, must accord with the characteristics and development trend towards passenger ow on the highway (Junevičius, Bogdevičius 2009).
In general, a highway tra c system is a complex one (Junevičius, Bogdevičius 2007). ere are many inuencing factors (Fig. 2) in predicting passenger ow on the highway, including economic (for example, urban economic development level) and non-economic factors (urban scale and urban population): 1. Urban economic development level. Demand for highway passengers mainly comes from two di erent elds: production and consumption. Economic development not only increases the activities of production, but also stimulates travels of consumers and drives a stable increase in travelling passengers along with economical development.

Urban scale and urban population. e quantity
and structural changes of urban population will cause changes in tra c demand. Normally, when the frequency of travelling remains unchanged, population increase will cause a rise in highway passenger ow. In addition, with an increase in non-agricultural population, rural surplus labour will transfer to the town and result in an increase in passenger ow on the highway. Yearbook (1998Yearbook ( -2008 and is provided by Hangzhou Bureau (http://www.hzstats.gov.cn). We have selected eight prediction indexes that greatly in uence passenger ow: population (agricultural population, non-agricultural population), gross domestic product (primary industry, secondary industry and tertiary industry), per capita disposable income, the total retail sales of social consumer goods, civilian vehicle ownership. Speci c descriptions of prediction indexes are shown in Table 1 and those of the sample data in Fig. 3.

Data Preprocessing
First, we construct a training dataset of eight inputs and one output referring to the data obtained within the period 2000-2005. en, we make the corresponding testing sample set using data collected for the period 2006-2008 and apply the LS-SVM model for training and testing.
Second, we scale data on training and testing datasets for passenger ow on Hangzhou highway because scaling can avoid attributes in great numeric ranges and numerical di culties during calculation. e data displayed in the article are scaled to (0, 1). e normalized results are shown in Table 2.

Regression Prediction of LS-SVM
One of the most important factors in building the prediction regression model using LS-SVM is the selection of the kernel function. In general, there are four main types of kernel: linear, polynomial, radial basis function (RBF) and sigmoid. In this article, the RBF kernel function is used as the default kernel, because RBF has better advantage than the other kernel function under a lack of prior knowledge. First, the RBF kernel can map nonlinear samples into higher dimensional feature space and deal with samples when relation between regression data and an attribute is nonlinear (Suykens et al. 2002).  Second, in terms of performance, the linear kernel is a special case of RBF (Keerthi, Lin 2003). e linear kernel having parameter C performs similarly to the RBF kernel having parameters (C, J). ird, the number of hyperparameters in uences the complexity of the regression model, and therefore the polynomial kernel has more hyperparameters than the RBF kernel (Kim et al. 1999). Finally, the RBF kernel has few numerical difculties because the kernel value lies between zero and one. On the contrary, the polynomial kernel value may go to in nity or zero when the degree is high. en, the RBF kernel is used for building the default prediction regression model for passenger ow on the highway.
When the RBF kernel is selected as a default function, two parameters(C, J) associated with the RBF kernel also need to be decided. Upper bound C and kernel parameter J play important role in the performance. Keerthi and Lin (2003) suggested a practical guideline for SVM using grid-search and cross validation the reason for which is that the latter function can prevent from the over tting problem; grid-search can avoid searching for an exhaustive parameter and may nd good parameters for computational time. In addition, each parameter (C, J) is independent, and thus grid-search can be easily parallelized. e article uses the methods of RBF, cross validation and grid-search to determine the optimal parameters of LS-SVM in accordance with the procedures of mesh generation and gradual re nement. First, we use a coarse grid (Fig. 4) and nd out that the best (C, J) is (32, 0.0039) with the cross validation rate of 98.56%. Next, we use ner grid search (Fig. 5) and establish that the best (C, J) is (1, 0.125) with the cross validation rate of 0.995971%. A er the best (C, J) is found, the whole training dataset is retrained to generate the nal regression result of passenger ow on the highway for the period 2006-2008 (Table 3).

Results
is article uses the regression method of LS-SVM to predict passenger ow on the highway for the period 2006-2008 and measures the results of predictions applying four performance indexes (Table 4). e obtained data show that LS-SVM model has higher predicting accuracy and tting degree (Fig. 6). In addition, through adjusting constants for LS-SVM model (Table 5), the article reduces error and strengthens the smooth degree of the regression function as soon as possible. erefore, we suppose it is feasible and e ective to predict passenger ow on the highway using the regression method of LS-SVM.

Conclusions
1. LS-SVM based on the statistical learning theory is a machine learning method having a strict theoretical basis and able to solve some problems of small sample size, high dimensionality, nonlinear and local minima. Due to its excellent merit, the article constructs the nonlinear regression model based on LS-SVM, and uses the example of nine samples to make a prediction. e obtained result shows that the prediction method based on LS-SVM is feasible and accurate and can provide a new method for passenger ow on the highway.  2. Passenger ow on the highway is an important index that re ects passenger capacity on the highway and has a very important meaning of grasping development trend, characteristics and rules for passenger ow on the highway. However, the prediction methods of passenger ow on the highway have still been based on the total amount by now. In fact, the prediction of the total passenger ow is only an aspect of prediction. In fact, the prediction of the total passenger ow is only an aspect of prediction. While discussing passenger ow, we should also consider space position and passenger distribution because these are important factors in formulating development plan for the highway and arranging stations. 3. A tra c system on the highway is a complex one.
ere are many in uencing factors in predicting passenger ow on the highway, including economic, non-economic, quantitative and qualitative factors. To some extent, the method discussed in the article is mainly based on data, thus some limitations can be observed. In case some qualitative analyses should be added, it could compensate for a shortage of quantitative methods.