SPATIAL ANALYSIS OF SUBWAY PASSENGER TRAFFIC IN SAINT PETERSBURG

. The purpose of the paper is to create clear visualization of passenger traffic for Saint Petersburg subway system. This visualization can be used to better understand the passenger flow and to make more informed decisions in future planning. Research was based on officially published information about passenger traffic on subway station for years 2016 and 2018. Visualization was created with the variety of methods and software: Voronoi diagrams (QGIS software), social gravitation potential (R programming language), presentation of gravitation potential as a relief (Blender software), service zones of ground transport accessibility (2GIS, QGIS and Mapbox mapping platform). In this research, authors propose the use of intersection between the service zones and social gravitation potential isolines as an instrument for spatial analysis of traffic data. Analysis shown that current development of subway system does not correspond to passenger distribution. All stations were classified according to their accessibility and propositions about future directions of development were made.

quite the opposite: most of the projects are in stagnation due to lack of finances and total incoordination of city government actions.
Subway system is one of the best illustrations of problems facing the development of public transport system. In the Table 1 below you can see the amount of stations for future construction due to the scheme of subway development in 2008 and real situation in the end of 2019. It has to be noted that subway development in Saint Petersburg is very difficult and long process because of unfavorable geological conditions (Kozin, 2017).

Introduction
Public transport development is one of the most essential aims of urban planning. Car traffic takes up too much city space. Research shows that road system design aimed at cars only leads to more traffic, more traffic jams and more accidents as a result (Goodwin & Noland, 2003). Most of modern urban projects aim to decrease the amount of cars and close certain city districts from traffic. However, for successful creation of the new city type it is necessary to have an adequate combination of ground and subway public transport. Development of public transport system leads to decrease in the car traffic, helps to make city environment much more comfortable for its residents and provides the opportunities not just for traffic mobility but for social mobility as well.
In Russia, the issue of balance between the car roads and public transport system is more complicated than in other countries. Large city agglomerations have high level of urbanization, which is combined with poor financing of transport system. Only Moscow agglomeration can afford to realize big infrastructure projects such as new city railway systems or new subway circle line. In the second sized agglomeration in Russia -Saint Petersburg -situation is Therefore, in the conditions of low financing and slow construction process it is necessary to choose correctly the ways of transport system development. In Saint Petersburg situation becomes critical because of enormous amount of people per subway station (Table 2). Subway system acts as kind of carcass of the public transport system. Population uses all the other types of public transport -city railways, tramlines, buses -to reach city districts distanced from subway stations. Modern research of public transport networks can be divided into few directions: network analysis using graph theory, spatio-temporal analysis of passenger flow using big data and analysis of robustness of public transport networks. Graph theory usually provides the framework to analyze network properties and topological structure (Derrible, 2012;Levinson, 2012). Complex analysis of passenger flow often done using big data from various data providers: cell phone location data, smart card data etc. (Xiao & Yu, 2018;Shin, 2020). This research often aimed at supporting smart city policy. Robustness and resilience of public transport networks is also very important topic in the face of natural and human-made disasters (Chopra et al., 2016;Yang et al., 2015;Gonzalez-Navarro & Turner, 2016). Usually, main data for city transport system development is amount of passenger traffic on existing stations. However, semantic data is not good for perception of nonspecialists, especially public organizations or general public. Therefore, it is necessary to visualize passenger flow in comprehensible way to make it useful for public debates and future city planning. At current time, the development of traffic data acquisition and improvement of graphic in geostatistics lead to growth of traffic visualization projects. They are helpful to analysis of transport networks. There are visual analytics systems based on smart card and social network data in China (Zhang et al., 2020). Traffic data is presented as multilayered 3D visualization based on spatiotemporal preferences of passengers. There is also visualization of smart card data and Twitter activity for Tokyo Metro (Itoh et al., 2014). N. Andrienko and G. Andrienko work on a big amount of data visualization projects, part of them is connected to passenger traffic. They are related to personal activity and SoBigData initiative (Andrienko et al., 2020), spatial connectedness by public transport (Andrienko et al., 2019) transport network graphs visualization (Andrienko et al., 2016). Origin-destination (OD) data visualization and passenger flow simulation are also very popular (Pérez-Messina et al., 2020;Massobrio & Nesmachnow, 2020). Some authors present a micro-prediction approach to predict individual passenger's destination station and arrival time (Lin et al., 2017). There are a number of experimental interactive projects that examine patterns of passenger behavior (Barry & Card, 2014;Chong, 2015;Bumgardner, 2016;Dataveyes, 2013). Most of these projects based on public open data. Some projects just aim to represent daily or weekly patterns (Chong, 2015;Bumgardne, 2016) when others also analyze how delays impact passenger flow (Barry & Card, 2014). More complex project (Dataveyes, 2013) challenges subway users to see if their perception of time and space based on objective facts. Traffic is visualized as a combination of heatmap and DEM (elevation is amount of passengers). Interpolation is well performed and DEM can be explored in 3D mode. Unfortunately, there are no connection between traffic visualization and city map which makes spatial research impossible.

Materials and methods
Unfortunately, there are not many resources containing reliable and open information about passenger traffic in Saint Petersburg subway. Therefore, for the purpose of the research the data is taken from two sources: Kommet agency (main advertising company in Saint Petersburg subway Kommet agency, 2019) and official statistics presented by Saint Petersburg Metro (Saint Petersburg metro, 2017). Former contained information for the year 2018 and the latter for 2016. It should be noted that the data contains some oddities and discrepancies, which are probably caused by different methods of calculation. The traffic in 2016 for most stations bigger than in 2018 for about 200 000 passengers per month. Most of the stations has not changed its traffic a lot, but there are some essential decreases/increases for some stations. Passenger traffic for transfer stations was combined ( Figure 1 and Figure 2).
There are some reasons for change in traffic. For example, traffic increase on Parnas, Devyatkino and Ladozhskaya stations is connected with dramatic growth of urban areas near the stations. Traffic increase on Pl. Vosstaniya / Mayakovskaya may be explained by the intensification of Moskovskiy Railway Station traffic, which gets all passengers from Moscow. Symmetric increase and decrease on Petrogradskaya and Gorkovskaya; Vladimirskaya / Dostoyevskaya and Pushkinskaya / Zvenigorodskaya (and also decrease on Spasskaya / Sennaya Pl. / Sadovaya) may be explained by redistribution of traffic between stations. Decrease on Vasileostrovskaya is connected with construction of the second entrance to Sportivnaya station. Decrease in traffic on Prospekt Prosvescheniya is probably caused by redistribution of traffic from suburban areas to other stations (Parnas, Begovaya).
Semantic explanation and graphs are not very clear especially if one is not familiar with the location of stations and their connections. This presentation of data takes it out of the context and loses the vital information about location of stations. So it necessary to create the visualization which will show traffic flow in such a way that it can be understood by public.
Simplest method of visualization is just to draw points on the map and label them (Figure 3). This approach allows to see the location of stations and which stations have the biggest passenger flow.
QGIS software and Mapbox mapping platform (Mapbox, 2020) are used for initial preparation and visualization of collected data. This visualization is preferable to the table form but still does not allow to see the distribution of passenger density. One of the ways to visualize the traffic is flow maps (Flowmap.blue, 2020). They represent aggregated numbers of movements between geographic locations. However, despite the fact that there is information about passenger traffic of each station, there is no data about traffic direction, origins and destinations of passenger flows. Without the essential data containing not just the number of people but also the origins and destinations of movement, it is impossible to create accurate flow map. Therefore, passenger flow is presented as digital elevation model (DEM). The passenger traffic for every station acts as height of this point. It is possible to recreate and visualize the terrain by interpolation.
The number of different economic, demographic and social reasons influences passenger flow with the subway stations acting as gravity points. Each station has its own gravitational force. Bearing that in mind, data can be interpolated using demographic gravitation or social physics. Social physics was inspired by the principle of isomorphism: one set of principles and laws can be applied to social world as well as the natural world (Barnes & Wilson, 2014).
John Q. Stewart was one of the first researchers who proposed the idea to apply principles and laws of physics to social processes. He found that some objects such as schools and colleges have a gravitational potential similar to gravitational influence of the planets (Stewart, 1947). Such potential can be calculated according to the mathematical formulas designed by Newton, Lagrange etc.
The data between known points should be interpolated, which helps to find and present the spatial pattern. In his 2016 2018 Note: first number is the ID of the station according to Figure 1, second -amount of passengers per month, thousands of people; points colour corresponds to the line.  (Stewart, 1942). These potentials help to smooth complex spatial patterns. Calculations were done in RStudio software with the SpatialPosition package (Pebesma & Bivand, 2005;Bivand et al., 2013). It is designed for spatial position models creation. This package allows to compute Stewart potentials with two spatial interaction functions: Pareto and exponential. Test visualization with Pareto model has found that it does not suite neither data nor the goal of the research. Exponential function gave some good results. Interaction between points is defined according to formula (1) (1) User should choose two parameters: β and dist (distance), where the density of probability of the spatial interaction function equals 0.5 (which is used to calculate α). Parameter β shows the impedance factor for the spatial interaction function and it defines what the results would look like. A number of equipotential (isopleth) maps were created to determine the correspondence between the data and the value of β (Figure 4). On these maps, isolines show the places with the same potential. The breaks were calculated as equal intervals.
As evident from the results: the bigger value β is, the closer to the circles isolines are and lesser the area they cover. For the β = 1 almost whole area is covered by isolines, but for the β = 6 most of them are concentrated just around the stations. The best results achieved for the β = 3 and β = 4: calculated isolines show the flow of people within the subway system but don't redistribute the flow for the areas where are no subway stations. So the final maps were created for β = 3.5 ( Figure 5). These final maps were drawn using Tanaka method or illuminated contour method (Tanaka, 1950).
Created maps clearly show the structure of passenger flow and main points with biggest gravitational potential.
Most of the central stations (Nevski pr, Pl. Vosstaniya, Sennaya pl.) were combined in one cluster with the biggest passenger traffic. It can be clearly seen that the city center very well covered by subway traffic, whereas areas between some stations on the Line 5 and Line 2 stay empty. Some of empty spaces in subway coverage are due to natural barriers: for example, Primorskaya and Novokrestovskaya are located far from each other on different islands with no direct connection. Because of the unique location of Saint Petersburg, big bodies of water define the structure of transport systems. DEM-looking surface was created using isopleth maps. For smoothing, extra isolines were added by increasing the amount of breaks. Two raster images were calculated: the monochrome one that is later used as digital elevation raster and the colored one for clearer DEM presentation. Model was created with Blender creation suite (Huffman, 2019). These models are not very smooth because they were created from isopleth maps rather than actual DEM raster ( Figure 6). This type of visualization challenges conventional view and perception of city (Dataveyes, 2013).
All the maps above were created using absolute numbers of passenger flow but different stations serve differently sized areas. Therefore, it is important to calculate relative number of passengers per square meter. Voronoi diagram was built to draw boundaries of service area for each station. With this method of plane partition, each station will have assigned cell within which all points of the cell closer to that station than to any other. Biggest problem with this partition is that it doesn't take into the account external factors, for example, natural barriers (Figure 7). For most of the stations, if they are located close to the Gulf of Finland or big rivers, such as Neva, created polygons go over these natural barriers. However, most passengers would not cross the river or the gulf to get to the closest subway station. Another issue is that ground public transport system is oriented to certain subway stations, for example station located near railway stations. Therefore, Voronoi diagram sometimes should be corrected using local information about traffic system and city topography. Diagram was corrected using 2GIS project (2GIS, 2020) for subway stations transport accessibility evaluation. Data was prepared in QGIS software and visualized via Mapbox (Figure 8). After that, surface areas and relative number of passengers per square meter were calculated. This parameter was set as height of 3D polygons for Mapbox visualization (Figure 9).
Our final idea is to combine isopleth graphic interpretation with service areas to find the correspondence between them for different stations. Isopleth areas are theoretical or "ideal" and service areas are "practical" ones. So, two layers were overlaid and intersected in QGIS (Figure 10).

Results and discussion
Isopleth coverage shows what part of service area intersects with isopleth. These areas might be considered as access areas for stations. Relative sizes of areas were calculated in relation to the sum of service and isopleth areas respectively ( Figure 11).
Closer inspection of the table shows that two stations (Prospekt Prosvescheniya and Ladozhskaya) with biggest service areas serve more than 17% of the city with relative passenger traffic about 3% on each station. However some stations located in the city center (Nevsky Prospekt / Gostiny Dvor) have almost the same relative amount of passengers but their service areas much smaller. This difference can mostly be explained by the location of the stations. Prospekt Prosvescheniya and Ladozhskaya serve very populated areas and some suburban areas as well and because of that, they attract many people from big territory. Nevsky Prospekt / Gostiny Dvor is the transfer station so it gets a lot of passengers on their way from one line to another but it is located near few other stations.
In addition, it is important to analyze the possible correlation between the size of service area and isopleth coverage. As can be seen on chart (Figure 12) there is some significant correlation between two variables. These results do not rule out the influence of other factors. Most of the service areas covered by 100% are related to city center. More than 50% of city area corresponds to stations with isopleth coverage less than 50% (Figure 13). In opposite, central stations with coverage of 90-100% absorb more than 25% of passenger traffic. Weakly covered service areas (<50%) have 37% of traffic, which is also significant value. It is important to mark that central stations have a big amount of transit and tourist traffic. Therefore, real relative citizen traffic on weakly covered stations is higher.
The weakest points in the subway system are the stations with poorly covered service areas. Area and traffic distribution presented on diagrams below (Figure 14a and 14b).
It should be noted that most of the area and traffic relates to the North-East, East and North of the city (Figure 15). Some issues with access to the subway stations in the South of the city partly decreased because of new stations (opened November 2019). However, it is not shown in the analysis for the fact that there is no official information about the passenger traffic. South-East part of the city not very populated and mostly covered by industrial sites or railway infrastructure therefore there is not much need for access to the subway. On the contrary, South-West part mostly consists of residential area but for the most of it isopleths do not reach even the half of service area. So, it has to be the main direction of development for subway system (Figure 16).
It is interesting to compare our findings with the plan of subway system development according to Government of Saint Petersburg ( Figure 17)   Of course, traffic development directions for North/ North-West are exist but time of realization is "after 2030" which means "never" in city management practice. Even the part of Circle line is planned to be constructed earlier.

Conclusions
City planning should rely on research into where people go, how passenger traffic flows and what areas are underserviced. One of the aspects of such a research should be clear presentation of data. In this paper, we analyzed the best way to present data about passenger traffic of subway system. Our results show that actual development plans do not correspond with the official statistics. Sometimes, subway system development depends on strategic decisions. For example, development plans were reconsidered while Russia was elected as host for FIFA World Cup 2018. Novokrestovskaya station was constructed especially to connect World Cup stadium to subway system. Now, there is no significant traffic except for time of football games. Another example is plan to construct station near marine passenger terminal. It is located far from residential areas and services small amount of cruise ships, so traffic will be not very significant. In opposition, large overpopulated areas have no subway stations, so hundreds of thousands passengers need to transfer to subway stations by ground transport.
It is difficult to compare the study to others in the field because of lack of data. Most researchers have an advantage of using big data sources. They might include not just passenger traffic on each stations but aggregated data of passenger flow from station to station by phone locations or smart card data. Analysis might include how the passenger flow corresponds with speed of the trains and how delays affect traffic. Unfortunately, in Russia public datasets are very rare and mostly do not contain detailed information so at this stage such a research is virtually impossible. Another possible area of future research would be to investigate how accessible the whole system of public transport in different areas of the city. Some areas not serviced by subway have excellent ground transport systems.
It is possible to compare this visualization to existing ones. The closest one is Paris metro visualization made by Dataveyes (2013). It has better interpolation and smoother surface but it can be explained by high density of stations in Paris (249 in Paris vs. 23 in Saint Petersburg for city center). In addition, Dataveyes project is not matched with city map so it is impossible to provide spatial analysis. In opposition, our method is aimed to connect traffic data and spatial features. Visual presentation illustrates main difference between systems -stations in Paris metro are spaced evenly, so there are no extreme elevations. Notable peaks are related to railway stations and train hubs. In opposite, small amount of stations and high population density make the same visualization for Saint Petersburg rather different. There are many extreme elevations located on outskirts. It illustrates the lack of stations in rapidly developing city districts. Other parts look smooth because traffic there is insignificant so it creates almost flat surface in comparison with overcrowded stations.