PEDESTRIAN SAFETY EVALUATION OF SIGNALIZED INTERSECTIONS USING SURROGATE SAFETY MEASURES

The large proportions of pedestrian fatalities led researchers to make the improvements of pedestrian safety at intersections. Thus, this paper proposes a methodology to evaluate crosswalk safety at signalized intersections using Surrogate Safety Measures (SSM) under mixed traffic conditions. The required pedestrian, traffic, and geometric data were extracted based on the videographic survey conducted at signalized intersections in Mumbai (India). Post Encroachment Time (PET) for each pedestrian were segregated into three categories for estimating pedestrian–vehicle interactions and Cumulative Frequency Distribution (CDF) was plotted to calculate the threshold values for each interaction severity level. The Cumulative Logistic Regression (CLR) model was developed to predict the pedestrian mean PET values in the crosswalk at signalized intersections. The proposed model was validated with a new signalized intersection and the results were shown that the proposed PET ranges and model appropriate for Indian mixed traffic conditions. To assess the suitability of model framework, model transferability was carried out with data collected at signalized intersection in Kolkata (India). Finally, this study can be helpful to rank the severity level of pedestrian safety in the crosswalk and improve the existing facilities at signalized intersections.


Introduction
Accidents are undesirable events, which lead to death or injury of road users and property damages. Accidents occur when a series of influential factors occurs. Therefore, if these factors did not exist, then a probability of accidents may be reduced. The behavior and characteristics of pedestrian and traffic are the main contributing factors in accident's occurrence at intersections. In India, more than 141500 people have died and 488700 people have been injured in road traffic accidents in the year 2014, which is more than the past rates of Indian accidental deaths (NCRB 2015). It has been reported that Mumbai has the maximum number of "accidental deaths" including pedestrian fatalities reported by 53 major cities in India (Mohan et al. 2015). As per Mumbai traffic police record, out of the total 3040 fatal accidents recorded from 2007 to 2010 in Mumbai, 54% of the accidents took place at or close to intersections. The most vulnerable entity at intersections in India is the pedestrian. Therefore, the large proportions of pedestrian fatalities led researchers to make the improvements of pedestrian safety at intersections. Pedestrian safety has been difficult to assess for innovative traf-fic treatments at signalized intersections, because of the lack of statistical models for estimating conflict severity and lack of consensus on what constitutes a safe or unsafe pedestrian facility. Pedestrian safety can be established by using the concept of Surrogate Safety Measures (SSM). SSM depends on the idea that accidents develop from interaction, which are situations where the probability of an interaction is high. SSM is proactive indicators that reflect the safety of a pedestrian facility and very important term is used in these proactive studies: interactions. A interaction is defined as a recognizable condition in which two road users approach each other in time and space in such a way there is a risk of a collision if their actions remain unchanged (Marisamynathan, Vedagiri 2015;Vedagiri, Killi 2015). Therefore, the primary objective of this study is to develop a mathematical model for estimating the pedestrian safety by using SSM at signalized intersections crosswalk in developing countries. This paper is divided into several sections. First a review of the literature on pedestrian safety at signalized intersections, SSM and concepts of Post Encroachment Time (PET) is provided. The next section focuses on the site selection, data collection and extraction process. This section will be followed by defining the PET values for pedestrian with respect to vehicle type at signalized intersections. The paper then presents and discusses the development of Cumulative Logistic Regression (CLR) model for estimating pedestrian PET category. The final section presents application of the research presented in this paper and concluding remarks.

Literature review
In road network, pedestrian safety is commonly analyzed in terms of the number of crashes between pedestrian and vehicle (Kocourek, Padělek 2016;Roshandeh et al. 2016). However, this approach failed to quantify actual safety conditions for pedestrians due to unavailability of accident data and exposure measures. Most of the existing studies on pedestrian safety have been examined through the use of historical accident data, depends on frequency and severity. Few studies list the limitations of using accident data for pedestrian safety analysis, such as low-mean sample, underreporting, misallocation and misclassification (Persaud et al. 2013). Pedestrian safety evaluation at intersections can be evaluated into two major methods: accident rate, and conflict method. Both methods have some limitations such require large data, need secondary data, lesser evaluation in findings, and lesser accuracy level (Wang, Abdel-Aty 2008).
In order to overcome those problems, proactive methods have been introduced which does not require any accident or secondary data and rely on surrogate measures of safety. Surrogate measure techniques observe non-crash event, which is related in a reliable way to actual crashes and convert those non-crash event into the corresponding crash severity level. Surrogate measures provide more better and precise alternative safety indicators and used to identify the various risk factors on before-after control studies (Vedagiri, Killi 2015;Nadimi et al. 2016;Zangenehpour et al. 2016). Most of the existing studies have adopted Traffic Conflicts Technique (TCT) as SSM to quantify the conflict serverity level. TCT measures as a conflicts, which occur much more frequently than crashes and provide information on relative risk at particular facilities (Song, Yang 2011;Cafiso et al. 2015;Fu et al. 2016). However, TCT has some limitations when conflict happened between pedestrian and vehicle, because of complex movement dynamics and grouping, non-rigid and less organized nature of pedestrian. Therefore, some existing studies have utlized PET to measure the severity index of conflicts. The PET between two road users can be used as surrogate measures and it is defined as the period of time from the moment when the first road users leaves the conflict area until the second road user reaches it (Ni et al. 2013;Cafiso et al. 2015;Fu et al. 2016;Vedagiri, Killi 2015;Nadimi et al. 2016;Zangenehpour et al. 2016).
Intersections have a large impact on the pedestrian-vehi-cle interaction, which influences the pedestrian safety and severity. PET relates to interaction severity explained the low or unreported model fitness in existing studies. Therefore, there is a need to develop and testing a new safety surrogate measure for pedestrian crosswalk at signalized intersections, particularly in India. The major contribution of this paper is to propose the accurate PET values for pedestrians from the real-world data and imporve the accuracy level of the pedestrian PET perdiction by proposing a mathematical model under mixed traffic conditions. Thus, the specific objectives of this research are: (1) define the PET values for pedestrian severities in crosswalk based on collected video data (by using cumulative distribution function plot), (2) develop the CLR model to estimate pedestrian severity categories at crosswalks, (3) evaluates the safety level of the existing crosswalk at selected signalized intersections based on proposed PET values and model in this study.

Data collection and analysis
Effective data on pedestrian behavior is required to improve the safety of pedestrians while crossing the signalized intersections. The data collection process requires a careful procedure to ensure the accuracy of the data. This section describes the process, which consists of the following steps: (1) site selection and videographic survey, (2) data extraction and analysis. Additional details for each step are provided as follows.

Site selection and videographic survey
To fix the required number of intersections for safety model development, existing literature on pedestrian behavior and safety modeling at signalized intersections were reviewed and the outcome inferred that the number of study intersections varies from one to ten (Muraleetharan et al. 2005;Abu Sa'a 2007;Zhang et al. 2009;Huang, Ma 2010;Chen et al. 2011;Ling et al. 2012;Nagraj, Vedagiri 2013;Marisamynathan, Vedagiri 2018). To cover the variation in pedestrian, traffic and roadway characteristics, five signalized intersections were selected in this study. The study locations were selected in the central part of Mumbai. All intersections in this study were typical four arm signalized intersections where at least one approach was defined an arterial. The major crosswalk with proper marking was considered for safety analysis at selected intersections. It was observed that the study locations were operated by shared signal phases and free left turning movements for vehicles. It was noticed that ramp between sidewalk and crosswalk provided at all selected study locations. In addition, it was found that there is no exclusive left turning vehicle lane and absent of lane marking at all study locations. The selected study locations are Link Road Junction (A), Malad Junction (B), Mahim Junction (C), Mahatma Gandhi Road Junction (D) and Holkar Junction (E). The details about the selected study location, pedestrian flow and signal timing were presented in Table 1.
Data were collected from selected study locations by conducting field measurements and vidographic survey. To investigate the safety effects of pedestrian at crosswalks, the peak hour videos were recorded from the selected five signalized intersections. Based on reconnaissance survey and secondary data, two hours duration was finalized in this study and the selected duration was in the morning 8:00 am to 10:00 am, which covered the school, college, office and commercial market opening time in Mumbai. Therefore, two hours video data were collected from the study locations and pedestrian volume was extracted from the collected videos. Finally, the peak one hour was fixed with respect to maximum pedestrian volume at all locations. For videographic survey, two Sony cameras were used in HD resolution at 30 images/s. The cameras were setup on both sides of the crosswalk and covered the direction of pedestrian upstream to downstream movement and downstream to upstream movement at the selected crosswalk in each intersection. The setup points of the camera were shown in Figure 1.

Data extraction and analysis
The required data were extracted from the collected videotape by using AVS video editor software (https://www. avs4you.com/avs-video-editor.aspx). The software provided 20 images/s and 72000 images were extracted from 1 h video. Two cameras were used and totally 144000 images were extracted per location. Based on earlier studies and field observations, several possible factors influencing pedestrian safety in crosswalk at signalized intersections were identified for Indian conditions and selected variables and descriptions were shown in Table 2 with encoded parameters.
The categories of pedestrian crossing speed were selected based on 15th and 50th percentile crossing speed. Highway Capacity Manual (2010) recommended that 15th percentile crossing speed reflects the design of pedestrian signal time and 50th percentile crossing speed represents the purpose of calculating service level of pedestrian with respect to safety and comfort. Therefore, 15th and 50th percentile crossing speed was utilized to define the categories of pedestrian speed at crosswalk. From the field data, 15th and 50th percentile crossing speed are 1.08 and 1.27 m/s respectively. For model and further analysis purpose, the pedestrian crossing speed is categorized into three categories such as: (1) less than 1.08 m/s, (2) between 1.08…1.27 m/s, (3) more than 1.27 m/s. A total of 1398 pedestrians was clearly observed from recorded video and the listed parameters were extracted for each pedestrian. In addition, pedestrians using the crosswalk during pedestrian green phase considered as compliance pedestrians and coded as 0. Pedestrians using the crosswalk during pedestrian non-green phase considered as noncompliance pedestrians and coded as 1.
To check the reliability of the sample, a sample size test was performed at 95% confidence interval. It is reasonable to assume margin error of 3% and with a response rate of 50%, which yields an approximate sample size of 1014 pedestrians with the assumed population of 20000 pedestrians. The value of the statistical sample size result (1014 pedestrians) is less than real-time respondents' size (1398 pedestrians) and it shows the significance level of collected sample for developing a pedestrian severity model and further statistical analysis in this study.

PET values for pedestrian at signalized intersections in India
The surrogate measure of safety used in this study to evaluate the severity of pedestrian in crosswalk is PET. PET is defined as the time gap between the arrival of the pedestrians to collision point and the time of first vehicle arrive at same collision point at the crosswalk, or vice versa. Notes: C/W -crosswalk; UtoD -upstream to downstream; DtoU -downstream to upstream. The schematic outline of pedestrian PET on an intersection is presented in Figure 2. The crosswalk is divided into three strips from curb to median. Near to curb (left turn movement also) considered as the first strip, middle lane considered as the second strip and near to median considered as the third strip. In the same way, on the other side, median to curb is divided into three strips. The strips were marked in the video while extracting data using AVS video editor software and the PET values for each pedestrian were extracted from collected video. To calculate more accurate PET values, grid lines were plotted between strips by using the software and values were extracted for each grid position. PET values can be calculated using the time difference between the two road users (pedestrian and vehicle) pass through the particular strip where their paths intersect. A smaller values of PET implies the higher probability of pedestrian interacting with vehicle in crosswalk.
In order to provide meaningful results, PET values are divided into three categories, defined as: (1) highly dangerous, (2) dangerous or conflict, (3) safe or no conflict. Cumulative Frequency Distribution (CDF) plot was used to calculate the threshold value for each interaction severity category. The CDF for mean PET values were plotted and the results are shown in Figure 3. The threshold values for each PET category were defined with respect to 15th and 50th percentile values from CDF plot. In addition, PET values were calculated based on each vehicle type such as two-wheeler, car, auto, bus and light commercial vehicle and the results are presented in Table 3, which are suitable for Indian mixed traffic conditions. From Figure 3, the PET values for 15th and 50th percentile frequency are 2 and 5.5 s. Thus, this study sets an arbitrary threshold of 5.5 s on PET for interaction between pedestrian and vehicle in crosswalk. Notes: * , ** denotes the significant at 95 and 99% confidence interval respectively; UtoD -upstream to downstream; DtoU -downstream to upstream; LCV -light commercial vehicle; HCV -heavy commercial vehicle.
In order to provide meaningful results, PET values were segregated into three categories for estimating pedestrian-vehicle interactions and presented in Table 3. In the first category, if pedestrian do not change their crossing speed, then PET is considered as no interaction. In the second category, if pedestrian increase or decrease their crossing speed, then PET is considered as interaction. In the third category, if pedestrian stops in the crosswalk, then PET is considered as highly dangerous interaction. The study found that if PET ≤ 2 s, considered as a highly dangerous interaction, then 2 < PET ≤ 5.5 s, considered as an interaction, and PET > 5.5 s, considered as no interaction.
Compare with other vehicle types, auto has lesser PET values at all categories, which indicates that a pedestrian has a lesser probability of interacting with auto in crosswalk. Two-wheeler has higher PET values, which mean that the probability of interaction between pedestrianvehicle is high due to lane changing behavior and speed of two-wheeler. From Table 3, the severe conditions of existing pedestrian crosswalks at signalized intersections can be evaluated and the impact of each vehicle type with pedestrian safety can be estimated for Indian traffic conditions.

CLR model for pedestrian PET category
Once PET is categorized, CLR is applied to control for the effects of other significant variables such as traffic and pedestrian characteristics. CLR is one of the most commonly used statistical models for severity and ranked data analysis (Ye et al. 2015). For the pedestrian severity rating, the PET category have a hierarchical ordering that varies from 1 to 3. The discrete choice of PET category rules out the use of conventional linear regression as it requires the dependent variables to be a continuous variable. Therfroe, CLR technique is adopted in this study because it supports the issue of modelling discret variables with hierarchical ordering. The Pearson correlation, Spearman's correlation and ANalysis Of VAriance (ANOVA) tests were performed to identify the significant variables that influence the PET category at crosswalks. The Pearson correlation and Spearman's correlation tests were performed by using the Statistical Package for the Social Science (SPSS 16.0, https://www.ibm.com/analytics/spss-statistics-software) at 95% confidence interval. Table 2 shows the results of the Pearson correlation tests with the PET category as the dependent variable and remaining selected variables as independent variables.
Many existing studies have utilized Pearson correlation techniques to identify the correlation between continuous and categorical data or categorical and categorical data (Bian et al. 2009;Ren et al. 2011;Lipovac et al. 2013;Nagraj, Vedagiri 2013). The same technique is adopted in this study to measure the correlation between PET values and other independent variables (continuous and categorical data). Pearson's correlation coefficient is a measure of the strength and direction of association that exists between two variables measured on at least an interval scale. It is defined as the ratio between covariance of two variables and standard deviations. The value of Pearson correlation indicates the direction of results between two variables and Sig-value indicates the significance of results between two variables at 0.05 or 0.01 levels (2-tailed).
In addition, Spearman's correlation and one-way ANOVA tests were performed to identify the correlation between PET values (categorical data) and categorical independent variables. The ANOVA is used to observe if there is any significant difference between the mean of two or more independent groups. It is based on the mean of squares and variance. The F-value greater than F-critical and p-value less than 0.05 indicates significant difference between groups at 95% confidence level.  From Table 2 and ANOVA test results, approaching vehicle direction (X1), approaching vehicle position (X2), approaching vehicle type (X3), pedestrian age (X4) and speed type (X5) were found to have a significant effect on pedestrian PET category in crosswalks at signalized intersections. The value of the Pearson correlation indicates the direction of results between two variables and Sig-value indicates the significance of results between two variables at 0.05 levels (2-tailed). The Pearson correlation coefficient can take a range of values from +1 to -1. A value of 0 indicates that there is no association between the two variables. A value greater than 0 indicates a positive association, that is, as the value of one variable increases, so does the value of the other variable. A value less than 0 indicates a negative association, that is, as the value of one variable increases, the value of the other variable decreases.
The identified five significant variables in correlation test were considered as the most probable primary factors affecting pedestrian safety at crosswalks and those variables were used to develop the CLR model. A model was developed in SPSS 16.0 software by using 80% of collected data and the proposed model for pedestrian PET category in crosswalk at signalized intersection is described in following equations: (2) ( ) where: i = 1, ..., 5.
The results of the developed CLR model are described in Table 4.
The statistical performance tests were performed and the considered variables in the model were significant with a p value of less than 0.05. The pseudo R 2 -values of Cox & Snell, Nagelkerke, McFadden are 0.501, 0.533 and 0.412 respectively, and these indicate an overall goodness of fit. In addition, the standard errors were less and Wald values were satisfied 95% confidence interval.
From field observed data, 80% of data were used to develop the CLR model and remaining 20% data were used to validate the developed model. The predicted PET categories were compared with field observed data and the statistical performance tests were performed in Origin Pro 9.0 software (https://www.originlab.com/origin). The calculated statistical values of Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE), Pearson's R-and R 2 -values were found to be 7.64%, 1.9980, 0.7087 and 0.5022 respectively. The error values are less and it indicates that the predicted severities were estimating the existing condition of pedestrian in crosswalk at signalized intersections.
Based on the proposed PET values and respective categories, the selected study locations were ranked to understand about the pedestrian existing safety level at particular locations. From the analysis results, PET category at all locations were found to be 2, which meant that all locations are in dangerous conditions for pedestrians. Based on PET values, the locations mentioned in Table 1 are ranked and the ascending order with respect to dangerous are as follows: -rank 1: Mahatma Gandhi Road Junction (location ID: D and respective mean PET value is 2.15); -rank 2: Mahim Junction (location ID: C and respective mean PET value is 2.29); -rank 3: Holkar Junction (location ID: E and respective mean PET value is 2.32); -rank 4: Malad Junction (location ID: B and respective mean PET value is 2.41); -rank 5: Link Road Junction (location ID: A and respective mean PET value is 2.55).

Discussion
Pedestrian age parameter is divided into three groups as child, adult, and elderly pedestrian by visual appearance. It is clearly shown that PET of adult pedestrians is more than the other age group pedestrians and their average crossing speed is also higher than the other age groups of pedestrians, which allows to interact with vehicle in crosswalk. The pedestrian crossing speed has significant influence in PET at crosswalks assuming that the crossing speed is the same or higher throughout the crosswalk. Reduction or increases in speed would result in interaction. Three directional traffic movements like through, right and left turning were considered for analysis of pedestrian PET. The results indicated that pedestrians did not find difficulty when the approaching vehicle is from turning direction during pedestrian red phases. Turning vehicle drivers were also reducing the vehicle speed while turning and the volume of such vehicles are also low compared to through movement vehicles, which favors pedestrian to find minimum PET to use the crosswalk during pedestrian non-green phase. The approaching vehicle position is divided into three strips from curb to median. Near to curb (left turn movement also) considered as first strip, middle lane considered as second strip and near to median considered as the third strip. Likewise median to curb is divided into three strips. The lane of approaching vehicle closer to the pedestrian (first strips) then the possibilities of interaction between pedestrian and vehicle is lesser. Usually, pedestrians identify the PET from the first strips of the crosswalk to the approaching vehicle and non-comply with traffic signal based on that. However, for lanes far from the pedestrian (second and third strips), they are unable to identify suitable PET and receive interaction from vehicles, which may result in accidents or delay to pedestrians.

Application
The main objective of this study is to evaluate the safety of pedestrian in crosswalk at signalized intersections. Application was carried out with data collected at a new crosswalk in Chembur Nakka Junction (F) in Mumbai. Video graphic survey was conducted and a total 60 pedestrians data were extracted by using video editor software.
Problem: Are signalized intersections with crosswalk safer for pedestrians in Chembur Nakka Junction?  Table 2) and also the severity level of observed PET value 2.67 is 2, which fit with predicted value.
Recommendation for improving safety: From the analysis results, it is concluded that the crosswalk in Chembur Nakka Junction in Mumbai is "not safe". In addition, it indicates that the conditions for pedestrian is dangerous and probability of pedestrian interact with a vehicle is high in crosswalk. Therefore, there is a need to apply immediate remedial measures such as prohibit free left turn, exclusive signal phase for pedestrian to improve pedestrian safety at crosswalks in Chembur Nakka Junction.

Transferability of proposed model framework
Model transferability is an essential attribute that the transfer of proposed model estimates to a new location can reduce or eliminate the need for a large data collection and model development effort in the application context. The proposed CLR based pedestrian severity model were transferred to another city -Kolkata. Model transferability was carried out with data collected at a typical four arm signalized intersection (Sarat Bose Road Xing Junction) in Kolkata, another major metro city of India. Video graphic survey was conducted during peak hour on weekday and it covered the all four approaches at the study location. The selected location is in two-way traffic, bi-directional pedestrian movement, and pedestrian shared signal phase system with fixed time signal. For model transferability, three approaches such as North (from Gariahat), South (from Kalighat) and West (from Rabidra Sarovar) video data were extracted and the respective sample size was presented in Table 5. The geometric characteristics of selected location with snapshots of each approach are presented in Figure 4.
Instead of extracting all possible variables, mentioned five parameters in the Table 4 were extracted from the collected videos and the average values for independent variables were tabulated in Table 5.
Transferability of the developed model was evaluated using field data and the results are presented in Table 6. Table 6 shows that the estimated mean PET values were close to the field values. The estimated PET categories were matched with the field observed PET categories. In addition, the statistical performance level of the proposed model with field value was analyzed in Origin Pro 9.0 software. The MAPE and RMSE were calculated and the respective values were 18.05% and 0.4515. The acceptable difference between the values indicate that the developed model has the capability to estimate the pedestrian severity level at signalized intersections in developing countries.
Finally, the sensitivity of the model variables was performed with data collected in Mumbai and Kolkata. The ranking results of each variable were compared between the both cities. It indicates that both cities have the same ranking between the variables and the order are as follows: -severity rank 1: pedestrian crossing speed; -severity rank 2: approaching vehicle direction; -severity rank 3: pedestrian age; -severity rank 4: approaching vehicle type; -severity rank 5: approaching vehicle position.
From the ranking results, pedestrian crossing speed had more impact than other variables with pedestrian severity value. Therefore, the first remedial measure can be taken to reduce the impact of this variable and improve the pedestrian safety. Like that, all other variables can also be improved as per the ranked order. Finally, pedestrian safety can be improved by changing the most significant variables at selected signalized intersections in developing countries.

Conclusions
This research investigated the safety effectiveness of pedestrian crosswalk using a pedestrian-vehicle interaction methodology based on video data. In this study, the PET value of the pedestrian is considered as a surrogate safety measure for defining the severity of the conflict between pedestrian and vehicle in crosswalk at signalized intersections. The proposed methodology consisted of three major steps: (1) video data were collected at the selected study location and required data were extracted by using video editor software, (2) threshold values of PET values for each severity condition were defined by using CDF plot, (3) statistical modeling of the pedestrian severity category was developed to identify the safety level of pedestrian in crosswalks. In addition, mean PET values for pedestrian were calculated with respect to each vehicle type. The study results found that the PET value is less than or equal to 2 s, then there is a high probability of interaction between pedestrian and vehicle in crosswalk. Also, the PET value is greater than 5.5 s, and then there is no chance of interaction between pedestrian and vehicle in crosswalk. CLR model was developed for finding the mean PET values of pedestrians in crosswalks at signalized intersections with 0.5022 McFadden R 2 -values.
The results were shown that the proposed mean PET ranges and CLR model appropriate for Indian mixed traffic conditions. This study can be helpful to rank the severity level of pedestrian safety in crosswalks and improve the existing facilities at signalized intersections. In addition, it can be useful to revise PET values in traffic simulation tools such as VISSIM, SUMO and driving simulator that best apt for Indian conditions.