OPERATING SPEED AS A KEY FACTOR IN STUDYING THE DRIVER BEHAVIOUR IN A RURAL CONTEXT

. The research aims to explore the effects of geometric road features on driver speed behaviour in order to identify unsafe road segments where high reductions in speed between successive road elements occur. The sample involves two-lane rural roads on flat terrain (vertical grade less than 5%) in Southern Italy, totalling 184 km without spiral transition curves between the tangent segments and circular elements. The testing was carried out on 567 study sites, of which 248 are on circular curves and 319 on tangents. Speed data collection was carried out in environmental and traffic conditions using a laser. The conditions were the following: dry roads, free flow conditions, daylight hours and good weather conditions. The main goal was to calibrate and validate different operating speed prediction models: a) one model on tangent segments; b) one model on circular curves; c) only one model to be used at the same time on tangents and circular curves. The validation process involved almost 10% of the total road network length, that was removed from the calibration phase. The speed measurements of each of the first two datasets ( a , b ) were grouped into ten homogeneous substrates while for the remaining dataset ( c ) sixteen substrates were defined by using a hard c -means algorithm. Two statistical criteria were used to remove anomalous operating speed values from each group of three datasets, namely, the Chauvenet criterion and the Vivatrat method. The first criterion was preferred in the final process of model selection. The results of the first filtering procedure showed more homogeneous samples that guar-anteed a higher correlation coefficient and lower residuals of the predictive models during the validation phase than the Vivatrat method. The models were developed using an Ordinary Least Squares (OLS) method. The explanatory variables were total segment length, lane width, curvature of the road element, the curvature change rate on homogeneous road segments, and the number of residential driveways per km. ANOVA and additional synthetic statistical parameters were assessed to check the effectiveness of using a single general model to predict operating speeds at the same time on tangents and on circular curves alike. The results suggested the reliability of this hypothesis and its effectiveness in bringing advantages during the application phase.


Introduction and Literature Review
The prediction of human behaviour takes on a key role in the optimal management of traffic, maintenance and planning of the best intervention strategy. User behaviour is difficult to predict as it is influenced by human, infrastructural and environmental factors. Studies have shown that one of the parameters that most influence safe driving is operating speed, and being able to predict it is one of the main factors in highway safety assessment at the design phase. One way to compensate for human information processing limitations is to design roadway environments in accordance with driver expectations: a road designed according to the Rules and, consequently, according to drivers' expectations has good alignment consistency.
The use of operating speeds in lieu of, or in addition to, design speed has been suggested and implemented in many countries when dealing with design consistency. One of the ways in which operating speeds are used in ensuring design consistency is through the use of speed profiles. Speed-profile models are used to detect speed inconsistencies along road alignments. Design inconsistencies are identified on the speed profile when there are large differentials in operating speeds between successive alignment features (Fitzpatrick et al. 2000).
Geometric design refers to the selection of roadway elements that include horizontal alignment, vertical alignment, cross section, and the roadside of a highway or street. In general terms, good geometric design means providing the appropriate level of mobility and land use access for motorists, cyclists, and pedestrians while maintaining a high degree of safety. Roadway design must also be cost effective in today's fiscally constrained environment. The goal is to provide geometric street designs that 'look and feel' like the intended purpose of the roadway. Such an approach produces geometric conditions that should result in operating speeds consistent with driver expectations and commensurate with the function of the roadway. It is envisioned that a complementary relationship would then exist between design speed, operating speed, and posted speed limits (Fitzpatrick et al. 2003).
The research presented here explores the effects produced by the road elements of a two-lane rural road network (tangent segments, circular curves, total element length, mean curvature change rate, travel lanes plus shoulder length, residential driveways per km, radius of the geometric element) on speed levels, and vehicle trajectories. The basic aim is to determine if such a relationship exists and, if it exists, what the consequences on road safety in terms of high operating speed variation between two successive road elements are. Spiral transition curves, tangent-to-curve transition sections, and curve-to-curve transition curves are missing along the whole length of the two-lane rural roads studied. These types of roads were built before the current Road Design Standard (Ministero delle Infrastrutture e Dei Trasporti 2001) became law. These Rules require the design and introduction of spiral transition curves along the horizontal alignment length because they help to provide a gradual change in the alignment curvature and also avoid abrupt speed variations while negotiating the change of curvature. Passetti and Fambro (1999) investigated whether spiral transitions can affect the speed at which vehicles move along horizontal curves on rural two-lane roadways. Passenger car free-flow speeds on spiral transition curves were compared with the speeds of free-flow passenger cars on circular curves with similar geometric characteristics. Using regression techniques, it was concluded that spiral transitions did not significantly affect the 85th percentile speed of drivers on horizontal curves, and spiral transition curves may affect vehicle speeds as the curve radius decreases.
Few studies in the scientific literature have dealt with a complete speed profile for roads without spiral transition curves (Dell' Acqua, Russo 2010; Figueroa, Tarko 2005a, 2005b. Using an iterative process, they obtained a deceleration/acceleration transition length divided between the approach/departure tangent to/ from the horizontal circular curve and on the circular element. They subsequently calibrated predictive speed models on tangents and curves by removing speed values falling within the transition portions of the tangents and circular curves from the database. Models available in the scientific literature make it possible to reproduce real driver speed behaviour on the horizontal alignment in order to carry out safety evaluations (Arndt et al. 2011;Chen, S., Chen, F. 2010) and road consistency assessments along the whole length (Boonsiripant et al. 2011;Dell' Acqua et al. 2013).
Many researchers have dealt with driver speed behaviour on rural roads to identify all the possible factors that can be directly linked to safety conditions during travel (Dell' Acqua, Russo 2011) and to analyze changing driver behaviour in relation to rapid evolutionary change in the surrounding environment (Török 2011). Promising methodologies that can be employed to improve roadway safety performance and to choose the best improvements have been proposed. Designing highways to influence driver-operating speed effectively through environmental feedback is a key research field requiring special attention (Stamatiadis, Hartman 2011).
Speed has been found to have a very large effect on road safety, probably larger than any other known risk factor. Speed is a risk factor in all accidents, ranging from the smallest fender-bender to fatal crashes. The effect of speed is greater in accidents leading to serious injury and fatal events than in property-damage only accidents (Elvik et al. 2004;Weiss at al. 2014). Louah et al. (2009) realized an equation appropriate on circular curves and on independent tangents: predicted V 85 values are a function of the curvature radius as a constraint on the asymptotic value when the curve radius tends to infinity. In fact, for approach tangents longer than 500 m, a constant value is given by the equation with the asymptotic value of 102 km/h. Complementary fits were then carried out using uncorrected data, introducing the year and the speed limit as explanatory variables. Castro et al. (2013) developed an operating speed model for passenger cars on circular curves collecting data on 42 elements of two-lane rural highways in South-West Spain integrating laser, GPS and GIS systems. They made a comparison between this model and several models developed previously in other European countries. Morris and Donnell (2014) proposed operating speed prediction models for passenger cars and trucks on multilane highways combining horizontal curves and steep vertical grades. The findings indicate that the radius of horizontal curve and increasing the right shoulder width appear to have a larger influence. Higher posted speed limits were associated with higher truck and passenger car operating speeds.
Despite the widespread acceptance and use of speed limits throughout the world, there has been no consensus among practitioners concerning the methods and techniques that should be used to select the most appropriate speed limit for a particular facility. Statutory limits are based on the concept that uniform categories of highways can operate safely at certain maximum speeds under ideal conditions. Most engineering approaches to speed limit setting are based on the 85th percentile speed. The typical procedure is to set the speed limit at or near the 85th percentile speed of freeflow traffic. Adjustments intended to either increase or decrease the speed limits may be made depending on infrastructure and traffic conditions. Specifically, research at the time had shown that travelling at or around one standard deviation above the mean operating speed yields the lowest crash risk for drivers. Furthermore, crash risk increases rapidly for drivers travelling two standard deviations or more above or below the mean operating speed. Therefore, the 85th percentile speed separates acceptable speed behaviour from unsafe speed behaviour that disproportionately contributes to crash risk. Variable speed limits are speed limits that change, using dynamic sign messages, based on road, traffic, and weather conditions. Variable speed limits offer considerable promise in restoring the credibility of speed limits and improving safety by restricting speeds under adverse conditions. Variable speed limit systems may use sensors to monitor prevailing traffic and/or weather conditions, and input from transportation professionals and law enforcement in posting appropriate enforceable speed limits on dynamic message signs (Forbes et al. 2012).

Data Collection
This study is a continuation of a 2010-2011 (Dell' Acqua, Russo 2010, 2011) research project that developed procedures to predict speed factors for horizontal curves and tangents on low-volume roads reproducing real driver speed behaviour at each section of the horizontal alignment with geometric variables and non-geometric variables.
A total of 184 km of two-lane rural road network in Southern Italy was involved and investigated in the research project here presented: 509 speed measurements were collected, 42% of which were on tangents (total of 212 investigated road sections) and 58% on circular curves (total of 297 investigated road sections). A total of almost 7090 hours of speed data were collected in 3 years. As will be explained at the next point in the manuscript, an accurate procedure has been implemented to define transition zones at each circular curve, available in the scientific literature (Figueroa, Tarko 2005a, 2005b. This methodology has made it possible to identify portions of circular curves and near tangents perceived as transitions approaching or leaving the curve. Speed data collection was carried out using laser detectors in specific environmental and traffic conditions where drivers can reach best driving performance (Dell' Acqua 2015): i.e., dry roads, free flow conditions, daylight hours, and good weather conditions. Time headways of 5s or more were used to identify free-flow vehicles (Figueroa, Tarko 2005b). The average number of speed observations made was 300, with at least 100 observations in free flow conditions at each spot. Laser detectors were hidden from the view of drivers and placed on a tripod beside the road for two to three hours. The detector emitted and received a pair of laser beams and it recorded the time, instantaneous vehicle speed, vehicle length and both travelling directions for each vehicle.
The laser detector was placed on the beginning section, middle section and end section of each tangent segment and circular curve. Driver speed behaviour approaching the curve and leaving the curve has been carefully examined in previous research by the authors (Dell' Acqua, Russo 2010): the number of sections sur-veyed was increased from the beginning and end section of each curve. The first change was from 30 to 50 m, the second from 80 to 120 m and the third change from 140 to 200 m. Motorcycles and trucks were eliminated from the database because the main goal is to calibrate operating speed prediction models on tangents and circular curves for the highest percentage of the traffic component of the average annual daily traffic. At least 100 free flow speeds were measured at each site. The rural roads examined in this manuscript are located on flat terrain with a vertical grade of less than 5%. Roadway grades have a different effect on vehicle speeds, depending on vehicle and roadway characteristics. For example, passenger cars can generally negotiate gradients of 5% or less without considerable reductions in vehicle speeds, while heavy-duty trucks are affected significantly by gradients because of their inferior operating capability. The collected vertical grades for the Italian case study make it possible to disregard the effect of the vertical grade on the operating speed value for only the passenger cars.
Vertical grades and increasing lane width appeared to have a more significant influence on truck operating speeds than on passenger car speeds (Morris, Donnell 2014). Some guidelines are available in the scientific literature where the effect of grade on vehicle performance and a list of road types which would be suitable for these grades are shown in Table 1 (DTMR 2015).
Other research works have shown how the maximum road grades are determined by vehicle performance, particularly heavy vehicles, and level of service criteria. On high-speed roads, grades of up to 3% provide road users with a good level of service, and minimise the adverse effects of speed variance between different types of vehicle. On roads with more modest operating speeds, grades of up to about 6% do not usually cause noticeable problems with speed variance. Grades steeper than 10% often cause speed variance problems. The main problem is the very slow uphill speeds of heavy vehicles, but there is also the potential for high downhill speeds on steep grades and the safety problems associated with these (NZ Transport Agency 2000).
The calibration network involved 172 km of the total, while the remainder was used to validate the models. Speed observations on curves with a radius greater than 500 m were included in the same database because driver behaviour is very similar to that adopted on the tangents analysed in terms of acceleration and deceleration motions leaving and approaching the circular curve (Dell' Acqua, Russo 2010). The experimental analysis proved that curves with a radius greater than 500 m were not binding elements of the horizontal alignment or restrictive for high driving performance because the resulting deceleration rate approaching the curve and/ or acceleration rates leaving the curve were less than 0.1 m/s 2 .
On tangent segments, calculated operating speeds vary from 35 to 110 km/h with a mean value of 79 km/h. The mean roadway width (travel lanes plus shoulders) is 7.51 m. Tangents have a minimum length of 22 m and a maximum length of 2.6 km. On circular curves, calculated operating speeds vary from 20 to 112 km/h with a mean value of 63 km/h. A careful analysis showed that on tangent lengths greater than 1 km the operating speed value adjusts to an asymptotic value of 110 km/h while on circular curves with a radius of less than 60 m the operating speed value adjusts to asymptotic value equal to 44 km/h; therefore, these study sites were not taken into account in calibrating operating speed prediction models like the points on the transition zones. Table 2 shows the descriptive statistics of the mean features observed on the investigated road network.
The variables in Table 2 are as follows: L is the total element length; W is the travel lanes plus shoulders; RES is the number of residential drive-ways per km; R is the radius of the geometric element; 1/R is the curvature of the geometric element; CCR is the sum of the absolute values of angular changes on the horizontal alignment divided by the total length of the road section; CCR s is the curvature change rate of the single circular curve with transition curves (Wolhuter 2015) equal to 63700/R without transition curves.
The curvature change rate is a variable that reflects the mean tortuosity of the horizontal alignment of a geometric road. Manuscripts are available in the literature where CCR is used to divide the sample into homogeneous segments, investigate the road consistency (Fitzpat-rick et al. 2000), as well as one of several explanatory variables to calibrate operating speed prediction models on tangents and circular curves (TRB 2011). In order to identify homogenous segments, we referred to indications of the German standard (RAS-L:1995).
A diagram is plotted: on the x-axis there is the road distance expressed in km, and on the y-axis there is the cumulative of the absolute value of the angular changes. The slope of each fitting line with the highest coefficient of determination calculated using the Ordinary Least Squares (OLS) method shows the curvature change rate of a homogeneous road segment. No more than three homogeneous road segments were identified on the investigated rural roads.

Data Analysis
During the study carried out in 2010 (Dell' Acqua, Russo 2010) V 85 profiles were designed for two travel directions for all selected roadways, and a careful analysis was carried out to identify the real transition zone occupied by drivers to decelerate on approaching a curve and to accelerate on leaving the curve for each circular element. The study confirmed that two-lane rural roads without spiral transition curves between tangent and circular curves as well as between circular curve and circular curve have transition zones that can enclose a portion of a circular horizontal curve and near tangent. It has been assessed how the mean value of the deceleration transition length L d is 115 metres and the mean value of the acceleration transition length is 130 m. In particular L d is divided thus: 60% on the approach tangent at the horizontal curve from the beginning section of the circular curve and 40% on the circular element, while L a is divided as follows: 51% on the departure tangent from the end section of the circular curve and 49% on the circular curve. In particular, it has been observed that the mean deceleration rate used by drivers entering circular elements, is equal to 0.70 m/s 2 and the mean acceleration rate is equal to 0.68 m/s 2 and is adopted by drivers coming off the curves. Therefore, once the deceleration transition length and acceleration transition length for each circular curve have been measured on the basis of the results of the previous work, assuming a uniform motion along the transition distance, the transition segments are identified at each circular element, and operating speed prediction models on tangents and horizontal curves are calibrated using the remainder of the collected speed values. The potential geometric elements used to study driver speed behaviour were 80 tangent segments, 40 circular elements and 70 tangentcurve-tangent transitions identified during the study of deceleration and acceleration actions. As a result of the transitions identified at each circular curve: a) 141 sections over a total of 297 placed on the circular curves fell within the identified transition zones (156 speed sections belonged to real circular curves without transition length), and b) 53 sections over a total of 212 placed on the tangent segments fell within the identified transition zones (159 speed sections belonged to real tangent segments without transition length). The main goal was to calibrate and validate different operating speed prediction models for the safety management: a) one model on tangent segments, b) one model on circular curves, and c) only one model to be used at the same time on tangents and circular curves. Before moving to the calibration phase, the Pearson's correlation index that expresses the linearity between the covariance of two variables and the product of their respective standard deviations, was estimated. The most significant variables for the V 85 are in bold type in Table 3.
The speed measurements of each first two datasets (a, b) were grouped into ten homogeneous substrates while for the remaining dataset (c), sixteen substrates were defined by using a hard C-means algorithm. Clustering is basically considered as the classification of similar objects or, in other words, it is the precise partitioning of datasets into clusters so that the data in each cluster share some common trait. The hierarchical, partitioning and mixture model methods are the three major types of clustering processes applied for organising data. The choice to apply a particular method generally depends on the type of output desired, the known performance of the method with a particular type of data, the available hardware and software facilities, and the size of the dataset (Rao, Vidyavathi 2010).
k-means or hard c-means clustering is basically a partitioning method applied to analyse data, and treats observations of the data as objects based on locations and the distance between various input data points. It partitions the objects into mutually exclusive clusters (K) in such a fashion that objects within each cluster remain as close as possible to each other but as far as possible from objects in other clusters. Each cluster is characterized by its centre point i.e. the centroid (Ghosh, Dubey 2013).
A cluster analysis was carried out before moving to the calibration phase of the operating speed prediction models for assembling different objects into groups in such a way that the degree of association between two objects is at its maximum when belonging to the same group, and minimum otherwise. By following this procedure, the cluster is treated as the sampling unit so the analysis can be carried out on a population of clusters. The main objective is to reduce the scattering of measures, focusing on the mean value of the cluster by increasing sampling efficiency. Many authors such as Azimi, Zhang (2010) and Prassas et al. (1996) implemented cluster analysis to examine traffic data.
After several iterations to maximize the reliability and effectiveness of the prediction models, and reflecting the results in Table 2, the variables used to cluster road sections using a hard c-means algorithm falling within tangents and circular curves, as well as sections on tangents and on circular curves, were joined to calibrate  Figure shows the process adopted to develop the operating speed models.
For each cluster, two filtering criteria (the Chauvenet criterion and the Vivatrat method) were then applied to remove anomalous data before moving on to the calibration phase of the predicted operating speed models. As part of the literature review, a number of statistical anomaly detection approaches were developed to identify anomalous data (Kasunic et al. 2011) including statistical control chart techniques (3-sigma outlier, moving range, SPI/CPI outlier), Grubbs, Rosner, and Dixon tests, and Tukey box plots.
A comparative approach was adopted in order to filter the operating speed values at each investigated road section for each defined cluster by using two different criteria aiming to check the reliability of the techniques and if no huge differences exist, to identify and remove anomalous speed points. The results, as will be shown later, confirmed that small differences exist in terms of the number and value of the removed anomalous speed values. This confirms the consistency of filtering tech-niques and an irrelevant error in the model calibration by removing a few data from the sample to optimize the statistical reliability of the models. No more than 5% of the total operating speed values were removed.
Although the number of the anomalous speed values at each cluster was small and no huge difference was observed by using two different filtering techniques, statistical tools helped to identify the best dataset during the calibration and validation phase. The coefficient of determination of the models, the significance level of the explanatory variables, the performance diagrams, and the outcomes of the validation procedure based on the residuals' analysis, as well as ANOVA, helped to identify the more suitable models from among those calibrated.
The Chauvenet criterion is based on the Eq. (1) as follows: , , is not rejected if the corresponding distance from the mean is less than d * (Foti, Gianino 1999). Table 4 shows that , j i cluster v belongs to the 7-th cluster of the tangent only segments and the distance from the relative , j mean cluster v .
With regard to the example shown above, the third operating speed value is anomalous and was rejected.
The Vivatrat method (Vivatrat 1978) is widely used in geotechnical engineering. The method is based on estimates of a value range that can be considered fluctuations of the 'regular' measurements compared with values estimated as 'abnormal' . This procedure can be summarized as follows: -divide the speed measurements for each subset (a, b, c) into homogeneous substrates (clusters) by using the explanatory variables defined above; -order the measurements for each cluster in increasing calculated V 85 values; -determine the mean and standard deviation of operating speed distribution for each substrate (cluster); -determine 'representative dispersion' S r for each cluster defined as the minimum value among the following expressions: where: S i-1 , S i and S i+1 are the standard deviations for (i-1)-th cluster of the dataset (a, b, c), i cluster, and (i+1)-th cluster, respectively; -for each cluster, the measurements outside the range i r A S µ ± ⋅ will be removed where m i is the mean of the measures belonging to each cluster i; S r is the standard deviation and A is a coefficient that defines the amplitude of the semi-interval considered acceptable for the values assumed by the measures. The values of parameter A must come between 0.5 < A < 2.5. Table 5 shows the results from the Chauvenet and Vivatrat criteria for each cluster of the three datasets.
Moving to the calibration phase, the models were created using an iterative process by applying the OLS method with non-linear multiple regression, according to the Gauss-Newton algorithm, based on expansion in a Taylor series of the proposed function (TRB 2011). Table 6 shows a summary of the calibrated operating speed models. The outputs of the Chauvenet criterion were preferred due to the number of significant variables introduced, the higher value for ρ 2 and the lower residuals during the validation phase. Eq. (4) was selected to estimate the operating speed on tangent elements, Eq. (6) for circular curves and Eq. (11) for tangent elements + circular curves.

Results
The three models that best fit the operating speed datasets were validated on geometric elements not included in the calibration phase by estimating three synthetic statistical parameters as shown in Table 7.
The procedure focuses on the residuals analysis equal to the value of the difference measured between the predicted operating speed value using the model and the real value of the operating speed surveyed on the same roadway segment (Hauer et al. 2004).
Finally, a variance analysis (ANOVA) was carried out to ascertain whether the last equation can replace the first two models for predicting operating speed on tangent elements and circular curves.
Operating speed prediction models where i Y  denotes the sample mean in the i-th dataset; n i is the number of observations in the i-th dataset; Y  denotes the overall mean of the data; k denotes the number of datasets; ij Y  is the j-th value in the i-th out of K datasets (3 datasets exist but the comparison has been carried out in pairs); n is the overall sample size (see the results in the Table 5 after the filtering phase for each).
The ANOVA results shown in Table 8 confirm that the last equation (Eq. (11)) predicts comparable operating speed values of Eq. (6) on circular curves and comparable operating speed values on tangents of Eq. (4).

Conclusions
The research aims to create a procedure to identify unsafe road segments on a rural road network as well as potential countermeasures in order to improve the safety management process according to the Directive 2008/96/EC. The V 85 profiles can be used to develop safety studies of existing two-lane rural roads.
By assessing the real difference between the operating speed value using models and the design speed value according to Italian standards, it becomes possible to plan measures to improve the roadway conditions operating on some introduced explanatory factors reducing the gap between two values of speeds.
The number of possible strategies of which we can control the phenomenon and outcomes according to our results is equal to the number of explanatory variables employed in the model. It is not possible to consider working on an explanatory variable to reduce hazardous conditions on the road network without also considering how this variation might affect the influence of the remaining explanatory variables on operating speed and, consequently, on the predictive model. For example, according to Eq. (11), it is possible to reduce V 85 without changing CCR and curvature by reducing the length. In the same way, it was observed that a reduction of the CCR and the curvature, with a fixed value for the length of the roadway segment is associated to a higher V 85 value.