Abstract
Accurate estimation of percentile operating speeds on arterial roads is crucial for calibrating crash-prediction models, evaluating eligibility conditions for traffic safety countermeasures, and informing speed-management decisions. Existing percentile-speed estimation models were developed for specific regions, so their transferability to other geographical jurisdictions needs to be evaluated. To fill this gap, this study develops and validates an 85th-percentile speed (V 85 ) model for non-freeway arterials using probe vehicles, field surveys, and roadway data. A dataset comprising sixty spot-speed surveys (forty-two urban and eighteen rural) collected in Maryland from 2019 to 2025 was matched to corresponding INRIX segment speeds, roadway geometric attributes, and traffic volumes. Ordinary-least-squares regression analyses were conducted for urban and rural settings, incorporating key variables including INRIX segment speed, posted speed limit, directional annual average daily traffic, segment length, lane width, access density, signal density, and functional-class indicators. The proposed model demonstrated high predictive accuracy for both urban and rural segments, achieving substantial error reductions compared with the baseline Texas A&M Transportation Institute model and a locally calibrated model. The findings also show that the posted speed limit is essential in rural V 85 estimation and remains useful in urban contexts. The model also supports network screening for segments that may warrant speed management or safety review.
Introduction
Traffic speed is one of the most readily observed reflections of how drivers perceive and respond to their roadway environment. Among the available operating-speed metrics, the 85th percentile speed (V85) plays an important role. It is fundamental to many applications, including the setting of appropriate speed limits ( 1 ), conducting crash and safety analyses, and designing effective traffic-calming strategies. Accurate estimation of V85 enables transportation agencies to make informed decisions aimed at enhancing roadway safety, efficiency, and overall traffic management.
Traditional methods for estimating V85 typically rely on direct field measurements. Although effective, these methods are often costly, time-consuming, and logistically challenging, especially when covering large roadway segments. Thus, there is a need for alternative, efficient modeling approaches that can reliably estimate V85 using readily available data sources. Existing models, such as the Texas A&M Transportation Institute (TTI) percentile speed estimation model (11), have been developed and calibrated primarily within specific geographic contexts. Despite these developments, their effectiveness and accuracy when applied to different regions remain unclear, indicating a gap in understanding the transferability of such models—particularly in contexts with varied road characteristics and regional driving behaviors.
The objective of this study is to evaluate the applicability of an existing percentile-speed estimation model in Maryland and to develop a Maryland-calibrated V 85 model for non-freeway arterials using probe vehicles, field surveys, and roadway data. The study contributes in two ways. First, it provides a practical screening approach that can estimate V 85 with a limited number of local speed surveys, supporting agencies in identifying segments that may warrant speed management or safety review. Second, it examines two issues that remain important in the literature: the transferability of out-of-region percentile-speed models and the role of posted speed limit (PSL) in V 85 estimation across urban and rural arterial settings. Although the model offers practical estimation capabilities, it is not intended to substitute for formal speed studies. The model cannot capture detailed speed distributions, free-flow speeds, or segment-level speed changes resulting from specific treatments or evolving roadway conditions (e.g., roadside development or parking practices).
The remainder of this paper is structured as follows: the next section reviews relevant literature on V 85 estimation methods and model transferability. This is followed by a description of the research area and data collection. The methodology section details the data processing and model development approach. Results and key implications are then presented, along with an analysis of the model’s performance. The paper concludes with key findings, practical recommendations, and directions for future research.
Literature Review
Role of the 85th Percentile Speed
85th percentile speed refers to the speed at or below which 85% of vehicles travel under free-flow conditions. It is widely used by transportation engineers to represent a reasonable and safe operating speed on a given road ( 2 ). Common applications include speed limit setting, crash and safety analysis, and traffic-calming or speed-management strategies. In the United States, the Manual on Uniform Traffic Control Devices (MUTCD) recommends that PSL be set within 5 mph of the measured V 85 ( 3 ). Minor downward adjustments are allowed to account for contextual factors such as a high crash history, roadway geometry (e.g., curves), development density, and the presence of pedestrians. In safety analysis, V 85 is frequently used to characterize the speed environment. For example, the Highway Safety Manual (HSM) includes procedures that adjust crash predictions based on deviations from expected V 85 ( 4 ). In traffic calming and speed management, measures such as vertical deflections (e.g., speed humps, speed tables, and raised crosswalks) often require the V 85 as an input to determine appropriate design and effectiveness ( 5 ).
Models for Estimating 85th Percentile Speed
Models for estimating V85 are categorized into statistical models and artificial intelligence (AI)/machine learning models. These models aim to represent how drivers respond to their surrounding environment. The objective is to capture key contextual factors—such as intersection frequency, roadside development, and land use—that influence driver speed choice.
Statistical models often use variables related to road geometry, roadside environment, traffic characteristics, and occasionally surface and weather conditions. Common predictors include curve radius, curvature, grade, lane width, roadside object density, driveway density, adjacent land use, annual average daily traffic (AADT), and PSL. Early models focused on rural highways and used linear or polynomial regression. For example, Morrall and Talarico ( 6 ) and Islam and Seneviratne ( 7 ) modeled V 85 as a function of horizontal curve geometry. Urban and suburban models use multiple linear regression and include variables such as the number of lanes, presence of a median, lane and shoulder width, roadside characteristics, intersection density, and land use. Several studies have examined the relationship between roadway characteristics and operating speeds, with differing views on the inclusion of the PSL in prediction models. Wang et al. used a mixed-effects regression model and found that lane count increased operating speed, whereas roadside object density, driveway density, intersection density, the presence of sidewalks, and on-street parking reduced it ( 8 ). Speeds were also higher in commercial and residential zones compared with park or office areas. Although PSL is often correlated with the V 85 , they excluded it from their model because of its strong correlation with geometric design variables, which could introduce endogeneity and reduce the interpretability of other covariates. In contrast, Himes et al. evaluated whether PSL should be included in speed models using ordinary least squares (OLS) regression and simultaneous equations ( 9 ). They demonstrated that excluding PSL leads to omitted variable bias and overestimation of other variables’ effects, particularly geometric factors. Their results showed that PSL significantly influences both mean speed and speed variance. Additional testing indicated that PSL can be treated as an exogenous variable, supporting its inclusion in operating speed models without introducing endogeneity concerns. Similarly, the NCHRP 15-18 study found PSL to be the only statistically significant predictor of operating speed among several roadway variables ( 10 ).
More recently, Fitzpatrick et al. developed regression models using INRIX probe data to estimate 85th percentile speed and average speed ( 11 ). The models account for roadway type (freeway versus non-freeway) and setting (urban versus rural), incorporating variables such as INRIX average speed, segment length, number of lanes, lane width, AADT, truck percentage, directional factor, access and signal density, and functional classification. The models demonstrated strong explanatory power, with adjusted R 2 values ranging from 0.72 to 0.88. Lan and Zhao studied methods for selecting an appropriate reference speed to calculate highway performance measures using vehicle probe data ( 12 ). The study focused on both freeways and arterials. For freeways, V 85 during low-volume periods was identified as the most consistent and reliable reference. For arterials, however, reference speed estimation remained challenging because of inconsistent speed patterns caused by signal control, access points, and variable traffic conditions. Probe data availability was also lower and less reliable, especially during nighttime. The research team recommended using the V 85 for freeway performance evaluation and emphasized the need for further research on arterial methodologies.
AI and machine learning models, particularly artificial neural networks (ANNs), have been applied to capture non-linear relationships among input variables. Singh et al. developed an ANN model for two-lane rural highways in Oklahoma using inputs such as lane and shoulder width, traffic volume, PSL, skid resistance, roughness, and crash statistics ( 13 ). Models that included PSL and geometric features achieved higher accuracy. Their results showed that wider lanes and higher PSL increased V 85 , whereas higher traffic volume, crash rates, and poor pavement conditions reduced it. Semeida found that ANN models reduced V 85 prediction errors for both passenger cars and trucks compared with regression models ( 14 ).
Several contextual factors should be considered when applying or developing V 85 models. A key distinction exists between urban and rural environments. Urban roadways typically include more frequent intersections, higher access density, pedestrian activity, and traffic controls, which tend to lower vehicle speeds. In contrast, rural roads often feature uninterrupted flow, wider lanes, and fewer roadside conflicts, leading to higher speeds. Even within the same environment type, functional road classification plays a critical role in shaping speed distributions. For example, a collector street—which generally has lower design standards, frequent stop control, and residential access—tends to exhibit lower V 85 than a major arterial with coordinated traffic signals and limited access. Similarly, models developed for multilane highways or freeways consider different variables such as interchange spacing, terrain, and vehicle mix, compared with models for two-lane rural roads, where combined horizontal and vertical alignment may be more influential. Vehicle composition also varies across functional classes, with rural highways and freeways typically carrying a higher proportion of heavy vehicles, affecting the speed profile. In addition, factors such as driver behavior, population type (e.g., areas near retirement communities or university campuses), level of enforcement (e.g., presence of police or automated cameras), availability of off-street parking, and environmental conditions (e.g., lighting and weather) are difficult to quantify but can significantly affect speed behavior. Although these factors may affect how drivers choose their speeds, they are often excluded because of data limitations. Given these complexities, model recalibration or appropriate adjustments are needed to keep predictions reliable under different road and operational conditions.
In summary, statistical models have evolved from simple regressions to advanced methods such as mixed-effects models, while AI-based models such as ANNs offer improved accuracy by capturing non-linear interactions. As V 85 predictions can vary by environment and road type, key contextual factors—such as urban versus rural settings, functional classification, and enforcement or driver characteristics—should be considered, and models should be adjusted accordingly.
Model Transferability and Local Calibration
Cross-regional transfer of traffic prediction models is an alternative for regions with limited data resources. By transferring models trained in data-rich areas, agencies can save on both data collection and development costs. However, simple transfers without adaptation often result in large prediction errors because of differences in road capacity, traffic demand, and other contextual factors ( 15 ). To address this, transferred models must be adjusted to reflect the target region’s characteristics.
A common approach to evaluating model transferability involves a combination of statistical tests and comparative performance assessments. For example, researchers may test whether key model parameters estimated in the source region remain statistically valid in the target context using chi-square or t-tests. Another standard criterion is predictive accuracy—specifically, how closely the transferred model’s output aligns with observed traffic data in the new region. This is often benchmarked against a model trained directly with local data. Beyond goodness-of-fit, some frameworks also assess policy sensitivity—whether the transferred model responds realistically to policy or operational changes such as speed limit modifications or shifts in travel demand. This helps determine whether the transferred model can approximate the behavioral dynamics of a locally developed model under changing conditions ( 16 ). Through such evaluations, analysts can decide whether a model is ready for direct deployment or requires redevelopment.
Local calibration, even when applied minimally, has been shown to enhance the performance of pre-trained speed prediction models. By integrating region-specific data—such as driver behavior patterns, road geometries, or fleet compositions—models can more accurately reflect real-world conditions on both highways and arterial networks. An example is the crash prediction model in the HSM, whose performance significantly improves when calibrated with local crash records. However, local calibration also carries certain risks. Excessive tuning to local data may lead to overfitting, reducing the model’s generalizability and causing accuracy to degrade under different conditions. Moreover, when local data are scarce or of poor quality, the calibration process becomes challenging. It may require additional optimization procedures, trial-and-error, or the use of advanced algorithms. Consequently, a practical strategy is to start with a validated external model and apply limited, targeted local calibration. In many real-world scenarios, partial calibration can yield substantial improvements in predictive performance without the resource burden of full model redevelopment.
Research Area
The study focuses on the State of Maryland, concentrating on its state and U.S. numbered highways, which together form the region’s highway network. Within this network, the study focuses on arterial roads because of the following key considerations:
Larger discrepancy between posted and actual speeds: Arterial roads tend to show more variation between PSL and observed V 85 , making them more suitable targets for predictive modeling.
Higher complexity in roadway environments: Arterials involve intersections, traffic signals, variable lane configurations, and typically lack full access control or median separation. These factors create more complex speed dynamics compared with freeways.
Stronger relevance to policy interventions: Speed-management strategies, such as speed limit modifications and enforcement, are more frequently applied to arterials, given their direct interaction with local traffic and pedestrian environments.
More careful interpretation with probe data required on arterials: Unlike freeways, arterials involve various factors, such as signals and unsignalized access points, that complicate the direct use of probe data for estimating free-flow or V 85 speeds.
Figure 1 illustrates the roadway network in Maryland and the locations of spot speed surveys. The map displays sixty survey sites, including forty-two urban and eighteen rural locations. It also includes twelve model testing sites, with eight urban and four rural, used to develop and test the predictive model. This spatial distribution represents the various arterial segments used to develop and test the speed percentile forecasting model.

Maryland network and all speed survey spots on the Maryland map.
Methodology
The methodology framework is shown in Figure 2. Data are collected from multiple sources, including speed study reports, probe vehicle data, road geometry, and traffic volumes. Key variables include V 85 speeds, segment length, lane width, curb presence, signal and driveway density, PSL, and traffic factors such as AADT, D-factor, and K-factor.

Overview of the research methodology.
After data processing, the selected variables are used to build a regression-based estimation model that predicts V 85 spot speeds. The model is evaluated through performance analysis and statistical tests, including normality tests and outlier analysis, to ensure reliability and identify segments where the model performs poorly.
The final model is applied for segment screening by comparing estimated speeds with PSLs. This helps identify road segments that may require attention for speed management or safety review, supporting data-driven decision-making in transportation planning and operations.
The proposed framework clarifies the model structure by specifying the variable combinations, functional forms, and separate regression equations for urban and rural arterials. It provides selected parameter estimates from the literature, but also highlights the need to adjust these coefficients through local calibration when possible, so that local driving behavior and roadway conditions are reflected. These components together make the framework transferable but flexible, supporting practitioners who rely on probe data and limited spot-speed surveys for estimating operating speeds across diverse arterial conditions.
Data Collection and Process
This study aims to develop a forecasting model for percentile speeds. To this end, four datasets were utilized: (1) Maryland speed study reports, (2) probe data, (3) road geometry data, and (4) traffic volume.
Maryland Speed Study Reports
Ground-truth spot V85 values were obtained from Maryland Department of Transportation speed-study reports. Sixty survey sites distributed across both freeways and non-freeway corridors were sampled during campaigns carried out between 2019 and 2025, providing location-specific observations of V 85 values, and speed limit against which model estimates can be compared.
Probe Data
Probe-based explanatory variables, sourced from the INRIX Probe Data Analytics, include average speed, reference speed, segment length, functional class, and timestamps for each road segment across the state. Although continuous observations are available from January 1, 2019, to April 1, 2025, only data corresponding to the year of each speed study were extracted for analysis. One-minute records were aggregated into hourly intervals to align with the spot speed-survey data.
Road Geometry Data
Road geometry can also influence percentile speeds. If a road segment has numerous traffic signals or other interruptions, it will affect the space speed, resulting in significantly lower values compared with the percentile spot speed. In this study, road geometry data, including lane width, the number of curbs, signal density, and driveway density, were collected from Google Maps for the corresponding time periods.
Traffic Volume Data
Traffic volume data were obtained from publicly available state-level AADT databases, organized by functional classification. The dataset includes the K-factor, the proportion of daily traffic occurring during the peak hour, and the D-factor, which indicates the directional distribution of peak-hour traffic. Additionally, the number of lanes was recorded for each segment.
In this section, influential factors associated with the percentile speed estimation are extracted from existing datasets.
where
For urban roads, the road types are categorized in urban principal arterial, urban minor arterial, urban major collector, urban minor collector, and urban local roads.
Regression Model
To estimate the V85 on roadway segments, we employed an OLS regression model. The dependent variable is the observed spot V85 from field surveys, and the independent variables include traffic volume, roadway geometry, and functional classification attributes derived from INRIX segment data and other sources.
The general model is written as:
With the Maryland data listed in the previous section, the V85 model for rural and urban non-freeway segments is proposed using two Maryland spot-speed datasets. The rural dataset comprises surveys from eighteen two-lane and multilane rural highway segments, including minor arterials and major collectors. The urban dataset comprises surveys from forty-two urban highway segments, including two-lane and multilane principal arterials, minor arterials, and major and minor collectors. Each survey site corresponds to a homogeneous roadway segment where geometry, traffic control, and roadside conditions remain approximately constant within the influence area of the speed-measurement station. At each site, field observations were used to obtain the spot 85th-percentile speed, which served as the dependent variable for model development.
Rural Maryland Model
The initial model specification included all available variables that were theoretically relevant and consistently observed across the rural sites in the previous section. To assess multicollinearity, we first examined the Pearson correlation matrix (Figure 3) among continuous variables (excluding dummy indicators) and then computed variance inflation factors (VIFs) for all predictors, retaining K−1 dummies for each categorical variable to avoid the dummy-variable trap ( 17 ). Following common practice, VIF values above ten were treated as evidence of severe multicollinearity

Correlation matrix for rural Maryland model. (a) Correlation matrix with all rural variables and (b) Correlation matrix without driveway.
In the rural dataset, driveway density
Variance Inflation Factors Rural Maryland Model
Note: NA = not available.
In summary, based on Pearson correlation and VIF analysis for the rural Maryland model, the final rural model retained segment length
Urban Maryland Model
The initial urban model specification included all variables that were theoretically relevant and consistently observed across the urban sites. As in the rural case, we first examined the Pearson correlation matrix (Figure 4) among continuous variables excluding dummy indicators and then computed VIFs for all predictors, retaining K−1 dummies for each categorical variable to avoid the dummy variable trap. VIF values above ten were treated as indicative of severe multicollinearity.

Correlation matrix for urban Maryland model. (a) Correlation matrix with all urban variables and (b) Correlation matrix w/speed difference.
In the initial urban specification, directional AADT per lane
Variance Inflation Factors Urban Maryland Model
Note: NA = not available.
In summary, based on the Pearson correlation and VIF analysis for the urban Maryland model, the final urban specification retains segment length
Results
In this section, the model’s performance is first evaluated by comparing it with the estimation model proposed by TTI and a TTI model calibrated using Maryland data. Next, a normality test and outlier analysis are conducted to identify scenarios in which the model performs less effectively. Finally, a case study demonstrates how the model can be applied in practice, using an example to screen other segments.
Performance Analysis
To evaluate the performance of the speed estimation model proposed in this study, we compare it against the percentile speed estimation model developed by TTI, including both the original TTI model and a recalibrated version fitted with Maryland-specific data. The original TTI model (TTI) serves as the baseline, applying parameter estimates derived from the original study. The calibrated Maryland model (Cali_TTI) is a modified version of the TTI model, refitted using Maryland-specific data while excluding speed limit as an explanatory variable.
Table 3 summarizes the estimated coefficients for three model specifications. Each specification is estimated separately for urban (N = 42) and rural (N = 18) arterial segments.
Estimated Coefficients and Confidence Levels by Area Type for V85 Models
Note: Cali_TTI = calibrated Maryland model; TTI = Texas A&M Transportation Institute; NA = not available.
***p ≤ 0.01; **p ≤ 0.05; *p ≤ 0.10.
From Table 4, in urban areas, all models achieve relatively high R2 values, indicating good explanatory power. Although the TTI model achieves a higher R2 than the Maryland model, its mean absolute percentage error (MAPE) is much larger; therefore, despite tracking the overall speed pattern well, the TTI model systematically over- or underestimates, leading to greater prediction error. In rural areas, the Cali_TTI model performed poorly, as excluding the speed limit did not capture actual operating speeds. The strong correlation (0.71) between PSL and actual V85 indicates that the speed limit is the most important predictor. This is confirmed by the model results: excluding PSL reduced the R2 to 0.014, whereas including it improved the R2 to 0.751. Note that unlike the TTI context where geometric variables or stronger INRIX data effectively captured operating speeds, the lower INRIX correlation (0.42) in Maryland necessitates PSL as a calibration tool. This indicates that probe data remain valuable but require PSL as a correction factor for reliable rural estimation.
R 2 and Mean Absolute Percentage Error (MAPE) of Urban and Rural V85 Estimation Models
Note: Cali_TTI = calibrated Maryland model; TTI = Texas A&M Transportation Institute.
With regard to prediction error, the Maryland model clearly outperforms the alternatives in both settings. In urban segments, the Maryland model yields the smallest MAPE (4.7%), with the Cali_TTI model slightly higher (5.0%) and the original TTI model much less accurate (18.7%). In rural segments, the Maryland model again produces the lowest MAPE (2.5%), compared with 7.9% for the TTI model and 4.4% for the Cali_TTI model. It is worth noting that although the rural Cali_TTI model shows a low MAPE (4.4%), this should be interpreted with caution. Its negligible R2 suggests the model fails to explain the variance in operating speeds, implying that the low error rate is merely an artifact of the data distribution rather than a sign of predictive reliability. Thus, when PSL is omitted, prediction errors increase in both rural and urban applications, and the rural Maryland case in particular shows that a model calibrated with PSL provides substantially more reliable estimates of spot 85th-percentile speed.
Taken together, these results suggest that the posted speed limit is an important predictor in both urban and rural contexts, especially when only a small number of local calibration sites are available. In urban segments, congestion, signal density, and geometric design undoubtedly shape operating speeds, yet the Maryland model that includes PSL still achieves the lowest MAPE. This indicates that even where traffic control and geometry play a dominant role, PSL provides additional explanatory power and helps stabilize the model under limited-sample conditions. In rural segments, the effect of PSL is even more pronounced: omitting it leads to substantially larger prediction errors, whereas the Maryland model with PSL captures observed V85 much more accurately. This pattern is consistent with the idea that rural PSLs tend to align closely with prevailing driver behavior under relatively simple and unconstrained operating conditions.
In addition, the statistical relationship we observe between PSL and V85 reflects the type of connection addressed in the MUTCD guidance, which states: “On a freeway, expressway, or rural highway (outside urbanized locations or conditions), the speed limit that is posted within a speed zone should be within 5 mph of the 85th-percentile speed of free-flowing motor-vehicle traffic under the following conditions.” This reflects a widely adopted practice in which PSLs are often adjusted based on measured operating speeds. Although our model does not determine a causal direction between PSL and V85, it reveals a strong statistical relationship. As V85 increases, there may be upward pressure to raise PSL; similarly, higher PSL segments tend to show higher V85. This bidirectional relationship suggests that speed limits not only influence driver behavior but may also reflect it.
In brief, rather than being excluded in the previous model, PSL serves as a useful variable in both urban and rural Maryland. Incorporating PSL improves predictive reliability and helps bridge the gap between probe-based speed indicators and field-measured spot speeds. This finding aligns with the literature ( 8 ): in urban areas, PSL is strongly shaped by geometric and operational constraints but still carries residual information about drivers’ speed choices, whereas in rural areas, where alignment and control are simpler, PSL can reasonably be treated as an independent term that directly captures the intended operating regime.
Normality Test and Outlier Analysis
To evaluate the suitability of the regression model and verify that key assumptions hold, a normality assessment was conducted on the prediction errors. Figure 5 shows that the residuals are approximately normally distributed, with a symmetric, bell-shaped pattern centered around zero and no evident skewness or extreme outliers. This visual observation is supported by the results of three established normality tests. The Shapiro–Wilk test yields a test statistic of 0.981 and a p-value of 0.360, indicating that the null hypothesis of normality cannot be rejected. The D’Agostino–Pearson omnibus test produced a p-value of 0.204, further supporting the assumption of normality. The Anderson–Darling test returned a test statistic of 0.486, which is below all critical values (minimum = 0.548), confirming the absence of significant deviation from a normal distribution.

Prediction error distribution.
In addition to the normality check, an outlier analysis was conducted to identify conditions where the model may underperform and to understand sources of large prediction errors. Using the standard deviation of prediction errors σ of 2.51 mph and a mean residual of 0.20 mph, a 95% confidence interval was defined as [−4.72, 5.11] mph. Prediction errors lying outside this range were flagged as outliers for further investigation.
Six segments illustrating a range of deviations between predicted and actual V85 values were selected to examine contexts associated with both high and minimal prediction errors (Table 5):
Comparison of Prediction Accuracy Across Selected Segments
Segments with minimal prediction errors, such as MD 14 at Cloverdale and US 1 (north of Dr. Patel Drive), represent typical arterial environments where the model effectively captures V85. The rural segment (MD 14) features linear geometry with negligible roadside interference, resulting in a predictable driving environment where behavior is heavily dependent on V p and roadway class. Similarly, the urban segment (US 1) exhibits the characteristics of a straight suburban arterial. Despite the presence of commercial access points and townhouse entrances, the segment maintains near free-flow conditions, as the disturbances from access points are insufficient to significantly disrupt traffic momentum. In both instances, the absence of unexpected geometric constraints ensures that V p serves as the primary determinant, allowing the model to accurately estimate V85.
The model underestimates V 85 at MD 12 at Coolspring United Methodist Church and MD 586 at Bushey Drive, with prediction gaps of 2.6 mph and 4.5 mph, respectively. For the rural segment (MD 12), the underestimation is likely driven by a directional disparity in V p . With opposing traffic streams subject to different regulatory limits (30 mph versus 40 mph), the environment lacks uniformity, potentially prompting drivers to disregard the lower limit in favor of a higher V 85 consistent with the faster direction. Meanwhile, the urban segment (MD 586) represents a high-density residential corridor with frequent signalized intersections (DSig = 4). This suggests that the model tends to underestimate speeds in sections with aggressive driving behavior. Although geometric variables such as DSig imply lower speeds, actual driving patterns in this corridor consistently exceed the speed reductions expected from the signal density, resulting in observed speeds that are higher than model predictions.
Conversely, the model overestimated speeds for MD 363 in Dames Quarter. This rural major collector segment represents a transition zone, with the PSL decreasing from 50 mph to 40 mph. It appears the model does not fully account for the influence of transitional speed limits, which often cause drivers to slow down in anticipation of the upcoming lower limit. Similarly, an overestimation occurred along MD 450 at the Cornerstone Church Drive segment. Although the segment is designated as an urban major principal arterial, the actual V85 was lower than the estimated speed. For this segment, the PSL increases from 30 mph to 45 mph along the eastbound direction, so drivers may have accelerated in anticipation of the higher limit ahead, resulting in an observed V85 that is higher than the model would expect for the current zone.
Overall, this analysis illustrates that the accuracy of V85 prediction is not solely dependent on physical roadway attributes but is also influenced by nuanced driver behavior and context-specific factors. These findings suggest that careful calibration and clearer subclassification of variables (e.g., differentiating high- and low-impact driveways) are needed when generalized models are applied to specific sites.
Case Study: Application of the Model for Segment Screening
This case study demonstrates the utility of the proposed model as a planning and screening tool for identifying road segments with substantial discrepancies between PSLs and predicted V85 (Figure 6). The model can be useful in contexts where spot speed studies are not available, supporting data-driven decisions such as setting or modifying speed limits, calibrating crash-prediction models, and determining eligibility conditions for traffic safety countermeasures.

Difference between speed limits and estimated V85.
For demonstration purposes, the study defines extreme cases as those where the predicted V85 exceeds the PSL by more than 12 mph, or where the predicted V85 falls below the speed limit. A total of two cases are identified below:
MD 97 @ Old Hanover Rd (V85 is 15.6 mph Higher than the Speed Limit)
This segment lies within a rural speed transition zone where the posted speed limit drops from 45 mph to 35 mph. Drivers approaching from the upstream 45 mph section tend to maintain their higher speed as they enter the lower-speed zone, so most vehicles substantially exceed the 35 mph limit. The site is located on a two-lane, two-way rural highway with noticeable rolling grades, and the combination of downhill approaches and limited vertical sight distance likely contributes to the observed high operating speeds. A high V85 in this context may pose a safety concern for both through traffic and turning vehicles. Potential countermeasures include extending the speed reduction over a longer distance, enhancing advance warning and speed limit signage, and considering automated speed enforcement.
MD 30BU @ Trenton Mill Rd (V85 is 1.3 mph Lower than the Speed Limit)
This segment is located on an urban arterial where the upstream section is posted at 30 mph and includes a signalized railroad crossing. Drivers approaching from the upstream side tend to be cautious because of the rail crossing and associated traffic control, often decelerating in advance and proceeding carefully through the intersection. As a result, when vehicles enter the downstream segment, which is posted at 50 mph, their speeds may still be recovering from the upstream control and have not yet increased to match the higher posted speed limit. In this case, the observed V85 being slightly below the 50 mph PSL does not immediately indicate a safety concern; rather, it reflects prudent driver behavior in response to upstream constraints. Nonetheless, it may suggest an opportunity to review the coordination of signals, signing, and speed zoning along the corridor to ensure that speed expectations are clear and that operating speeds transition smoothly between adjacent segments.
Conclusions
This study developed a Maryland-calibrated regression model for estimating 85th-percentile speed (V85) on non-freeway arterials and benchmarked it against both the original and locally calibrated TTI models. By incorporating PSL and a directional AADT term, the proposed model lowered the mean absolute error to roughly 5 mph, about 60% below the uncalibrated TTI model and 25% below the Maryland-tuned model, while relying on only 60 spot-speed surveys. Outlier diagnostics further revealed that the largest residuals arise in speed-transition zones, segments with sparse driveway counts but active roadside development, and sites affected by probe-data dropouts.
This study validates that direct model transfer is not always feasible and helps clarify the ambiguity about the role of the PSL in operating-speed models. Although geometric features effectively substituted for PSL in the original Texas context, they failed to replicate such predictive power in rural Maryland. These results indicate that true transferability requires context-aware local calibration: treating PSL as an essential predictor in rural environments.
The findings have practical implications for transportation agencies. First, transportation agencies can deploy the model as a cost-effective surveillance tool, using readily available probe data and roadway attributes to prioritize locations for detailed study. Second, the quantified influence of PSL provides a transparent criterion for deciding when its inclusion is essential, for example, on homogeneous rural corridors, and when a leaner specification may suffice in dense urban non-freeways. Third, the model’s diagnostic capability directs practitioners toward specific countermeasures such as improved transition-zone signing, driveway-access management, or enforcement in lanes with marked directional imbalance.
Some limitations exist in this study. The analysis is confined to non-freeway arterials because concurrent probe-speed and spot-survey data were unavailable for limited-access facilities; extending the approach to freeways remains a priority. Although rigorous filtering was applied, residual noise and gaps in the probe dataset could bias coefficient estimates. Another limitation of the model is that it treats all access points equally, potentially biasing estimates by not distinguishing between high- and low-impact driveways, though future work could explore weighted or land-use-based classifications to improve accuracy.
Future work should therefore pursue the following directions. First, additional spot-speed data on freeways would permit testing the model’s applicability across the full roadway hierarchy. Second, systematic experiments are needed to formalize when PSL improves prediction accuracy and when it can be safely omitted, balancing parsimony and performance. Third, integrating land-use may refine driveway and access metrics, potentially eliminating the rural outliers identified here. Fourth, as larger datasets become available, exploring hybrid approaches that pair interpretable regression with non-linear methods (e.g., machine-learning residual correctors) could further improve prediction accuracy while maintaining transparency. Finally, as comparable spot-speed and probe datasets from other states become available, the proposed framework could be evaluated through cross-jurisdictional validation (e.g., training on Maryland and testing on another state, and vice versa) to assess which components are transferable and which require region-specific recalibration.
Footnotes
Authors’ Note
The authors acknowledge that a large language model, ChatGPT, was used and only used in improving the language when preparing the manuscript. The authors acknowledge the limitations of language models, and the accuracy, validity, and appropriateness of written language have been rigorously verified by the authors. The manuscript was prepared with assistance from the ChatGPT o3 model for code development and manuscript language improvement.
Author Contributions
The authors confirm contributions to the paper as follows: study conception and design: Y. Zhang and Y. Choi; data collection: Y. Zhang and Y. Choi; analysis and interpretation of results: Y. Zhang and Y. Choi; draft manuscript preparation: Y. Zhang, Y. Choi, and X. Yang. All authors reviewed the results and approved the final version of the manuscript.
Declaration of Conflicting Interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Xianfeng (Terry) Yang is a member of Transportation Research Record’s Editorial Board. All other authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Data Accessibility Statement
The data that support the findings of this study are available from the corresponding author on reasonable request, subject to applicable restrictions.
