Abstract
Urban traffic congestion increases pollution, fuel consumption, and travel time, highlighting the limitations of traditional signal systems that rely on outdated schedules and insufficient sensors. This study proposes the Dynamic Traffic Adaptive Signal Optimization (DTASO) method, which leverages multi-source data fusion to dynamically adjust signal timings based on real-time traffic data from cameras, sensors, and historical trends. Powered by Adaptive Decentralized Deep Reinforcement Learning (ADDRL), DTASO improves traffic flow and system performance by continuously learning and adapting to changing traffic patterns. Experimental results demonstrate that DTASO outperforms conventional signal optimization approaches in reducing congestion and enhancing traffic efficiency, offering a scalable and intelligent solution for urban traffic management.
Keywords
Introduction
With the acceleration of urbanization, traffic congestion1,2 has become a key issue affecting residents’ travel efficiency and urban sustainable development. Traditional traffic signal systems3,4 rely on fixed timing and simple sensors, which are difficult to cope with complex and changing traffic flow patterns, resulting in low traffic efficiency, increased delays and increased environmental pollution. Therefore, it is urgent to introduce intelligent and adaptive traffic signal control methods to improve the responsiveness and operating efficiency of traffic systems.5,6 In recent years, the development of artificial intelligence2,7 and data fusion technology has provided new solutions for traffic management, making it possible to dynamically optimize signal timing based on multi-source data.
This study proposes a DTASO based on the ADDRL framework, which integrates cameras, sensors, GPS and historical traffic data to achieve real-time perception of urban traffic flow8,9 and signal control strategy optimization. This method adopts a decentralized deep reinforcement learning (DRL) architecture, which enables each intersection to have autonomous decision-making capabilities and can coordinately adjust signal timing, significantly improving traffic flow and reducing vehicle delays. Compared with traditional methods, DTASO shows obvious advantages in system performance, scalability and environmental friendliness, and represents a new direction for intelligent traffic signal control.
Literature survey
In recent years, with the development of artificial intelligence and data fusion technology, traffic signal control has gradually evolved towards intelligence and adaptability. In existing studies, federated learning combined with reinforcement learning has been used for coordinated control of traffic signals at multiple intersections.10,11 The global performance is improved by aggregating local models in the cloud, but it relies on communication infrastructure and has limited convergence speed. In addition, neural network-based methods12,13 such as GRA-BPNN, Hybrid LSTM and CNN-LSTM have also been widely used in traffic flow prediction and signal optimization. Although they perform well in spatiotemporal feature extraction, they are difficult to cope with dynamic traffic changes14,15 and have scalability bottlenecks. In contrast, MPLight adopts a modular DRL framework and has demonstrated better traffic efficiency than traditional timing control in tests at multiple urban intersections.
The DTASO algorithm proposed in this study is based on the ADDRL framework and integrates camera, sensor, GPS and vehicle network data to achieve decentralized real-time signal timing optimization. This method not only overcomes the scalability problem of centralized control systems, but also improves state perception accuracy and system robustness through multi-source data fusion. Experimental results show that DTASO outperforms existing mainstream models in key indicators such as average travel time, traffic volume, and congestion frequency, demonstrating stronger dynamic adaptability and system stability. This work continues and expands the current research paradigm of traffic signal optimization, providing a more practical technical path for future intelligent transportation systems.
Research methodology
Dynamic traffic adaptive signal optimization (DTASO) algorithm
The DTASO algorithm based on a multi-source data fusion model consists of a traffic analysis and prediction, signal timing adjustment based on ADDRL prediction and dynamic signal control and optimization, the architecture shown in Figure 1.

Proposed DTASO algorithm based traffic signal optimization.
The DTASO algorithm works based on the fusion of real-time traffic data to change signal timings dynamically. It uses advanced data fusion algorithms to combine inputs from several sources, including cameras, sensors, and historical patterns. The algorithm adjusts signal timings for shifting traffic conditions by continuously analyzing and predicting traffic patterns using ADDRL techniques. The program can optimize signal schedules through reinforcement learning for better traffic flow.
Multi-source data fusion model
Data sources
The CV Pilot data now available to the public via ITS DataHub includes Essential Safety Messages (BSM) and Traveller Information Messages (TIM) with Signal Phase and Timing (SPaT) information. Designated short-range communications are used to send these signals (DSRC). With permission, users can access more restricted access datasets from the CV Pilots on Private Data Commons.
Traffic signals can gather information from several sources, including Traffic cameras, Recording vehicle types, movement patterns, and traffic density in real time and delivering real-time footage of the crossroads. Inductive Loop Sensors: These are integrated into the pavement to identify the existence of automobiles. Global Positioning System(GPS) and Mobile Data: Compiling data on traffic patterns, speeds, and routes taken by vehicles and monitoring the movements of cars using tracking. Weather stations: Providing information on the weather that affects driving conditions.
Data fusion
The term “data fusion,” or “sensor fusion,” in traffic management refers to combining and integrating data from multiple sensors and data sources to comprehensively picture traffic dynamics and conditions. Sensor fusion creates a comprehensive view of traffic conditions by combining data from several sensors, including GPS, video feeds, and vehicle identification sensors. Information fusion is merging data from several sources to obtain deeper insights. For example, weather and traffic flow data can be combined to understand better how the weather affects road conditions.
A realistic urban traffic network was recreated using data from the Connected Vehicle (CV) Pilot program of the ITS DataHub, including BSMs, TIMs, SPaT, and others. The simulation system simulated vehicle-to-infrastructure (V2I) interactions using SUMO(Simulation of Urban Mobility) to mimic Dedicated Short-Range Communications junctions. To assess the DTASO algorithm at 500–2000 vehicles/hour, the experimental environment included simulated traffic flows and actual CV data comprising vehicle trajectories, signal statuses, and priority requests. Compared to conventional fixed-time and adaptive signal systems, the main signs were SPaT distribution delay, junction queue reduction, and emergency vehicle conflict resolution (Table 1).
Experimental settings for CV pilot data.
Data pre-processing
In the data preprocessing process of traffic management, data from multiple sources are cleaned and integrated. Since the data may contain missing values, outliers, or inconsistent information, data cleaning is performed, including removing erroneous data, filling missing values, and correcting records that deviate from the normal range. In addition, in order to facilitate subsequent analysis and modeling, the data is standardized so that data from different sources can be compared on a unified scale.
The study analyzes lane borders, distance, and feature extraction from video data to determine when a vehicle changes lanes. This procedure aids in improving driver behavior analysis and traffic monitoring, which are important for safety precautions and traffic management (Figure 2).

General multi-source data fusion model.
Data decision
Decision-level fusion occurs when the system uses reasoning, inferencing, and selection to make a decision. It includes the processes of estimation, prediction, and classification. Traffic signal control, which manages coordinated and isolated intersections, uses judgments from three systems: accident information, priority vehicle transition, and route information.
ADDRL framework based on DTASO algorithm
The detailed implementation method of the ADDRL framework based on the DTASO algorithm first involves the collection and fusion of multi-source data. The system obtains real-time traffic data through multiple sensors such as cameras, GPS, loop detectors and weather stations, including key indicators such as the number of vehicles, speed, waiting time and weather conditions. After preprocessing, these heterogeneous data are uniformly input into the state space of ADDRL to form a complete representation of the current traffic state. Subsequently, ADDRL uses a deep neural network to approximate the Q function, and combines the experience replay buffer and the target network mechanism to improve the training stability. In each decision cycle, the agent predicts the Q value of each feasible signal timing scheme based on the current state, and selects the optimal action to execute based on the ε-greedy strategy.
In the process of dynamic signal control and optimization, the ADDRL framework continuously updates its strategy through continuous interactive learning to maximize the long-term cumulative reward. The reward function design comprehensively considers the local intersection efficiency and the global traffic flow balance. Whenever the signal timing is adjusted, the system collects new traffic feedback data to update the neural network parameters and complete a closed-loop learning iteration.
In reinforcement learning, and more significantly in DRL traffic signal optimization using frameworks like ADDRL, the Q-learning equation (1) is an essential component. It provides the mathematical basis for learning the best signal times, adjusting to shifting traffic conditions, and continuously enhancing decision-making procedures to improve traffic flow and lessen congestion.
The Q-value of doing action “a” in state “s” is represented by
The Q-function,
The network uses gradient descent to adjust its parameters by minimizing the loss.
The traffic signal light is seen as an agent in the ADDRL framework for independent traffic signal control, as illustrated in Figure 3. It engages in closed-loop multi-agent policy interactions with the traffic environment or circumstances. The control strategy is obtained by mapping the related optimal control measures (such as phase shift, cycle length modification, green time rise, etc.) to the traffic condition (such as waiting time, total delay, etc.). After taking action, the agent iteratively gets feedback rewards and modifies the procedure until it converges to the best control policy. The agent's chosen course of action during the decision-making process mixes the application of previously learnt policies with investigating novel approaches that have never been encountered. Applying the RL algorithms to a transportation network with numerous intersections is also possible. Traffic signal agents now have the dual goals of concurrently learning each agent's optimal policy and optimizing the traffic condition of the overall traffic environment while keeping all other parameters constant.

ADDRL framework for traffic signal optimization.
Traffic Flow Models: Various models, such as cell transmission (CTM) or traffic simulation models, represent traffic dynamics. These models can forecast future traffic situations and estimate current traffic statuses. These mathematical models track vehicle passage between segments to describe traffic flow, as shown in equation (3).
The number of vehicles in segment
Signal timing adjustments based on ADDRL prediction
The training process of a neural network approximates the Q-function., which aids in making well-informed decisions about signal modifications in forecasting optimal signal timings using the ADDRL model and neural networks. The Q-function represents the quality of doing a particular action for a specific condition, as shown in equation (4). A neural network approximates this function in the context of ADDRL.
The neural network's training aims to reduce the error between the target and predicted Q-values. The network's parameters (
The loss function, denoted as
Dynamic signal control and optimization
Real-time timing adjustments for traffic signals are made possible bdynamic signal control that considers traffic flow. Based on the anticipated traffic patterns, the system dynamically modifies the timings of traffic lights. For example, adaptive phasing adjusts the length of the green, yellow, and red light cycles according to traffic volume and flow. Priority management allocates resources based on the current demand, prioritizing public transportation, key roads, emergency vehicles, and pedestrian crossings. In this case, optimization entails adjusting the timing of the traffic signals to produce the best results feasible based on the fused multi-source data. Figure 4 shows the ADDRL framework of multi-intersection traffic signal control.

Framework of AADDRL for multi-intersection signal control.
The ADDRL framework dynamically applies the best signal timings by adjusting and learning from the shifting traffic conditions. The benefits of dynamic signal control and optimization include, Adequate Traffic Flow: It guarantees more fluid traffic flow by adapting quickly to current circumstances cutting down on traffic jams and travel times. Resource Utilization: Without significant physical alterations, signal timing optimization maximizes the effectiveness of the current infrastructure. Environmental Impact: Fuel consumption and emissions are decreased by improved traffic flow, which makes the environment greener.
Experiment and results
Traffic flow efficiency
The study selects the following two representative measures to analyze traffic flow efficiency. Travel time—When assessing the effectiveness of the signal control approach in transportation, the most commonly used metric is the mean travel duration of all vehicles in the system. Throughput—Its definition is the total number of journeys that automobiles make during the simulation. More vehicles have finished their journey in a particular timeframe with a higher throughput, which suggests an improved control technique.
Because of its improved reward design and simultaneous feedback learning from the environment, ADDRL stands out among transportation and reinforcement learning approaches. ADDRL outperforms other RL techniques in optimizing strategy by reducing congestion, notably between entering and exiting lanes. Phase-specific pressure is also taken into account by the DTASO algorithm, but ADDRL shows a much more significant performance disparity with improved throughput and trip time. This benefit results from ADDRL's capacity to ignore the assessment of previous acts extracted from the environment.
Figure 5 compares the performances of the various approaches in the five synthetic traffic data settings. The smaller the average travel time, the better; conversely, the higher the throughput. The best results are obtained with our approach regarding throughput and travel time. The system, AIMD, 12 cannot be compared because of its high computing costs and complexity, which prevents it from scaling to massive networks. In contrast, our suggested technology, ADDRL, allows for the efficient and proper management of thousands of lights.

Comparison of traffic time and efficiency.
Congestion control
Decongestion in urban settings requires a multi-pronged approach for successful traffic flow regulation. The DTASO approach continuously adjusts signal timings based on real-time traffic data from several channels, historical patterns, sensors, and cameras; this effectively reduces congestion. The fundamental goal of this strategy is to learn how traffic conditions are dynamic and to adapt to these changes by using the benefits of combining data from various sources. The DTASO algorithm employs an ADDRL architecture to provide adaptive signal regulation. This approach could make assessing and predicting traffic characteristics easier by allowing the system to change signal timings dynamically depending on real-time traffic data. Because it uses information from many sources, this approach can adjust and react well to changing traffic patterns. DTASO seems to have significant advantages in traffic flow, congestion reduction, and overall system performance compared to traditional signal optimization methods.
Notably, the algorithm adjusts the length of its reaction time according to the degree of congestion. More instances of congestion need longer times to be mitigated successfully. Additionally, the system modifies the cycle-time increments based on the strength of the congestion impulses. Although modest notions encourage smaller cycle-time increments to avoid establishing vehicle bottlenecks, severe congestion impulses lead to more considerable additive increases. The study shows that using ADDRL to manage traffic on intricate road networks with several links is feasible.
This innovation results (shown in Figure 6) from DTASO's capacity to modify signal timings in response to real-time traffic information, which helps to reduce traffic during peak hours and in unforeseen situations. DTASO can efficiently handle congestion by incorporating data from several sources and applying adaptive signal control procedures via the ADDRL framework. This strategy offers a workable way to improve traffic signal management in urban environments by providing real-time flexibility to changing traffic conditions. This will result in less traffic congestion, better traffic flow, and general urban mobility.

Frequency of congestion occurrences at various intersections (Int) of roads.
System performance
Regarding traffic management, system performance includes the effectiveness, efficiency, and flexibility of the tactics used to reduce traffic and improve urban mobility. The underlying idea is that efficient traffic control is essential to reducing the adverse effects of traffic on urban settings’ travel times, fuel usage, and pollution levels. Traditional traffic signal systems often fail during peak hours or unforeseen traffic situations because they rely on antiquated schedules and basic sensor-based operations, which may lead to decreased efficiency.
Compared to conventional methods of signal optimization, DTASO has shown significant improvements in system performance, traffic flow, and congestion reduction (see Figure 7). With DTASO, there may be a better way to regulate traffic signals in metropolitan areas. This is due to its ability to integrate several data sets. A traffic management system's effectiveness depends heavily on its capacity to adapt to changing traffic conditions.

Comparison of performances of various machine learning frameworks.
How well innovative algorithms like DTASO work, how they adapt to real-time traffic data, and how much they have been proven to reduce congestion and boost overall urban mobility efficiently are all aspects of “system performance” in traffic management. Complex data integration is needed to change signal durations for better traffic and flow management dynamically. This will improve traffic management in heavily populated areas over time.
Safety enhancement and user satisfaction
An ADDRL-based Intelligent Traffic Signal Optimization Algorithm (DTASO) improves traffic management and customer satisfaction.
The comparison of congestion frequencies at different intersections is shown in Figure 8.

Congestion frequency comparison of different intersections.
The DTASO algorithm excels in congestion control at multiple intersections. In tests at five different intersections, the congestion frequency of traditional signal systems ranged from 18 to 22 times/hour, while the DTASO system significantly reduced this number to between 1 and 4 times/hour. This shows that the ADDRL framework can effectively respond to real-time traffic conditions and significantly reduce vehicle conflicts and queues by dynamically adjusting signal timing. Especially in high-traffic areas (such as Int.5), DTASO shows stronger adaptability and regulation capabilities, and its congestion control effect is better than other methods.
Comparison of traffic performance indicators.
The comparison of traffic performance indicators is shown in Table 2.
In terms of overall traffic performance, DTASO outperforms existing models in core indicators such as average travel time, traffic volume, and congestion frequency. Compared with traditional neural network models such as GRA-BPNN and CNN-LSTM, DTASO shortens the average travel time to 262 s, which is about 26% less than the optimal traditional model; at the same time, the traffic volume is increased to 6170 vehicles/hour. These data fully demonstrate that the DTASO method based on multi-source data fusion and DRL not only improves the efficiency of local intersections, but also achieves better traffic flow scheduling and resource allocation at the global road network level, and has good scalability and practical application potential.
Conclusion
Implementing efficient traffic management systems is essential to address long-standing urban issues such as pollution, waste, and congestion. Traditional traffic signal systems, reliant on outdated schedules and basic sensors, are no longer sufficient. The DTASO method leverages multi-source data fusion to dynamically adjust signal timings, offering a revolutionary approach. Powered by ADDRL, DTASO integrates real-time data from cameras, sensors, and historical traffic patterns to optimize signal control. Compared to conventional methods, DTASO significantly improves system performance, reduces congestion, and enhances transportation efficiency. Its data-driven adaptability makes it a major advancement in urban traffic management. However, further development is needed to ensure its effectiveness in real-time big data processing and resilience against unexpected changes. Once fully validated, DTASO promises to reduce emissions, improve public transit, and deliver smarter, more sustainable urban mobility.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
