Abstract
When the elevator car operates at high speed, it can experience unpredictable horizontal vibrations due to factors such as guide rail unevenness, guide shoe nonlinearity, and variations in load, all of which affect both safety and ride comfort. To effectively suppress the horizontal vibration of the car, this paper proposes an optimal dynamic semi-active control method based on deep deterministic policy gradient (DDPG). First, a dynamic model of the elevator car’s horizontal vibration system is established, accounting for the nonlinear coupling between the guide rail, guide shoe, car frame, and car body, while the unevenness of the guide rail is simulated using power spectral density. Second, a Markov decision process model is formulated for controlling horizontal vibrations, with the state space, action space, and reward function designed for deep reinforcement learning. Third, to improve the generalization performance, the DDPG algorithm is enhanced by normalizing the state space of the elevator, and the controller learns the optimal dynamic semi-active control strategy by interacting with the simulation environment, and achieves real-time control of the car’s horizontal vibration. Simulation results demonstrate that, compared to passive control, Mixed SH-ADD control, PID control, and DQN-based control, the DDPG-based method reduces the average horizontal acceleration of the elevator car by 68.90%, 45.46%, 21.35%, and 20.95%, respectively, effectively mitigating horizontal vibration.
Keywords
Introduction
During the operation of high-speed elevators, the car is influenced by factors such as guide rail unevenness, the nonlinear behavior of guide shoes, and load fluctuations, all of which introduce uncertainties that result in unpredictable horizontal vibrations. These vibrations will not only cause passengers dizziness, tinnitus, and other adverse reactions, seriously affecting the ride comfort, but also damage the parts, posing a threat to the long-term safe and stable operation of the elevator system. Therefore, how to effectively suppress the horizontal vibration of the elevator car has become one of the key issues in the field of high-speed elevator research.
To address the horizontal vibration problem of elevator carriages in a targeted manner, researchers have conducted numerous studies. Tusset et al. (2023) introduced a Nonlinear Energy Sink (NES) into a four-degree-of-freedom elevator horizontal vibration control system and explored the influence of the parameter configuration of the NES on the vibration reduction effect under specific external disturbances. Su et al. (2023) optimized the damping parameters based on the Sparrow Search Algorithm and used electromagnetic active rolling guide shoes to control the horizontal vibration amplitude of the carriage. Tang et al. (2023) took the control cost and system performance as objective functions, designed and optimized a fractional-order PID controller with the aid of a multi-objective genetic algorithm, and verified its effectiveness in vibration reduction through numerical simulations. However, these studies lack the self-learning ability for the random excitation of the elevator system, making it difficult to cope with the dynamic disturbances in actual working conditions, and there is a lack of intelligent and effective vibration reduction schemes.
Santo et al. (2018) studied the horizontal nonlinear response of a three-degree-of-freedom vertical transportation model under the excitation of guide rail deformation and proposed a control strategy based on the State Dependent Riccati Equation (SDRE). Zhao et al. (2024) designed a controller for the horizontal vibration response induced by guide rail excitation based on acceleration feedback and optimized the control cost to reduce power consumption. Wang et al. (2021) proposed a predictive sliding mode controller based on adaptive fuzzy theory and carried out numerical simulations for typical guide rail excitations. However, the Gaussian random functions used in these studies differ significantly from the actual guide rail excitations, making it difficult to deal with the complex multi-source external disturbances of the car system when the elevator is running at high speed, thus limiting their practical applications.
He et al. (2022) proposed an adaptive sliding mode control scheme to suppress the horizontal vibration of high-speed elevator carriages and verified its effectiveness. Zhang et al. (2021), aiming at the uncertainty of external excitation of the elevator system, designed a BP-PID controller based on a linear prediction model for intelligent active control and analyzed the control effect using MATLAB/Simulink. However, when facing the complex nonlinear system of the elevator and the variable external environment, the control accuracy and response speed of these methods decline. Moreover, the training of the BP neural network requires a large amount of real data, resulting in low practicality. Traditional methods such as PID control (Tang et al., 2023), fuzzy control (Zhang et al., 2024), and robust control (Feng et al., 2009) also have limitations when dealing with the vibration problem of high-speed elevator carriages.
Optimized control methods based on deep reinforcement learning have achieved fruitful results in the field of continuous dynamic control, demonstrating great potential and advantages (Cai et al., 2023; Hagmar et al., 2023; Tang et al., 2024; S Wang et al., 2024; Y Wang et al., 2024; Yin et al., 2024; Zhu et al., 2024). In the field of vibration control, for instance, the hybrid RL controller designed by Panda et al. (2024) shows superior performance in the vibration control of multi-dimensional complex structures. The semi-active suspension controller based on deep reinforcement learning proposed by Lee et al. (2022) significantly improves the riding comfort of vehicles. Aiming at the problem of high-frequency and high-dimensional continuous vibration control, Shu et al. (2024) proposed a multi-reward mechanism to assist the lightweight network in finding an approximately optimal control strategy. Based on transfer learning, Ren et al. (2024) further explored the wake-induced vibration in the staggered configuration, confirming that DRL can provide a general solution for controlling flow-induced vibration. In response to the problem of intensified vibration between the pantograph and the catenary, Wang et al. (2024b) proposed an improved Deep Deterministic Policy Gradient (IDDPG) and verified the steady-state performance of the controller through simulations. However, in the field of horizontal vibration control of high-speed elevator carriages, there are relatively few related studies, and a systematic theoretical framework has not yet been formed. Therefore, conducting research by combining the capabilities of deep reinforcement learning with the control requirements of horizontal vibration of high-speed elevator carriages is expected to achieve more advanced and efficient control schemes and improve the operation quality of elevators.
The purpose of this paper is to address the horizontal vibration problem of high-speed elevator cars through a DRL approach. This study targets vibration issues arising from uncertainty factors such as uneven guide rails, nonlinear behavior of guide shoes, and load fluctuations. We propose an optimal dynamic semiactive control method based on the DDPG to mitigate horizontal vibrations in high-speed elevator cars. The damping coefficient of the semiactive guide shoe is explicitly constrained by the output of the strategy network, facilitating real-time control of horizontal vibrations.
The structure of this paper is as follows: The section “Horizontal vibration control model for high-speed elevators” models the horizontal vibration dynamics of a high-speed elevator car and simulates uneven guide rail excitations. The section “Optimal dynamic control of car horizontal vibration based on DDPG” formulates a Markov decision process to address the vibration issue, proposes a control method based on an enhanced DDPG algorithm, and analyzes controller convergence. The section “Simulation result and analysis” compares the simulation results of passive control, Mixed Semi-Active Control (SH-ADD), PID control, Deep Q-Network (DQN)-based control, and DDPG-based control, highlighting the superior performance of the enhanced DDPG-based controller. The section “Conclusion” summarizes the study. The appendix provides the parameters of the vibration model and DRL algorithm used.
Horizontal vibration control model for high-speed elevators
Nonlinear dynamics model of car horizontal vibration based on semi-active guide shoes
Figure 1 illustrates the horizontal vibration system of a high-speed elevator. Rubber blocks between the car and frame, along with roller guide shoes connecting the frame to the guide rail, jointly reduce vibrations. Due to the nonlinear nature of materials like rubber and the presence of semi-active guide shoes with controllable actuators, the guide shoe’s nonlinear behavior is crucial for vibration control. Semi-active guide shoes adjust damping in real-time based on vibration states, improving control effectiveness. To investigate the horizontal vibration of elevator car under various uncertainties, the following assumptions are made: (1) The car and car frame are considered rigid. (2) The semi-active guide shoe is modeled as a nonlinear spring-damped system. (3) Given that the dynamic models for front-back and left-right vibrations of the car are nearly identical, this paper primarily focuses on the left-right horizontal vibration. Schematic structure of horizontal vibration system in high-speed elevator (1. Wirerope; 2. Guide rail; 3. Roller guide shoe; 4. Car frame; 5. Rubber block; 6. Car).
Figure 2 shows the equivalent physical model of the horizontal vibration of a high-speed elevator car. Here, m
w
is the mass of the car body, m
b
is the mass of the car frame, m
s
is the mass of the guide shoe, k
w
is the stiffness of the rubber block, c
w
is the damping coefficient of the rubber block, k
b
is the stiffness coefficient of the guide shoe, cb1 and cb2 are the controlled damping coefficients of the left and right guide shoes, k
s
is the stiffness coefficient of the guide wheel, xr1 and xr2 are the left and right guide shoe displacements, xs1 and xs2 are the left and right guide shoe excitations, x
b
is the car frame displacement, and x
w
is the car displacement. Equivalent physical model of horizontal vibration of high-speed elevator car.
According to Newton’s second law, the system dynamics equation can be obtained as shown in equation (1):
Considering that the horizontal vibration of a high-speed elevator car is a nonlinear multiple-input multiple-output control problem, we will focus on the horizontal vibration of the car. The state-space model of this system can be expressed as equation (2):
Simulation of guide rail unevenness excitation
The elevator system is complex, with factors such as uneven guide rail excitation, nonlinearity of the guide wheel and guide shoe, and variations in load capacity contributing to the horizontal vibration of the car. Studies indicate that the guide rail unevenness is the primary cause of this vibration (Zhang et al., 2018). Guide rail excitation can be classified into four patterns: elastic bending, joint step, joint tilt, and surface wear. Among these, joint step and surface wear are the most prevalent; therefore, this paper simulates these two types of guide rail excitations as inputs for subsequent experiments.
Joint step excitation
Given the influence of rail installation errors and other factors, rail step excitation is quite common. In this paper, we simulate the step excitation of the guide rail, as illustrated in Figure 3, to assess the effectiveness of the proposed damping method. The simulation assumes that the left rail of the elevator car has two different degrees of installation error, while the right rail remains smooth. Step excitation of the guide rail.
Surface wear excitation
Surface wear excitation of the guide rail introduces significant uncertainty, posing challenges for the horizontal vibration control of high-speed elevator cars. In this paper, we propose a simulation method based on power spectral density. This method models the surface wear excitation of the actual guide rail by controlling both the frequency components and the phase, and they are defined by equations (3) and (4), respectively:
The guide rail surface wear excitation is obtained by superimposing all frequency components: Simulated surface wear excitation of the guide rail.
To more accurately describe the statistical characteristics of the guide rail surface wear excitation, the periodogram method is applied to estimate the one-sided power spectral density of the simulated guide rail surface wear excitation signal, and the guide rail surface wear excitation power spectral density function is fitted as equation (7): Displacement power spectral density of guide rail surface wear.
Optimal dynamic control of car horizontal vibration based on DDPG
Markov decision process modeling for horizontal vibration in elevator car
The mathematical foundation of reinforcement learning is based on Markov decision process (MDP) with Markovianity. An MDP is generally defined by its state space, action space, state transition probability function, and reward function. The horizontal vibration control system of high-speed elevator can be modeled as an MDP, at each time step t, the state space is derived from the state space model and the guide rail unevenness excitation model presented in the section “Horizontal vibration control model for high-speed elevators”:
The state space includes the gradients of the excitation changes on both sides of the guide rail, denoted as
The action space is given by equation (9):
The state transfer probability function can be defined according to equation (1). By considering the input guide rail excitation and the horizontal vibration state of the high-speed elevator car, the vibration state at the subsequent moment can be determined by solving the system of ordinary differential equations.
The horizontal vibration control system for high-speed elevator cars aims to reduce horizontal vibrations of the car. In compliance with ISO 8100-34:2021 and ISO 2631-1:1997, the ride comfort metric is evaluated using vibration acceleration. Additionally, system safety considerations are incorporated into the reward function through displacement and velocity terms. The reward function is defined by equation (10).
DDPG-based generalized control method for car horizontal vibration
The horizontal vibration control of high-speed elevator car involves a continuous action space, categorizing it as a continuous control problem. This paper is based on the DDPG algorithm (Lillicrap et al., 2015) to obtain the optimal dynamic control strategy for the horizontal vibration of high-speed elevator car, and the training process is shown in Figure 6. Training flow of high-speed elevator car horizontal vibration control based on DDPG.
DDPG belongs to an actor-critic approach with a strategy network (actor) and a value network (critic). The strategy network selects the optimal action based on the state observed from the environment, while the value network evaluates the expectation of the reward in the current state and action. The strategy network and the value network update the network parameters through equations (11) and (12):
High-speed elevator systems may encounter different excitations and environmental perturbations during operation, leading to differences between the state space encountered by the controller during training and that during actual operation. In this paper, state data are normalized before being input to the strategy network, ensuring that the input data across different elevator operating conditions maintain the same scale, thereby enhancing the generalization performance of the controller. State normalization is achieved by calculating the moving average and variance over a specified time window, allowing states to be dynamically adjusted over time:
To enhance algorithm performance and mitigate the overestimation problem, two target networks are utilized during training, with their parameters updated every k rounds through a soft update method:
In this paper, experience replay is used to store the past experience
When the elevator is operating at high speed, factors such as variations in passenger load, air resistance, and track unevenness may induce nonlinear vibrations in the car system. To address this, noise is added to the strategy network to ensure that the action output satisfies equation (19):
Training convergence of horizontal vibration controllers
The parameters, network structure, and hardware configuration used in training are detailed in the Appendix. The algorithm underwent 5000 iterations, and Figure 7 shows the cumulative reward profile for each iteration. In the early stages of training, the cumulative reward values exhibit considerable fluctuations, which are mainly due to the strategy network exploring new control strategies. As the training progresses, the cumulative reward values gradually increase and stabilize in the later stages. This indicates that our deep reinforcement learning algorithm learns optimal control strategies to improve the performance of horizontal vibration control in the elevator car. Total reward curve of the training process.
Simulation result and analysis
Horizontal vibration responses of the car induced by guide rail surface wear
In this study, a horizontal vibration model for a high-speed elevator car is built using Python, with the guide rail uneven excitation from the section “Horizontal vibration control model for high-speed elevators” as input. We compare passive control, Mixed SH-ADD (Savaresi and Spelta, 2007), PID control, DQN-based control (Feng et al., 2009), and DDPG-based control to analyze the time-domain responses of acceleration under guide rail surface wear (Figure 8(a)). The frequency-domain response of acceleration is shown in Figure 9(a). Time-domain response of car horizontal vibration. Frequency-domain response of car horizontal vibration.

The figures demonstrate that Mixed SH-ADD, a semi-active control algorithm achieving near-ideal global optimality approaching the filtering limit, significantly reduces peak vibration acceleration compared to passive control systems while maintaining superior horizontal vibration suppression across the entire frequency domain. Although the PID controller does show a significant acceleration reduction effect in the high frequency band, the suppression amplitude is smaller compared with other methods in the low frequency region, and the low frequency vibration control has obvious shortcomings, as demonstrated by frequency domain analysis. The DQN-based controller produces smaller amplitude oscillations in acceleration throughout the operational timeframe. However, its inherent limitation in handling discrete action spaces restricts practical applications to continuous control scenarios like car horizontal vibration suppression. In contrast, the DDPG-based controller effectively manages high-dimensional continuous state-action spaces, achieving optimal dynamic control performance. Frequency-domain analysis confirms that the proposed DDPG controller maintains car horizontal vibration amplitudes at minimal levels across all frequency bands, establishing its superiority in multidimensional vibration control.
Comparison of displacement, velocity, and acceleration responses across different control methods.
Horizontal vibration responses of the car induced by guide rail step excitation
To assess the impact of installation errors, the proposed method is tested using guide rail step excitation simulations. Figures 8(b) and 9(b) present the time and frequency responses of horizontal vibration acceleration. Despite not accounting for the influence of guide rail step excitation during training, the normalization of the input elevator state data enables the DDPG-based controller to effectively cope with step disturbances caused by installation errors, resulting in superior control performance compared to other methods.
Verification of generalized performance
To assess the generalization of the DDPG-based controller, its performance was tested on three elevator types under varying load conditions (Figure 10). The controller effectively suppressed horizontal vibrations in all scenarios, demonstrating its ability to adapt robustly to complex disturbances and provide a safe, smooth, and comfortable ride. Mean acceleration responses influenced by multiple factors.
In conclusion, we compare the effectiveness of various control methods on the horizontal vibration of the car under the influence of guide rail surface wear excitation, guide rail step excitation, and various uncertainties. The proposed DDPG-based method significantly reduces the acceleration peak-to-peak value, shortens stabilization time, and enhances resistance to disturbances. These improvements contribute to a smoother operation of the elevator car and increased safety.
Conclusion
To address the horizontal vibration issues of high-speed elevator cars caused by uncertain excitations due to uneven guide rails, nonlinear guide shoes, and complex operating environments, a semi-active control method for horizontal vibrations in high-speed elevator cars based on an enhanced DDPG-based approach is proposed. The main conclusions are as follows: (1) A nonlinear horizontal vibration model of a high-speed elevator car is established, and the uneven excitation of the guide rail is simulated. By considering the nonlinearity of the semi-active guide shoe and the power spectral density characteristics of real guide rail surface wear excitation, the complex external perturbations experienced by the high-speed elevator during operation are accurately represented, revealing the key factors influencing the car’s horizontal vibrations. (2) A Markov decision process model for controlling horizontal vibrations in elevator cars is developed. This model effectively characterizes the state space, action space, and reward function related to vibration control, providing a framework for addressing the horizontal vibration issue of elevator cars using deep reinforcement learning methods. (3) The DDPG algorithm is enhanced through state normalization to improve the controller’s generalization performance. In this study, state data is normalized before being input into the strategy network, enabling the controller to effectively handle high-dimensional input data during training. This enhanced algorithm demonstrates stable control performance across varying environmental conditions, exhibiting exceptional adaptability and robustness.
Future research may explore integrating energy-efficient metrics into the control framework and validating the strategy under real-world operational conditions to further enhance its engineering applicability.
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Natural Science Foundation of Shandong Province (grant number ZR2023ME174).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The data to support the findings will be made available upon reasonable request for academic use by contacting the corresponding author.
Appendix
The elevator state parameters and training hyperparameters used in this study are listed in Tables 2 and 3. Figure 11 illustrates the algorithm’s network structure. The network weights were trained for approximately 2 h, with the software and hardware configurations detailed in Table 4. Elevator state parameters. Hyperparameters used in training. Structure of networks. Hardware and software configurations used for training.
Parameters
Unit
A
B
C
Mass of car m
w
kg
1200
750
2000
Mass of car frame m
b
kg
750
500
1000
Mass of roller m
z
kg
10
5
15
Stiffness coefficient of rubber block k
w
N/m
5.0e5
4.0e5
5.0e5
Damping coefficient of rubber block c
w
Ns/m
320
320
320
Damping coefficient of guide shoe k
b
N/m
4.0e4
4.0e4
5.0e4
Damping coefficient of roller k
z
N/m
6.0e5
5.0e5
6.0e5
Parameters
Value
Number of episodes
500
Batch size
128
Buffer size
10e5
Optimizer
Adam
Learning rate (Q-network)
2e-5
Learning rate (Strategy-network)
2e-6
Discount factor (γ)
0.95
Soft update rate (τ)
0.99
Software/Hardware
Version/Model
CPU
11th Gen Intel(R) Core(TM) i5-11400H
Python
3.10
TensorFlow
2.10.0
GPU
NVIDIA GeForce RTX 3050
CUDA
11.7
cuDNN
V8.5.0
