Abstract
Heating, ventilation, and air conditioning (HVAC) systems offer the greatest potential for energy savings in building services. However, conventional thermostat control in offices often fails to balance comfort and efficiency. To address this issue, a model predictive control (MPC) framework is proposed to improve thermal comfort in office buildings through predictive thermostat regulation. An Extreme Learning Machine (ELM)-based predictive model is developed to forecast indoor thermal comfort conditions, which is embedded within a receding horizon optimization structure to enable real-time control decisions. To efficiently solve the underlying optimization problem, the Gray Wolf Optimizer (GWO) algorithm is adopted due to its favorable convergence characteristics. A high-fidelity Energy Plus simulation model is constructed to capture the dynamic behavior of the indoor thermal environment, from which comprehensive datasets are generated for model training and validation. The parameters of ELM model are further refined using GWO to enhance forecasting accuracy. The integrated predictive model and MPC strategy are implemented within a co-simulation environment, enabling bidirectional coupling between the MPC controller and the EnergyPlus thermal model. Furthermore, a pilot field experiment is conducted in a real-world office building to validate the applicability of the system. Simulation and experimental results demonstrate that the proposed approach significantly enhances occupant thermal comfort while maintaining energy efficiency, evidencing the effectiveness of the combined data-driven prediction and bio-inspired optimization strategy. The methodological integration of ELM-based prediction, GWO-driven optimization, and practical field validation represents a novel adaptive control framework for enhancing indoor thermal comfort.
Keywords
Introduction
The building sector is a major contributor to global energy consumption and environmental impact, accounting for approximately 40% of global energy use and 30% of greenhouse gas (GHG) emissions in industrialized countries, with the built environment contributing to 40% of annual global GHG emissions (Cho et al., 2024). Within this sector, heating, ventilation, and air conditioning (HVAC) systems are responsible for 50%–70% of operational energy use, making them the most energy-intensive component of building services (Yang et al., 2022). Among HVAC systems, room air conditioners (RACs) are widely utilized in small to medium-scale environments, including offices, residential units, and hotels, due to their effectiveness in regulating indoor thermal conditions. By maintaining indoor temperature and humidity within acceptable ranges, RACs play a crucial role in improving both energy efficiency and the thermal comfort of occupants (Yang et al., 2021).
Thermostats, as the key control interface of HVAC systems, have a significant influence on thermal comfort and energy consumption. The integration of advanced control algorithms into thermostat systems has enabled dynamic temperature setpoint adjustment, leading to improved comfort levels and reduced energy usage (Seri et al., 2021). Recent advancements have leveraged building energy simulation tools such as EnergyPlus to develop and test optimization-based thermostat control strategies. These efforts integrate building information models, advanced thermal comfort models, and optimization algorithms to optimize control performance under various indoor and climatic conditions (Bagheri-Esfeh and Dehghan, 2022; Elehwany et al., 2024). EnergyPlus is widely adopted in the building simulation community due to its ability to simulate complex indoor environmental conditions, thermal sensations, and HVAC system performance based on detailed physical and behavioral parameters. Control strategies developed within EnergyPlus have shown strong alignment with real-world building operation outcomes (Wang et al., 2024). For example, Smarra et al. (2018) proposed a machine learning-based model predictive control (MPC) strategy, whose effectiveness was validated using EnergyPlus. Yeon et al. (2019) integrated EnergyPlus and MATLAB via the Building controls virtual test bed (BCVTB) co-simulation platform to implement artificial neural network-based shutter control. Cetin et al. (2019) developed an on/off air conditioner controller using the built-in energy management system of EnergyPlus, improving HVAC operational efficiency.
Further efforts have expanded control optimization by incorporating multiple system components. Tian et al. (2019) applied multi-objective optimization to jointly adjust HVAC setpoints and ceiling fan speeds during summer operation, achieving significant energy savings. Sun et al. (2024b) analyzed the operating characteristics of split-type air conditioners in offices and formulated a rule-based control strategy for thermostats based on thermal comfort evaluation. These studies demonstrate the potential of combining EnergyPlus with machine learning and advanced control theories to realize energy-efficient and optimized building operation. Thermostat operation in EnergyPlus closely reflects real-world HVAC control logic. Setpoints are used to determine cooling and heating demands based on thermal zone loads. For single setpoint systems, cooling/heating is triggered when the zone temperature deviates from the user-defined temperature. For dual setpoint systems, both heating and cooling thresholds are evaluated to determine system response (Jiang et al., 2025). EnergyPlus thermostats, like physical devices, support schedule-based, zonal, and real-time control (Kontes et al., 2017; Rhodes et al., 2015). In practical applications, however, temperature setpoints are often manually configured by occupants. In office settings, where users may be focused on work activities, manual control frequently results in suboptimal or static settings, which can cause overcooling or overheating, thereby reducing comfort and increasing energy use.
To address these challenges, adaptive and predictive thermostat control strategies are essential for dynamically maintaining indoor thermal comfort while reducing energy consumption. This has led to growing interest in the adoption of advanced control approaches such as rule-based control, model-free control, and MPC, which can outperform traditional static or manual methods (Afram and Janabi-Sharifi, 2014; Michailidis et al., 2023). Rule-based strategies rely on predefined rules to adjust HVAC operation based on thermal states. Initially developed from expert knowledge and system behavior analysis, modern rule-based control has evolved through the integration of machine learning and data mining. For example, Alimohammadisagvand et al. (2018) evaluated multiple rule-based demand response methods for residential HVAC systems, while Sha et al. (2024) proposed a moving-window algorithm for indoor pollutant detection, adjusting airflow rates using data-mined rules. Zhu et al. (2025) enhanced rule-based control using offline-trained machine learning models to improve demand response performance while maintaining computational efficiency. While conventional rule-based systems lack flexibility for dynamic environments, their combination with learning-based algorithms enhances adaptability and extends applicability to more complex systems (Lu et al., 2023).
Model-free control approaches, such as artificial neural networks, fuzzy logic, and reinforcement learning, eliminate the need for explicit mathematical modeling. They are particularly well-suited to nonlinear, time-varying HVAC dynamics (Michailidis et al., 2023). Kumar and Kurian (2023) proposed a neural network-based predicted mean vote (PMV) predictor optimized through Bayesian algorithms to support real-time HVAC control. Safdari et al. (2025) introduced a Weather-Adaptive Fuzzy Control (WAFC) method that incorporates environmental and operational factors to balance comfort and energy use. Nguyen et al. (2024) implemented a deep reinforcement learning approach using a phased policy gradient algorithm to control HVAC systems, achieving effective performance without explicit physical modeling. However, model-free methods may struggle with interpretability, constraint handling, and physical accuracy, which can compromise stability and increase energy consumption.
It is noteworthy that many office buildings operate under an intermittent HVAC (Heating, Ventilation, and Air Conditioning) regime. Since the HVAC system is typically deactivated during nighttime or unoccupied hours, the building envelope and indoor environment frequently exist in a non-steady state at the onset of daily operation, accompanied by significant thermal accumulation. This leads to pronounced transient thermal fluctuations during the initial occupancy period, characterized by a rapid rise in operative temperature and the accumulation of latent heat loads (humidity). Conventional static setpoints or simplistic rule-based control strategies often overlook the building’s thermal inertia and lag effects, making it difficult to provide optimal regulatory responses during such transient phases. Therefore, developing an MPC strategy to forecast environmental evolution trends and execute receding horizon optimization is of substantial practical importance for enhancing both thermal comfort and energy efficiency in intermittent office environments.
Among advanced control strategies, MPC has emerged as a leading solution for predictive and energy-aware HVAC control due to its ability to incorporate system dynamics, constraints, and optimization objectives (Yao and Shekhar, 2021). By minimizing a cost function over a receding horizon, MPC enables anticipatory regulation of HVAC performance (da Fonseca et al., 2021). Recent efforts have improved MPC applicability through integration with deep learning, semantic modeling, and robust control. Tang et al. (2025) proposed a physics-informed deep learning MPC for optimizing VAV systems. Wan et al. (2025) established a semantic information model integrating BIM with MPC configuration. Ostadijafari and Dubey (2021) proposed a tube-based MPC formulation to address uncertainties in building thermal dynamics. The core of MPC lies in accurate modeling of HVAC systems (Bamdad et al., 2023). Physics-based models, derived from thermodynamics and heat transfer principles, provide physically interpretable representations of system behavior. For instance, Jiang et al. (2022) developed a thermodynamic model for thermoelectric cooling systems, while Lee and Lam (2014) proposed a simplified model validated through error analysis. Catano et al. (2013) and Xu and Chen (2013) also advanced physically grounded HVAC models for accurate prediction and optimization.
In contrast, data-driven models use operational data and machine learning techniques to capture system behavior without requiring detailed physical understanding. Yang et al. (2024) employed a quasi-Newton particle swarm optimization algorithm to develop a fuzzy dynamic model for air conditioners. Kocyigit (2015) and Sholahudin et al. (2019) explored neural network-based models for fault diagnosis and efficiency optimization. Zhao et al. (2014) used the extreme learning machine (ELM) to develop fast and accurate models of air conditioner performance, and Shao et al. (2012) constructed component-wise neural network models to enhance control robustness. While knowledge-based models often rely on simplifying assumptions and limited generalizability, data-driven models offer higher predictive accuracy and adaptability for dynamic control tasks. Despite growing research on HVAC modeling, most studies primarily focus on state variables such as efficiency or energy use. However, for building occupants, perceived comfort is often the most critical performance indicator. Therefore, the development of dynamic models specifically aimed at predicting and regulating human thermal comfort is of growing importance.
To address the challenge of real-time thermal comfort regulation in office buildings, this study proposes a predictive thermostat control framework based on MPC, combining a data-driven comfort prediction model and metaheuristic optimization. The main contributions of this study can be highlighted as follows.
Development of an ELM-based thermal comfort prediction model. An Extreme Learning Machine is utilized to construct a high-fidelity predictor that captures the complex nonlinear relationships among indoor environmental variables, occupancy, and PMV indices. This data-driven model facilitates accurate and rapid comfort forecasting suitable for real-time receding horizon control.
Design of a GWO-driven rolling optimization strategy with hardware protection mechanisms. A predictive control strategy driven by the Gray Wolf Optimizer is developed to iteratively compute optimal temperature setpoints. By integrating control action penalties and temporal constraints, this approach effectively balances thermal comfort requirements with energy efficiency and equipment longevity in dynamic indoor environments.
Integration and validation of the proposed framework via co-simulation and field experimentation. The predictive control strategy is implemented within an EnergyPlus-MPC co-simulation environment and further validated through a pilot field experiment in a real-world office building. This comprehensive dual-validation process demonstrates the effectiveness of the method in enhancing occupant comfort and reducing energy consumption while confirming its engineering feasibility on standard embedded hardware.
The remainder of this paper is structured as follows:
Section 2 details the development of the ELM-based comfort prediction model and the MPC optimization mechanism. Section 3 presents a case study using a five-zone office building simulation in EnergyPlus to validate and compare the proposed method against benchmark strategies. Section 4 discusses the strategy’s nonlinear adaptability and multi-objective performance. Finally, Section 5 concludes with insights on limitations and future directions, including multi-zone expansion and renewable energy integration.
Methodology
MPC is an advanced control strategy that leverages a predictive model of system behavior to perform rolling horizon optimization. By continuously solving an optimal control problem at each time step, MPC enables real-time adjustments of control inputs to maintain desired system performance over a prediction horizon. When applied to indoor environmental regulation, particularly in office buildings, MPC requires a dynamic model capable of capturing the nonlinear and time-varying relationships between system inputs and indoor thermal comfort outcomes. This predictive capability is essential for anticipatory regulation and maintaining a thermally comfortable indoor environment with minimal energy overhead.
Prediction model for indoor thermal comfort
To support the implementation of MPC for predictive thermostat control, this study constructs a data-driven indoor thermal comfort prediction model based on ELM algorithm. The proposed ELM model is designed to learn and approximate the complex nonlinear mapping between HVAC control inputs (e.g. thermostat setpoints) and perceived thermal comfort, measured by indices such as PMV. Owing to its single hidden layer feedforward architecture and analytically determined weights, ELM offers fast training speed and good generalization performance, making it suitable for real-time predictive control scenarios.
In this model, the input features include indoor temperature and humidity, outdoor weather conditions, occupancy rate, and air conditioner power level. The output is the predicted thermal comfort index at the next time step. This framework enables the controller to evaluate the comfort implications of alternative setpoints over the prediction horizon. The structure of the ELM-based thermal comfort prediction model is depicted in Figure 1, illustrating the flow of input variables, hidden layer transformation, and output prediction.

Architecture of the ELM prediction model.
The input layer of ELM comprises the indoor temperature
where
where
Once the number of hidden layers is determined, ELM only needs to randomly initialize
where
The minimum is achieved when
ELM-based predictive control of indoor thermal comfort
The concept of receding horizon optimization, introduced by Richalet et al., laid the theoretical foundation for MPC in process systems (Richalet et al., 1978). Since then, this framework has garnered significant attention and has been extensively developed and applied across a wide range of engineering domains. Building upon this theoretical paradigm, researchers have proposed various predictive control approaches tailored to different applications. Despite differences in model structure and optimization techniques, these methods generally follow a unified control architecture comprising three key components: a predictive model, a receding horizon optimization process, and a feedback correction mechanism.
In the context of indoor thermal environment regulation within office buildings, the MPC-based thermostat control strategy similarly follows this tripartite structure (Wang et al., 2023). First, a reliable predictive model is required to estimate future thermal comfort states based on current environmental conditions and HVAC operational variables. Second, an optimization algorithm is employed to determine the optimal sequence of temperature setpoints over the prediction horizon, aiming to maximize comfort while minimizing energy consumption. Finally, a feedback mechanism corrects for model inaccuracies and external disturbances by updating the control inputs in real time based on observed system behavior.
In this study, the predictive model is realized using ELM, as detailed in Section 2.1. The optimization process is conducted in a rolling horizon manner, where at each control interval, the system solves for a sequence of optimal setpoints using the comfort forecasts provided by the ELM model. Only the first setpoint of the optimized sequence is applied to the system, and the process is repeated at the next interval to reflect updated measurements and predictions. This feedback mechanism ensures adaptability to dynamic indoor conditions and occupant behavior.
The overall control framework, which integrates ELM-based comfort prediction, rolling optimization, and real-time feedback correction, is illustrated in Figure 2. This architecture forms the core of the proposed data-driven predictive control strategy and enables dynamic regulation of indoor thermal comfort in office environments.

Air conditioner control flowchart.
In Figure 2,
The MPC process for indoor office comfort is as follows: Over a prediction horizon of
The controller utilizes GWO to solve for the thermostat setpoints of indoor air conditioner over a prediction horizon of
GWO is based on the hunting behavior of wolf packs, in which the wolves are divided into four hierarchical levels, each strictly executing their respective tasks. The highest rank, denoted as
In the hunting process,
where
For the framework of ELM-MPC, a nonlinear optimization objective function is built, with GWO employed to solve for the optimal thermostat temperature setpoint sequence over the prediction horizon, as defined by:
where
where
Case study
Parameter settings
To evaluate the performance of the proposed ELM-based MPC strategy, a five-zone office building is adopted as the reference simulation model. The building configuration is derived from a standardized commercial building prototype developed and periodically revised by the U.S. The U.S. Department of Energy, in collaboration with industry experts. The selected site represents a cold and dry climatic region, providing a representative testing environment for thermostat control under heating-dominant conditions.
The office building has a floor plan of 18.46 m × 27.69 m, with a floor-to-ceiling height of 3.05 m. The thermal model of the building is constructed using the OpenStudio–SketchUp Plugin, which enables detailed spatial zoning and component specification. The geometric and zoning layout of the building is depicted in Figure 3. For simulation purposes, the building is divided into five thermal zones: one central zone and four perimeter zones. The central zone is characterized by relatively stable indoor thermal conditions, due to its limited exposure to external disturbances such as solar gains and infiltration. This thermally buffered environment allows for the isolation and evaluation of control algorithm performance with minimal interference from unpredictable external influences, such as wind-driven ventilation or direct solar radiation. As such, the central zone serves as the primary testbed for assessing the accuracy, adaptability, and energy efficiency of the proposed control strategy.

Target area diagram.
By focusing on a representative office setting with clearly defined boundary conditions and thermal behaviors, the simulation framework ensures both reproducibility and relevance to practical building control scenarios.
The building depicted in Figure 3 was modeled using EnergyPlus version 22.1. To comprehensively capture the dynamic thermal characteristics of the built environment, the modeling process incorporated detailed material properties of the envelope, including thickness, thermal conductivity, density, and specific heat capacity. The building envelope primarily consists of (1) exterior walls, (2) roof, (3) interior walls, (4) ceiling, (5) ground, (6) external windows, and (7) doors. The specific thermophysical parameters are summarized in Table 1.
Building envelope parameters.
Within the presented building model, continuous dynamic heat exchange is maintained between the Core Zone and its surrounding areas through envelope components including internal walls, floor slabs, and ceilings. According to the data presented in Table 1, a typical lightweight envelope system primarily composed of gypsum boards and partition walls is utilized, resulting in a relatively limited sensible heat storage capacity within the structural elements. Fundamental thermal inertia is nevertheless provided to the space by the internal thermal mass and the 0.102 m thick concrete floor slab. Furthermore, the response delay and evolutionary characteristics of the indoor temperature under transient conditions are shaped by the thermal equilibrium process of the large indoor air volume, coupled with the operational constraints of the HVAC system upon reaching capacity limits (Ghofrani et al., 2020; Verbeke and Audenaert, 2018). This physical environment, established through the coupling of envelope thermal performance, internal mass components, and HVAC system dynamics, serves as a rigorous basis for evaluating the effectiveness of control strategies leveraging structural thermal inertia for peak load shifting and the maintenance of thermal comfort.
The operation of room air conditioners is subject to multiple internal and external thermal influences, including occupant-generated heat, internal loads from plug-in equipment, and air exchange resulting from room infiltration. Among these, occupant presence and activity patterns—characterized by metabolic intensity, spatial distribution, and duration of stay—exert a considerable impact on the indoor thermal environment. Infiltration, as an uncontrolled ventilation mechanism, affects both the thermal and moisture balance of indoor air, thereby altering the cooling or heating demand imposed on HVAC systems. Similarly, internal heat gains from lighting and appliances elevate the operative air temperature, directly influencing HVAC system activation frequency and energy consumption levels.
To ensure realistic and robust simulation results, the parameters related to occupancy, infiltration rates, internal equipment and lighting loads, and system operation schedules must be carefully calibrated. These inputs form the basis for evaluating HVAC energy consumption under various occupancy and usage scenarios. By systematically optimizing these parameters, it becomes possible to assess thermal comfort outcomes and energy efficiency trade-offs in typical office environments. The key parameter settings used in this study are summarized in Table 2, providing a foundational reference for model configuration and simulation-based analysis.
Indoor environment parameters.
In this study, the operation of the air conditioning system strictly adheres to a typical office occupancy schedule to simulate realistic intermittent thermal load characteristics. The system is activated at a specific time each day, by which point the indoor environment has typically drifted away from the comfort zone due to nocturnal heat gains, resulting in substantial sensible and latent load fluctuations during the initial start-up phase. Although the core zone is relatively isolated from direct external meteorological disturbances, the internal transient perturbations, such as those triggered by sudden occupancy, equipment activation, and the cooling response of HVAC, remain highly nonlinear and time-varying. This provides a representative test scenario for validating the superiority of the proposed MPC-GWO algorithm in managing non-steady-state thermal processes.
The air conditioning system employed within the indoor environment depicted in Figure 3 is a packaged rooftop air conditioner (PSZ-AC). This PSZ-AC system operates as a self-contained HVAC unit designed to provide both cooling and heating functionalities for single-zone applications. Typically, the unit is housed within a weatherproof cabinet and installed either on the rooftop or at ground level, with conditioned air delivered to the indoor space through an associated duct network. The operational principle and key components of the PSZ-AC system are schematically illustrated in Figure 4.

Working principle of PSZ-AC.
As illustrated in Figure 4, PSZ-AC integrates both outdoor and indoor units into a compact, single-system configuration, rendering it well-suited for applications in small office spaces and residential buildings. The system operates on the principles of a vapor compression refrigeration cycle. Key components on the outdoor side include the compressor, condenser, expansion valve, evaporator, and a four-way reversing valve, which collectively facilitate the thermodynamic process. Within the condenser, the refrigerant undergoes a phase change, releasing or absorbing heat to or from the ambient outdoor environment depending on the mode of operation. Subsequently, the refrigerant passes through the expansion valve before entering the direct-expansion evaporator located on the indoor side. Conditioned air is propelled through the evaporator coil by a supply fan, enabling heat exchange between the refrigerant and indoor air.
Thermal energy transfer occurs via refrigerant piping between outdoor and indoor components, and the conditioned air is delivered into the occupied zone. Integrated temperature sensors continuously monitor indoor air temperature, enabling the control system to dynamically adjust operational parameters—including compressor speed, four-way valve position, and air supply rate—within predefined setpoint ranges. This closed-loop control mechanism ensures accurate thermostatic regulation of indoor temperature, maintaining occupant comfort in both cooling and heating modes.
During the HVAC system modeling and control simulation, several necessary idealized assumptions were made for the Packaged Single Zone Air Conditioner (PSZ-AC) to focus on validating the logical effectiveness of the MPC framework. Specifically, the cooling and heating outputs are assumed to respond instantaneously to changes in control setpoints, neglecting time delays associated with compressor startup, refrigerant cycle stabilization, heat exchanger thermal processes, and duct distribution. While these assumptions simplify the transient physical processes of the terminal equipment, they effectively eliminate low-level dynamic disturbances. This isolation allows for a more objective evaluation of how setpoint optimization strategies impact the macro-level indoor thermal balance and PMV indices, ensuring the clarity and reliability of the control performance assessment. The operational parameters of PSZ-AC system are summarized in Table 3.
Operational parameters of the PSZ-AC system.
Validation of the thermal comfort prediction model
This study utilized the EnergyPlus Co-simulation Toolbox to construct a platform for data acquisition and closed-loop control verification (Dostal and Baumelt, 2019). The co-simulation framework was deployed within a standard computing environment, with specific hardware configurations detailed in Table 4. The control algorithms were developed in MATLAB 2021b, establishing real-time bidirectional data exchange with EnergyPlus 22.1 via the BCVTB middleware. To comprehensively evaluate the generalization capability of the prediction model and the real-time feasibility of the control strategy proposed in Section 2, a full-year simulation was conducted across all operating conditions using the selected reference building model.
Data collection ranges and computational environment.
To comprehensively evaluate the generalization capability of the prediction model proposed in Section 2 and the real-time feasibility of the control strategy, a full-year simulation encompassing all operating conditions was conducted using the selected reference building model. Table 4 presents the detailed specifications of the data acquisition scope and the computational environment utilized in the experiments. Notably, performance tests indicated that the training process of the ELM model on the specified platform required only 0.9724 seconds, suggesting a significant potential for rapid model retraining within the sampling intervals. Furthermore, the single-step optimization for the ELM-MPC controller was completed in approximately 1.25 seconds. This computational efficiency, which is orders of magnitude faster than the control interval of 10 minutes, confirms the potential of the proposed strategy for real-time operation at the algorithmic level.
Using EnergyPlus, a total of 52,560 simulation data points were generated over a 1-year period, capturing variations in indoor and outdoor environmental conditions as well as corresponding human thermal comfort responses. This comprehensive dataset reflects a wide range of operational scenarios, enabling robust model development and validation. For the purpose of model training and evaluation, the dataset was divided into two subsets: the first 36,792 data points were used for model training and parameter identification, while the remaining 15,768 data points were reserved for testing to assess the generalization performance of the model. The parameter optimization for the prediction model was performed using the ELM algorithm, which efficiently estimates the network weights to minimize prediction error. To evaluate the accuracy of the trained ELM model, 100 data points were randomly selected from both the training and testing datasets. For each subset, the predicted PMV values were compared against the corresponding actual values derived from the EnergyPlus simulations. The results were visualized in the form of time series curves, allowing a direct comparison between predicted and reference PMV values and thus providing insight into the predictive capability of the model under varying indoor thermal conditions.
As illustrated in Figures 5 and 6, the ELM prediction model demonstrates a high level of accuracy in predicting the subsequent time step’s PMV by leveraging current inputs, including indoor and outdoor temperature and humidity conditions, occupancy rate, air conditioner power consumption, and the present PMV value. This predictive capability enables the model to capture the nonlinear and dynamic relationships between environmental variables and occupants’ thermal comfort.

Training dataset prediction results.

Testing dataset prediction results.
To quantitatively assess the model’s performance, three standard statistical metrics were employed: Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and the Coefficient of Determination (R
where
Prediction errors of ELM prediction model.
As presented in Table 5, the proposed ELM-based prediction model exhibits strong performance across both the training and test datasets. The low values of MAE and RMSE, combined with R
Simulation and comparative analysis of control strategies
To assess the performance of the proposed control strategy, it was embedded within a co-simulation framework integrating MATLAB and EnergyPlus. This configuration allowed the model predictive controller to interact dynamically with the building thermal environment model in real time. Within the co-simulation loop, EnergyPlus receives the thermostat setpoints generated by the external MPC algorithm and returns sequential simulation outputs, including indoor and outdoor air temperatures, occupant thermal comfort indices (e.g. PMV), HVAC system energy consumption, and occupancy profiles. To evaluate controller performance under representative extreme conditions, two design days were selected based on the standard definitions provided by the reference building model: the winter design day (December 21, with an outdoor dry-bulb temperature of
For validation purposes, the proposed control strategy was benchmarked against two alternative approaches: (1) the rule-based temperature setpoint control method developed by Sun et al., and (2) the thermal comfort-based temperature setting strategy recommended in ASHRAE Standard 55. This comparative analysis facilitates a comprehensive assessment of energy performance and thermal comfort compliance across varying operational scenarios.
The proposed MPC framework incorporates GWO within a rolling horizon optimization mechanism to continuously refine temperature setpoints in response to dynamic environmental and occupancy conditions. As depicted in Figures 7 to 12, the MPC-based controller adaptively adjusts thermostat setpoints within a narrow and responsive range, effectively maintaining PMV within the ASHRAE 55 recommended comfort band of

Summer typical day personnel comfort and indoor occupancy.

Summer typical day air conditioner energy consumption and setpoints.

Summer typical day zone operative Temperature and relative humidity.

Winter typical day personnel comfort and indoor occupancy.

Winter typical day air conditioner energy consumption and setpoints.

Winter typical day zone operative temperature and relative humidity.
In Figure 7, on a typical summer day with rising outdoor temperatures and full office occupancy, the PMV values under both the Baseline and ASHRAE 55 strategies exhibit a distinct upward trend, climbing above 0.5 and approaching 1.0. The rule-based control from Sun et al. demonstrates a limited regulatory effect, partially mitigating this trend but still failing to maintain comfort within the optimal (−0.5, +0.5) range. This failure can be attributed to the inability of traditional fixed-setpoint controls to anticipate the cumulative effects of thermal loads. In contrast, the proposed ELM-MPC strategy effectively suppresses these fluctuations through its predictive control actions.
Figure 8 illustrates the control mechanism responsible for the superior comfort performance. In contrast to the static setpoint of the Baseline strategy, the setpoint trajectory generated by the MPC is highly dynamic. During periods of lower load, such as morning hours or midday breaks, the MPC raises the setpoint temperature (to 24°C–
Figure 9 highlights the discrepancy between air temperature and operative temperature, a key challenge in thermal control. This discrepancy is caused by thermal radiation from the building envelope as it releases stored solar heat, which elevates the operative temperature even when the air temperature is stable. Under the Baseline strategy, this effect leads to a significant and undesirable rise in the afternoon operative temperature. In contrast, the ELM-MPC strategy leverages its predictive model to anticipate this delayed radiant heat gain. It compensates proactively by lowering the air temperature, thus maintaining a stable operative temperature. A secondary benefit of this precise control is improved humidity regulation; the strategy avoids the dry discomfort associated with excessive dehumidification, keeping relative humidity within a comfortable range.
Figure 10 illustrates the PMV control performance of various strategies during a typical winter day. The PMV curves for the Baseline and ASHRAE 55 strategies exhibit a distinct downward trend, frequently falling below the lower comfort threshold of −0.5. This failure indicates that maintaining a fixed indoor air temperature alone is insufficient to counteract severe external cold loads. While the rule-based control strategy from Sun et al. succeeded in elevating overall PMV levels, its performance was compromised by significant overshooting. Conversely, the ELM-MPC strategy demonstrated exceptional robustness. It maintained the PMV within the optimal comfort zone (−0.5–+0.5) throughout the day, including the challenging morning start-up and evening cooling phases, thereby effectively preventing discomfort caused by perceived coldness.
Figure 11 reveals that winter setpoint optimization operates on a logic distinct from that of summer. Rather than simply pursuing high setpoints for heating, the MPC strategy seeks an energy-efficient balance by determining the lowest possible energy path that still meets PMV requirements. Unlike the rule-based approach from Sun et al., the setpoint modulation of the MPC is gradual and smooth, which avoids the power spikes caused by aggressive HVAC operation. When occupancy levels drop or thermal loads diminish, the controller promptly adjusts the setpoint downward to minimize unnecessary heating output.
Figure 12 illustrates the detailed thermal environment during winter conditions. Under the Baseline strategy, the operative temperature exhibits a marked decline. This drop is primarily driven by low surface temperatures on external walls and windows, which induce a significant cold radiation effect. By proactively elevating the air temperature setpoint to counteract this radiative cooling, the ELM-MPC strategy consistently maintains an operative temperature 1°C–
To quantitatively evaluate the performance of the proposed ELM-MPC control strategy, a comparative analysis was conducted based on air conditioner power consumption and thermal comfort levels during the full simulation period under typical summer and winter design conditions. Thermal comfort was classified according to the evaluation standard for indoor thermal environment in civil buildings (Li et al., 2014), where a PMV value within the range of
Quantitative performance statistics of different strategies.
As detailed in Table 6, where bold values indicate the optimal performance achieved among the evaluated strategies for each specific metric, the ELM-MPC strategy demonstrates a superior ability to balance energy efficiency and thermal comfort under extreme seasonal conditions, significantly outperforming conventional methods. In the summer design-day scenario, the proposed strategy excelled as a balanced solution. For energy consumption, it used only 10.91 kWh, marking substantial reductions of 36.3%, 21.1%, and 16.2% compared to the baseline control (17.12 kWh), ASHRAE Standard 55 (13.83 kWh), and Sun et al. (13.02 kWh), respectively. This efficiency did not compromise comfort. The strategy sustained Level 1 comfort (PMV within (−0.5, +0.5)) for 280 minutes, a duration that was 16.7%–33.3% longer than the three benchmarks. Furthermore, it extended the Level 2 acceptable duration to 1350 minutes, matching ASHRAE 55 and surpassing the baseline and Sun et al. by 23.9% and 16.4%. This comprehensive improvement highlights how the pre-cooling capability of ELM-MPC effectively counteracts building thermal inertia during summer start-up, thereby preventing afternoon overheating.
Conversely, under winter design-day conditions, the strategy prioritized thermal comfort without excessive energy use. While the ASHRAE 55 standard appeared most efficient (5.49 kWh), its advantage was misleading, as it was achieved at the severe expense of occupant comfort. In stark contrast, our ELM-MPC strategy consumed a moderate 6.54 kWh, which was still 9.7% and 5.9% lower than the baseline and Sun et al. and delivered an exceptional comfort experience. It sustained Level 1 comfort for 680 minutes, whereas the baseline and ASHRAE 55 completely failed to reach this state (0 minutes). Most impressively, for Level 2 comfort, it provided 830 minutes of acceptable conditions, a duration 16.6 times longer than the mere 50 minutes from ASHRAE 55. This robust performance is attributed to the strategy’s ability to mitigate cold radiation from the building envelope, a critical capability that the fixed-setpoint ASHRAE 55 approach demonstrably lacks in severe cold.
Experimental validation and comparative analysis
To further validate the feasibility and effectiveness of the proposed predictive control strategy in a real-world environment, this study employed the radiation-convection dual-terminal air-source heat pump experimental platform at Shandong Key Laboratory of Smart Buildings and Energy Efficiency for data acquisition and testing of the control strategy. The experimental setup is depicted in Figure 13.

On-site photograph of the experimental setup.
The experimental site comprises a typical office space measuring 8.0 m by 5.2 m divided into two independent temperature-controlled zones designated as East and West. Utilized by the air conditioning system as the heating and cooling source is a single air-source heat pump unit. Employing a radiant-convective dual-terminal configuration, this system integrates three fan coil units (FCUs) handling sensible and latent heat loads with a radiant floor system to ensure efficient cooling and heating. Characterized by full variable frequency drive capability, the system allows flexible operation. Activated during testing of the proposed control strategy was only the fan coil supply air mode while 24-hour experiments were conducted in heating mode. Illustrated in Figure 14 is the air conditioning system employed in the test facility.

Schematic diagram of the experimental radiant-convective dual-terminal HVAC system.
Built upon a distributed IoT architecture, the data acquisition and control system for the experimental platform employs a Raspberry Pi 4 Model B as the edge computing node Xing et al. (2022). Collected by high-precision sensors real-time indoor temperature and humidity data are transmitted via an RS-485 bus. Operated on the host computer, the control algorithm allows the calculated optimal setpoint to be transmitted wirelessly to the Raspberry Pi. Relayed via Modbus RTU protocol commands to the air conditioning terminal units, this setpoint enables control of the terminal fans. Detailed in Table 7 are air-source heat pump parameters. Provided in Table 8 are specific sensor models measurement ranges and accuracy specifications.
Air-source heat pump parameters.
Sensor types and parameters.
The experiment was conducted under winter operating conditions with external meteorological parameters representative of typical cold winter characteristics in the Jinan region. Optimization calculations were performed by the controller every 10 minutes thereby enabling dynamic adjustment of the heating setpoint of the air conditioning system according to real-time monitored values of indoor temperature, humidity, and occupancy. Maintained throughout the experiment were normal office occupancy and electrical equipment operation in order to introduce realistic thermal disturbances and thereby comprehensively evaluate the performance of the control strategy under transient conditions.
Illustrated in Figures 15 and 16 are pronounced intelligent pre-regulation characteristics exhibited by the temperature setpoint output by the controller. Elevated smoothly by the controller prior to personnel arrival at 8:00 is the set temperature from night-time energy-saving mode approximately 18°C–

Fan temperature setpoint and PMV curve for zone 1.

Fan temperature setpoint and PMV curve for zone 2.
The evolution of the indoor thermal environment is further illustrated in Figures 17 and 18. Excellent tracking performance is demonstrated as the actual indoor air temperature closely follows the setpoint trajectory. Notably, a smoother temperature profile is achieved through the incorporation of a control action penalty term and a 10-minute control interval, effectively eliminating the sawtooth oscillations typically observed in conventional fixed-frequency air conditioning systems. This smooth regulation contributes to extending the operational lifespan of the variable-frequency compressor. Concurrently, although indoor relative humidity naturally decreases with rising temperatures, it remains maintained within the optimal range of 30%–45% throughout the heating period. Consequently, the proposed controller prevents the indoor dryness and discomfort associated with excessive winter heating, thereby indicating that thermal comfort is sustained without compromising humidity-related environmental quality.

Indoor air temperature and relative humidity fluctuations measured in zone 1.

Indoor air temperature and relative humidity fluctuations measured in zone 2.
The real-time power consumption profile of the air conditioning system recorded during the experiment is illustrated in Figure 19. As evidenced by the graph, typical variable-frequency continuous regulation characteristics are exhibited by the air-source heat pump system. Rather than displaying abrupt fluctuations between zero and the rated value, the power output is modulated smoothly in alignment with the setpoint commands of the controller (refer to Figures 15 and 16) and dynamic variations in indoor and outdoor thermal loads. The effectiveness of the proposed 10-minute control step and the action penalty mechanism at the practical hardware level is validated by this continuous operating mode. Consequently, stable grid load maintenance is facilitated, whilst the risk of mechanical wear on the compressor is significantly reduced. Notably, relatively high energy consumption levels are observed during night-time hours (00:00–08:00). This phenomenon is primarily attributed to the significant degradation of the Coefficient of Performance (COP) of the heat pump induced by extremely low outdoor temperatures during winter nights. Compounded by the absence of solar radiation and internal heat gains from occupants and lighting, increased electrical energy input is required to sustain the night-time set temperature. Overall, the capability of the proposed control strategy to enable refined and smooth actuation under complex environmental conditions, whilst ensuring thermal comfort, is confirmed by the observed power curve.

Real-time power consumption fluctuations exhibited by the air conditioning system.
Discussion
The proposed control strategy demonstrates a substantial enhancement in indoor thermal comfort while maintaining energy efficiency, confirming its effectiveness in balancing occupant satisfaction and HVAC performance. By leveraging the EnergyPlus simulation platform for precise PMV computation, the method integrates an Extreme Learning Machine (ELM) for accurate prediction of thermal comfort and employs a Gray Wolf Optimizer (GWO) to enable rolling horizon optimization within the proposed model predictive control (MPC) framework. Through iterative optimization and feedback correction, the system adaptively adjusts thermostat setpoints in response to environmental and occupancy variations, thereby sustaining thermal comfort within recommended ranges without incurring excessive air conditioner energy consumption. Unlike conventional fixed-setpoint or rule-based strategies, the proposed controller considers both the physical constraints of thermostat operation and the dynamic response of indoor comfort conditions. Compared to the rule-based method developed by Sun et al., the ELM-MPC strategy exhibits approximately 30% improvement in control precision, reflecting its superior responsiveness to transient disturbances and variable indoor loads.
From a comparative perspective, the strategy offers two primary advantages over traditional fixed-temperature control:
(i) Improved thermal comfort: By dynamically adjusting setpoints in response to fluctuating outdoor and indoor conditions, the controller significantly increases the duration in which PMV remains within the Level 2 comfort band of (−1, +1). Static setpoint strategies, in contrast, lack the adaptability to maintain comfort across diverse operating conditions.
(ii) Enhanced energy efficiency: The proposed method achieves a 21.1% reduction in air conditioner power consumption relative to the standard ASHRAE 55-based control, demonstrating the energy-saving potential of data-driven predictive control.
On a typical summer day, the four control strategies exhibit distinct differences in their ability to maintain comfort. Driven by rising outdoor temperatures and the accumulation of internal heat loads from occupants, PMV values under the Baseline and ASHRAE 55 strategies drift significantly upward. They frequently breach the +0.5 threshold and approach +1.0, signifying an uncomfortably warm environment. The primary cause of this performance gap is the reactive nature of traditional fixed setpoint control. Such systems cannot anticipate the delayed heating effects resulting from the thermal inertia of the building envelope. Although the rule-based strategy from Sun et al. offered improved flexibility and reduced PMV fluctuations compared to fixed methods, it still suffered from significant oscillatory behavior. This limitation stems from the reliance on predefined logic thresholds, which cannot account for continuous future changes in the environment. As a result, maintaining precise thermal neutrality under complex dynamic loads remains difficult for these rule-based systems. In contrast, the ELM-MPC strategy proposed in this study demonstrated superior control performance. Benefiting from the precise predictions of the ELM model and the rolling optimization capability of the GWO algorithm, this strategy enables the early identification of thermal load trends and facilitates proactive intervention. It consistently maintains the PMV tightly around zero, effectively suppressing significant thermal comfort fluctuations. This capability ensures a high-quality thermal environment in the office throughout the day.
On a typical winter day, the four control strategies again exhibited significant disparities in their ability to maintain thermal comfort. Faced with the dual challenges of a cold climate and thermal radiation from the building envelope, the Baseline and ASHRAE 55 strategies showed clear limitations. The PMV curves for these methods displayed a distinct downward trend, frequently dropping below the −0.5 comfort threshold during the critical morning start-up and evening cooling phases, indicating a perceptible sensation of cold. This finding indicates that relying solely on a fixed indoor air temperature setpoint is insufficient to compensate for the cold radiation effects induced by low-temperature surfaces on the building envelope. Consequently, the resulting thermal environment leads to a perceived temperature that is lower than the design intent, causing discomfort for occupants. The rule-based control strategy proposed by Sun et al. has mitigated this issue to a certain degree. By implementing rule adjustments, this method effectively raises indoor temperatures, thereby preventing severe cold discomfort. However, the control logic is characterized by pronounced conservatism and a tendency to overshoot. To ensure comfort, the strategy often maintains elevated setpoint temperatures for extended durations. This results in persistently high PMV values near +0.5 during the middle of the day. Such behavior not only compromises control accuracy but also leads to excessive heating energy consumption. In contrast, the ELM-MPC strategy proposed in this study achieves a superior balance between robustness and energy efficiency. Highly sensitive to operative temperature trends, the strategy precisely determines the necessary heat compensation using its predictive model. It effectively eliminates the underheating issues of the Baseline approach and avoids the overheating tendencies observed in the method by Sun et al., consistently keeping the PMV within the optimal range of −0.5–+0.5. This predictive, on-demand heating mechanism ensures high-quality thermal comfort throughout the day while maximizing energy utilization efficiency.
To mitigate the potential impact of high frequency setpoint adjustments on the operational lifespan of HVAC equipment, this study incorporates multiple protective mechanisms within the optimization framework. First, a control action penalty term is explicitly integrated into the objective function (equation (8)) to suppress aggressive setpoint fluctuations. Previous research indicates that in practical engineering deployments, increasing the weight of this penalty coefficient can enhance system robustness and reduce fan energy consumption, albeit with a marginal trade-off in temperature tracking precision Wei et al. (2022). Second, a control interval of 10 minutes is adopted for the modern inverter-based HVAC system modeled in this study. Experimental evidence suggests that this frequency facilitates smooth part-load regulation and effectively circumvents the mechanical wear associated with the rapid on-off cycling typical of traditional fixed-frequency controls Bohara et al. (2023). It is important to acknowledge that the PSZ-AC system modeled in this study assumes an idealized instantaneous response for heating and cooling outputs, implying that changes in the thermostat setpoint are realized immediately without hardware latency. In practical deployments, however, compressors typically require a ramp-up or ramp-down period of approximately 6 seconds (Sun et al., 2024a). The control interval of 10 minutes adopted in this framework provides a sufficient temporal margin to accommodate such transient processes. Future research will integrate detailed component-level dynamics, including compressor cycling constraints and minimum run times, to further validate the hardware feasibility of the proposed control trajectories.
According to the evaluation standard for indoor thermal environment in civil buildings (Li et al., 2014), the PMV range of (−0.5, +0.5) defines optimal thermal comfort. The simulation results indicate that the ELM-MPC approach significantly extends the duration within this optimal comfort band compared to the baseline and ASHRAE 55 strategies. Moreover, when benchmarked against the rule-based method of Sun et al., the proposed controller achieves a 16.2% reduction in energy consumption, while maintaining comparable thermal comfort performance, further validating its practical effectiveness. Furthermore, the engineering feasibility of the proposed control strategy is confirmed through experimental validation conducted within a real-world office environment. As indicated by the experimental results, smooth and effective control commands are generated by the controller despite limitations associated with physical hardware and sensor noise, thereby maintaining the indoor PMV within the comfort range. While relatively high energy consumption was recorded during night-time testing, the necessity for future integrated optimization involving both indoor and outdoor units is highlighted by this observation.
Despite the encouraging performance of the proposed control strategy, several limitations must be considered for practical deployment. First, the current study relies on a standardized five-zone office building model, which may not fully capture the variability present in real-world settings. Factors such as uncertainties in building envelope thermal properties and unpredictable indoor air infiltration rates could lead to deviations in thermal comfort prediction and control effectiveness. Second, although computational efficiency at the current prediction step size is demonstrated, a substantial increase in the computational overhead associated with the GWO algorithm may be induced by the extension of the prediction time horizon. Consequently, the exploration of rolling optimization strategies specifically tailored for prolonged temporal horizons is recommended as a direction for future research. Third, the present strategy is exclusively focused on thermal comfort and does not incorporate indoor air quality (IAQ) indicators such as
In conclusion, this study contributes a novel and effective framework for predictive indoor thermal comfort regulation based on model predictive control. The ELM-based comfort prediction model and the integration of a GWO-driven rolling optimization process represent notable methodological advancements. Beyond its theoretical value, the proposed control approach holds significant potential for real-world applications, particularly in the context of building energy retrofits and smart building systems. As building automation technologies continue to evolve, the findings of this study are expected to support broader implementation in advanced HVAC systems, fostering both energy conservation and improved occupant well-being.
Conclusion
This paper presents an innovative thermostat control strategy for office buildings that integrates an Extreme Learning Machine (ELM) with Model Predictive Control (MPC) to achieve a dynamic balance between indoor thermal comfort and air conditioner energy consumption. A multi-parameter PMV prediction model was developed, and a rolling horizon optimization mechanism driven by the Gray Wolf Optimizer (GWO) was designed to optimize thermostat setpoints as a multi-objective problem considering both comfort deviation and energy cost. Model parameters were trained using data generated via EnergyPlus-based co-simulation, and the strategy was validated under typical seasonal conditions. The results demonstrate that the proposed ELM-MPC control strategy exhibits superior seasonal adaptability. It effectively mitigates overheating risks in summer through pre-cooling and counteracts cold radiation effects in winter via precise heat compensation. Consequently, the strategy significantly improves the duration of thermal comfort (PMV within
Despite these promising outcomes, the current work has limitations. It does not fully account for component-level energy consumption characteristics (e.g. compressor and fan dynamics) or consider multi-zone control within complex building environments. Future work will address these gaps by incorporating game-theoretic frameworks and advanced multi-objective optimization algorithms to establish coordinated control between component-level energy use and multi-zone thermal comfort. Furthermore, the integration of digital twin technology into the EnergyPlus co-simulation platform will be explored, enabling real-time adaptation based on measured data and occupant behavior for more adaptive and responsive HVAC control in office buildings.
Footnotes
Acknowledgements
This research was funded by National Natural Science Foundation of China (No. 62473236).
Author contributions
All authors contributed to the overall conception and design of the study. Dongrun Yang was responsible for the original draft preparation and conducted the experimental work. Ming Wang provided supervision and oversight throughout the research process. Yixuan Yu and Mingyuan Wang contributed to data curation and preprocessing. Qianchuan Zhao led the development of the methodological framework. Xuehan Zheng was involved in manuscript review and editing. He Gao supported the project by providing necessary resources. All authors have read and approved the final version of the manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was financially supported by National Natural Science Foundation of China (No. 62473236).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
