Abstract
This study investigates the integration of solar energy into smart grids using artificial intelligence (AI) to improve energy management and production control. Accurate forecasting of key parameters enhances solar power generation efficiency and reduces losses, supporting the transition from traditional power systems. Initially, two deep learning models gated recurrent unit network (GRUNet) and long short-term memory network (LSTMNet) were evaluated. LSTMNet demonstrated superior performance across error metrics: Mean absolute percentage error (MAPE), mean absolute error, and mean squared error. To further enhance forecasting accuracy, two hybrid models were developed: Hybrid convolutional neural networks with long short-term memory net (HCLNet: a convolutional neural network and long short-term memory (LSTM) combination) and hybrid autoencoder LSTMNet (HAELNet: an autoencoder-LSTM framework). These models were trained and validated using one year of real solar power plant data. Results showed that HAELNet outperformed HCLNet, achieving the lowest MAPE values for daily power generation, grid connected power generation, and solar radiance of 1.221, 2.282, and 2.131, respectively. HAELNet's improved accuracy is attributed to its ability to capture complex patterns and long-term dependencies in time-series data. The study emphasizes the critical role of machine learning in the evolving energy sector and it's potential to support sustainability goals by optimizing renewable energy forecasting and reducing greenhouse gas emissions. Overall, the findings highlight the value of advanced AI models for efficient and reliable solar energy integration.
Keywords
Introduction
Cities which strive to be sustainable require a higher input of energy, often leading to the recourse of green energy sources. One of the numerous advantages of renewable energy sources, like, solar electricity, is the reduction of energy costs and the decrease of emissions of green-house gases (Zhang et al., 2024). On the other hand, sustainable energy has its problems such as the fact that it is seasonal and unpredictable, and this is an obstacle to grid integration. Solar energy is a resource that is both dependable and efficient, and thanks to the lowering of solar panels cost, it becomes particularly valuable (Hajji et al., 2021). Given the benefits of solar energy, its use is a more and more promoted renewable energy source on the global market during the last years (Emam et al., 2023).
For the smooth integration of solar energy into systems, precise forecasting is a must. Forecasts harmonize power output and demand due to storing and managing reserve, facilitating grid stability, and estimating reserves. Time-series forecasting has given birth to deep learning algorithms that make precise predictions based on past data. Publicly available solar energy datasets play a critical role in enhancing the precision of energy output forecasts and, by extension, improving the reliability of power grid operations. The availability of such comprehensive historical datasets enables the deployment of advanced forecasting methodologies that can anticipate solar energy production with greater temporal and spatial accuracy. These forecasts are instrumental in aligning generation profiles with dynamic demand patterns, thereby facilitating the secure and efficient integration of intermittent solar resources into modern electricity networks (Hajji et al., 2022). Accurate short- and long-term predictions are essential for a range of grid management functions, including reserve allocation, load balancing, and the optimal operation of energy storage systems (Wazirali et al., 2023). Achieving such predictive accuracy, however, depends not only on the availability of rich historical data but also on the application of robust computational techniques. In recent years, deep learning models have demonstrated considerable success in time-series forecasting tasks, particularly through their ability to model nonlinear temporal dependencies and deliver fine-grained, point-wise predictions. To fully leverage these capabilities, there is a growing imperative to expand access to high-quality, annotated datasets on solar energy production. Doing so would not only support methodological advancements but also contribute to the broader goals of grid resilience and energy transition.
Recent advances in renewable energy forecasting have leveraged a variety of deep learning and hybrid techniques to improve accuracy and operational efficiency. Singh et al. (2024) demonstrated that machine learning models can effectively manage energy and forecast power in microgrids integrating multiple distributed energy resources, highlighting their potential to optimize grid performance. Mfetoum et al. (2024) employed multilayer perceptron neural networks to enhance solar irradiance prediction in Central Africa, emphasizing the integration of meteorological data to improve forecasts. Guermoui et al. (2024) analyzed multi-scale fusion methods, underscoring their importance for photovoltaic (PV) power forecasting across diverse contexts. Mouloud et al. (2024) assessed hybrid bidirectional deep learning architectures for short-term global horizontal irradiance forecasting, illustrating their superior capability to capture temporal dependencies. Complementing these efforts, Molu et al. (2024) proposed a hybrid deep learning approach with Bayesian optimization that advanced short-term solar irradiance forecasting accuracy. Kheldoun et al. (2024) introduced a combined convolutional neural network (CNN)-BiGRU framework for seasonal irradiance prediction, showcasing the value of fusing convolutional and recurrent networks. In parallel, Coban et al. (2023) applied deep learning to forecast electric vehicle energy consumption, reflecting the expanding scope of intelligent prediction in energy systems. Khelifi et al. (2023) explored a hybrid TVF-EMD-ELM strategy for short-term PV power forecasting, revealing its effectiveness compared to conventional methods. Al-Suod et al. (2025) demonstrated that artificial neural networks can accurately forecast mining plant energy consumption, emphasizing ANN versatility in industrial contexts. Alharbi et al. (2024) proposed combining multilayer perceptrons with a waterwheel plant algorithm for hyperparameter tuning to forecast building energy efficiency. Finally, Sharma et al. (2023) presented solar power forecasting using GD and LM-artificial neural networks under diverse weather conditions, confirming the advantages of integrating advanced neural architectures for more resilient energy predictions.
The current artificial intelligence (AI)-driven developments have influenced various sectors like object recognition (Peng et al., 2024), image classification (Afif et al., 2020), forecasting of electricity market (Saroha et al., 2021), and forecasting of time-series (Wang et al., 2019) in particular. Generally, one of the most researched domains in the application of AI solutions is solar energy production forecasting (Chen et al., 2022) and its results have been comparable to conventional techniques. This is so because of the AI models’ learning capability from historical data and establishing strong connections among the relevant variables. Classical AI techniques, in addition to extreme learning machines (Ahsan et al., 2025), support vector machines (Deo et al., 2016), and fuzzy neural networks (Sharifian et al., 2018) are the most widely utilized to mimic the complicated nonlinear dynamics between time series data and make predictions.
Shallow models are easy to understand and practical; however, solar energy forecasting might suffer from limited learning possibilities and too much reliance on manual selection of features. As for the previous assumption, they have been proven to be incapable of accurately representing and handling the fluctuating solar energy data due to the necessity of involving skilled engineering in the development process. Furthermore, the fact that they are not capable of doing generalizations leads to inaccurate predictions in the presence of complicated weather changes. Although they are good for smaller datasets shallow models are most often the source of overfitting when used for bigger and more diverse datasets. Shallow models like ARIMA have a hard time managing large datasets in the historical solar energy output which in turn % results in their poor performance (Mukilan et al., 2023). The fact that these methodologies are not efficient enough under the conditions of a solar power plant calls for the application of AI methods like Deep Learning that are more reliable, due to the ability to handle big volumes of data and having the capacity for generalization (Feng et al., 2020). As it can learn from large datasets, it enables unsupervised learning, and it allows high-level abstraction, a deep learning technique is one of the successful AI methods which is used in various disciplines (Afif et al., 2022). Deep Learning is more powerful and flexible than traditional models as evidenced by its proliferation in numerous applications such as indoor object detection, fatigue detection (Hu et al., 2024), and time series forecasting (Kumar et al., 2024). It has been observed that deep learning models, e.g., CNNs (El Alani et al., 2021), long short-term memories (LSTMs) (Harrou et al., 2020), and autoencoders (Saffari et al., 2021), proved to be very effective in resolving the solar energy forecasting challenge.
The study proposes a novel hybrid deep learning architecture, hybrid autoencoder LSTMNet (HAELNet), specifically designed to enhance the accuracy and robustness of solar energy production forecasts. Accurate forecasting is a fundamental prerequisite for the reliable and cost-effective integration of intermittent renewable sources into smart grid infrastructures. In this context, HAELNet integrates the representational strengths of autoencoders with the sequential learning capabilities of LSTM networks, thereby enabling the model to extract latent features while effectively modeling temporal dependencies. The model is trained on a year-long dataset collected from an operational solar power facility, with the aim of forecasting three critical parameters: Daily power generation (DPG), grid connected power generation (GCPG), and solar radiance (SR). While long short-term memory network (LSTMNet) provides the baseline forecasts, HAELNet refines these outputs through its hybrid mechanism, yielding significant performance gains. Quantitative evaluation using standard error metrics, including mean absolute error (MAE), mean squared error (MSE), and mean absolute percentage error (MAPE) demonstrates that HAELNet consistently surpasses established models such as gated recurrent unit network (GRUNet), LSTMNet, and hybrid convolutional neural networks with LSTMNet (HCLNet). Importantly, the contributions of this study extend beyond algorithmic improvements. By enabling more accurate and stable solar forecasting, HAELNet enhances grid scheduling efficiency, supports real-time energy dispatch, and facilitates the scalable deployment of renewable energy systems. In doing so, it not only bridges the gap between model accuracy and operational utility but also contributes to the development of a more resilient, sustainable, and economically viable energy ecosystem. The key contributions of this study are as follows:
This study proposes HAELNet, a novel hybrid model that integrates autoencoders with LSTM networks. HAELNet effectively combines deep feature extraction with long-term temporal learning, outperforming existing models such as GRUNet, LSTMNet, and HCLNet. Extensive experiments using real-world solar power data demonstrate HAELNet's superior accuracy in forecasting key parameters DPG, GCPG, and SR as measured by MAPE, MAE, and MSE. By adopting the HAELNet approach, the study significantly improves prediction precision, showcasing the practical impact of machine learning in optimizing solar energy forecasting and supporting more efficient operation of solar power systems. Improved performance of HAELNet highlights its potential to facilitate the seamless integration of solar energy into smart grids. This contributes to sustainable energy development, efficient grid management, and progress toward carbon reduction goals.
A comprehensive analysis of the proposed framework and its underlying methodologies is presented in the “Proposed methodology” Section. This section details the architectural design, provides formal mathematical formulations, and offers a preliminary comparative assessment of the models introduced. The “Case study” section is dedicated to the case study and elaborates on the dataset employed. The “Results and discussion” section offers a comprehensive analysis of the outcomes, including losses and projections. Finally, the “Conclusion” section encapsulates the study's concluding reflections and insights.
Literature review
To effectively integrate renewable energy into a grid, which is increasingly essential due to the rising energy demands of modern cities, a dependable forecasting system is necessary. Diverse methodologies have been proposed to develop high-efficiency solar energy forecasting systems. Researchers have employed several machine learning techniques and data from many power and energy facilities to enhance forecast accuracy (Pérez et al., 2021). Although several studies, such as the work of Prajapati and Tiwari (2021), employ a singular statistical ARIMA framework to predict the parameters of solar plants, their findings remain ambiguous due to difficulty handling large datasets. Zafar et al. (2024a, 2024b) proposed a hybrid methodology that integrates LSTM and ARIMA for the estimation of short-term electrical power in PV facilities. This strategy aims to use the benefits of both techniques to enhance forecast precision. However, the evaluation's limited scope may hinder its broader applicability. Employing an alternative technique, Ahsan et al. (2024) used a Bi-LSTM model to predict several features of solar plant power generation. Nonetheless, the reliance of this model on a singular input feature may constrain its ability to account for additional variables influencing energy generation. Likewise, Gupta et al. (2021) evaluated the efficacy of an LSTM neural network in comparison to ARIMA and random forest models to advocate for the network's application in forecasting energy production in solar PV power plants. Additional datasets pertaining to electricity generation have likewise undergone machine learning methodologies. In the work of Boudia et al. (2020), the author advocates the ELM technique over ARIMA and LSTM for short-term wind power forecasting because to its superior accuracy. Nonetheless, the study overlooks the computational complexity and feasibility of the models, while the repeatability and reliability of the results are compromised by the lack of transparency in the training and preprocessing methodologies (Gao et al., 2022; Zheng et al., 2021).
Deep Learning-based techniques are considered in this study due to their superior performance over traditional methods. A novel approach introduced in (Singla et al., 2024) combines an iterative filtering with bidirectional LSTM for irradiance forecasting. While the bidirectional LSTM improves sequence learning by processing data in both directions, its increased complexity and memory requirements make it less efficient for longer time-series datasets or real-time scenarios, its inherent complexity and sensitivity to hyper-parameters make the model challenging to fine-tune for diverse datasets. In (Qu et al., 2021), a temporal distributed gated recurrent network is proposed for solar energy forecasting, incorporating daily fluctuation extraction, scenario generation, and one-day-ahead forecasting components. However, the gated recurrent network's reliance on vanishing gradient-prone structures for long sequences can reduce its efficacy when dealing with highly dynamic solar power datasets. Additionally, Bhutta et al. (2024) developed a hybrid CNN-LSTM model that effectively captures spatial and temporal features for solar energy forecasting. While CNNs excel at capturing local spatial dependencies and LSTM networks effectively model sequential patterns, the integration of these architectures in hybrid models often results in elevated computational demands. This increased complexity can constrain their practical deployment, particularly in scenarios requiring real-time processing or operating under resource-constrained conditions. In a related study (Rai et al., 2022), a sequence-to-sequence autoencoder architecture incorporating gated recurrent units (GRUs) was proposed to accommodate forecasting over variable temporal horizons. Although GRUs offer computational efficiency due to their simplified architecture relative to LSTMs, their reduced gating structure may compromise the model's ability to retain and leverage long-range temporal dependencies, an essential requirement in domains characterized by prolonged seasonal patterns. Similarly, the hybrid framework presented in the work of Singla et al. (2023) demonstrates an effective synergy for modeling both temporal and spatial features in solar irradiance forecasting. However, the GRU's limited capacity to capture complex temporal relationships may restrict its performance in datasets with high seasonal or diurnal variability. In another approach (Zheng et al., 2020), a multi-regional forecasting system was developed by combining bidirectional LSTM networks with a particle swarm optimization (PSO) algorithm. While the bidirectional LSTM component enhances the model's sequence learning capabilities, the inclusion of PSO introduces additional computational overhead. This added complexity, although potentially beneficial for optimization, raises concern about the model's scalability and feasibility in real-time or large-scale deployment contexts.
In another research (Zhang et al., 2020), authors roughly forecast the production of solar plants for the forthcoming day with the use of a model that is hybrid which merges LSTM and autoencoders. The hybrid framework named persistence and autoencoder LSTM (AE-LSTM) beats conventional methods due to its ability to handle noise and uncertainty in data (Zafar et al., 2023). The self-supervised learning architecture of the AE-LSTM shows competence in predicting complex meteorological situations. In a similar vein, further research (Zhang et al., 2023) used machine learning models to anticipate PV system output power, concentrating on hybrid deep learning techniques that use autoencoder LSTM models for time series forecasting.
Renewable energy, particularly solar power, is vital for global sustainability. Accurate forecasting using AI and machine learning is crucial for grid stability and cost reduction. Deep learning improves accuracy, addressing solar power's randomness. However, gaps exist, notably in Autoencoder utilization and hybrid model optimization, hindering further integration and optimization of solar power into grids.
Proposed methodology
In this study, conducted on a stationary time-series dataset, four machine learning models were employed: GRUNet, LSTMNet, HCLNet, and the proposed innovative HAELNet. Three crucial variables are covered by this dataset, which includes values for a full year: “DPG (kWh)”, “GCPG (MW)”, and “SR (MJ·m−2)”. Those readings in real-time were collected at a sizable power plant of solar. To guarantee the data meets the criteria for a continuous time sequence, preprocessing was applied.
Data preprocessing constitutes a foundational phase in the machine learning pipeline, wherein raw, often unstructured data is systematically transformed into a refined and analyzable format. This step is indispensable for ensuring model accuracy, stability, and generalizability. Real-world datasets frequently exhibit imperfections such as missing entries, noise, outliers, and heterogeneities in scale all of which can obscure meaningful patterns and degrade model performance if left unaddressed. To mitigate these challenges, a range of preprocessing strategies is employed. Data cleaning involves the imputation or removal of missing values and the identification and treatment of anomalies, thereby preserving data integrity. Normalization and feature scaling are essential to harmonize the range of input variables, preventing models, particularly those sensitive to feature magnitudes from being unduly influenced by disproportionately scaled attributes. Feature engineering further enriches the dataset by constructing informative variables that encapsulate latent relationships, thereby enhancing the model's capacity to learn relevant structures. Following preprocessing, the dataset is typically partitioned into training, validation, and test subsets. This methodological separation facilitates robust model development, hyperparameter tuning, and rigorous performance evaluation. Beyond enhancing predictive accuracy, effective preprocessing promotes model generalization, curbs overfitting, particularly when coupled with augmentation techniques, and reduces computational burden, which is critical for scalability. Thus, preprocessing is not merely preparatory; it is a prerequisite for developing reliable, efficient, and deployable machine learning solutions (Zafar et al., 2024a, 2024b).
The dataset was partitioned into training (80%) and testing (20%) subsets, with an additional validation set employed for performance assessment under previously unseen conditions. Model development commenced with training on the designated subset, followed by iterative tuning to enhance predictive accuracy and minimize common error metrics. Subsequent validation enabled assessment of model generalizability beyond the training environment. For forecasting solar power generation over the subsequent annual cycle, the trained models were used to extrapolate the time series, producing forward-looking predictions. Four distinct architectures, GRUNet, LSTMNet, HCLNet, and the proposed HAELNet were evaluated. These forecasts are instrumental for solar plant operators, informing strategic decisions related to energy pricing, system planning, operational reliability, and resource optimization. Empirical results indicate that HAELNet consistently outperforms the benchmark models across all evaluation metrics, including MAE, MAPE, MSE, and RMSE, demonstrating superior forecasting precision. A comparative performance summary is presented in Figure 1 providing a visual depiction of the methodology and outputs. The structural and algorithmic foundations of the models, detailing their respective architectures and mathematical formulations are discussed comprehensively in the subsequent subsections.

Process for visualize the results through GRUNet, LSTMNet, HCLNet and HAELNet model. GRUNet: gated recurrent unit network; LSTMNet: long short-term memory network; HCLNet: hybrid convolutional neural networks with LSTMNet; HAELNet: hybrid autoencoder LSTMNet.
Functional process of GRUNet
GRUNet is a neural network architecture utilizing GRU layers for processing sequential data, optimized through hyperparameter tuning and advanced feature engineering. It is based on recurrent neural networks (RNN) and features varying numbers of GRU layers alongside additional layers for feature extraction. In order to overcome the difficulties in preserving long-term memory and facilitating efficient backpropagation within RNNs. In 2014, Cho introduced the GRU as illustrate in Figure 2. GRUs use only an update gate and a reset gate, have fewer parameters, and don't have separate memory cells like LSTM networks do. Compared to LSTMs, GRUs seek to improve sequence learning while streamlining network architecture (Brahma and Wadhvani, 2020). The update gate in GRU controls the influence of the previous cell state on the current one, similar to the output gate in LSTM, regulating the preservation of information for future states. Similar to the input and forget gates in LSTM, reset gate in GRU controls how much historical data to ignore, deciding which historical data to keep and which to discard. The paper by Aslam et al. (2020) gives a comprehensive exposition of the equations that describe these gates in detail.

GRU (gated recurrent unit) internal structure.
Functional process of LSTMNet
LSTMNet is a neural network architecture that augments the standard LSTM cell through modifications with additional layers, thus, providing the opportunity to tailor the adjustments for improved time series forecasting performance. It uses LSTM layers as its main mechanism. LSTM depicted in Figure 3, which is the RNN unit cell, is meant to handle long-term data sequences that have dependencies by having coupled elements so that information can be decided to be stored or discarded over time. The key constitutive elements are the input, forget, and output gates that direct the cell's internal information flow. Furthermore, the memory cell keeps and updates data, selectively updating memory, keeping hold of the important information, and trashing the unneeded data to quickly and effectively extract long-term relationships from sequential data (Harrou et al., 2020).

Long short-term memory cell structure.
The LSTM cell comprises three gates: the forget gate, the input gate, and the output gate which dictate the route data take as they move within the cell. The paper by Zheng et al. (2020) gives a comprehensive exposition of the equations that describe these gates in detail. The input gate in LSTM controls how much new information to store in the memory cell from the present input and previous state of hidden which have been filtered through a sigmoid activated function, thus, allowing it to discard and keep the most important key input elements. Due to the present input and previous state of hidden, forget gate employs a sigmoid function to decide which data to delete from the memory cell. To purposefully remove unnecessary information, it multiplies the previous memory state by the gate's output. The output gate in LSTM determines the amount of data via the cell of memory to pass to the next hidden state using present input and prior hidden state, employing sigmoid activation and
Structure of CNN
The CNN structure, depicted in Figure 4, relies on factors like layer quantity and kernel size for performance. It begins with convolution and pooling operations to extract intricate features from input data, then the model forwards the extracted features to the final classification layer. CNNs excel at extracting hierarchical features from both one-dimensional sequences and two-dimensional data, demonstrating their versatility. The architecture includes convolutional layers for capturing spatial features, activation layers to introduce non-linearity, pooling layers to reduce the dimensionality of feature maps, and fully connected layers to integrate local and global features. CNNs’ adaptability is evident in tasks like solar energy forecasting, where 1D convolution facilitates streamlined time series analysis, showcasing the network's versatility in diverse applications (Bhutta et al., 2024).

Convolutional neural network (CNN) architecture.
Functional process of HCLNet
CNNs are structured with input, convolutional, pooling, fully connected, and output layers, focusing on direct visual pattern detection in pixel images (Bhutta et al., 2024). The CNN LSTM model, as depicted in Figure 5, merges LSTM layers for sequence prediction with CNN layers for feature extraction, facilitating tasks like activity detection and time series prediction (Brahma & Wadhvani, 2020; Singla, Duhan & Saroha, 2023). This architecture includes an input layer for the extraction of feature plus layers of LSTM for modeling temporal dependencies, providing an efficient approach to handle spatial and temporal relationships in sequential data. The CNN component extracts spatial characteristics, while the LSTM component focuses on modeling temporal patterns, making it suitable for tasks where both spatial and temporal patterns are crucial. However, its effectiveness diminishes when working with data lacking meaningful spatial patterns, in which case Autoencoder LSTM excels in dimensionality reduction, feature extraction, denoising, anomaly detection, and data reconstruction.

Structure of hybrid convolutional neural networks with long short-term memory net (HCLNet) model.
Structure of HAELNet
The integration of Autoencoders and LSTM networks creates a HAELNet as illustrated in Figure 6, a powerful neural network framework. Autoencoders simplify input data by encoding it into a compact form and then reproducing it (Saffari et al., 2021). The HAELNET incorporates LSTM cells to capture temporal correlations inside the data in sequential. In the encoder part LSTM cells handle the sequential order of the input data and generate a compressed representation, effectively reducing the dimensionality. To minimize reconstruction errors, the decoder LSTM cells closely mirror the input sequence while reconstructing it. During training, a set of input sequences is used alongside a loss function, such as MSE, to guide optimization. After training, the Autoencoder LSTM can effectively compress as well as reconstruct sequences of input, preserving essential data in a lower-dimensional form. It is particularly effective for applications such as sequence prediction, anomaly detection, also extraction of features via time-series data, owing to its ability to combine the data compression capabilities of Autoencoders with the temporal modeling strengths of LSTM networks. Overall, it's a handy model for data analysis, pattern detection, and also for data reconstruction.

Structure of hybrid autoencoder LSTMNet (HAELNet) model.
Mathematical explanation of HAELNet
The Autoencoder, usually used for extraction of feature, it processes input X = {x1,x2,…,xk} data with function f to understand its unique structure. The encoder then creates a sequence T = {t1,t2,…,tk}, expressive the data, which is subsequently decoded to reconstruct the original data Y = {y1,y2,…,yk}. Once trained, the encoder is utilized autonomously to extract original data properties, improving data quality within the group (Zheng et al., 2020).
Autoencoder equation (1) shows encoding and equation (2) depicts the decoding procedures. These equations include weights (
Equation (3) consolidates the prior cell
The variable
Case study
Structure of solar plant
The Sapphire Solar Power Plant (SSPP) is a 100 MW facility situated in Chakwal, Punjab, Pakistan, initiated by the Sapphire Group. With over 400,000 solar panels spread across 650 acres, it integrates seamlessly with the national grid via a 132 kV transmission line. Operational since April 2018, the SSPP annually produces approximately 165 GWh of clean energy, reducing the reliance of Pakistan on petroleum and coal. This endeavor exemplifies Pakistan's commitment to combating climate change and highlights the feasibility of significant solar power projects in that region.
Installed equipment details at solar plant
Detailed information about the equipment in place is furnished below.
The solar PV facility analyzed in this study comprises over 400,000 Trina Solar modules rated at 330 Wp, reflecting the prevailing deployment of polycrystalline silicon (poly-Si) technology during the plant's commissioning phase. These modules are characterized by a conversion efficiency typically ranging between 16% and 18%, a temperature coefficient in the range of −0.39% to −0.43%/°C, and an annual performance degradation rate estimated at 0.5% to 0.8% per year (Chen et al., 2022).
For the development of accurate and operationally meaningful forecasting models, particularly those aimed at long-term energy yield prediction, it is essential to explicitly integrate such device-specific parameters. Module efficiency establishes the baseline for instantaneous power output under standard test conditions, whereas the temperature coefficient provides a quantitative measure of output variability under thermal stress, a critical consideration in regions with elevated ambient temperatures. Perhaps most importantly, the inclusion of degradation rates is indispensable in preventing systematic overestimation of energy output over time. Ignoring this factor can significantly compromise the credibility of multi-year projections, thereby undermining efforts in financial planning, maintenance scheduling, and long-term energy resource integration. The deliberate incorporation of these characteristics enhances the model's fidelity and supports more resilient and economically viable energy system planning.
These panels are efficiently integrated with centralized inverters by Sungrow, renowned for their top-tier solar inverters, converting DC power into AC electricity suitable for the grid. The solar panels are mounted on precision-engineered stationary-tilt mount systems from Array Technologies, optimizing energy generation. Step-up transformers from Siemens elevate the voltage of solar-generated energy before grid injection, while ABB's meticulously designed switchgear ensures safety and reliability. The power plant is equipped with an advanced monitoring and control system from Huawei, allowing remote equipment oversight and energy output optimization, emphasizing operational efficiency.
Data analysis
Histograms are analytical tools used to assess dataset properties, revealing spread, central tendency, and distribution insights. They identify abnormalities, display frequency ranges, patterns, and calculate statistical measures, facilitating easy data comparison and pattern recognition. Their organization resembles pie charts, offering a visual representation of data distribution.
In Figure 7(a), the histogram displays the DPG pattern in kilowatt-hours (kWh), with the horizontal axis segmented into ranges from 0 to 700,000 kWh and the vertical axis indicating frequency up to 150. Figure 7(b) illustrates the occurrence of GCPG in megawatts (MW), with counts ranging from 0 to 200 and power values segmented from 0 to 100 MW. The histogram visually represents changes in grid power values with time. In Figure 7(c), for SR in (MJ/m2) histogram spans among 0 to 30 MJ/m2 on the abscissa and from 0 to 100 on the vertical axis, revealing the distribution and spread of radiance readings. The histogram's shape indicates the variability and symmetry of radiance scores, with slants reflecting lower or higher radiance levels and symmetry suggesting an even distribution.

(a–c). Data analysis for solar farm parameters utilizing a histogram.
Box plots serve as an effective instrument for summarizing and graphically displaying distributions of data, as seen in Figure 8(a) to (c), which provides insights into three distinct attributes from a significant power plant of solar dataset. Each box plot effectively conveys the central tendency and spread of these attributes.

(a–c). Data analysis for solar farm parameters utilizing a box plot.
In the Figure 8(a), “DPG in kWh” box plot, the y-axis covers the units range from 0 to 700,000 kWh, with quartiles neatly delineating the data distribution. The first quartile (Q1) spans between 340,000–450,000 kWh, indicating the lowest 25% of data. The (Q2) next quartile, signifying the median, extends from 450,000 to 500,000 kWh, while the third quartile (Q3), encompassing the highest 25% of the data, which varies from 500,000 to 530,000 kWh. Finally, the fourth quartile (Q4), indicating the top 1% of data, spans from 530,000 to 610,000 kWh. Moving to the Figure 8(b), “GCPG in MW” box plot, the y-axis captures the power range from 0 to 90 MW. Quartiles are similarly depicted, with Q1 covering 60 to 70 MW (the lowest 25% of data), Q2 spanning 70 to 73 MW (the median), Q3 ranging between 73 and 78 MW (the data in highest 25%), and Q4 extending between 78 and 88 MW (the top 1% of data). The “SR in MJ/m2” box plot in Figure 8(c), SR values are displayed on the y-axis, ranging from 0 to 30 MJ/m2. Quartiles provide insights into data distribution: Q1 covers the range 14–20 MJ/m2, indicating the lowest data of 25%; where Q2, the middle 50% of data, ranges from 20 to 22.5 MJ/m2, where 25% of upper data found in Q3, extends from 22.5 to 23.5 MJ/m2, and Q4 have 1% of top data, spans between 23.5 and 27 MJ/m2.
These box plots offer a detailed and intuitive representation of the respective attributes, enabling viewers to identify potential outliers, variations, or trends within the dataset's distribution characteristics.
The findings obtained from all models are provided in graphical form in the following section 5.
Results and discussion
A detailed dataset was collected via solar power facility, with real-time data on vital parameters: “DPG (kWh)”, “GCPG (MW)”, and “SR (MJ/m²).” These parameters were utilized to predict the subsequent year DPG, grid-connected power generation, and SR. Initially, two benchmark models with varying layers were employed: the GRUNet and LSTMNet. These models were chosen following a comprehensive examination of the appropriate research that confirmed their precision and efficacy in accomplishing the goals of the study. The results from comparative tables and visual representations demonstrated that LSTMNet performed better than GRUNet. Two hybrid approaches were then used to further improve LSTMNet: Our suggested novel combination of an autoencoder and LSTM (HAELNet) and a HCLNet. Though visual analyses and comparisons showed that HAELNet outperformed HCLNet, LSTMNet, and GRUNet models, all models showed good agreement between predictions and real data. Because of its expressively improved predictive accuracy and lower error rate, the hybrid HAELNet model stood out. The results, tables, and visual displays from each model are included in the section that follows, highlighting the significant advancements made by the HAELNet model.
During the experimental stage of this work, particular hardware and software configurations were employed. The hardware setup was based on an Intel (R) Core i5 with 12500H, RAPM of 32 GB, and NVIDIA RTX 3050 GPU memory of 8 GB. TensorFlow version 2.18 and Keras version 3.6.0 libraries were used to develop the software as deep learning models. The language was Python 3.13. The careful choice of these components software and hardware indeed tell how it is important to provide the research with the correct and proper outcome.
The primary details of this model's intrinsic parameters and its structural architecture are presented in Table 1. The proposed hybrid model, HAELNet, is designed with a deep learning architecture consisting of several layers: Five LSTM layers, one RepeatVector layer, and a final Dense layer for output prediction. The initial three LSTM layers (lstm_77, lstm_78, and lstm_79) have output dimensions of 10, 8, and 1 neurons, respectively, with 530, 458, and 42 trainable parameters. The next two LSTM layers (lstm_80 and lstm_81) also output 10 neurons each, contributing 530 and 890 parameters, respectively. A RepeatVector layer prepares the sequence for the output layer, which is a Dense layer with one neuron (output shape: None, 1, 1) and 18 parameters.
Training details of proposed architecture.
MSE: mean squared error; MAE: mean absolute error; LSTM: long short-term memory.
The model was trained on a dataset split 80% for training and 20% for testing, using a batch size of 32. The Adam optimizer was employed with a learning rate of 0.001, and MSE was used as the loss function. Model performance was evaluated using both MSE and MAE metrics. The network comprises 2468 trainable parameters in total, ensuring it has sufficient capacity to learn meaningful temporal patterns while maintaining computational efficiency. This architecture is specifically tailored to enhance the accuracy of solar power generation forecasting while keeping training time and resource usage within practical limits.
The study employs the principles of statistical learning theory (James et al., 2023) and the concept of generalization, which refers to a model's ability to perform accurately on unseen data, not just the data it was trained on. To promote robust model evaluation and prevent overfitting, the dataset was systematically divided into two subsets: 80% for training and 20% for testing. The training set was used to help the models GRUNet, LSTMNet, HCLNet, and the proposed HAELNet learn underlying patterns and temporal dependencies within the data. The testing set, which the models had not previously encountered, was used to assess their predictive performance. This evaluation strategy ensures that the models are not only learning effectively but also maintaining the ability to generalize to new, real-world data. Such a structured methodology supports unbiased and credible assessment of model effectiveness in forecasting solar energy parameters.
The losses encountered by each model were measured and displayed using graphical representations.
The mathematical representation of the MAE is articulated in Equation 11, (Gao et al., 2022).
Figure 9(a) and (b) display the MAE loss and validation MAE loss graphs, respectively, revealing the performance of the GRUNet, LSTMNet, HCLNet, and hybrid HAELNet models for DPG prediction. Despite consistent patterns across all models, the hybrid HAELNet demonstrates superior performance, with the minimal MAE loss of 0.1173 for “DPG (kWh),” compared to HCLNet (0.1401), LSTMNet (0.2223), and GRUNet (0.2994). Similarly, the hybrid HAELNet also shows the lowest validation MAE loss of 0.1113 for “DPG (kWh),” compared to HCLNet (0.1215), LSTMNet (0.2208), and GRUNet (0.2916), as indicated in Table 2. This highlights HAELNet's accuracy and robustness in predicting solar production, owing to its precise predictions and trend identification capabilities.

(a). Actual MAE comparison of daily power generation through GRUNet, LSTMNet, HCLNet, HAELNet models. (b) Validation MAE comparison of daily power generation through GRUNet, LSTMNet, HCLNet, HAELNet models. (c) Actual MAE comparison of grid connected power generation through GRUNet, LSTMNet, HCLNet, HAELNet. (d) Validation MAE comparison of grid connected power generation through GRUNet, LSTMNet, HCLNet, HAELNet models. (e) Actual MAE comparison of solar radiance through GRUNet, LSTMNet, HCLNet, HAELNet models. (f) Validation MAE comparison of solar radiance through GRUNet, LSTMNet, HCLNet, HAELNet models. GRUNet: gated recurrent unit network; LSTMNet: long short-term memory network; MAE: mean absolute error; HCLNet: hybrid convolutional neural networks with LSTMNet; HAELNet: hybrid autoencoder LSTMNet.
MAE and validation MAE comparison table among GRUNet, LSTMNet, HCLNet and hybrid HAELNet.
GRUNet: gated recurrent unit network; LSTMNet: long short-term memory network; MAE: mean absolute error; HCLNet: hybrid convolutional neural networks with LSTMNet; HAELNet: hybrid autoencoder LSTMNet.
Figure 9(c) and (d) exhibit MAE loss and validation MAE loss graphs, respectively, for the “GCPG” parameter, showcasing consistent patterns across models. The hybrid HAELNet model emerges as the most accurate, according to the comparison in Table 2, with the lowest MAE of 0.0609 also with the lowest validation MAE of 0.0599 compared to other models. This supports the HAELNet model's more efficient predicting of the GCPG parameter making it once again demonstrated that its ability to provide accurate predictions is strong.
The hybrid HAELNet model presents the greatest accuracy in the prediction of the “SR” parameter with a loss score of MAE of 0.1125. The HCLNet model achieve 0.1461, the LSTMNet model's score of 0.1618, and the GRUNet model's score of 0.2145 models. Similarly, results were found in validation MAE with the HAELNet model score lowest of 0.1111 MAE. This hybrid model's exceptional precision compared to the “HCLNet” and the other two models (LSTMNet, GRUNet) can be seen in Figure 9(e) and (f). The HAELNet model has a remarkable accuracy in this aspect forecasting the “SR” parameter.
Figure 10(a) and (b) present the MSE loss and validation MSE loss for the “DPG” parameter across different models, with the hybrid HAELNet clearly showing superior performance. Table 3 confirms that the HAELNet model has the lowest MSE value of 0.0244 and validation MSE value of 0.0241. In contrast, the HCLNet, LSTMNet, and GRUNet models exhibit higher MSE values of 0.0333, 0.0976, and 0.1170, and validation MSE scores of 0.0329, 0.0971, and 0.1164, respectively. This strongly demonstrates that the HAELNet model not only offers superior accuracy but also greater efficiency in predicting the “DPG” parameter.

(a) actual MSE comparison of daily power generation through GRUNet, LSTMNet, HCLNet, HAELNet models. (b) Validation MSE comparison of daily power generation through GRUNet, LSTMNet, HCLNet, HAELNet models. (c). Actual MSE comparison of grid connected power generation through GRUNet, LSTMNet, HCLNet, HAELNet. (d) Actual MSE Comparison of grid connected power generation through GRUNet, LSTMNet, HCLNet, HAELNet. (e) Actual MSE comparison of solar radiance through GRUNet, LSTMNet, HCLNet, HAELNet models. (f) Validation MSE comparison of solar radiance through GRUNet, LSTMNet, HCLNet, HAELNet models. GRUNet: gated recurrent unit network; LSTMNet: long short-term memory network; MSE: mean squared error; HCLNet: hybrid convolutional neural networks with LSTMNet; HAELNet: hybrid autoencoder LSTMNet.
MSE and validation MSE comparison table among GRUNet, LSTMNet, HCLNet and hybrid HAELNet.
GRUNet: gated recurrent unit network; LSTMNet: long short-term memory network; MSE: mean squared error; HCLNet: hybrid convolutional neural networks with LSTMNet; HAELNet: hybrid autoencoder LSTMNet.
Figure 10(c) and (d) provide a comparison among MSE and validation MSE for the “GCPG” parameter across various models, highlighting the exceptional performance of the HAELNet hybrid model. Table 3 supports this observation, showing that the HAELNet framework attains the minimal MSE value of 0.0096. In contrast, the MSE values for the HCLNet, LSTMNet, and GRUNet models are higher, at 0.0098, 0.0386, and 0.389, respectively. Similarly, the HAELNet model has the lowest validation MSE value of 0.0094, followed by HCLNet (0.0097), LSTMNet (0.0380), and GRUNet (0.0383). This demonstrates the improved accuracy and efficiency of the HAELNet model in predicting the “GCPG” parameter.
Table 3 indicates that the hybrid HAELNet model with the least MSE of 0.0248 for the “SR” parameter, surpassing the MSE scores of the HCLNet (0.0321), LSTMNet (0.5Nou), and GRUNet (0.849) models. Similarly, hybrid HAELNet model with the least validation MSE of 0.0246 Figure 10(e) and (f) visually represent the MSE outcomes, further highlighting the HAELNet model's superior performance in reducing MSE across different phases. These findings underscore the effectiveness and precision of the HAELNet model in enhancing forecasting accuracy compared to other models.
Equation 12 illustrates the MSE's formulation in mathematics. (Qu et al., 2021).
Table 4 presents a comparative analysis of several models for MAPE and RMSE outcomes utilizing a 24-h time step ahead. Our proposed hybrid model, HAELNet, demonstrates lower MAPE and RMSE values, indicating its superiority over other models. Moreover, a meticulous examination of the graphical depictions from both models highlights a remarkable degree of resemblance, signifying that data accuracy surpasses 95% and the absence of outliers has been observed. These discoveries serve to reinforce the suitability as well as meticulousness of data derived from the power plant of solar, providing a sturdy groundwork for further analysis and decision-making endeavors.
MAPE and RMSE comparison table among GRUNet, LSTMNet, HCLNet and hybrid HAELNet.
GRUNet: gated recurrent unit network; LSTMNet: long short-term memory network; MAPE: mean absolute percentage error; HCLNet: hybrid convolutional neural networks with LSTMNet; HAELNet: hybrid autoencoder LSTMNet.
Figures 11 to 13 provide a detailed overview of the results from a thorough study of a 100 MW solar plant, showing forecast outcomes based on the data assessment. In this study, four ML models Gated Recurrent Unit (GRUNet), LSTMNet, HCLNet, and the proposed hybrid Autoencoder with LSTM (HAELNet) were carefully trained using 80% of a year's real-time actual data across essential parameters: “Daily Power Generation (DPG) in kWh,” “Grid Connected Power Generation (GCPG) in MW,” and “Solar Radiance (SR) in MJ·m−2.” The remaining 20% of data was set aside for validation and testing. The first part of the visuals reveals the differences in the 60-day test data from the solar facility and the actual prediction data. The second section illustrates the comparison between the plant's forecasts for the forthcoming year and the actual data from that year.

Test and predicted data comparison for a solar plant's daily power generation.

Test and predicted data comparison for a solar plant's grid connected power generation.

Test and predicted data comparison for a solar plant's solar radiance.
Figure 11 illustrates the variation in the “DPG (kWh)” parameter, with the vertical axis ranging from 100,000 to 600,000 kWh and the horizontal axis showing the days. This graph enables a comparison between the forecasted data from the GRUNet, LSTMNet, HCLNet, and hybrid HAELNet models and the actual data from the 60-day test period and actual data of next year for comparison with the predictions. The outcomes show that predictions from the hybrid HAELNet model are closely aligned with the actual test data and actual next year data, with minor discrepancies in certain areas. Meanwhile, predictions from the HCLNet and LSTMNet models are less accurate, possibly due to their narrower ranges, but they perform better than GRUNet. Given these assessments, the next step is to use the trained models to predict DPG for the entire next year. Moreover, Table 4 reveals that the HAELNet model attains a least MAPE value of 1.221, while the HCLNet model records a higher MAPE error of 1.891, the LSTMNet model scores 2.713, and the GRUNet model scores 2.894. These findings strongly support the proposed hybrid HAELNet model due to its superior accuracy in predicting parameter values.
Figure 12 presents the “GCPG (MW)” parameter, with a vertical axis range of 40–90 MW and abscissa denoting the total amount of days. The graph compares the forecasted data from the GRUNet, LSTMNet, HCLNet, and hybrid HAELNet models utilizing empirical data from a 60-day testing interval alongside actual data from the subsequent year for comparative analysis with the forecasts. The predictions from hybrid HAELNet model closely mirror the real test data and actual next year data, showing only minor deviations in certain areas. In contrast, the HCLNet model's predictions do not match the test data as closely, though they are more accurate than those from the LSTMNet model. The next step is to forecast GCPG for the entire upcoming year using the trained models based on this comparative analysis. Furthermore, Table 4 shows that the HCLNet model's MAPE error is marginally higher at 2.384, whereas the HAELNet model's least MAPE value is 2.282. The GRUNet model has the highest score at 2.512, while the LSTMNet model has a much higher score at 2.434. The hybrid HAELNet model's predictions demonstrate its higher accuracy and precision in GCPG value forecasting by providing a closer fit to the test data.
The “SR (MJ/m²)” parameter is examined in Figure 13, with the horizontal axis representing the days and the ordinate spanning between 0 and 30 MJ/m². The GRUNet, LSTMNet, HCLNet, and hybrid HAELNet models’ predictions and the actual data from a 60-day testing period and the subsequent year's data for comparative analysis with the forecasts are illustrated in the figure. The forecast from the HAELNet model closely matches the test data as well as actual next year data, with only minor variations in a few areas. In contrast, the predictions of the HCLNet model are more accurate than those of the LSTMNet model, which outperforms GRUNet, but they are less consistent with the test data. Following this comparative study, SR predictions for the forthcoming year will be generated using the trained models. As can be seen in Table 4, the HAELNet hybrid model has the lowest MAPE value at 2.131, while the MAPE error of the HCLNet model is higher at 2.604, that of LSTMNet is noticeably higher at 3.362, and that of GRUNet is the highest at 3.813. This demonstrates the high accuracy of the hybrid HAELNET model in predicting radiance values, as well as its excellent agreement with the test data and low variability.
Figure 14(a) to (c) display scatter plot findings that juxtapose the actual data with the predicted data generated by our proposed HAELSTM model. The results indicate that the projected data closely aligns with the actual data in both the testing and prediction stages, showing minimal discrepancies. This underscores the model's accuracy and validates its efficacy.

Actual and predicted data comparison via scatter plot.
Limitations
One notable limitation of this study is the use of historical data from a utility-scale solar power plant equipped with 330 Wp panels. Although this data reflects actual operational conditions and provides a realistic testing environment for our proposed forecasting models, it may not fully represent the performance and efficiency profiles of newer, high-capacity PV technologies. As solar technology continues to evolve, newer panels offer improved efficiency, better thermal management, and reduced land and installation requirements.
Another limitation of this study concerns the panel density required for the 100 MW capacity, utilizing over 400,000 panels across roughly 650 acres. This panel density increases associated costs related to installation, regular maintenance, cleaning, and land acquisition, factors that are especially critical in densely populated or land-scarce areas. While the core strength of our work lies in validating the HAELNet model's ability to capture temporal patterns and forecast solar power accurately, future research should explore its application to data from modern facilities using higher-efficiency panels. This will help evaluate the model's scalability and effectiveness in newer PV environments, ensuring its continued relevance and adaptability to ongoing advancements in solar technology and deployment strategies.
Conclusion
Accurate forecasting plays a crucial role in integrating renewable energy sources like solar power into modern electrical grids. Due to the variable and weather-dependent nature of solar energy, precise prediction of key parameters DPG, GCPG, and SR are essential for maintaining grid stability, optimizing production, and ensuring efficient energy management. This study introduces HAELNet, a novel hybrid model that combines the strengths of Autoencoders and LSTM networks to significantly improve forecasting performance. Empirical evaluations reveal that HAELNet consistently outperforms benchmark models including GRUNet, LSTMNet, and HCLNet across multiple metrics. Specifically, HAELNet achieved the lowest RMSE values: 0.1564 for DPG, 0.0979 for GCPG, and 0.1576 for SR, clearly demonstrating its superior predictive capability. These results underscore HAELNet's practical utility in real-world energy systems. Beyond technical accuracy, the real-world impact of HAELNet is substantial. Improved forecasting enables better energy scheduling, reduces operational uncertainties, lowers economic losses, and facilitates smoother integration of solar energy into smart grids. It supports grid resilience and long-term sustainability by enabling proactive energy management strategies. Overall, this research highlights how advanced machine learning models like HAELNet can drive meaningful progress in renewable energy forecasting, contributing to cleaner, smarter, and more sustainable power systems.
Future research should focus on applying diverse deep learning architectures, including hybrid and fusion models, to time-series forecasting of renewable energy. In particular, validating the proposed AI models using datasets from modern PV facilities with high-efficiency panels will be essential to assess their scalability, adaptability, and practical relevance in evolving solar energy environments.
Footnotes
Author contributions
Ahsan Zafar, Aamina Ahsan, and Muhammad Zain Yousaf: Conceptualization, methodology, software, visualization, investigation, writing—original draft preparation. Mohit Bajaj and Wajid Khan: Data curation, validation, supervision, resources, writing—review & editing. Zahid Ullah, Mustafa Abdullah, and Ievgen Zaitsev: Project administration, supervision, resources, writing—review & editing.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
