Abstract
In blast furnace smelting, the silicon content in molten iron is an important indicator of the temperature trend of the blast furnace. Due to the multi scale, non-linear, large time delay and strong coupling characteristics of the blast furnace smelting process, the control effect of silicon content in hot metal is often not ideal. Therefore, finding an effective and accurate method for controlling silicon content in hot metal is very important for blast furnace smelting. Based on this, this paper proposes a prediction and control model for silicon content in hot metal of blast furnace based on GRA–LSTM–BAS. Based on this, this paper proposes a prediction and control model for silicon content in hot metal of blast furnace based on GRA–LSTM–BAS. Firstly, the original data set is processed using wavelet analysis and normalisation processing methods. Secondly, the gray relational analysis (GRA) method is used to analyse the correlation between the input variables of the model to determine the input parameters of the model. Subsequently, a long short-term memory (LSTM) prediction model was established to obtain silicon content values at future times through feedback correction. The model was trained and tested by on-site collected data and compared with the support vector machine (SVM) model. The results show that the LSTM model can quickly and accurately predict the silicon content in hot metal, and has a good guiding significance for actual blast furnace production. Finally, the control model for silicon content in molten iron is optimised iteratively by combining the beetle antennae search algorithm (BAS algorithm). Feedback and update of the results in the model are done in real time according to errors, forming a closed-loop controller to maintain the silicon content in molten iron at an appropriate level and achieve optimal control of the silicon content.
Introduction
As the main indicator of the thermal state of the blast furnace and an important indicator of pig iron quality,1,2 keeping the silicon content in molten iron within a certain range plays an important role in accurately controlling the blast furnace temperature, improving the quality of the molten iron and reducing the coke ratio. Due to the high pressure and high dust environment in the blast furnace iron making process, relying solely on the experience of the furnace manager to determine the silicon content in molten iron could lead to hysteresis and inaccuracy in the measurement results, which may lead to slippage, collapse, or even scrapping of furnace molten iron once the operation is improper. 3 Therefore, it is necessary to establish an accurate and reliable prediction model for the silicon content in molten iron to reflect the changes of the internal temperature and index parameters between the current values and expected values and to provide information about the furnace temperature and molten iron quality for the operators.4,5
Regarding the above issues, a great deal of research work has been done by domestic and international scholars. Ralf has proposed a time series approach applicable to silicon content and temperature, 6 which not only improves error accuracy but also provides a useful framework for practical modelling and control of the production process. Tathagata has successfully applied the partial least squares (PLS) technique to silicon content prediction, using the structure of the data set to help identify the main process variables and also handle large amounts of highly co-linear data and help with data reduction. 7 Jian combines neural networks with qualitative analysis to construct a neural network model for blast furnace prediction systems through causal analysis and qualitative reasoning. 8 The subspace method is also used as a model for identifying input and output variables to predict silicon content in molten iron. 9 In addition, the study of silicon content is also involved in the fields of chaos 10 and fractal11,12 characteristics.
These studies provide an important theoretical basis for the control of furnace temperature. Among the current control methods, the most popular furnace temperature control method is fuzzy control. 13 However, this scheme has defects such as poor control accuracy and serious overshoot, which make it difficult to control the blast furnace smelting accurately. To address the shortcomings of existing methods for modelling and controlling the blast furnace ironmaking process, Zhou et al. established data-driven non-linear subspace modelling based on LS-SVM. This model can effectively guide the blast furnace smelting process to build a furnace temperature closed-loop control system. 14 Bo Yang 15 et al. used a neural network combustion optimisation control method to control a storage heater and made the system adaptive by adjusting the neural network power coefficients in real time. Furthermore, mechanistic and semi-mechanistic models based on blast furnace production have also been developed, 16 such as the Wu model developed by the French Iron and Steel Research Institute, 17 the E index model proposed by the Belgian Metallurgical Research Center 18 and the TS index model developed by Sumitomo, Japan. 19
However, these models all have certain shortcomings, such as long calculation time, lack of readability, over fitting, local optimisation and slow training speed, which cannot show the optimal control effect. In order to improve the accuracy and stability of silicon content prediction and control, a new model based on gray relational analysis (GRA), long short-term memory (LSTM) and beetle antennae search (BAS) is proposed in this paper. GRA is applied to the selection of model input parameters to provide effective data for model building. In this way, the LSTM model can better capture key features in input data, and can improve the accuracy and generalisation ability of the model. As a result, reliable prediction results can be obtained. At the same time, a rolling optimisation controller is designed based on the BAS algorithm to achieve the purpose of predicting and controlling the silicon content in molten iron.
Mathematical processing
In this paper, the production data of a #4 blast furnace in a steel plant in Chengde from January 1, 2022 to December 31, 2022 are used as the research basis. The data mainly covers 69 different parameters such as the distribution system, raw fuel data and blast furnace status data, with a total of over 550,000 pieces of data. Among them, production record data is monitored on an hourly basis. Databases focused on testing data the furnace are stored in furnace batch. Databases based on monitored real-time data recording frequency reach the second level. The blast furnace is a complex system with non-linear, large time delay and multiple distributed parameters. There are many sensors inside, and the working environment of the sensors is harsh. Affected by these interference signals, the collected data is often accompanied by a large amount of noise. 20 This noise can cause interference with the model and reduce performance. In order to meet the modelling accuracy requirements, it is necessary to process the data collected online to eliminate the noise inside. After that, the data is normalised to reduce the dimensional gap, which is of great significance for modelling.
De-noise processing
At present, among many signal de-noising methods, wavelet analysis methods are more widely used and have achieved good results. 21 Wavelet analysis is able to evaluate signals in the time and frequency domains, with high localisation and multi-resolution properties, and it solves many of the drawbacks of classic Fourier analysis. Each wavelet transform decomposes the signal into an approximate value and a detailed value, where the approximate value is the lower frequency component and the detailed value is the higher frequency component. Thus, by continuously applying wavelet transform to the approximate component, the high frequency part of the signal can be stripped and noise reduction can be achieved. 22 The effect of wavelet noise reduction with wind pressure, top pressure and oxygen enrichment, is shown in Figure 1.

Diagram of wavelet analysis before and after processing.
Normalisation process
Normalising the data is necessary in order to prevent errors caused by varying data attributes. Several methods exist for normalising the data, and in this paper, we employ the method of maximum and minimum values to scale the data between (0, 1), which narrows the range of values and simplifies model computation. This is shown in Eq. (1).

Normalisation result chart.
Model building
Selection of input variables
Mechanistic analysis
Predicting and modelling the silicon content in blast furnace hot metal is challenging due to the various parameters that affect it and the fluctuations in furnace conditions. To overcome this challenge, it is necessary to perform some theoretical analysis to identify the input variables that contribute the most to the model's predictive ability. This, in turn, can guide the optimisation of feature engineering. State parameters, which result from the control parameters, reflect the current operating state of the blast furnace. Many parameters impact the smooth operation of the blast furnace, with the main factors affecting the silicon content in molten iron being volume of blown coal, wind temperature syndrome, gas utilisation rate, theoretical combustion temperature.
1) Volume of blown coal 2) Wind temperature syndrome 3) Gas utilisation rate 4) Theoretical combustion temperature
The injected coal amount refers to the quantity or volume of pulverised coal injected into the blast furnace through the tuyere during the smelting process. When coal injection is switched on, it needs to absorb a large amount of heat to balance its temperature with the environment, leading to a decrease in the furnace's heat and a decrease in the silicon content of the iron, as cited in Ref.
23
The wind temperature refers to the temperature of the hot air blast. Increasing the wind temperature is a crucial technical and economic indicator in the current smelting industry. Raising the wind temperature increases the physical heat brought in by the blast, replacing some of the heat generated by coke combustion. As a result, the consumption of coke is reduced, leading to a decrease in the amount of gas produced, which, in turn, reduces the heat carried away by the gas, thereby increasing the furnace's heat input, as cited in.
24
A decrease in silicon content in the iron is a consequence of this process.
The utilisation rate of blast furnace gas is a measure of the conversion of CO to CO2 in the gas–solid phase reduction reaction inside the blast furnace during the smelting process. It is an important indicator for evaluating the degree of indirect reduction reaction inside the furnace. The gas utilisation rate determines the level of selectable silicon, as well as production, consumption and carbon indicators. Increasing the gas utilisation rate reduces the coke ratio and the amount of SiO2 brought in by coke, directly reducing the silicon content in pig iron.
The theoretical combustion temperature refers to the temperature at which all the heat released by the complete combustion of combustibles and air under adiabatic conditions. The theoretical combustion temperature in the tuyere raceway represents the temperature in the raceway, and it has a significant impact on the silicon content in hot metal. All variables that affect this temperature also affect the silicon content. Because high coal injection rates can reduce the temperature in the tuyere raceway, reducing coal injection can increase the temperature. Therefore, the amount of coal injection is also an important factor affecting the silicon content in hot metal.
25
During the ironmaking process in a blast furnace, various factors affect the migration behaviour of silicon, ultimately determining the level of silicon in hot metal. When analysing input variables for the model, different processing methods are used for these parameters depending on their impact on hot metal silicon content based on blast furnace process parameters. Parameters that have little impact on hot metal silicon content, such as furnace bottom temperature and bosh temperature, are directly eliminated. Parameters that have a significant impact on hot metal silicon content, such as pulverised coal injection rate, wind temperature, gas utilisation rate and theoretical combustion temperature, are selected as input parameters for the model. For parameters that are related to hot metal silicon content in the process but whose correlation is uncertain, they require further analysis. Based on an analysis of the blast furnace ironmaking process, the following factors related to hot metal silicon content are selected as gray correlation analysis factors: coke ratio, air volume, overall furnace pressure difference, permeability, top temperature, top pressure, oxygen enrichment, CO and carbon content (hot metal).
Gray correlation analysis
GRA is a method for quantitatively describing and comparing the developmental trend of a system. Its basic idea is to determine the degree of geometric similarity between a reference data column and several comparative data columns to determine the closeness of their relationship, which reflects the correlation between curves. This method is used to determine the degree of correlation between iron silicon content and input variables. The calculation method of correlation degree is as follows:
Step 1: Process the raw data
The first step is to normalise the sequences of iron silicon content and the sequences of the main factors affecting the iron silicon content by taking the average, that is, dividing the data in each sequence by the average of that sequence. The resulting new sequences are the mean sequences of each sequence. The normalised data is then divided into two categories: reference sequences and comparison sequences.
Assuming that the reference column is the data for the silicon content of the iron after the homogenisation process:
Step 2: Calculate the gray correlation coefficient
After the first step of processing, at time
Step 3: Find the correlation
Correlation is shown as Eq. (5):
Step 4: Relevance ranking
The calculated correlation values
Correlation degree analysis of influencing factors of silicon content.
According to the results of mechanism analysis, the injection amount of coal, air temperature, theoretical combustion temperature and gas utilisation rate are determined as the input parameters of the model. Combined with the gray correlation analysis, the coke ratio, carbon content (molten iron), furnace pressure difference, permeability, top pressure and oxygen enrichment were determined as the input variables of the model. In summary, 10 process parameters are used as input parameters for the neural network prediction model, as shown in Table 2. In order to facilitate the establishment of the model, these parameters are divided into state parameters and control parameters according to process knowledge.
The table of model input parameters.
The model building of [Si] content prediction
LSTM neural network
The LSTM model is a type of recurrent neural network (RNN) that is designed to process real-time series and make predictions. 26 It was mainly proposed to solve the problem of gradient vanishing in RNN, and this model has specially designed memory units and a more accurate information transmission mechanism than RNN. Figure 3 shows the LSTM network structure, which mainly consists of forget gates, input gates, cell update gates and output gates. In the figure, a represents the state of the previous time step neuron, b represents the output of the previous time step neuron, and c represents the current input value.

LSTM network structure diagram.
The forgetting gate
The input gate
The undetermined cell state
After the forgetting gate and the input gate, the outputs of both update the cell state
Model construction and result analysis
In this paper, an iron silicon content prediction model is constructed based on LSTM. Ten parameters including blast furnace differential pressure, permeability, coke ratio, carbon content (molten iron), top pressure, gas utilisation rate, theoretical combustion temperature, injected coal amount, oxygen enrichment rate and wind temperature are used as inputs, and the iron silicon content is the output. To avoid overfitting, the hidden layer is set to only one layer, and the number of neurons in the hidden layer is set to 32. The model uses mean squared error (MSE) as the loss function and adopts the adaptive Adam algorithm for optimisation during error backpropagation.
The ability to evaluate the predictive performance of a model should be measured from multiple perspectives, and commonly used evaluation metrics include the coefficient of determination (R2), root mean squared error (RMSE), MSE, mean absolute error (MAE), correlation coefficient (Corr), hit rate (e), recall rate, area under the curve (AUC), entropy, etc. These metrics can objectively evaluate the accuracy of the model and assess its performance in different tasks. They can also be used to compare the performance of different models, especially when there are significant differences between them, evaluation metrics can be used to choose the better model. However, using too many evaluation metrics can interfere with the model's noise, reducing its accuracy and stability. It can also make the model susceptible to local optima rather than global optima, which reduces the model's precision and reliability.
Therefore, this study uses SVM as a comparative model and selects six indicators, including R2, RMSE, 27 MAE, Corr, e and modelling time (T), for evaluation. R2 is used to measure the goodness of fit of the fitting curve to the original data, with a fit close to 1 indicating a better fit; RMSE represents the overall deviation between the sample's true value and predicted value; MAE is used to evaluate the accuracy of the regression model in predicting real-value variables, with a smaller MAE indicating higher prediction accuracy. Corr is used to measure the linear correlation between variables, with a range of [-1, 1]. When the value is 0, it means there is no correlation; when the value is 1, it means there is complete positive correlation, and the magnitude of the same direction change is the same. If it is −1, it means complete negative correlation and changes in the opposite direction with the same magnitude. For the task involved in this study, e represents the proportion of samples whose difference between the predicted and true silicon content is within 0.01 to the total number of samples.
The test set data were separately input into the two trained models, and the results are shown in Figure 4. The trends predicted by both models are almost identical to the real data trends, indicating that both LSTM and SVM accurately capture the data trends. As shown in Table 3, the coefficient of determination of the LSTM model is closer to 1, indicating a better fit and better interpretability of the model. In terms of relative prediction error, although both models achieved a hit rate of over 70%, the overall hit rate of the SVM model is much lower than that of the LSTM model, and the relative prediction error of the LSTM model is relatively stable. On the other hand, the relative prediction error of the SVM model showed a large deviation when predicting for a longer time, indicating that the LSTM model has higher prediction accuracy.

Prediction results of the model for each index.
Prediction ability of each model to Si content.
Study of optimal control model for [Si] content
Theoretical basis of [Si] content control
With the development of technology, China's blast furnace ironmaking performance has also been continuously improved, and some of the problems encountered in blast furnace ironmaking performance have been well solved. While technology is increasingly advancing, many steel companies are shifting their focus to low-silicon smelting technology, because the production cost of fuel ratio is closely related to low-silicon smelting. This means that if the iron in the blast furnace contains less silicon, the fuel ratio for ironmaking will not be too high, and the production cost will not be too high. In addition, the silicon content in the iron affects the performance of the steel. High silicon content can increase the hardness and strength of the steel, but it will reduce its toughness. Therefore, to obtain the best performance, it is necessary to control the silicon content in the iron.
The control system consists of three parts: a prediction module, a PID control module and an adjustment module. The prediction module uses the prediction model established in the prediction section to predict the silicon content of the molten iron. The PID control module then adjusts the raw material composition according to the deviation of the predicted silicon content from the target value. The main principle of the PID controller for controlling the silicon content in the iron is achieved by feedback control. By detecting changes in the silicon content of the iron, the control variables: blast temperature and the amount of oxygen enrichment are adjusted according to certain rules to maintain the silicon content within a specified range.
The PID controller regulates the silicon content in the molten iron by adjusting the wind temperature and oxygen enrichment. For instance, when high silicon content is detected, high wind temperature can be used to produce low silicon pig iron. The high wind temperature not only reduces the coke rate, increases production and enhances the physical heat in the furnace, but also moves the high-temperature zone in the furnace downwards, improves furnace conditions, expands the indirect reduction zone and shifts the softening-melting zone (lower part) of the blast furnace downwards, which is beneficial for smelting low-silicon pig iron. 28 Therefore, the PID controller can precisely control the silicon content in the molten iron to maintain it within the specified range. The adjustment module adjusts the composition of raw materials in the furnace based on the control results of the PID control module.
However, there is no unified standard for the silicon content in pig iron. Steel companies should determine the optimal silicon content of pig iron based on their own situation and furnace conditions, taking into account the smelting effect and the cost of pig iron. In China's current situation, the silicon content in pig iron is generally high (the possibility of the silicon content in pig iron being too low has not been considered yet). At present, the silicon content in pig iron may be appropriate at 0.3% to 0.4%. 29
Optimised design of PID controller based on BAS algorithm
BAS algorithm
BAS is an intelligent optimisation bionic algorithm proposed in 2017.
30
It is inspired by the foraging behaviour of longhorn beetles and initialises the system with a set of random solutions. By comparing the perceived odor concentration values of the beetle's two antennae, it updates the position of the beetle's next flight towards the global optimum through iteration. Compared to other intelligent algorithms, BAS does not require gradient information and has a simple principle that is easy to implement, which results in a faster convergence rate. The search steps are as follows:
Set the position of the aspen in D-dimensional space as Define the left and right tentacle positions. As shown in Eq. (12), the positions of the two antennae are defined as The search capability of the BAS algorithm is affected by the step size, and a suitable step size Calculate the odor intensity of the left and right antennae of the longhorn beetle according to the fitness function During the iterative process of the BAS algorithm, the position updating rule is shown in Eq. (16). To determine whether to accept the proposed updated position, compare the fitness value at the current position with the fitness value at the proposed updated position. If the fitness value at the proposed updated position is better than the current position, update the current position to the proposed updated position. Otherwise, keep the current position unchanged. Iterate termination conditions. The termination conditions of the algorithm are usually set according to the following cases. Firstly, the maximum number of iterations has been reached. Secondly, within a certain number of iterations, the value of the algorithm remains unchanged and a better solution cannot be found. Thirdly, the algorithm has converged.
Where,
BAS has been widely applied to function optimisation, neural network training for pattern classification, fuzzy system control and other application domains extremely well. In this paper, the BAS algorithm will be used to optimise the design of PID controller parameters. The specific implementation steps are shown in Figure 5.

Flow chart of BAS algorithm.
PID control
PID control is a common control method. Based on proportional control, PID control introduces integration to eliminate errors, and adds derivative control to improve system stability.
31
The PID controller itself is a linear controller
32
:

PID control structure diagram.
The feedback control process is achieved through a PID controller. This operates by detecting changes in the silicon content of molten iron and adjusting the control variable according to a set of rules, in order to maintain the silicon content within a specified range. Firstly, the PID controller detects the silicon content in the molten iron and sends a feedback signal to the control system. The control system then adjusts the control variable appropriately based on the feedback signal, in order to regulate the silicon content in the molten iron. This process is repeated continuously to ensure that the silicon content in the molten iron remains within the specified range.
Design of the optimal control model
The PID controller's parameters are optimised using the BAS algorithm, as shown in Figure 7.The bridges between the BAS algorithm and the Simulink model are the PID controller parameters and the control system performance indicators. The optimisation process is as follows: firstly, the new positions generated by the BAS algorithm are assigned to the PID controller parameters

Diagram of optimised control prediction model.
Simulation and result analysis
Model building
The control of silicon content in molten iron is achieved using a PID optimisation control method based on the BAS algorithm. In Section ‘Mathematical processing’, accurate prediction of silicon content in molten iron was achieved, enabling the controlled object to stabilise within a certain range and providing a basis for controlling the silicon content in molten iron. Based on this, a BAS controller was designed, which has the advantages of high control accuracy and fast control response. Firstly, the parameters of the BAS controller are set according to the following values: prediction steps

Model of the control system in the Simulink environment.
Firstly, the function (assignin) is called to assign the values of the newly generated positions to the variable
Analysis of simulation results
The optimisation process is shown in Figure 9. The optimal controller parameters obtained are: Kp = 0.000638, Ki = 0.000358, Kd = 0.067. These parameters are used in the Simulink model for simulation experiments, and the response curve obtained is shown in Figure 10. The simulation results show that the BAS controller is superior to the conventional PID controller in terms of overshoot, rise time and response speed. The BAS controller was the first to complete the tracking of the setpoint step signal and maintain stability. In addition, the PID controller first fell into a local optimum, while the BAS algorithm improved PID controller optimised the operation, allowing it to enter other areas for further search when entering the area of local optimum. Although its convergence rate has decreased, its local convergence ability has been enhanced, and the training ability has been improved. For unstable controlled objects, the optimal PID controller designed by the BAS algorithm makes the selection of

The response curves of the two furnaces’ system.

Comparison of adaptation change curves.
Conclusions
In order to tap the potential value of historical production data in steel mills and achieve timely guidance for production, this paper combines big data technology with machine learning to establish a silicon content prediction system suitable for enterprise production. The system achieves good prediction results and, based on this, a BAS-based PID controller for silicon content in molten iron was designed to strive to control the silicon content at a lower level, which can better meet the needs of enterprise production and bring economic benefits. The research process yielded the following conclusions:
In the process of using big data technology to analyse and mine blast furnace smelting data, data processing methods are particularly important. As an industrial big data, there is a strong correlation between data and process, so it is necessary to fully consider the characteristics of the process and the impact of external factors on the data to ensure that the processed data has better quality. Purely relying on the results of feature selection by big data technology is not comprehensive. Therefore, it is necessary to analyse the results of feature selection from the perspective of the process, to avoid overlooking some important parameters that do not often change but have an impact on production. This paper used process theory analysis and gray correlation analysis methods to optimise the parameters of the prediction model, select the optimal parameters to input into the prediction model, and finally complete the construction of the blast furnace iron silicon content prediction model. A blast furnace iron silicon content prediction model is established by analysing the characteristics of the prediction target and selecting the LSTM algorithm for the construction of the prediction model. Finally, the prediction model is validated through testing samples, proving that the algorithm can well adapt to its non-linear, large lag and multi-operating conditions. Compared with the traditional SVM method, the prediction accuracy reaches more than 90%, and the relative prediction error is only 0.0199, which has good prediction performance. This provides a theoretical basis for accurately predicting the blast furnace iron silicon content. In response to the problems of large time delay and lag in industrial production, a BAS-based PID control strategy for iron silicon content is proposed, which overcomes the problem of poor adaptability of traditional PID control. Through Matlab simulation analysis, the BAS-based PID control algorithm has better performance than the traditional PID control algorithm, with a shorter response time and better convergence accuracy.
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship and/or publication of this article: the financial supports from National Natural Science Foundation of China Youth Fund Project (52004096).
