Safety risk prediction based on improved particle swarm optimization and least squares support vector machine

Abstract

To address low accuracy, difficult parameter determination, and inefficient data use in safety risk prediction, this study proposes an intelligent prediction framework. It uses rough set theory to preprocess data (eliminate incompleteness, discretize continuous data, remove redundancy) and introduces an improved particle swarm optimization (IPSO) by introducing a mutation operator to optimize key parameters of the least squares support vector machine (LSSVM), forming the IPSO-LSSVM model. Experiments on 43 risk samples (33 for training, 10 for testing) show that it outperforms the BP neural network and standard PSO-LSSVM with faster convergence. It enriches integrated intelligent algorithms in safety risk prediction and provides technical references for improving prediction accuracy.

Keywords

Support vector machine particle swarm optimization rough set safety risk prediction least squares

Introduction

The rockburst prediction technology refers to the in-depth study of the causes and mechanism of the formation process based on coal geological exploration or the occurrence of rockburst cases.^1–3 According to statistics on the causes of rockbursts, studying and identifying the core triggering factors of rockbursts, conducting advance assessment of rockburst severity in coal geological formations, and clarifying its spatial distribution can provide guidance for rockburst safety management and prevention technology, and offer a reliable, targeted foundation for rockburst prevention and control.^4,5

Nowadays, the traditional rockburst prediction methods can be roughly divided into the following two categories except for the analogy experience method:

(1) The first type is the local detection method based on the drilling chip method, which estimates the stress state according to the drilling chip amount and then evaluates the impact risk. The drilling cuttings method is simple, reliable, and intuitive, and widely used. Its disadvantage is that it can predict the discontinuity of rockburst in time and space, and the monitoring results are easily affected by human factors. It is only used as an auxiliary prediction method for local impact.

(2) The second type is the systematic monitoring method that uses all kinds of shooting information caused by coal rockburst and strain softening, including the Geophysical Monitoring Method (geomagnetism, geotemperature, and geoelectricity). It is based on the structural stress failure process of rock mass structure and the advance appearance of a series of radiation phenomena. These radiation phenomena are considered as the precursors of stress failure. In general, continuous monitoring points are set up underground, and a continuous monitoring network is established to continuously monitor the rockburst dangerous area. The microseisms and ground sounds continuously monitored are converted into electrical pulse signals and transmitted to the surface monitoring station. The data are automatically processed and analyzed by the computer to predict the danger of the rockburst dangerous area. Although this square section can realize continuity in time and space, it is expensive, and it is difficult to manage and maintain the equipment. It is difficult to analyze the data and evaluate the stress state of coal and rock. It needs to test and further analyze and collate the accumulated large amount of empirical data to accurately evaluate the prediction results.

All kinds of methods for evaluating and predicting rockburst risk have their limitations and shortcomings. They cannot prevent and control the rockburst hazard well. The root cause is that they have not grasped the mechanism affecting the rockburst and have not taken qualitative and quantitative measures to predict and prevent the occurrence of disasters. In order to closely integrate with the actual project, research on the method of the test site must be carried out and is of great significance.^6–8 However, how to use the measured information to predict the impact ground pressure is the key research direction of the field test technology. In recent years, with the rapid development and continuous updating of intelligent algorithms, most of the engineering problems have been solved.

In view of this, this paper makes an in-depth study on the dangerous dynamic disasters of rockburst in the underground driving roadway of coal mines. In view of the shortcomings of the current prediction methods in this field, it systematically and comprehensively analyzes the impact indices (such as mining depth, geological structure, etc.) closely related to the formation and development of rockburst in many coal mines where rockburst disasters have occurred, and applies the advanced intelligent algorithm theory to the prediction of rockburst.

Theoretical background

Statistical learning theory provides excellent solutions to minor sample modeling problems. Support vector machine (SVM)^9–13 and support vector regression machine^14–18 are essential components of statistical learning. In solving the problem of rockburst with a limited number of samples, the vector machine model can be said to have opened up a good idea in its prediction research. This method only needs to select highly representative sample data and has low requirements for mastering the internal mechanism of the rockburst. It is also unnecessary for each correlation index to model the complex nonlinear mathematical relationship between rockburst pressure. The risk minimization principle constructs a successful prediction of the down hole shock state and its changing trend, which saves much workforce and material resources and dramatically improves the prediction efficiency. In this way, it is possible to predict the rockburst level at a specific time and place in the mining area in the future. However, the typical data of rockburst is often not enough in practice, so in the case of small samples, it is necessary to consider the time-varying and nonlinear characteristics of rockburst data and make accurate and reliable data in this field. The promotion and application of prediction models have become the focus of current prediction problems.

This paper brings the data after rough structure and intensive reduction into the SVM model and constructs models for each nonlinear relationship. It makes up for the vacancy of low theoretical degree in traditional shock prediction methods, which only rely on run test results and human experience to grade rockburst risk.

Statistical learning theory

Statistical learning theory^19–21 was once considered an empirical experimental science when studying the patterns of machine learning. However, after in-depth research by a large number of researchers, it has been found to have significant advantages in addressing small-sample (non-asymptotic) problems. Based on data, it seeks general patterns from limited sample data and attempts to analyze objective objects using some patterns that cannot be derived through principles, thereby accurately predicting the future or data that is not easily discoverable.

VC dimension

The VC dimension is an index of the learning performance of function sets defined in statistical learning theory to study the convergence speed and generalizability of the empirical risk optimal value converging to the expected risk optimal value when the number of indicator function sets tends to infinity. It is used to describe the capacity of a collection of functions.

The intuitive definition of the VC dimension can be expressed as follows: in the indicator function set, if there are possible 2 ⁿ forms that are separated by the functions in the function set through n data samples, it is said that n samples can break up this function set, and the maximum number of samples n refers to the VC dimension. The VC dimension is infinite if the functions in the function set can break up any number of samples.

The VC dimension further reflects the strength of the learning ability of the function set. If the VC dimension is smaller, the learning machine is more straightforward, and the capacity is correspondingly smaller. However, the current calculation of the VC dimension has certain limitations, and the VC dimension of complex function sets is difficult to determine. Usually, only the VC dimension of some particular function sets can be accurately calculated. Therefore, in the research of statistical learning theory, how to obtain the VC dimension of any function set is still the focus of research in this field.

The generalization bounds

The generalization bounds in statistical learning theory refer to the relationship between the empirical risk $R_{e m p} (a)$ and the actual risk $R (a)$ . Studying it is conducive to analyzing the performance of the learning machine and laying the foundation for developing new algorithms. For all functions, and these functions exist in the indicator function set, at least when the probability is 1−η, the relationship between the empirical risk $R_{e m p} (a)$ and the actual risk $R (a)$ satisfies the formula (1).

R (a) \leq R_{e m p} (a) + \sqrt{\frac{h (\ln (2 n / h) + 1) - \ln (η / 4)}{n}} = R_{e m p} (a) + φ (n / h)

(1)

In the formula, n is the number of data samples, and h is the VC dimension.

From formula (1), the actual risk consists of the empirical risk $R_{e m p} (a)$ and the confidence range $φ (n / h)$ . The confidence range $φ (n / h)$ is affected by the VC dimension h of machine learning and the training data sample n and has nothing to do with the confidence level 1−η. Further analysis: when the number of data samples n is fixed, the VC dimension h will increase with the increased complexity of the learning machine. Now, n/h decreases, increasing $φ (n / h)$ , making the empirical risk approximate to the actual risk, relatively large error. Therefore, to obtain a more negligible actual risk, it is necessary to reduce the empirical risk and VC dimension and narrow the confidence range $φ (n / h)$ , which is convenient to improve the generalization of future data.

Structural risk minimization

From equation (1), it can be known that in order to achieve the minimum limit of actual risk, the empirical risk should be minimized, and the learning machine (function set) with the smallest VC dimension should be used, but in fact, these two types of requirements are contradictory. Therefore, in order to minimize empirical risk, a wider set of functions should be used instead of a smaller VC-dimensional set when selecting functions. It requires us to control two conflicting factors while searching for the optimal solution to minimize the actual risk. In order to solve this problem, a new inductive principle in statistical learning theory, namely the principle of structural risk minimization (SRM), is obtained.

The basic idea of SRM is: if the size of the number of training samples n is fixed, the actual risk $R (a)$ is controlled by the VC dimension h and the empirical risk $R_{e m p} (a)$ . Among them, to minimize the empirical risk, h should be as small as possible, and the confidence range $φ (n / h)$ should be as small as possible. The specific method is as follows:

Decompose the function set $S = {f (x, a), a \in Λ}$ into a sequence of function subsets, and have a nested structure;

S_{1} \subset S_{2} \subset \dots \subset S_{k} \subset \dots

(2)

Among them, $S_{k}$ is a subset of the function set, and the number of pictures in the subset $S_{k}$ is bounded, and its VC dimension $h_{k}$ is also limited. Rank the size of the VC dimension of each subset:

h_{1} \leq h_{2} \leq \dots \leq h_{k} \leq \dots

(3)

Each subset has a unified confidence range, and as the complexity of each subset increases, the empirical risk will decrease. At this time, the confidence range and empirical risk should be kept similar in the subset, so that the actual risk reaches a minimum.

SVM model

SVM proposes a novel machine learning method based on the principle of SRM.^16–18 It effectively addresses problems related to limited samples, non-linearity, and high dimensionality. Essentially, SVM solves a convex quadratic optimization problem, ensuring the effectiveness of the solution while avoiding local optima, thereby guaranteeing that any extremum solution found is a global extremum.

Fundamentals of support vector machines

A SVM is a learning machine based on statistical learning theory and linear separable optimal classification surface. It minimizes the actual risk according to the empirical risk minimization criterion and takes the minimization of the confidence range as the goal.

The SVM performs a nonlinear transformation on the input sample vector. This process is completed by using an appropriate kernel function, and then the transformed input space vector is mapped to another high-dimensional feature space through the kernel function. In order to construct the optimal classification hyperplane and replace the linearly inseparable problem with the linearly separable problem, the rational selection of the kernel function effectively solves the “dimension curse” of the internal computer calculation and the feature space. The intuitive definition of the vector machine method is to construct the optimal classification hyperplane by obtaining the support vector by bisecting the closest point of the maximum separation plane.

Linearly separable optimal classification hyperplane

The SVM was initially developed for binary classification problems.^22–25 A given set is assumed to be divided into positive and negative samples. The goal is to find a unique maximum hyperplane to classify positive and negative data patterns using linear classification.^26,27 When solving the optimal hyperplane problem with the SVM, it is transformed into the solution of the Lagrangian function's functional convex quadratic optimization problem.

The method of constructing the optimal classification hyperplane: set a given training sample set $T = {(x_{1}, y_{1}), (x_{2}, y_{2}),$ $\dots, (x_{i}, y_{i})}$ , in which the feature vector input is $x \in R^{n}, y \in {+ 1, - 1}$ , n is, the dimension of the input space, $y_{i}$ is the value of each input vector, if the input sample $x_{i}$ is divided into the first category, $y_{i}$ takes value 1; if $x_{i}$ is divided into the second category, $y_{i}$ takes value −1. The so-called optimal classification line is to use the hyperplane to entirely and correctly distinguish the two types of samples and define the classification surface: $ω^{T} x + b = 0, ω \in R^{n}, x \in R^{n}, b \in R^{n}$ , where $ω$ is an adjustable coefficient vector, and the $ω$ in the first inner product is transposed, and the all vectors are represented as column vectors.

Defining the linear classification model $g (x) = ω^{T} x + b$ , $g (x)$ represents the distance from the input space $x_{i}$ to the optimal classification surface H. If $∥ ω ∥$ is unchanged, the classification interval should be maximized. It is necessary to minimize $∥ ω ∥$ so that the Euclidean norm represented by the weight vector $ω$ of the hyperplane is minimized.

Constraints on g(x) are:

{\begin{matrix} x_{i} \cdot ω^{T} + b \geq + 1, y_{i} = + 1 \\ x_{i} \cdot ω^{T} + b \leq - 1, y_{i} = - 1 \end{matrix}

(5)

The above formula (5) can be transformed into a limiting condition, namely:

y_{i} (ω^{T} x_{j} + b) \geq 1 i = 1, 2, \dots, N

(6)

The distance from H₁ and H₂ to H is $1 / ∥ ω ∥$ , and $2 / ∥ ω ∥$ is used to represent the classification interval. If $∥ ω ∥ / 2$ is the smallest, the largest classification interval can be obtained. At this point, the training samples can be separated correctly, and the classification interval $∥ ω ∥ / 2$ should be minimized. In order to construct the optimal classification surface problem of the Lagrange function, it is equivalent to solving the quadratic programming problem in the following formula:

{\begin{matrix} m ϕ (ω) = \frac{1}{2} ∥ ω ∥^{2} = \frac{1}{2} (ω^{T} \cdot ω) \\ s . t . y_{i} (ω^{T} x_{i} + b) \geq 1, i = 1, 2, \dots, N \end{matrix}

(7)

The solution for parameters $ω$ and b can be transformed into a dual problem, which is simpler:

Construct the Lagrange function:

L (ω, b, α) = \frac{1}{2} ∥ ω ∥^{2} - \sum_{i = 1}^{N} α_{f} [y_{i} (ω^{T} \cdot x_{i} + b) - 1]

(8)

The above formula $α_{i} \geq 0$ is the Lagrange operator. At the saddle point, the partial derivatives of $ω$ , b, and $α_{f}$ must be equal to 0, so the minimum value of equal (3.8) is obtained, and we get:

{\begin{matrix} \frac{\partial L (ω, b, α)}{\partial ω} = ω - \sum_{i = 1}^{N} α_{i} y_{i} x_{i} = 0 \\ \frac{\partial L (ω, b, α)}{\partial b} = \sum_{i = 1}^{N} α_{i} y_{i} = 0 \\ \frac{\partial L (ω, b, α)}{\partial α_{i}} = α_{i} [y_{i} (ω^{T} x_{i} + b) - 1] = 0 \end{matrix}

(9)

Then:

{\begin{matrix} ω = \sum_{i = 1}^{N} α_{i} y_{i} x_{i} \\ \sum_{i = 1}^{N} α_{i} y_{i} = 0 \end{matrix}

(10)

Substitute equation (10) into equation (8), and replace the original optimization problem with solving the dual problem:

{\begin{matrix} m L (α) = \sum_{i = 1}^{N} α_{i} - \frac{1}{2} \sum_{i = 1}^{N} \sum_{j = 1}^{N} α_{i} α_{j} y_{i} y_{j} (x_{i} \cdot x_{j}) \\ s . t . \sum_{i = 1}^{N} α_{i} y_{i} = 0 α_{i} \geq 0, i = 1, 2, \dots, N \end{matrix}

(11)

The optimal solution of equation (11) must satisfy:

f (x) = sgn [{(ω *)}^{T} \cdot x + b *] = sgn [\sum_{i = 1}^{N} α_{i} * y_{i} (x_{i} \cdot x) + b *]

(12)

Equation (12) can be regarded as a quadratic function optimization problem, which is based on the inequality constraints and has only one solution. If the optimal solution is $α_{i} *$ , $ω * = \sum_{i = 1}^{N} α_{i} * y_{i} x_{i}$ , $α_{i} *$ is the optimal solution, that is, the support vector.

The objective function of the optimal classification can be obtained as

f (x) = sgn [{(ω *)}^{T} \cdot x + b *] = sgn [\sum_{i = 1}^{N} α_{i} * y_{i} (x_{i} \cdot x) + b *]

(13)

Among them, N is the number of SVMs, $α_{i} *$ is the optimal value of the Lagrange coefficient, and $b *$ is the Lagrange domain value, which is obtained according to formula $b * = y_{j} - \sum_{i = 1}^{N} α_{i} * y_{i} (x_{i} \cdot x_{j})$ .

In order to distinguish the category of the test sample, which category the sample finally belongs to can be obtained from the positive and negative values of $f (x)$ . The principles for judging the category are:

{\begin{matrix} f (x) > 0, x \in positive \\ f (x) = 0, x lies in the classification hyperplane \\ f (x) < 0, x \in negative \end{matrix}

Nonlinear support vector machines

For the case of linear inseparability, a nonlinear mapping function $Φ$ is usually used in this kind of problem analysis, through which the linear inseparable training samples from the original data space are mapped to the high-dimensional feature space H in some way.^27–29 Then it is processed in the space, and the classification problem is solved in the high-dimensional feature space to solve the $Φ : R^{n} \to H$ .

Although the nonlinear mapping of the low-dimensional space complex problem through the high-dimensional feature space is adopted to simplify the classification problem, with the change of the spatial dimension, the calculation of the inner product also increases, resulting in the high-dimensional feature space in the high-dimensional feature space. The optimal classification surface solution problem becomes much more complicated. The introduction of the kernel function makes the problem equivalent to computing in the original space. The solution of the inner product of the high-dimensional feature space is cleverly avoided, and the computational complexity is reduced accordingly.

Kernel function of SVM

The successful application of SVMs in the field of statistical learning benefits from the following two characteristics, one uses the principle of minimum structural risk to construct the optimally spaced classifiable hyperplane, and on the other hand uses the convolution of the inner product to ingeniously construct the kernel function without knowing the explicit expression of the nonlinear mapping. The specific nonlinear problem-solving process adopts the kernel function. The random vector x in the N-dimensional vector space is not the original characteristic to participate in the calculation, but the vector is linearly separable after mapping in the high-dimensional space. In the linearly separable problem, all coordinate components are cleverly solved by the inner product cyclotron calculation, so it is unnecessary to spend time extracting the specific form of the nonlinear transformation. As long as the validity of the kernel function satisfies Mercer's theorem, it can be applied in the inner product computed in the original input space, resulting in a suitable nonlinear algorithm.

In the SVM, the kernel function determines the structure of the high-dimensional feature space. In order to obtain a learning machine of the nonlinear decision surface, the kernel function with different inner products should be selected in the sample space. That is to say, the choice of the constructed kernel function parameters directly affects the complexity and generalization performance of the classifier. At present, the kernel inner product functions currently used mainly include the following categories:

(1) Polynomial inner product function: one is the P-order non-homogeneous polynomial kernel function $K (x, x_{i}) = [(x \cdot x_{i}) + 1]^{p}$ , and the other is essentially the P-order homogeneous polynomial kernel function $K (x, x_{i}) = (x \cdot x_{i})^{p}$ of the polynomial pattern classifier.

The polynomial kernel function has achieved extensive application results in the application of SVMs in many fields. But for the polynomial kernel function, when the dimension is high, the calculation amount increases sharply, which slows down the operation rate, and sometimes it is even difficult to solve at all; that is, the problem of “dimension disaster” occurs. The polynomial kernel function constructed in the SVM can only solve low-dimensional space.

(2) Radial basis function (RBF)

K (x, x_{i}) = \exp (- \frac{x - x_{i}^{2}}{2 σ^{2}})

(14)

The so-called radial basis function is a symmetrical scalar function along the radial direction. In the above formula, $x_{i}$ is the kernel function, and $σ$ is the kernel width coefficient.

The RBF represents the monotonic function of the Euclidean distance from the kernel function center $x_{i}$ in the high-dimensional feature space. $σ > 0$ , the radial range of the function has a significant influence on the kernel width coefficient. If $σ$ is close to infinitesimal, the characteristics of the RBF kernel function are close to the linear kernel function $K (x, x_{i}) = (x \cdot x_{i})$ , which makes the generalization level of the sample worse; if the value of $σ$ is significant, the characteristics of the RBF kernel function are close to the polynomial kernel function.

(3) Sigmoid kernel function:

K (x, x_{i}) = \tanh [d (x \cdot x_{i}) + c]

(15)

In the above formula, d and c are the kernel function coefficients, that is, the hyperbolic tangent function $\tanh$ , which comes from the SVM kernel function of the multi-layer perceptron neural network of the neural network model.

The choice of kernel function and different kernel function parameters has a profound impact on the learning performance and generalization ability of the SVM model. According to the polynomial kernel function mentioned above, there are more parameters than the RBF, which will increase the difficulty of solving practical problems; and the Sigmoid kernel function has poor generalization ability in practical applications, while the RBF kernel function has a very high locality and regression performance, and more accurate prediction.

Least squares support vector machine

Although the SVM was first proposed for the classification problem, by introducing the ingenious idea of the loss function, the SVM can be extended to the function regression problem, thus providing a new idea for solving the modeling problem of some problems.^28–32

The least squares support vector machine (LSSVM)^33–37 transforms the inequality constraints in the classification problem into equality constraints in order to fit the accuracy, and all training samples can be fitted with a linear function under the condition of a specific accuracy, and finally directly solve the linear equation system, and it is no longer necessary to solve the quadratic programming problem to find the hyperplane for the optimal classification. The LSSVM does not need to solve the quadratic programming problem, so the complexity of the calculation is reduced, and the learning and solving speed is also improved, which is suitable for predicting the risk level of rockburst.

The solution process of the LSSVM algorithm^38–42 is as follows:

Given a training sample set $D = {(x_{i}, y_{i}) ∣ i = 1, 2,$ $\dots, M}$ , where the input data is represented by $x_{i} \in R^{n}$ , then $y_{i} \in R$ represents the corresponding output data. Similar to SVM, the samples are fitted by a linear function $f (x) = ω^{T} ϕ (x) + b$ in a high-dimensional space, and the input samples are mapped in the high-dimensional feature space by means of a nonlinear mapping $ϕ (x)$ . LSSVM uses the Lagrange optimization method to transform it into a quadratic optimization problem under equality constraints:

{\begin{matrix} m J (ω, e) = \frac{1}{2} ω^{T} ω + \frac{1}{2} C \sum_{i = 1}^{M} e_{i}^{2} \\ s . t . y_{i} = ω^{T} ϕ (x_{i}) + b + e_{i}, i = 1, 2, \dots, M \end{matrix}

(16)

Among them, $ω$ is the weight vector, $e_{i} \in R$ is the error variable, b is the deviation, and C is the penalty factor, which can adjust the error. In order to solve the quadratic optimization problem of equation (16), the constrained optimization problem is replaced by an unconstrained optimization problem, and the Lagrange function is defined.

L (ω, b, e, α) = \frac{1}{2} ω^{T} ω + \frac{1}{2} C \sum_{i = 1}^{M} e_{i}^{2} - \sum_{i = 1}^{M} α_{i} [ω^{T} ϕ (x_{i}) + b + e_{i} - y_{i}]

(17)

According to the optimization conditions, the partial derivatives of each variable of the Lagrange function are obtained, and the partial derivatives are set equal to zero, we get:

{\begin{matrix} \frac{\partial L}{\partial ω} = 0 \to ω = \sum_{i = 1}^{M} α_{i} ϕ (x_{i}) \\ \frac{\partial L}{\partial b} = 0 \to \sum_{i = 1}^{M} α_{i} = 0 \\ \frac{\partial L}{\partial e} = 0 \to α_{i} = C e_{i}, i = 1, 2, \dots, M \\ \frac{\partial L}{\partial α} = 0 \to ω^{T} ϕ (x_{i}) + b + e_{i} - y_{i} = 0, i = 1, 2, \dots, M \end{matrix}

(18)

Substitute into equation (18) to eliminate $ω$ and $e_{i}$ , and define $K (x_{i}, x_{j}) = ϕ (x_{i})^{T} ϕ (x_{j})$ as the kernel function that satisfies Mercer's theorem, where the RBF kernel function $K (x_{i}, x_{j}) = \exp (- x_{i} - x_{j}^{2} / 2 σ^{2})$ is selected; then the optimization problem can be transformed into solving linear equations as:

[\begin{matrix} 0 & I^{T} \\ I & Ω + C^{- 1} I \end{matrix}] [\begin{matrix} b \\ α \end{matrix}] = [\begin{matrix} 0 \\ y \end{matrix}]

(19)

Among it,

I = [1, \dots, 1]^{T}, y = [y_{1}, \dots, y_{M}]^{T}, α = [α_{1}, \dots, α_{M}]^{T}, Ω = {K_{i j} = K (x_{i}, x_{j})}_{i, j = 1}^{M}

Finally, the least squares method is used to solve $α_{i}$ and b, and the model prediction output of the LSSVM is obtained as:

y (x) = \sum_{i = 1}^{M} α_{i} K (x, x_{i}) + b

(20)

Rough set theory

Rough set theory is a mathematical tool for describing incompleteness and uncertainty.^19–22 It can effectively analyze inaccurate, incomplete, inconsistent, and other uncertain information with strong randomness, and further analyze and reason about data to uncover hidden information and reveal inherent knowledge rules. By utilizing rough set theory, the various influencing factors of intermediate impact pressure are reduced to eliminate redundant information, obtaining the main factors affecting impact pressure. A precursor processor is then constructed to provide simplified sample data for the SVM prediction model.

In the extraction of data indexes for impact pressure in rough sets, discrete data is required to express the numerical values in the decision table, and continuous impact pressure index data must be discretized. Effective discretization methods can reduce the time and space overhead of algorithms, improve the clustering ability of samples, and enhance learning accuracy. The discretization of continuous attributes refers to dividing the value domain of continuous attributes into breakpoints, and then classifying the attribute values into multiple intervals according to the judgment criteria, with each code representing a different interval, thereby obtaining discrete attribute values. This paper adopts the fuzzy C-means clustering (FCM) algorithm^18,29–31 for fuzzy clustering partition of the sample space, as it has strong adaptability and is easy to operate. This algorithm has been widely applied in many fields such as image segmentation and system identification.

Particle swarm optimization

In the LSSVM algorithm, the penalty factor C and the kernel function width coefficient σ need to be determined manually. Their selection has a significant impact on both the generalization prediction accuracy and computational efficiency of the algorithm. Typically, grid search combined with cross-validation is employed to obtain optimal parameters, but this method is heavily influenced by human factors and is complex and time-consuming to implement. Therefore, this paper leverages the powerful global search capability of the recently popular particle swarm optimization (PSO) algorithm^37–40 to optimize these two parameters of the SVM, thereby greatly improving the performance of the SVM in predicting impact pressure.

The PSO algorithm randomly initializes a swarm of particles, and then searches for the best solution during each iteration. Each particle follows its individual best position (pbest) and the global best position (gbest), meaning that the particle updates its position and flight velocity based on its historically best solution found and the best solution found by the swarm, gradually converging towards the optimal solution.

Methods

Theoretical foundations

Based on a comprehensive analysis of historical rockburst cases and relevant literature,^43–50 we selected seven geological and mining indicators closely related to the occurrence of rockburst as input features for the model, including mining depth, geological structure type, coal seam thickness, drill cuttings volume, maximum principal stress, uniaxial compressive strength, and elastic deformation energy index. The model uses rough set theory as a precursor data processor combined with LSSVM to predict the hazard level of rockburst. The fundamental reason is to provide a practical basis for strengthening rockburst risk prediction and a reliable and effective decision support method for the prediction of rockburst. PSO is employed to automatically search for and determine the optimal parameters for the LSSVM model, which is crucial for its predictive performance.

The attribute reduction (i.e. deleting redundant attributes without affecting decision-making ability) of rough set aims to remove redundant conditional attributes without changing the basic classification and decision-making ability of the data set, so as to reduce the input dimension of the subsequent LSSVM model and simplify the model complexity. The construction process of the prediction model involves the following steps: First, use the attributes of rough set theory and the idea of attribute reduction to perform redundancy processing (i.e., eliminate the redundancy of sample information, remove the duplicate or samples that do not provide new information) on each impact index, that is, use rough set as a processor, and then use least squares support vector machine as a back-end processor to perform regression fitting on the simplified sample information.

Safety prediction model

Overall process

The model of this study is mainly based on conventional indicators that are easily obtainable during geological exploration and mining processes (such as depth, geostress, etc.), and does not include electromagnetic radiation and other data from geophysical monitoring methods. This is both a simplification of this study and a direction that can be expanded in the future.

A rough set-particle swarm SVM prediction model is constituted. The learning process of rockburst hazard level and the steps of the prediction algorithm are as follows:

Step 1: Eliminate redundancy in sample information: obtain and complete specific sample information, eliminate incomplete data, ensure that the decision attributes of the remaining sample condition attribute sets are not missing, and perform normalization processing.

We removed any samples containing missing values to ensure data integrity. Here, “sample information redundancy” means that if two samples have exactly the same indicators (such as mining depth, geostress value, etc.) and risk level, then one of them can be considered redundant data and removed to avoid misleading model training.

Step 2: Discretize the continuous rockburst hazard sample information in the data by constructing the membership degree using the FCM algorithm.

For continuous indicators, we use the FCM algorithm to automatically divide them into several fuzzy sets, with each data point having a membership degree to each fuzzy set, thus achieving soft discretization of the data, which can better preserve the uncertainty information of the data than hard partitioning (such as equidistant discretization).

Step 3: Reduce the discretized decision table with the idea of rough set theory, and ensure that redundant information samples and irrelevant condition attributes are eliminated without affecting the dependency relationship between the decision attributes and condition attributes in the decision table. The final attribute decision table is obtained through the reduction of the kernel.

The reduction algorithm based on the discernibility matrix is used to reduce the attributes of the discretized decision table. By constructing the discernibility matrix, the algorithm finds out all the attribute combinations that can distinguish different decision categories, and then solves the minimum attribute reduction set, so as to eliminate those redundant condition attributes that do not affect the decision classification ability.

Step 4: Extract training and testing samples from the final training samples of the reduced kernel, select the radial basis kernel inner product function, and use the improved particle swarm optimization (IPSO) algorithm to optimize the optimal penalty factor and kernel width coefficient. The training sample trains the regression estimation function $y (x) = \sum^{M} α_{i} K (x, x_{i}) + b$ of LSSVM, obtains the model parameters $α_{i}$ and b, and establishes the training sample regression model.

Step 5: Finally, use the test sample to evaluate the output result of the LSSVM against the rockburst risk level model, and evaluate from the two aspects of convergence speed and prediction accuracy. If the prediction requirements are not met, repeat steps (1)–(5) until the final model is satisfactory and stop.

Among them, the prediction error is verified with the mean absolute percentage error (MAPE). Let the sample time sequence be $x_{1}, x_{2}, \dots, x_{1}$ , the corresponding real value is $f (x_{1}), f (x_{2}), \dots f (x_{1})$ , and the definition of MAPE is as in equation (21).

MAPE = \frac{1}{l} \sum_{i = 1}^{l} | \frac{f (x_{i}) - y_{i}}{y_{i}} |

(21)

The characteristics of this model are: rough set does not require additional artificial assumptions and uses its reduction idea to dig deep into the classification rules and internal correlations of the input samples of rockburst hazard levels to extract the implicit knowledge of the samples, not only to remove the support vector, the abnormal data sensitive to the machine algorithm also ensures the elimination of large-scale high-dimensional redundant attributes and information and noise in the sample; and the introduction of the construction membership method of FCM in the discretization of rough sets enhances the individual distinction; this kind of precursor processor not only enhances the subsequent LSSVM regression accuracy but also reduces the prediction and learning burden and training time through information dimension reduction and redundancy elimination. The SVM is used as the post-information processing system, which makes up for the complexity of the samples in practical applications and the limitation of the SVM for samples with strong fault tolerance and interference; the introduction of the swarm algorithm not only improves the accuracy of the test set but also dramatically improves the efficiency.

The core of rough set theory lies in simplifying the dataset and extracting core decision rules by identifying and removing redundant attributes that do not contribute or contribute very little to decision-making, without altering the data's classification ability. This is akin to slimming down the data by removing noise and irrelevant information. Through attribute reduction in rough sets, the number of input features has been reduced. This not only reduces the input dimension and computational complexity of the LSSVM model, but more importantly, eliminates redundant and noisy information, allowing the model to focus more on core decision attributes, thereby improving prediction accuracy and generalization ability.

The specific flow is shown in Figure 1. Figure 1 shows the complete process of the proposed IPSO-LSSVM prediction model, clearly presenting the various stages and their interrelationships from data input, rough set preprocessing, IPSO parameter optimization to LSSVM model training and prediction.

Figure 1.

Safety prediction flow.

Optimizing LSSVM parameter by IPSO

In the later stages of iteration, the standard particle swarm algorithm tends to cluster particles near a local optimum, resulting in loss of population diversity and premature convergence. To enhance the algorithm's ability to escape from local optima, this paper introduces a mutation operator (referred to as IPSO) in the PSO framework. The specific operation is as follows: after updating the velocity and position of particles in each iteration, mutation is performed on some particles with a certain probability. The mutation method uses Gaussian perturbation to randomly shift the position of particles. This strategy can effectively increase population diversity, help particles escape local extremum points, and avoid excessive random disruption of the convergence direction through adaptive probability.

The detailed steps are as follows:

(1) Initialization. Randomly initialize the speed and position of each particle in the entire particle swarm, the current position is the historical optimal position of a single particle, $P_{i}$ is set to the current position, and the current historical optimal setting of the entire particle swarm is $P_{g}$ .

The performance of LSSVM is highly dependent on the penalty factor C (the trade-off between control model complexity and error) and the kernel function parameter σ (which affects the radial range of the RBF). In order to obtain the optimal (C, σ) combination, we use the IPSO algorithm to conduct a global search within the preset empirical range (e.g. C ∈ [0.1, 1000], σ ∈ [0.01, 100]).

Initialize the kernel parameter vector $(C, σ)$ according to the empirical value range, set the acceleration factors $c_{1}$ and $c_{2}$ , the inertia weight w, the convergence factor $χ$ , and the maximum number of iterations $T$ .

(2) Evaluate the population. The evaluation function is also the fitness function, which is mainly used to distinguish the quality of the individuals in the population. The fitness value of the particle is inversely proportional to the quality of the position. The individual extreme value pbest of each particle stores the fitness value and position of each particle, and the global extreme value gbest stores the fitness value and position of the individual with the best fitness value among all the individual extreme values. The fitness function can be defined as:

f = \sqrt{\frac{\sum_{i = 1}^{m} {({\hat{y}}_{i} - y_{i})}^{2}}{m}}

(22)

Among them, ${\hat{y}}_{i}$ and $y_{i}$ represent the training output value and the actual output value of LSSVM, respectively.

(3) Comparison of fitness values. For each particle, compare the optimal position that the particle flies through and the fitness of the particle. According to the comparison between all the current individual extreme values pbest and the global extreme value gbest, update gbest.

(4) Particle state update.

(5) Check whether the end conditions are met. If the maximum number of iterations is reached or the fitness value is less than the given precision, the iteration is stopped, and the optimal solution is output. Otherwise, return to step (2) to continue the iteration. The condition of iteration stop is to meet one of the following two conditions: (1) reach the preset maximum number of iterations t (set as 2000 in this paper); (2) In successive iterations (such as 50 times), the change of the global optimal fitness value is less than the preset minimum threshold (set as 1×10⁻⁶ in this paper).

(6) The optimal position of the particle found, that is, the optimal parameter vector $(C, σ)$ , is assigned to LSSVM.

We set the population size to 20, the acceleration factor $c_{1} = c_{2} = 2$ , the maximum number of iterations $I_{max} = 2000$ , the inertia weight w gradually decreases from 0.9 to 0.4 with the increase of the number of iterations, and the target accuracy is 0.0001.

Experimental results and analysis

We choose the classic BP neural network as the benchmark model, because it is widely used in all kinds of prediction problems. At the same time, the standard PSO-LSSVM is selected as a comparison to verify the effectiveness of the proposed improvement strategy (IPSO). Although there are other advanced methods such as random forest and gradient lifting tree, this study focuses on performance improvement under the SVM framework, so it is mainly compared with the homologous LSSVM model.

In order to better analyze the prediction effect of the rockburst prediction model based on the IPSO-LSSVM algorithm proposed in this paper, according to the sample data, the prediction and simulation of the rockburst hazard level are carried out. The sample data is described as follows.

Data source: The data specifically comes from Meihuajing Coal Mine in Ningxia, China.

Data size: A total of 43 samples, each containing 4 features.

Feature description: The feature description is shown in Table 1.

Table 1.

Feature description.

Feature name	Units	Data type
Mining depth	meters	Continuous
Geological structure type	/	Discrete
Coal seam thickness	meters	Continuous
Drill cuttings volume	L	Continuous
Maximum principal stress	MPa	Continuous
Uniaxial compressive strength	MPa	Continuous
Elastic deformation energy index	/	Continuous

Output tags: Rockburst risk level: 1-Weak risk, 2-Medium risk, 3-Strong risk.

Data partitioning: 33 training sets and 10 testing sets are randomly partitioned.

Given the small sample size, in order to more reliably evaluate model performance and prevent overfitting, five-fold cross-validation is used during the model training process. The training set is randomly divided into five parts, with four parts used for training and one part used for validation in turn. The average error of the five validations is used as the basis for selecting model parameters.

Compared with the PSO-LSSVM network learning algorithm and the BP neural network prediction model, it starts from two aspects of training efficiency and prediction accuracy. Taking 1–33 groups of data in the sample as the training object of the network model, the hazard level of rockburst is tested through 34–43 groups of sample data, and after 2000 iterations of training.

The hardware environment used in this performance comparison experiment is:

CPU: Intel Core i7-10700k (8-core 16 thread, 3.8 GHz basic frequency, 5.1 GHz maximum RF).

GPU: NVIDIA geforce RTX 3060 (12gb gddr6, CUDA acceleration supported).

Memory: 32GB DDR4 3200mhz.

Operating system: Ubuntu 22.04 LTS.

Software environment: Python 3.9 + scikit learn 1.3 + pytorch 2.1 + numpy 1.24.

The comparison of the prediction results of IPSO-LSSVM, BP neural network, and PSO-LSSVM prediction models is shown in Table 2.

Table 2.

Prediction result comparison of three prediction models.

No.	Actual value	BP neural network	PSO-LSSVM	IPSO-LSSVM
34	3	3.3378	3.2072	3.1067
35	2	2.2189	1.8571	2.1184
36	2	2.3429	2.2625	2.1318
37	1	1.2441	0.9526	1.0129
38	2
39	2
40	3
41	2
42	1
43	3

Mean absolute error (MAE), root mean square error (RMSE), and MAPE of three models are shown in Table 3.

Table 3.

Prediction performance comparison.

	BP neural network	PSO-LSSVM	IPSO-LSSVM
MAE	0.3212	0.2079	0.1079
RMSE	0.3467	0.2357	0.1212
MAPE	10.71%	7.00%	3.60%

From the error results in Table 3, the three error indices of ipso-lssvm are the lowest, indicating that its prediction accuracy is the highest; PSO-LSSVM takes the second place; the error index of the BP neural network is the highest, and the prediction accuracy is relatively low.

BP neural network is trained for 3000 epochs, while PSO and IPSO-LSSVM models are optimized for parameters using the PSO algorithm, and their internal LSSVM models do not involve iterative training. Comparative analysis of training time is shown in Table 4.

Table 4.

Comparative analysis of training time.

Model	Training time (seconds)	Convergence iterations	Efficiency improvement comparison
BP neural network	187	1927	Benchmark
PSO-LSSVM	124	1438	33.7% faster than BP
IPSO-LSSVM	68	598	63.6% faster than BP and 45.2% faster than PSO-LSSVM

BP structure is simple, but it is easy to fall into a local optimum. PSO-LSSVM uses PSO to optimize kernel parameters, avoiding manual parameter adjustment. IPSO-LSSVM introduces a mutation operator to simplify the model structure. Under the same hardware environment, BP neural network training takes about 187 s, PSO-LSSVM takes about 124 s, while IPSO-LSSVM significantly improves the convergence speed and simplifies the kernel parameter optimization process due to the introduction of mutation factor, which takes about 68 s, and the training efficiency is about 64% higher than BP neural network and 45% higher than standard PSO-LSSVM.

From Figure 2, it can be seen that the curve rapidly decreases at the beginning, indicating that IPSO can quickly find the optimal area in the early stage of the search. The mid-term decline rate slows down, and the algorithm gradually shifts from global exploration to local development. The stable curve in the later stage indicates that the algorithm has converged and there has been no significant oscillation, demonstrating good stability.

Figure 2.

Convergence curve of fitness value (RMSE) with iteration number during IPSO optimization of LSSVM parameters.

The prediction results show that the IPSO-LSSVM prediction model has higher prediction accuracy and better generalization ability, and can accurately predict the hazard level of rockburst, achieving an ideal prediction effect. Although the model performs well on the current dataset, its generalization ability still needs further validation. There are significant differences in coal rock mechanical properties and geological structures among different mining areas, which may lead to a decrease in the predictive performance of the model. Therefore, when applying the model to a new mining area, it is necessary to fine-tune or retrain the model using historical data from that mining area.

Conclusions

This paper proposes an algorithm combining rough set theory and improved particle swarm least-squares SVM to predict the historical data. The accurate prediction of this model can provide a quantitative basis for mine safety management. For example, when the model predicts a high-risk level, managers can take preventive measures such as pressure relief blasting and optimizing mining layout in advance, effectively reducing the probability of rockburst accidents and ensuring the safety of personnel and equipment.

The main contribution of this article lies in: (1) Organically combining rough set theory, IPSO algorithm, and LSSVM, an intelligent integrated model is constructed for small sample, nonlinear rockburst risk prediction problems. (2) An improved PSO algorithm (introducing a mutation operator to prevent premature convergence) has been proposed, effectively enhancing the global search capability and convergence speed of parameter optimization. (3) Through actual data verification, it has been proven that the model significantly outperforms the BP neural network and standard PSO-LSSVM models in terms of prediction accuracy. The model proposed in this study provides theoretical support and technical reference for the real-time prediction of rockburst risk. In the future, it can be embedded in the coal mine safety monitoring system to dynamically update the prediction results by using the data collected in real time (such as microseism and ground stress), so as to provide timely early warning information for on-site engineering personnel.

The prediction of rockburst is quite complex. This paper attempts to apply an intelligent algorithm to the prediction of the risk level of rockburst. However, this study also has some limitations. First of all, due to the difficulty of obtaining data on rockburst cases, this study only used 43 sets of samples. Such a small sample size limits the training of complex models and may result in uneven distribution of samples with different risk levels in the dataset, which may lead to a decrease in the model's ability to predict risk categories with smaller sample sizes. In the future, it is necessary to collect more data from different mining areas to enhance the robustness and generalization ability of the model. The sample data is small (only 43 groups) and comes from a specific mining area. The generalization ability of the model needs to be verified under a wider range of geological conditions. Secondly, although the improved PSO algorithm performs well, its parameters (such as learning factor and population size) still need to be adjusted according to specific problems. Thirdly, it is important to check and discuss whether the sample size of different risk levels in the dataset is balanced. Due to the small sample size of this article, it is difficult to achieve data balance. In future research, Synthetic Minority Oversampling Technique (SMOTE) will be used to overcome this limitation.

Future research can be conducted from the following aspects: (1) Algorithm optimization: We will explore combining other heuristic algorithms such as simulated annealing and genetic algorithms with LSSVM, or developing hybrid optimization strategies to further improve model performance. (2) Data fusion and model generalization: We will combine multi-source heterogeneous data such as microseismic monitoring and electromagnetic radiation, and verify them in more mining areas with different geological conditions to improve the robustness and generalization ability of the model. The intelligent prediction framework adopted in this research, with its core idea of data preprocessing + parameter optimization + prediction model, is not only applicable to rockburst prediction, but can also be transferred to other engineering and scientific fields by adjusting input features. For example, in medical science,^33–39 this framework can be used for predicting drug efficacy or diagnosing diseases; in environmental science,^40–42 it can be used for water resource management or air quality warning; in engineering science,^43–50 it can be applied to geomechanical stability analysis or other geological hazard prediction. (3) Dynamic prediction: We will develop a dynamic prediction model capable of processing time series data to achieve real-time and online warning of rockburst risks.

Footnotes

ORCID iDs

Lianhui Li

Fred Brunson

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the National Natural Science Foundation of China (Grant No: 52165061) and the Research Initiation Project of Wenzhou Polytechnic (Grant No: RC202307).

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request. Restrictions may apply to the availability of these data, which were used under license for this study.

References

Xiang

. Rockburst prediction on the superimposed effect of excavation accumulation energy and blasting vibration energy in deep roadway. Shock Vib 2021; 2021. doi:10.1155/2021/6644590

Zhao

Chen

Zhang

. Data-driven model for rockburst prediction. Math Probl Eng 2020; 2020. doi:10.1155/2020/5735496

Wang

, et al. Rockburst prediction based on the KPCA-APSO-SVM model and its engineering application. Shock Vib 2021; 2021. doi:10.1155/2021/7968730

Ahmad

Katman

Al-Mansob

, et al. Prediction of rockburst intensity grade in deep underground excavation using adaptive boosting classifier. Complexity 2022; 2022. doi:10.1155/2022/6156210

Tian

Chen

, et al. Energy release analysis of a severe rockburst in a headrace tunnel crossing a tectonic stress zone. Shock Vib 2019; 2019. doi:10.1155/2019/8959845

X-B

Q-H

. Risk assessment of the rockburst intensity in a hydraulic tunnel using an intuitionistic fuzzy sets-TOPSIS model. Adv Mater Sci Eng 2022; 2022. doi:10.1155/2022/4774978

Pan

Ren

Cai

. Effect of joint density on rockburst proneness of the elastic-brittle-plastic rock mass. Shock Vib 2021; 2021. doi:10.1155/2021/5574325

Chen

Guo

. Discussions on the complete strain energy characteristics of deep granite and assessment of rockburst tendency. Shock Vib 2020; 2020. doi:10.1155/2020/8825505

Nie

Bai

Nie

, et al. Optimization of the economic and trade management legal model based on the support vector machine algorithm and logistic regression algorithm. Math Probl Eng 2022; 2022. doi:10.1155/2022/4364295

10.

Olatunji

Owolabi

. Barium titanate semiconductor band gap characterization through gravitationally optimized support vector regression and extreme learning machine computational methods. Math Probl Eng 2021; 2021. doi:10.1155/2021/9978384

11.

Qin

. Identification of accounting fraud based on support vector machine and logistic regression model. Complexity 2021; 2021. doi:10.1155/2021/5597060

12.

Gyamerah

. Two-stage hybrid machine learning model for high-frequency intraday bitcoin price prediction based on technical indicators. Variational mode decomposition, and support vector regression. Complexity 2021; 2021.

13.

Zhu

. Action recognition, tracking and optimization analysis of training process based on the support vector regression model. J Healthc Eng 2022; 2022.

14.

Mao

Sun

, et al. Digital twin driven green performance evaluation methodology of intelligent manufacturing: hybrid model based on fuzzy rough-sets AHP, multistage weight synthesis, and PROMETHEE II. Complexity 2020; 2020: 1–24.

15.

Hang

Sun

, et al. A conjunctive multiple-criteria decision-making approach for cloud service supplier selection of manufacturing enterprise. Adv Mech Eng 2017; 9: 168781401668626.

16.

Dun

Chen

, et al. Short-term air quality prediction based on fractional grey linear regression and support vector machine. Math Probl Eng 2020; 2020. doi:10.1155/2020/8914501

17.

Hang

Gao

, et al. Using an integrated group decision method based on SVM, TFN-RS-AHP, and TOPSIS-CD for cloud service supplier selection. Math Probl Eng 2017; 2017: 1–14.

18.

. Enhancing the optimization of the selection of a product service system scheme: a digital twin-driven framework. Strojniski Vestnik-J Mech Eng 2020; 66: 534–543.

19.

Wang

Zhang

, et al. Support vector regression inverse system control for small wind turbine mppt with parameters’ robustness improvement. J Control Sci Eng 2022; 2022. doi:10.1155/2022/2978380

20.

Liu

Mei

, et al. A new support vector regression model for equipment health diagnosis with small sample data missing and its application. Shock Vib 2021; 2021. doi:10.1155/2021/6675078

21.

Lei

Mao

. Digital twin in smart manufacturing. J Indust Inform Integr 2022; 26: 100289.

22.

Liu

, et al. Sustainability assessment of intelligent manufacturing supported by digital twin. IEEE Access 2020; 8: 174988–175008.

23.

Dong

. Support vector regression method for regional economic Mid- and long-term predictions based on wireless network communication. Wirel Commun Mob Comput 2021; 2021. doi:10.1155/2021/1837681

24.

Liu

, et al. Crack prediction based on wavelet correlation analysis least squares support vector machine for stone cultural relics. Math Probl Eng 2021; 2021. doi:10.1155/2021/6638521

25.

Xue

Zhang

. The simplified expression of machine learning and multivariate statistical analysis based on the centering matrix. Math Probl Eng 2021; 2021. doi:10.1155/2021/5545061

26.

Zhao

. Machine learning theory in the strategic management of regional risk factors measurement. Mobile Inform Syst 2021; 2021. doi:10.1155/2021/2770830

27.

Lin

Zhou

. Research on the construction of deep learning based innovation and entrepreneurship education system in the internet plus era. Mobile Inform Syst 2022; 2022. doi:10.1155/2022/4552425

28.

Chen

. Semantic analysis of multimodal sports video based on the support vector machine and mobile edge computing. Wirel Commun Mob Comput 2022; 2022.

29.

Sun

Yang

, et al. Differentially private kernel support vector machines based on the exponential and laplace hybrid mechanism. Secur Commun Netw 2021; 2021. doi:10.1155/2021/9506907

30.

Mao

. Big data supported PSS evaluation decision in service-oriented manufacturing. IEEE Access 2020; 8: 1. doi:https://doi.org/10.1109/ACCESS.2020.2995063.

31.

Pan

Wei

Pan

. Study on evaluation model of Chinese P2P online lending platform based on hybrid kernel support vector machine. Sci Program 2020; 2020. doi:10.1155/2020/4561834

32.

Reddy

SSS

Kumar

Ghafoor

, et al. CoySvM-(GeD): Coyote optimization-based support vector machine classifier for cancer classification using gene expression data. J Sens 2022; 2022. doi:10.1155/2022/6716937

33.

Zhang

Chen

, et al. Artificial intelligence-enabled innovations in cochlear implant technology: Advancing auditory prosthetics for hearing restoration. Bioeng Transl Med 2025; 10. doi:10.1002/btm2.10752

34.

Ding

Wang

, et al. The impact of magnesium on shivering incidence in cardiac surgery patients: A systematic review. Heliyon 2024; 10. doi:10.1016/j.heliyon.2024.e32127

35.

Ghorbani

Asadi

, et al. Investigating the predictive contribution of attitude towards life and belief system on self-resilience and psychological toughness of cancer patients about the mediating role of emotion regulation. 2023 IEEE 21ST World Symposium on Applied Machine Intelligence and Informatics, SAMI. 2023, 139–146.

36.

Ghorbani

Minasyan

, et al. Anti-diabetic therapies on dental implant success in diabetes mellitus: a comprehensive review. Front Pharmacol 2024; 15. doi:10.3389/fphar.2024.1506437

37.

Aghabalyan

Ghorbani

Rituraj

. Relationship of medicine and philosophy: mathematical modeling of moral structures-etometry. Proceedings article published 23 May 2023 in 2023 IEEE 17th International Symposium on Applied Computational Intelligence and Informatics (SACI).

38.

Ghorbani

Chalabyan

, et al. Application of analytical hierarchy process to the selection suitability of biological drug forms for psoriasis treatment. J Pharm Innov 2025; 20. doi:10.1007/s12247-025-09997-0

39.

Rezaei

Azouji

, et al. Application of the analytical hierarchy process in the management of private ambulance care systems in three selected European countries: a strategic decision-making framework. Front Public Health 2025; 13. doi:10.3389/fpubh.2025.1526586

40.

Voskanyan

Ghorbani

Azodinia

. Utilizing Citizen-Driven Scientific Endeavors for Freshwater Pollution Surveillance: A case report of Lake Sevan, Armenia. Proceedings article published 25 January 2024 in 2024 IEEE 22nd World Symposium on Applied Machine Intelligence and Informatics (SAMI).

41.

Voskanyan

Ghorbani

Azodinia

. An Investigation of the Hydrochemical Parameters for Natural Monuments. PROCEEDINGS ARTICLE published 4 April 2024 in 2024 IEEE 11th International Conference on Computational Cybernetics and Cyber-Medical Systems (ICCC).

42.

Wang

, et al. Data-driven insights into climate change effects on groundwater levels using machine learning. Water Resour Manage 2025; 39: 3521–3536.

43.

Beheshtian

Roodbari

, et al. Comparative Evaluation of Machine Learning and Bayesian Deep Learning Methods for Estimating Ultimate Recovery in Shale Well Reservoirs. PROCEEDINGS ARTICLE published 4 April 2024 in 2024 IEEE 11th International Conference on Computational Cybernetics and Cyber-Medical Systems (ICCC).

44.

Hazbeh

Ghorbani

, et al. Proposing a New Model for Estimation of Oil Rate Passing Through Wellhead Chokes in an Iranian Heavy Oil Field. IEEE Joint 22nd International Symposium on Computational Intelligence and Informatics / 8th IEEE International Conference on Recent Achievements in Mechatronics, Automation, Computer Science and Robotics (CINTI-MACRo) 2022.

45.

Ghorbani

Rajabi Behesht

. Analysis of geomechanical processes of sand production for productive wells (study of asmari reservoir in the ahvaz oil field). Natural sciences, 2022. https://bulletin.am/wp-content/uploads/2022/04/2.pdf.

46.

Beheshtian

Roodbari

, et al. Advanced Machine Learning Methods for Accurate Prediction of Loss Circulation in Drilling Well Log. PROCEEDINGS ARTICLE published 4 April 2024 in 2024 IEEE 11th International Conference on Computational Cybernetics and Cyber-Medical Systems (ICCC).

47.

Beheshtian

Roodbari

, et al. Machine Learning Prediction of Gas Hydrates Phase Equilibrium in Porous Medium. 18th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI) 2024.

48.

Tehrani

Ghorbani

, et al. Laboratory study of polymer injection into heavy oil unconventional reservoirs to enhance oil recovery and determination of optimal injection concentration. AIMS Geosci 2022; 8: 579–592.

49.

Xie

, et al. Predicting hydrocarbon reservoir quality in deepwater sedimentary systems using sequential deep learning techniques. Geomech Geophys Geo-energy Geo-Resour 2025; 11. doi:10.1007/s40948-025-01030-5

50.

Deng

Wang

, et al. Deep learning-driven analysis of petrophysical dynamics in pay zone quality and reservoir characterization. Nat Resour Res 2025; 34: 2047–2066.