Implementing machine learning algorithms to optimize sprint performance and biomechanical analysis of track and field athletes

Abstract

Sprint performance is a crucial component of athletic performance, especially in sports like track and field, football, and rugby, which require quick bursts of peak effort over short durations. Understanding the biomechanics of sprinting is essential for enhancing athletic performance, preventing injuries, and creating effective training plans. Traditional research on sprint evaluation often focuses on discrete measures while neglecting the intricate interactions between variables that evolve throughout the sprint. This study addresses these challenges by applying a machine learning (ML) algorithm, specifically the Polar Bear-tuned Multi-Source Kernel Support Vector Machine (PB-MKSVM), to predict and optimize the sprint performance of track and field athletes. The system analyzes essential biomechanical characteristics such as muscle activation patterns, joint angles, ground reaction forces, and stride length. Data were collected using wearable sensors and motion capture systems during standardized sprint trials, during which various biomechanical parameters were recorded. Standard preprocessing steps including noise removal and outlier detection were applied to the data. Power Spectral Density (PSD) was employed to extract features from the preprocessed data. The results demonstrate that the proposed method outperforms traditional algorithms in predicting sprinting efficiency and identifies complex, phase-specific changes in movement patterns. The model effectively analyzes the intricate biomechanics of sprinters’ movements to differentiate between various skill levels. Using Python software, the model achieved impressive performance metrics, including accuracy (94.5%), precision (92.7%), recall (93.6%), F1-score (92.1%), R² (0.92), and AUC (0.91), highlighting its robust predictive ability. This study illustrates how machine learning models can advance research in sprinting mechanics and provide insightful information to athletes and coaches seeking to improve performance.

Keywords

machine learning (ML)sprint performance biomechanical track and field athletes Polar Bear-tuned Multi-Source Kernel Support Vector Machine (PB-MKSVM)

Introduction

Track and field sprinting performance is influenced by complex, interwoven biomechanical, physiological, and environmental determinants. Optimizing sprinting performance is a highly specific focus area within sports science, where even minor improvements can make a significant difference between an athlete who wins and one who loses.¹ Biomechanical analysis enables an understanding of the physical and mechanical aspects of athletes’ movements, allowing for tailored training interventions that optimize performance.² However, the complexity of human movement and the large volume of data involved make analyzing and optimizing sprint biomechanics challenging.³ Traditional performance analysis methods, such as video-based observations, manual biomechanical assessments, and basic statistical modeling, often fail to process large datasets or account for the multifactorial nature of sprinting. Such research can be time-consuming, subjective, and dependent on the analyst’s expertise, leading to inconsistent or poor training recommendations.⁴

Machine learning (ML) algorithms offer a novel approach to address these limitations by effectively handling large and complex datasets.⁵ Techniques such as supervised learning, unsupervised clustering, and predictive modeling enable the uncovering of relationships within biomechanical data that may not be identified using traditional approaches.⁶ These relationships can reveal the impact of various factors, such as stride lengths, ground reaction forces, and joint kinematics, on sprint performance with greater accuracy.⁷ The primary objective of this study is to develop ML algorithms aimed at optimizing sprint performance and conducting biomechanical analyses of track and field athletes.⁸ Advanced computational tools will enhance the performance analysis and provide actionable insights for athletes and coaches. This research strives to overcome the constraints of traditional approaches by providing a scalable and data-driven solution that enhances athletes’ sprinting performance.⁹ The application of ML in sports biomechanics represents a significant milestone, revolutionizing training practices and competitive strategies in track and field sports.¹⁰ However, there is a lack of specific examples of implemented ML models, quantified improvements, and detailed challenges related to integrating ML into biomechanical analysis for optimizing sprints, which limits its practical applicability.

This work aims to design and implement a novel machine learning (ML) algorithm, the Polar Bear-tuned Multi-Source Kernel Support Vector Machine (PB-MKSVM), which uniquely integrates Polar Bear optimization with multi-source kernel SVM. This innovative framework enhances the analysis of complex biomechanical parameters by optimizing kernel parameters for superior classification accuracy and predictive performance. The main contributions of this work are:

(1) Research introduces a new algorithm tailored to predict and optimize sprint performance, surpassing conventional methods and providing deeper insights into complex dynamics in sprinting.

(2) Through wearable sensors and motion capture systems, it examines muscle activation, joint angles, ground forces, and stride length in gaining insights into sports performance and reducing risk factors for injuries.

(3) Research brings out concrete suggestions for training programmers that improve sprinting mechanisms, and differentiate levels regarding skills, bringing along direct improvement to the sprint performance of the field athletes.

Related work

Machine learning methods were used to simulate the velocity-time curve in 100-meter sprinting to get beyond the limitations of traditional speed models.¹¹ Neural Network (NN) and Random Forest (RF) models forecast the acceleration phase. The shed illuminated the dynamics of sprinting performance by discovering a significant inverse relationship between maximal velocity and final time. Using ML methods that examined how morphometric characteristics affected first-grade primary school students’ 20-meter sprint performance was examined.¹² To find the best ML method for outcome prediction, three experiments were carried out. With a minimum mean squared error (MSE) value of 0.012, the correlation-based characteristics demonstrated strong linear relationships with the target variable and reliable predictors. The 1489 male adolescent athletes who competed in five different sports and average physical fitness (PF) were examined using 6 ML algorithms.¹³ With an F1-score of 0.87, an area under the curve (AUC) of 0.86, and an average accuracy of 90.14, the XGBoost model performed the best. The expectation was that this would facilitate the creation of tailored training plans for athletes engaged in various sports. Sports science was being transformed by artificial intelligence (AI), which enhances performance, education, and health management examined by.¹⁴ It improved the menstrual cycle upper management, off-training behavior, avoiding injuries, talent identification, load efficiency, and sleep quality. The manuscript highlighted the necessity of interdisciplinary cooperation, context-specific AI technology, and AI-focused education. In-depth medical evaluations of 3661 athletes were used to examine their health parameters.¹⁵ To identify the most important markers of fluid and blood biochemistry for phenotypic analysis and recovery process characterization in the post-competitive period, RF and multinomial logistic regression ML techniques were used. The classification of catabolism and anabolism was found to be highly influenced by the ornithine cycle characteristics and muscle metabolism factors.

Sports in the wild assessment of movement was currently possible through developments in wearable sensing and ML.¹⁶ This enables long-term monitoring for avoiding injuries as well as real-time feedback for approach analysis and performance improvement. A summary of methods for examined sports movement with wearable sensors and ML was given, along with information on how to set up measuring protocols, their advantages and disadvantages, and suggestions for building models from movement data. To create a deep learning and computer vision technology to record mass center velocities and the features of sprinting and skeleton push-start steps was suggested.¹⁷ High levels of agreement were found when the approach was compared to customized marker-less and marker-based methods. While step properties were similar, evaluations revealed lower mass center velocities during pushing. To maximize performance, decrease tiredness, and lower the risk of injury, athlete movement data was essential.¹⁸ Sport-specific actions could be classified using computer vision and inertial sensor technology, and performance could be improved by athlete-dependent classification techniques. Using information from an inertial measurement unit sensor that describes the training and assessment of supervised ML models for automatically identifying running surfaces. To examine how Internet of Things (IoT) and cloud computing could be used to investigate common sports injuries sustained by track and field players.¹⁹ Human sports injuries were analyzed using the cluster analysis approach, which used mathematical techniques to ascertain the association between samples based on features and similarity indicators. The findings show that computational and IoT-based approach outperformed the multi-level model approach in terms of time and accuracy, outperforming it by 10% and 18%. Employing deep learning to acquire image coordinates and binocular position, an automated technique was suggested for capturing footwork data from badminton players.²⁰ The final precision for positioning was 74.7%. The technique provided insights into how players cooperated to intercept projectiles by exposing inter-individual footwork adjustments during competitive performance.

To use ML techniques to enhance sports injury monitoring. The RF and Discrete Wavelet Transform (DWT) methods were used to generate a training set of sports injuries suggested.²¹ A wearable gyroscope device was used to create a system for tracking the likelihood of injuries in athletes. To offer affordable overall costs and great testing accuracy, the system determined an athlete’s predisposition for injury and made advice for prevention. A framework for integrating deep reinforcement learning (DRL) and physics simulation to find various motion strategies for sports talents like high leaps was examined.²² The framework explored initial character states using a Bayesian diversity search method and constrains actions to natural poses using a Pose Variational Autoencoder. The approach enabled the development of new tactics without reward programming or motion examples. The relationship between young, competitive track & field players’ strength and speed characteristics was investigated.²³ It was discovered that among young, talented field and track athletes who live and train above 50° north latitude, the prevalence of vitamin D deficits was lower than in other groups. Sports, medicine, and entertainment all depend on Motion Capture (MoCap), but its full potential was hampered by high cost and lack of experience,²⁴ particularly for novice and intermediate sports coaches. The difficulties encountered in creating reasonably priced MoCap systems for these levels had been examined to launch a system that was simple to use and required few resources. To evaluate countermovement jump (CMJ) performance during the contraction phase using a commercially available Inertial Measurement Unit (IMU) approached by.²⁵ Eight athletes performed CMJs while wearing the IMU on their fifth lumbar vertebra. For negative impulse phase time, contraction time, jump time, flight time, and minimum force, the results demonstrated excellent accuracy, correlation, and no statistical discrepancies between the IMU and force plate (FP).

Problem statement

Sprint performance and biomechanics have capitalized on ML and wearable technologies to potentially facilitate performance improvements, predict outcomes, and minimize potential risks associated with injury. A specific parameter of the acceleration phases, morphometric characteristics, or some isolated biomechanical feature that does not generally capture the complex, dynamic interactions between variables throughout the sprint cycle were focused on. Traditional ML models also have serious deficiencies in conducting phase-specific movement analysis, applicability across skill levels, and accuracy in outcome prediction. While some research use advanced techniques of ML often don’t capture delicate biomechanic variations important to optimize sprinting efficiency. The proposed PB-MKSVM algorithm incorporated wearable sensor data, motion capture systems, and advanced preprocessing techniques to analyze key biomechanical parameters holistically. This new approach recognized complex, phase-specific patterns, thereby outperforming traditional algorithms with actionable insights into optimizing sprint mechanics and improving athlete performance.

Proposed system

The proposed system utilized the Kinematics Motion dataset in performing biomechanical analysis on track and field athletes. There is preprocessing involving Z-score normalization to standardize features such as muscle activation, joint angles, and stride length so that data becomes comparable. Feature extraction uses PSD for the extraction of vital patterns in biomechanical signals. This includes a core system of the Polar Bear-tuned Multi-Source Kernel Support Vector Machine, optimized for the estimation of the optimal kernel parameters with enhanced prediction accuracy for sprint performance. This hybrid approach allows it to achieve greater accuracy and convergence speed in its classification and prediction of sprint biomechanics for performance optimization for athletes. The sprint performance flow is depicted in Figure 1.

Figure 1.

Overflow of the Sprint performance prediction.

Dataset

The Kinematics Motion dataset involves a sample of 150 track and field athletes, aged 18–30 years, consisting of various sprinting levels (beginner, intermediate, and advanced). Data was collected using a combination of wearable sensors, which captured muscle activations and ground reaction forces, alongside a motion capture system that recorded joint angles and other biomechanical parameters during standardized sprint trials. Each athlete performed three 60-meter sprints, and the average values were used for analysis to ensure consistency. The type of data available is particularly good for running an ML algorithm to find improvements in sprinting performance with track and field athletes concerning biomechanics such as stride length, muscle activation, and ground reaction forces.

Source: https://www.kaggle.com/datasets/yasserh/kinematics-motion-data.

Preprocessing using Z-score normalization

Z-score normalization is used for standardizing the biomechanical data, thereby ensuring that the features such as muscle activation, joint angles, and stride length all have comparable scales. This reduces bias during training, improves model performance, and enhances convergence in the PB-MKSVM-based sprint performance prediction. The Z-score normalization technique is based on the data’s mean and standard deviation values. If the data’s true lowest and maximum values are unknown, this approach is extremely useful. The formula is applied in equation (1).

W_{n e w} = \frac{W - μ}{σ} = \frac{W - m e a n (W)}{s t d D e v (W)}

(1)

The normalized new value is denoted by $W_{n e w}$ , the old value is denoted by $W$ , the standard deviation value is denoted by $σ$ , and the population mean is denoted by $μ$ .

Power Spectral Density (PSD) using feature extraction

Features extracted from a frequency domain for PSD are then considered to obtain vital patterns within these signals for such biomechanics as muscle activations and ground reaction forces, providing the foundation in which PB-MKSVM analyses and predicts a sprint performance effectively. The output PSD for a linear system $S_{z z} (w)$ is defined as equation (2).

S_{z z (x)} = {| G (x) |}^{2} S_{w w} (x)

(2)

where the data input PSD is

S_{w w} (x)

in equation (2), the median power is equal to the area under

S_{z z (x)} / 2 π

. The relative variations between input and output energy in particular bandwidths are the aspects interested in the analysis. This bandwidth-localized power is represented historically using Root Mean Square

(R M S)

means, which are defined as equation (3).

R o o t M e a n S q u a r e = {\frac{1}{2 π} \int_{x 1}^{x 2} {| G (x) |}^{2} d x}^{\frac{1}{2}}

(3)

where the bands are denoted by

x_{1} a n d x_{2}

. The

12 d B

below the

j^{t h}

frequency of resonance of the intact framework, it used the bandwidth among frequencies

x_{1} a n d x_{2}

in the examination. The index can be used to detect the framework. These RMS values are specified as

{(R M S)}_{V}

and

{(R M S)}_{C}

for the structure represented in equation (4).

D L M = \sum_{j = 1}^{M} | \frac{{({R M S}_{j})}_{V} - {({R M S}_{j})}_{C}}{{({R M S}_{j})}_{V}} |

(4)

where the absolute amount of the difference among the

R M S

of the

j^{t h}

resonance mode across the structures is represented by

| {({R M S}_{j})}_{V} - {({R M S}_{j})}_{C} |

. The impact of the

j^{t h}

mode shape is represented by the

j^{t h}

term of the

D L M

index. It is possible to create a sensitive measure by adding up all of the contributions. The index can be used to evaluate structural, but it cannot differentiate between changes in stiffness and changes in mass. A position function

D L (w)

can be introduced to estimate the locations along the framework. It is defined as equation (5).

D L (w) = \sum_{j = 1}^{M} | \frac{{({R M S}_{j})}_{V} - {({R M S}_{j})}_{C}}{{({R M S}_{j})}_{V}} | \times | φ_{j} (w) |

(5)

Equation (5) also contains the function $| φ_{j} (w) |$ , which represents the absolute magnitude of the structure’s $j^{t h}$ modal shape. The total of all the mode shapes, each weighed by the change in percentages is what the $D L (w)$ function can be considered.

PB-MKSVM for enhanced sprint performance and biomechanical analysis

The optimization of sprint performance and biomechanical analysis of track and field athletes using the hybrid Polar Bear-tuned Multi-Source Kernel Support Vector Machine (PB-MKSVM) leverages a unique hyperparameter search space including parameters such as C and γ within ranges defined as C ∈ (0.1100) and γ ∈ (0.001,10). This approach ensures that the model can efficiently explore the parameter landscape while converging on optimal settings suited for capturing the dynamics of biomechanical data. The convergence criteria were defined through monitoring changes in accuracy, with a stopping threshold set for minimal improvement over a predetermined number of iterations. The natural hunting strategies-inspired PB enhances the MKSVM by optimizing the parameters of MKSVM to increase the speed of convergence with greater accuracy for sprint biomechanics prediction. The framework uses PB in fine-tuning the kernel parameters and in choosing the most appropriate features of biomechanical data such as muscle activation, joint angles, and stride length. The MKSVM then utilizes optimized features and kernels for accurate classification and prediction performance by using different kinds of kernels: linear, polynomial, RBF, sigmoid, etc. This PB optimization uses global search by using ice floes, local search by polar bears hunting behavior model, and dynamic population control so that all those processes will perfect the ability to classify and predict sprint performance based on various scenarios. This hybrid approach maximizes predictive accuracy and robustness in biomechanical data analysis for sprint performance optimization. The hybrid PB-MKSVM process is described in Algorithm 1.

MKSVM framework for biomechanical data integration in sprint performance optimization

The Multi-Source Kernel Support Vector Machine (MKSVM) is made for the task of dealing with complex, high-dimensional biomechanical data involved in sprint performance analysis. With a kernel-based approach, the effective integration of diverse features helps ensure precise predictions and robust differentiation at the skill level to optimize athletic performance. Two distinct methods yield distinct sets of attributes that are then fed into the support vector machine (SVM) for training. Order calculations are used by a supervised SVM model to tackle two-bunch grouping problems. After supplying an SVM model layout of the assigned preparation data for every class, model are ready to classify current content. Finding a hyperplane in an $M$ -dimensional space is the primary objective of the SVM algorithm, which unquestionably groups the data. A collection of numerical functions known as the kernel is used in SVM computations. These kernels allow complex biomechanical input data to be transformed into appropriate formats for accurate classification. It addresses the complex dynamics involved in sprinting biomechanics. Different types of piece functions are used in distinct SVM computations. To explore SVM kernels tailored for sprint data analysis, specifically nonlinear, linear, sigmoid, polynomial, and radial base functions (RBF), have been examined here.

The Linear Kernel SVM is used for data that can be split into two categories using a single straight line, or data that can be segregated in a linear form. To enhance its capacity for generalization, it tries to maximize this margin. According to equation (6), the linear function is the dot product of two vectors $y 1 & y 2$ .

Z (y 1, y 2) = y 1 \cdot y 2

(6)

An expanded linear kernel representation is the Polynomial Kernel SVM. It is described as equation (7).

Z (y 1, y 2) = (y 1 \cdot y 2 + 1) * f

(7)

where

f

is the polynomial’s degree and the vectors

y 1 & y 2

in equation (7). For many nonlinear applications, the RBF SVM Gaussian is one similar kernel that provides strong linear separation in higher dimensions as represented in equation (8).

Z (y 1, y 2) = \exp (- α {‖ y 1 - y 2 ‖}^{2})

(8)

In equation (8), $α > 0 & α = \frac{1}{2} σ^{2}$ . The variable that can change Sigma significantly affects the kernel’s efficiency and needs to be adjusted for the particular issue. Careful tuning ensures optimal kernel efficiency for modeling sprint biomechanics. The distance from the origin or from a particular location determines the value of its function. The Sigmoid SVM $\tanh$ function is used by this kernel. This can serve as a stand-in for the neural network that is depicted in equation (9).

Z (y 1, y 2) = \tanh (α \cdot {y 1}^{S} \cdot y 2 + c)

(9)

Here, c represents the point of intersection constant, while α, typically set to 1/M (where M is the dimension of the data), denotes the slope. This formulation serves as the foundation for advanced analyses of sprint dynamics.

PB optimization for sprint performance and biomechanical analysis

The Polar Bear (PB) optimization is used to optimize the MKSVM method with its parameter selection for better accuracy and faster convergence and performance in the prediction and analysis of sprint biomechanics. To propose an optimization of PB for an MKSVM framework by the optimization of convergence speed prediction as well as for the complex handling of biomechanical data. The polar bear optimization method is an intelligent swarm-optimizing metaheuristic that derives inspiration from nature. The following four steps represent an analytical representation of polar bear hunting behavior. These include dynamic control of population, local search, global search with ice floes, and beginning population.

Initial population

Create the initial residents of the polar bears at random first, and then use exploitation, exploration, and a dynamic population search method to identify the most effective solution in the search space. The representation of each polar bear with $y$ coordinates is $Z = (z_{0}, z_{1}, z_{2}, \dots, z_{y - 1})$ . At the $j^{t h}$ iteration $, {(Z_{i}^{j})}^{(o)}$ can be used to represent a value of $o$ polar bears with $i$ coordinate. Candidate solutions are the initial polar bear population, which relies on exploration and exploitation to optimize sprint performance and biomechanics.

Global search using ice floes

The bear in quest of food typically looks around his neighborhood. It relocates to a sizable, stable ice floe in the event of food scarcity. When looking for food, polar bears steer ice floes to areas with a greater chance of locating seals for hunting, allowing them to support their weight for longer periods. To explore the global search space in a parsimonious manner, it can mimic polar bear strategic movements when traveling ice floes. The drifting directly approaches the starting population’s current optimal solution. The mathematical representation of this behavior is as follows in equation (10).

{(Z_{i}^{j})}^{(o)} = {(Z_{i}^{j - 1})}^{(o)} + s i g n (u) α + ρ

(10)

The $o^{t h}$ bear action at $i$ coordinate in the $j^{t h}$ iteration near the optimal value is represented by ${(Z_{i}^{j})}^{(o)}$ . The difference between the current and ideal polar bear populations is denoted by $u$ . $α$ is a given value that $α \in (0, 1)$ . $ρ$ is a given value that ranges from $u, ρ \in (0, u) .$ The distance between points $Z^{(j)} a n d Z^{(i)}$ is calculated using the Euclidean metrics and is expressed as follows in equation (11).

c (Z^{(j)}, Z^{(i)}) = \sqrt{\sum_{l = 0}^{y - 1} (Z_{l}^{(j)} - Z_{l}^{(i)})}

(11)

The global search approach uses the polar bear movement in optimizing ML algorithms to improve the efficiency of sprint performance and biomechanical analysis.

Local search

When hunting in the Arctic, polar bears shift asymmetrically, approaching their prey underwater, on land, or ice. Polar bears are among the biggest and most deadly non-aquatic predators, capable of swimming hundreds of kilometers without stopping. The local search strategy models polar bears’ real and flexible actions in their space to improve solution quality around the global optimum. The Trifolium equation is used to model this behavior of polar bears. This equation has two different parameters: $β \in (0, 0.3)$ is the polar bear’s visual range, and $θ_{0}$ represents the tumbling angle, which is a given value $(0, \frac{π}{2})$ . Using these two variable values, the polar bear radiation is computed in equation (12).

q = 4 β \cos θ_{0} \sin θ_{0}

(12)

For every spatial coordinate, the following system of equation (13) describes the individual movements.

{\begin{cases} z_{0}^{n e w} = z_{0}^{o l d} \pm q \cos (θ_{1}) \\ z_{1}^{n e w} = z_{1}^{o l d} \pm [q \sin θ_{1} + \cos (θ_{2})] \\ \begin{array}{l} z_{2}^{n e w} = z_{2}^{o l d} \pm [q \sin θ_{1} + q \sin θ_{2} + \cos (θ_{3})] \\ \dots \\ \begin{array}{l} z_{y - 2}^{n e w} = z_{y - 2}^{o l d} \pm [\sum_{l = 1}^{y - 2} q \sin θ_{l} + q \cos (θ_{y - 1})] \\ z_{y - 1}^{n e w} = z_{y - 1}^{o l d} \pm [\sum_{l = 1}^{y - 2} q \sin θ_{l} + q \cos (θ_{y - 1})] \end{array} \end{array} \end{cases}

(13)

where each solution’s

y

manages

θ_{1}, θ_{2}, \dots, θ_{y - 1}

, which are given values with an even circulation

(0, π)

. By using the equation adding a

(+)

sign, and comparing fitness, may update the bears’ local positions. If it is smaller than the initial value,

(-)

is used in its place, and the procedure is continued until the optimum response is obtained. This local search approach models polar bear movements in refining ML models to improve sprint performance.

Dynamic population control

The PB optimization method depends on the reproduction of the best parts of a population and the deprivation of the worst, while 75% of the overall population begins to exist at random. The dynamic population sizing balances exploration and exploitation by the dynamic management of population size concerning the number of fitness evaluations. Because of the severe arctic environment, PB optimization manages the population, allowing the creature to be exterminated. To determine whether a person can die or procreate, a constant $γ$ is introduced. $γ$ is a random number that occurs $γ \in [0, 1]$ . When the impact of the harsh arctic area and this method’s population can be represented as equation (14).

{\begin{cases} R e p r o d u c t i o n, & i f γ > 0.75 \\ D e a t h, & i f γ < 0.25 \end{cases}

(14)

The size of the population won’t drop to half it was in the beginning. Equation (15) utilizes the median of the most effective solution at $j^{t h}$ , ${(Z_{i}^{j})}^{(b e s t)}$ and a randomly chosen individual ${(Z_{i}^{j})}^{(o)}$ from the top 10% population excluding the best one, gives the reproduced individual ${(Z_{i}^{j})}^{(r e p)}$ . To replace deceased people to maintain population size.

{(Z_{i}^{j})}^{(r e p)} = \frac{{(Z_{i}^{j})}^{(b e s t)} + {(Z_{i}^{j})}^{(o)}}{2}

(15)

Dynamic population control in PB optimization manages model diversity, which improves the efficiency of ML algorithms for sprint performance and biomechanical analysis.

Performance evaluation

The system had 16 GB RAM, an Intel Core i7 processor, and an NVIDIA GPU. Python-based software, TensorFlow, and scikit-learn were used to implement the ML algorithms as shown in Table 1.

Table 1.

Experimental setup.

Component	Specification
RAM	16 GB
Processor	Intel core i7
GPU	NVIDIA GPU
Software	Python-based (TensorFlow, scikit-learn)

The confusion matrix shows how the ML model performs activity recognition, including walking and running activities, which serve as transitional movement patterns relevant to understanding sprint biomechanics. Evaluating these classifications allows us to comprehensively assess the biomechanical features associated with different phases of a sprint, emphasizing the importance of muscle activation and joint dynamics in varying speed conditions. Diagonal values correspond to correct classifications; 13,259 of the instances are correctly classified as walking, and 13,082 of the instances are correctly classified as running. Off-diagonal values are misclassifications. 115 instances of walking were classified as running, and 121 instances of running were classified as walking. The high accuracy of this model is an indicator of its ability to distinguish movement patterns. This degree of precision is vital in the context of sprint performance optimization, as accurately analyzing biomechanical data is critical to improving athletic performance and identifying minute variations in movement. The confusion matrix of the sprint performance is depicted in Figure 2.

Figure 2.

Confusion matrix of sprint performance.

The PSD method was used to extract key biomechanical features from the dataset, with an emphasis on parameters that are crucial for attaining the best possible sprint performance and biomechanical analysis. These are peak frequency, bandwidth, and energy across low (0–5 Hz), and high (15–30 Hz) frequency bands, as well as RMS for each of the bands. Hip and knee joint angles, stride length, muscle activation (quadriceps), and vertical ground reaction force are some of the key parameters analyzed. Hip joint angle has a peak frequency of 2.3 Hz, with most of the energy being in the low-frequency band, 0.45, and a corresponding RMS of 0.33. The knee joint angle had a peak frequency of 1.8 Hz with significant low-frequency energy. Stride length and muscle activation showed different frequency profiles, showing variability in the energy and RMS values across the bands. The ground reaction force peaked at 4.2 Hz, pointing out its importance in sprint propulsion. These features provide detailed information on biomechanical patterns that can be used to improve predictions of sprint performance using the PB-MKSVM algorithm. The biomechanical feature extraction using PSD for sprint performance is displayed in Table 2.

Table 2.

Biomechanical feature extraction using PSD for sprint performance.

Biomechanical parameter	Peak frequency (Hz)	Bandwidth (Hz)	Energy in low-frequency band (0–5 Hz)	Energy in high-frequency band (15–30 Hz)	RMS (low-frequency band)	RMS (high-frequency band)
Hip joint angle	2.3	5.0	0.45	0.15	0.33	0.20
Knee joint angle	1.8	4.5	0.50	0.20	0.36	0.18
Stride length (m)	1.5	3.2	0.60	0.15	0.40	0.14
Muscle activation (Quadriceps)	3.0	6.5	0.40	0.10	0.31	0.17
Ground reaction force (Vertical)	4.2	8.0	0.55	0.10	0.38	0.15

Training time in seconds for the training and validation dataset over 100 epochs optimizing performance of sprinters and biomechanical analysis for track and field athletes with the use of ML algorithms. The training time for both datasets is greater than 150 seconds but starts decreasing drastically as the number of epochs passes, settling below 50 seconds after about 20 epochs. The validation time follows the same pattern, starting at about 100 seconds and then leveling out to about 50 seconds. The training time graph is illustrated in Figure 3(a). The model has a convergence time of 50 seconds at the peak value at around 100 epochs, indicating stabilization of the process along with acquiring optimal performance by the end of the training phase. The convergence time graph is illustrated in Figure 3(b). This suggests that the efficiency of the ML model increases considerably in the early epochs and stabilizes both during training and validation as it converges to a reliable solution for performance and biomechanical optimization.

Figure 3.

Sprint Performance: (a) Training time (sec) (b) Convergence time (sec).

A comparison of ML model performance in optimizing sprint performance and biomechanical analysis of track and field athletes, RF,²⁶ and the hybrid Convolutional Neural Networks and Long Short-Term Memory (CNN-LSTM),²⁶ the existing method compared to the proposed PB-MKSVM model.

Precision, accuracy, f1-score and recall matrices are determined in the model. The accuracy (88.1%) for the RF model, with the given precision (86.5%), recall (87.2%), and F1-Score 86.8%), which is a sign of its strong predictive ability. The CNN-LSTM model was able to show improved performance with an accuracy (92.4%), precision (90.2%), recall (91.7%), and F1-Score (90.9%). The proposed PB-MKSVM model outperformed both of them with accuracy (94.5%), precision (92.7%), recall (93.6%), and F1-Score (92.1%) demonstrating superior ability in the optimization of performance and analysis of the biomechanics of athletes in sprint performance and competition analysis. The comparison values are shown in Figure 4 and Table 3.

Figure 4.

Comparison of ML models for sprint performance.

Table 3.

Numerical outcomes of ML Models for Sprint Performance and Biomechanical Outcome of Athletes.

Method	Accuracy (%)	Precision (%)	Recall (%)	F1-score (%)
RF²⁶	88.1	86.5	87.2	86.8
CNN-LSTM²⁶	92.4	90.2	91.7	90.9
PB-MKSVM [Proposed]	94.5	92.7	93.6	92.1

The R² (coefficient of determination) and AUC metrics were used to measure and distinguish the performance between machine learning models during optimizing sprint performance and biomechanical analysis of track and field athletes. RF achieved an R² of 0.82 and AUC of 0.78, meaning that the model was reasonably accurate in predictions. The CNN-LSTM model enhanced these values with R² of 0.89 and AUC of 0.86, and the proposed PB-MKSVM model outperformed both with an R² of 0.92 and AUC of 0.91, thus showing better model performance in athlete analysis. The R² and AUC values despite in Figure 5 and Table 4.

Figure 5.

R² and AUC result of Sprint Performance.

Table 4.

R² and AUC outcome of sprint performance.

Method	R²	AUC
RF²⁶	0.82	0.78
CNN-LSTM²⁶	0.89	0.86
PB-MKSVM [Proposed]	0.92	0.91

Discussion

PB-MKSVM utilizes multiple kernels, tuned with the optimization algorithm, to capture a complex, phase-specific biomechanical interaction, enabling precise prediction and optimization of sprint performance metrics. RF²⁶ method does not model temporal or phase-specific biomechanical interactions, and thus it cannot provide high precision and accuracy in the recognition of complex motion patterns. This method determines the accuracy (88.1%), precision (86.5%), recall (87.2%), f1-score (86.8%), R² (0.82) and AUC (0.78) values. CNN-LSTM²⁶ is computationally intensive, tends to overfit when data is scarce, and does not generalize well across a wide range of athletes with varying skills. This method determines the accuracy (92.4%), precision (90.2%), recall (91.7%), f1-score (90.9%), R² (0.89) and AUC (0.86) values.

PB-MKSVM integrates multi-source kernel learning to account for the inability of RF to capture the dependencies in temporal features. The phase-specific analysis that PB-MKSVM shows has a lower computational demand since the Polar Bear algorithm optimizes kernel parameters while ensuring robust precision and recall are preserved. As compared to CNN-LSTM, this model proves better at generalized predictions across the wide range of datasets and levels of skills applied during sprint performances.

Limitations and generalizability

Limitations in our study include the potential biases associated with wearable sensor data, such as sensor placement variability and ambient noise, which may affect the accuracy of biomechanical measurements. To mitigate this, we conducted preliminary tests to ensure consistency in sensor placement and calibrations across trials. Additionally, to address overfitting, we employed k-fold cross-validation during model training, which allowed us to evaluate the model’s performance on unseen data and to regularize weights to preserve generalizability. Performance testing was conducted across diverse athlete populations, including both male and female athletes across various skill levels and body types, ensuring that our model’s predictive capability is robust across a spectrum of training conditions and demographics.

Conclusion

Optimized sprint performance by using the PB-MKSVM approach in track and field athletes, the proposed model of ML successfully identifies changes in intricate biomechanical variables, such as muscle activation patterns, joint angles, ground reaction forces, and stride length with phase-specific precision. The results are more elaborate than those in traditional methods that have been conducted to date concerning sprint performance and skill differentiation; the ML algorithm enhances understanding in the realm of sprinting biomechanics and provides invaluable tools for the athletes and coaches’ better performance and injury reduction. The proposed model of ML showed great success in predicting sprint performance with impressive metrics: accuracy (94.5%), precision (92.7%), recall (93.6%), F1-score (92.1%), R² (0.92), and AUC (0.91). These results are a testament to the model’s ability to distinguish between various levels of skill levels and rightly predict sprinting efficiency, thus providing an excellent resource to both athletes and coaches for further performance improvements. The limitations include reliance on accurate biomechanical data from wearable sensors, the difficulty of feature extraction, and possibly overfitting in machine learning models due to limited training data. The future scope includes extending the dataset with various athlete profiles, incorporating real-time monitoring systems, and enhancing the PB-MKSVM algorithm to further increase the accuracy of prediction and personalization.

Footnotes

ORCID iD

Xiangwei Chen

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Conflicting interest

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

The authors declare that the data supporting the findings of this study are available within the article. The raw/derived data supporting the findings of this study are available from the corresponding author at request.*

References

Galvan‐Alvarez

Gallego‐Selles

Martinez‐Canton

, et al. Physiological and molecular predictors of cycling sprint performance. Scand J Med Sci Sports 2024; 34(1): e14545. DOI: 10.1111/sms.14545.

Bortone

Moretti

Bizzoca

, et al. The importance of biomechanical assessment after Return to Play in athletes with ACL-Reconstruction. Gait Posture 2021; 88: 240–246. DOI: 10.1016/j.gaitpost.2021.06.005.

Lichtwark

Schuster

Kelly

, et al. Markerless motion capture provides accurate predictions of ground reaction forces across a range of movement tasks. J Biomech 2024; 166: 112051. DOI: 10.1016/j.jbiomech.2024.112051.

Liu

, et al. The impact of different velocity losses on post-activation performance enhancement (PAPE) effects in sprint athletes: a pilot randomized controlled study. Sports 2024; 12(6): 157. DOI: 10.3390/sports12060157.

Soh

Japar

, et al. Maximizing the performance of badminton athletes through core strength training: unlocking their full potential using machine learning (ML) modeling. Heliyon 2024; 10(7): e35145.

Kapinski

Jaskulski

Witkowska

, et al. Towards Achilles tendon injury prevention in athletes with structural MRI biomarkers: a machine learning approach. Sports Med Open 2024; 10(1): 118. DOI: 10.1186/s40798-024-00786-6.

Brown

Hume

Brughelli

. Clinical determinants of knee joint loads while sidestepping: an exploratory study with male rugby union athletes. Adv Rehabil Sci Pract 2024; 13: 27536351241267108. DOI: 10.1177/27536351241267108.

Thelkar

. Leveraging artificial neural networks for enhanced athlete performance evaluation through IMU data analysis. Heliyon 2024; 10(15): e34826.

Sandamal

Arachchi

Erkudov

, et al. Explainable artificial intelligence for fitness prediction of young athletes living in unfavorable environmental conditions. Results in Engineering 2024; 23: 102592. DOI: 10.1016/j.rineng.2024.102592.

10.

Chen

Dai

. Utilizing AI and IoT technologies for identifying risk factors in sports. Heliyon 2024; 10: e32477. DOI: 10.1016/j.heliyon.2024.e32477.

11.

Tam

Yao

. Advancing 100m sprint performance prediction: a machine learning approach to velocity curve modeling and performance correlation. PLoS One 2024; 19(5): e0303366. DOI: 10.1371/journal.pone.0303366.

12.

Kurtoğlu

Eken

Çiftçi

, et al. The role of morphometric characteristics in predicting 20-meter sprint performance through machine learning. Sci Rep 2024; 14(1): 16593. DOI: 10.1038/s41598-024-67405-y.

13.

Lee

Chang

Lee

, et al. Essential elements of physical fitness analysis in male adolescent athletes using machine learning. PLoS One 2024; 19(4): e0298870. DOI: 10.1371/journal.pone.0298870.

14.

Mateus

Abade

Coutinho

, et al. Empowering the sports scientist with artificial intelligence in training, performance, and health management. Sensors 2024; 25(1): 139. DOI: 10.3390/s25010139.

15.

Petrovsky

Pustovoyt

Nikolsky

, et al. Tracking health, performance and recovery in athletes using machine learning. Sports 2022; 10(10): 160. DOI: 10.3390/sports10100160.

16.

Dorschky

Camomilla

Davis

, et al. Perspective on “in the wild” movement analysis using machine learning. Hum Mov Sci 2023; 87: 103042. DOI: 10.1016/j.humov.2022.103042.

17.

Needham

Evans

Cosker

, et al. Development, evaluation and application of a novel markerless motion analysis system to understand push-start technique in elite skeleton athletes. PLoS One 2021; 16(11): e0259624. DOI: 10.1371/journal.pone.0259624.

18.

Worsey

Espinosa

Shepherd

, et al. One size doesn’t fit all: supervised machine learning classification in athlete-monitoring. IEEE Sens Lett 2021; 5(3): 1–4. DOI: 10.1109/LSENS.2021.3060376.

19.

. Common sports injuries of track and field athletes using cloud computing and internet of things. Int J Comput Intell Syst 2023; 16(1): 70. DOI: 10.1007/s44196-023-00257-y.

20.

Luo

Davids

, et al. Vision-based movement recognition reveals badminton player footwork using deep learning and binocular positioning. Heliyon 2022; 8(8): e10089.

21.

. An investigation of an athlete injury likelihood monitoring system using the random forest algorithm and DWT. Technol Health Care 2024; 22: 1–5. DOI: 10.3233/THC-231789.

22.

Yin

Yang

Van De Panne

, et al. Discovering diverse athletic jumping strategies. ACM Trans Graph 2021; 40(4): 1–7. DOI: 10.1145/3450626.3459817.

23.

Bezuglov

Shoshorina

Lazarev

, et al.

Does vitamin D affect strength and speed characteristics and testosterone concentration in elite young track and field athletes in the North European summer?

Nutr J 2023; 22(1): 16. DOI: 10.1186/s12937-023-00848-7.

24.

Maduwantha

Jayaweerage

Kumarasinghe

, et al. Accessibility of motion capture as a tool for sports performance enhancement for beginner and intermediate cricket players. Sensors 2024; 24(11): 3386. DOI: 10.3390/s24113386.

25.

Miranda-Oliveira

Branco

Fernandes

. Accuracy of inertial measurement units when applied to the countermovement jump of track and field athletes. Sensors 2022; 22(19): 7186. DOI: 10.3390/s22197186.

26.

Burenbatu

. Comparative analysis of biomechanical patterns in sprinting: a machine learning approach to optimize running performance in track athletes. Mol Cell BioMech 2024; 21(1): 321. DOI: 10.62617/mcb.v21i1.321.