Abstract
Sprint performance is a crucial component of athletic performance, especially in sports like track and field, football, and rugby, which require quick bursts of peak effort over short durations. Understanding the biomechanics of sprinting is essential for enhancing athletic performance, preventing injuries, and creating effective training plans. Traditional research on sprint evaluation often focuses on discrete measures while neglecting the intricate interactions between variables that evolve throughout the sprint. This study addresses these challenges by applying a machine learning (ML) algorithm, specifically the Polar Bear-tuned Multi-Source Kernel Support Vector Machine (PB-MKSVM), to predict and optimize the sprint performance of track and field athletes. The system analyzes essential biomechanical characteristics such as muscle activation patterns, joint angles, ground reaction forces, and stride length. Data were collected using wearable sensors and motion capture systems during standardized sprint trials, during which various biomechanical parameters were recorded. Standard preprocessing steps including noise removal and outlier detection were applied to the data. Power Spectral Density (PSD) was employed to extract features from the preprocessed data. The results demonstrate that the proposed method outperforms traditional algorithms in predicting sprinting efficiency and identifies complex, phase-specific changes in movement patterns. The model effectively analyzes the intricate biomechanics of sprinters’ movements to differentiate between various skill levels. Using Python software, the model achieved impressive performance metrics, including accuracy (94.5%), precision (92.7%), recall (93.6%), F1-score (92.1%), R2 (0.92), and AUC (0.91), highlighting its robust predictive ability. This study illustrates how machine learning models can advance research in sprinting mechanics and provide insightful information to athletes and coaches seeking to improve performance.
Keywords
Introduction
Track and field sprinting performance is influenced by complex, interwoven biomechanical, physiological, and environmental determinants. Optimizing sprinting performance is a highly specific focus area within sports science, where even minor improvements can make a significant difference between an athlete who wins and one who loses. 1 Biomechanical analysis enables an understanding of the physical and mechanical aspects of athletes’ movements, allowing for tailored training interventions that optimize performance. 2 However, the complexity of human movement and the large volume of data involved make analyzing and optimizing sprint biomechanics challenging. 3 Traditional performance analysis methods, such as video-based observations, manual biomechanical assessments, and basic statistical modeling, often fail to process large datasets or account for the multifactorial nature of sprinting. Such research can be time-consuming, subjective, and dependent on the analyst’s expertise, leading to inconsistent or poor training recommendations. 4
Machine learning (ML) algorithms offer a novel approach to address these limitations by effectively handling large and complex datasets. 5 Techniques such as supervised learning, unsupervised clustering, and predictive modeling enable the uncovering of relationships within biomechanical data that may not be identified using traditional approaches. 6 These relationships can reveal the impact of various factors, such as stride lengths, ground reaction forces, and joint kinematics, on sprint performance with greater accuracy. 7 The primary objective of this study is to develop ML algorithms aimed at optimizing sprint performance and conducting biomechanical analyses of track and field athletes. 8 Advanced computational tools will enhance the performance analysis and provide actionable insights for athletes and coaches. This research strives to overcome the constraints of traditional approaches by providing a scalable and data-driven solution that enhances athletes’ sprinting performance. 9 The application of ML in sports biomechanics represents a significant milestone, revolutionizing training practices and competitive strategies in track and field sports. 10 However, there is a lack of specific examples of implemented ML models, quantified improvements, and detailed challenges related to integrating ML into biomechanical analysis for optimizing sprints, which limits its practical applicability.
This work aims to design and implement a novel machine learning (ML) algorithm, the Polar Bear-tuned Multi-Source Kernel Support Vector Machine (PB-MKSVM), which uniquely integrates Polar Bear optimization with multi-source kernel SVM. This innovative framework enhances the analysis of complex biomechanical parameters by optimizing kernel parameters for superior classification accuracy and predictive performance. The main contributions of this work are: (1) Research introduces a new algorithm tailored to predict and optimize sprint performance, surpassing conventional methods and providing deeper insights into complex dynamics in sprinting. (2) Through wearable sensors and motion capture systems, it examines muscle activation, joint angles, ground forces, and stride length in gaining insights into sports performance and reducing risk factors for injuries. (3) Research brings out concrete suggestions for training programmers that improve sprinting mechanisms, and differentiate levels regarding skills, bringing along direct improvement to the sprint performance of the field athletes.
Related work
Machine learning methods were used to simulate the velocity-time curve in 100-meter sprinting to get beyond the limitations of traditional speed models. 11 Neural Network (NN) and Random Forest (RF) models forecast the acceleration phase. The shed illuminated the dynamics of sprinting performance by discovering a significant inverse relationship between maximal velocity and final time. Using ML methods that examined how morphometric characteristics affected first-grade primary school students’ 20-meter sprint performance was examined. 12 To find the best ML method for outcome prediction, three experiments were carried out. With a minimum mean squared error (MSE) value of 0.012, the correlation-based characteristics demonstrated strong linear relationships with the target variable and reliable predictors. The 1489 male adolescent athletes who competed in five different sports and average physical fitness (PF) were examined using 6 ML algorithms. 13 With an F1-score of 0.87, an area under the curve (AUC) of 0.86, and an average accuracy of 90.14, the XGBoost model performed the best. The expectation was that this would facilitate the creation of tailored training plans for athletes engaged in various sports. Sports science was being transformed by artificial intelligence (AI), which enhances performance, education, and health management examined by. 14 It improved the menstrual cycle upper management, off-training behavior, avoiding injuries, talent identification, load efficiency, and sleep quality. The manuscript highlighted the necessity of interdisciplinary cooperation, context-specific AI technology, and AI-focused education. In-depth medical evaluations of 3661 athletes were used to examine their health parameters. 15 To identify the most important markers of fluid and blood biochemistry for phenotypic analysis and recovery process characterization in the post-competitive period, RF and multinomial logistic regression ML techniques were used. The classification of catabolism and anabolism was found to be highly influenced by the ornithine cycle characteristics and muscle metabolism factors.
Sports in the wild assessment of movement was currently possible through developments in wearable sensing and ML. 16 This enables long-term monitoring for avoiding injuries as well as real-time feedback for approach analysis and performance improvement. A summary of methods for examined sports movement with wearable sensors and ML was given, along with information on how to set up measuring protocols, their advantages and disadvantages, and suggestions for building models from movement data. To create a deep learning and computer vision technology to record mass center velocities and the features of sprinting and skeleton push-start steps was suggested. 17 High levels of agreement were found when the approach was compared to customized marker-less and marker-based methods. While step properties were similar, evaluations revealed lower mass center velocities during pushing. To maximize performance, decrease tiredness, and lower the risk of injury, athlete movement data was essential. 18 Sport-specific actions could be classified using computer vision and inertial sensor technology, and performance could be improved by athlete-dependent classification techniques. Using information from an inertial measurement unit sensor that describes the training and assessment of supervised ML models for automatically identifying running surfaces. To examine how Internet of Things (IoT) and cloud computing could be used to investigate common sports injuries sustained by track and field players. 19 Human sports injuries were analyzed using the cluster analysis approach, which used mathematical techniques to ascertain the association between samples based on features and similarity indicators. The findings show that computational and IoT-based approach outperformed the multi-level model approach in terms of time and accuracy, outperforming it by 10% and 18%. Employing deep learning to acquire image coordinates and binocular position, an automated technique was suggested for capturing footwork data from badminton players. 20 The final precision for positioning was 74.7%. The technique provided insights into how players cooperated to intercept projectiles by exposing inter-individual footwork adjustments during competitive performance.
To use ML techniques to enhance sports injury monitoring. The RF and Discrete Wavelet Transform (DWT) methods were used to generate a training set of sports injuries suggested. 21 A wearable gyroscope device was used to create a system for tracking the likelihood of injuries in athletes. To offer affordable overall costs and great testing accuracy, the system determined an athlete’s predisposition for injury and made advice for prevention. A framework for integrating deep reinforcement learning (DRL) and physics simulation to find various motion strategies for sports talents like high leaps was examined. 22 The framework explored initial character states using a Bayesian diversity search method and constrains actions to natural poses using a Pose Variational Autoencoder. The approach enabled the development of new tactics without reward programming or motion examples. The relationship between young, competitive track & field players’ strength and speed characteristics was investigated. 23 It was discovered that among young, talented field and track athletes who live and train above 50° north latitude, the prevalence of vitamin D deficits was lower than in other groups. Sports, medicine, and entertainment all depend on Motion Capture (MoCap), but its full potential was hampered by high cost and lack of experience, 24 particularly for novice and intermediate sports coaches. The difficulties encountered in creating reasonably priced MoCap systems for these levels had been examined to launch a system that was simple to use and required few resources. To evaluate countermovement jump (CMJ) performance during the contraction phase using a commercially available Inertial Measurement Unit (IMU) approached by. 25 Eight athletes performed CMJs while wearing the IMU on their fifth lumbar vertebra. For negative impulse phase time, contraction time, jump time, flight time, and minimum force, the results demonstrated excellent accuracy, correlation, and no statistical discrepancies between the IMU and force plate (FP).
Problem statement
Sprint performance and biomechanics have capitalized on ML and wearable technologies to potentially facilitate performance improvements, predict outcomes, and minimize potential risks associated with injury. A specific parameter of the acceleration phases, morphometric characteristics, or some isolated biomechanical feature that does not generally capture the complex, dynamic interactions between variables throughout the sprint cycle were focused on. Traditional ML models also have serious deficiencies in conducting phase-specific movement analysis, applicability across skill levels, and accuracy in outcome prediction. While some research use advanced techniques of ML often don’t capture delicate biomechanic variations important to optimize sprinting efficiency. The proposed PB-MKSVM algorithm incorporated wearable sensor data, motion capture systems, and advanced preprocessing techniques to analyze key biomechanical parameters holistically. This new approach recognized complex, phase-specific patterns, thereby outperforming traditional algorithms with actionable insights into optimizing sprint mechanics and improving athlete performance.
Proposed system
The proposed system utilized the Kinematics Motion dataset in performing biomechanical analysis on track and field athletes. There is preprocessing involving Z-score normalization to standardize features such as muscle activation, joint angles, and stride length so that data becomes comparable. Feature extraction uses PSD for the extraction of vital patterns in biomechanical signals. This includes a core system of the Polar Bear-tuned Multi-Source Kernel Support Vector Machine, optimized for the estimation of the optimal kernel parameters with enhanced prediction accuracy for sprint performance. This hybrid approach allows it to achieve greater accuracy and convergence speed in its classification and prediction of sprint biomechanics for performance optimization for athletes. The sprint performance flow is depicted in Figure 1. Overflow of the Sprint performance prediction.
Dataset
The Kinematics Motion dataset involves a sample of 150 track and field athletes, aged 18–30 years, consisting of various sprinting levels (beginner, intermediate, and advanced). Data was collected using a combination of wearable sensors, which captured muscle activations and ground reaction forces, alongside a motion capture system that recorded joint angles and other biomechanical parameters during standardized sprint trials. Each athlete performed three 60-meter sprints, and the average values were used for analysis to ensure consistency. The type of data available is particularly good for running an ML algorithm to find improvements in sprinting performance with track and field athletes concerning biomechanics such as stride length, muscle activation, and ground reaction forces.
Source: https://www.kaggle.com/datasets/yasserh/kinematics-motion-data.
Preprocessing using Z-score normalization
Z-score normalization is used for standardizing the biomechanical data, thereby ensuring that the features such as muscle activation, joint angles, and stride length all have comparable scales. This reduces bias during training, improves model performance, and enhances convergence in the PB-MKSVM-based sprint performance prediction. The Z-score normalization technique is based on the data’s mean and standard deviation values. If the data’s true lowest and maximum values are unknown, this approach is extremely useful. The formula is applied in equation (1).
The normalized new value is denoted by
Power Spectral Density (PSD) using feature extraction
Features extracted from a frequency domain for PSD are then considered to obtain vital patterns within these signals for such biomechanics as muscle activations and ground reaction forces, providing the foundation in which PB-MKSVM analyses and predicts a sprint performance effectively. The output PSD for a linear system
Equation (5) also contains the function
PB-MKSVM for enhanced sprint performance and biomechanical analysis
The optimization of sprint performance and biomechanical analysis of track and field athletes using the hybrid Polar Bear-tuned Multi-Source Kernel Support Vector Machine (PB-MKSVM) leverages a unique hyperparameter search space including parameters such as C and γ within ranges defined as C ∈ (0.1100) and γ ∈ (0.001,10). This approach ensures that the model can efficiently explore the parameter landscape while converging on optimal settings suited for capturing the dynamics of biomechanical data. The convergence criteria were defined through monitoring changes in accuracy, with a stopping threshold set for minimal improvement over a predetermined number of iterations. The natural hunting strategies-inspired PB enhances the MKSVM by optimizing the parameters of MKSVM to increase the speed of convergence with greater accuracy for sprint biomechanics prediction. The framework uses PB in fine-tuning the kernel parameters and in choosing the most appropriate features of biomechanical data such as muscle activation, joint angles, and stride length. The MKSVM then utilizes optimized features and kernels for accurate classification and prediction performance by using different kinds of kernels: linear, polynomial, RBF, sigmoid, etc. This PB optimization uses global search by using ice floes, local search by polar bears hunting behavior model, and dynamic population control so that all those processes will perfect the ability to classify and predict sprint performance based on various scenarios. This hybrid approach maximizes predictive accuracy and robustness in biomechanical data analysis for sprint performance optimization. The hybrid PB-MKSVM process is described in Algorithm 1.
MKSVM framework for biomechanical data integration in sprint performance optimization
The Multi-Source Kernel Support Vector Machine (MKSVM) is made for the task of dealing with complex, high-dimensional biomechanical data involved in sprint performance analysis. With a kernel-based approach, the effective integration of diverse features helps ensure precise predictions and robust differentiation at the skill level to optimize athletic performance. Two distinct methods yield distinct sets of attributes that are then fed into the support vector machine (SVM) for training. Order calculations are used by a supervised SVM model to tackle two-bunch grouping problems. After supplying an SVM model layout of the assigned preparation data for every class, model are ready to classify current content. Finding a hyperplane in an
The Linear Kernel SVM is used for data that can be split into two categories using a single straight line, or data that can be segregated in a linear form. To enhance its capacity for generalization, it tries to maximize this margin. According to equation (6), the linear function is the dot product of two vectors
An expanded linear kernel representation is the Polynomial Kernel SVM. It is described as equation (7).
In equation (8),
Here, c represents the point of intersection constant, while α, typically set to 1/M (where M is the dimension of the data), denotes the slope. This formulation serves as the foundation for advanced analyses of sprint dynamics.
PB optimization for sprint performance and biomechanical analysis
The Polar Bear (PB) optimization is used to optimize the MKSVM method with its parameter selection for better accuracy and faster convergence and performance in the prediction and analysis of sprint biomechanics. To propose an optimization of PB for an MKSVM framework by the optimization of convergence speed prediction as well as for the complex handling of biomechanical data. The polar bear optimization method is an intelligent swarm-optimizing metaheuristic that derives inspiration from nature. The following four steps represent an analytical representation of polar bear hunting behavior. These include dynamic control of population, local search, global search with ice floes, and beginning population.
Initial population
Create the initial residents of the polar bears at random first, and then use exploitation, exploration, and a dynamic population search method to identify the most effective solution in the search space. The representation of each polar bear with
Global search using ice floes
The bear in quest of food typically looks around his neighborhood. It relocates to a sizable, stable ice floe in the event of food scarcity. When looking for food, polar bears steer ice floes to areas with a greater chance of locating seals for hunting, allowing them to support their weight for longer periods. To explore the global search space in a parsimonious manner, it can mimic polar bear strategic movements when traveling ice floes. The drifting directly approaches the starting population’s current optimal solution. The mathematical representation of this behavior is as follows in equation (10).
The
The global search approach uses the polar bear movement in optimizing ML algorithms to improve the efficiency of sprint performance and biomechanical analysis.
Local search
When hunting in the Arctic, polar bears shift asymmetrically, approaching their prey underwater, on land, or ice. Polar bears are among the biggest and most deadly non-aquatic predators, capable of swimming hundreds of kilometers without stopping. The local search strategy models polar bears’ real and flexible actions in their space to improve solution quality around the global optimum. The Trifolium equation is used to model this behavior of polar bears. This equation has two different parameters:
For every spatial coordinate, the following system of equation (13) describes the individual movements.
Dynamic population control
The PB optimization method depends on the reproduction of the best parts of a population and the deprivation of the worst, while 75% of the overall population begins to exist at random. The dynamic population sizing balances exploration and exploitation by the dynamic management of population size concerning the number of fitness evaluations. Because of the severe arctic environment, PB optimization manages the population, allowing the creature to be exterminated. To determine whether a person can die or procreate, a constant
The size of the population won’t drop to half it was in the beginning. Equation (15) utilizes the median of the most effective solution at
Dynamic population control in PB optimization manages model diversity, which improves the efficiency of ML algorithms for sprint performance and biomechanical analysis.
Performance evaluation
Experimental setup.
The confusion matrix shows how the ML model performs activity recognition, including walking and running activities, which serve as transitional movement patterns relevant to understanding sprint biomechanics. Evaluating these classifications allows us to comprehensively assess the biomechanical features associated with different phases of a sprint, emphasizing the importance of muscle activation and joint dynamics in varying speed conditions. Diagonal values correspond to correct classifications; 13,259 of the instances are correctly classified as walking, and 13,082 of the instances are correctly classified as running. Off-diagonal values are misclassifications. 115 instances of walking were classified as running, and 121 instances of running were classified as walking. The high accuracy of this model is an indicator of its ability to distinguish movement patterns. This degree of precision is vital in the context of sprint performance optimization, as accurately analyzing biomechanical data is critical to improving athletic performance and identifying minute variations in movement. The confusion matrix of the sprint performance is depicted in Figure 2. Confusion matrix of sprint performance.
Biomechanical feature extraction using PSD for sprint performance.
Training time in seconds for the training and validation dataset over 100 epochs optimizing performance of sprinters and biomechanical analysis for track and field athletes with the use of ML algorithms. The training time for both datasets is greater than 150 seconds but starts decreasing drastically as the number of epochs passes, settling below 50 seconds after about 20 epochs. The validation time follows the same pattern, starting at about 100 seconds and then leveling out to about 50 seconds. The training time graph is illustrated in Figure 3(a). The model has a convergence time of 50 seconds at the peak value at around 100 epochs, indicating stabilization of the process along with acquiring optimal performance by the end of the training phase. The convergence time graph is illustrated in Figure 3(b). This suggests that the efficiency of the ML model increases considerably in the early epochs and stabilizes both during training and validation as it converges to a reliable solution for performance and biomechanical optimization. Sprint Performance: (a) Training time (sec) (b) Convergence time (sec).
A comparison of ML model performance in optimizing sprint performance and biomechanical analysis of track and field athletes, RF, 26 and the hybrid Convolutional Neural Networks and Long Short-Term Memory (CNN-LSTM), 26 the existing method compared to the proposed PB-MKSVM model.
Precision, accuracy, f1-score and recall matrices are determined in the model. The accuracy (88.1%) for the RF model, with the given precision (86.5%), recall (87.2%), and F1-Score 86.8%), which is a sign of its strong predictive ability. The CNN-LSTM model was able to show improved performance with an accuracy (92.4%), precision (90.2%), recall (91.7%), and F1-Score (90.9%). The proposed PB-MKSVM model outperformed both of them with accuracy (94.5%), precision (92.7%), recall (93.6%), and F1-Score (92.1%) demonstrating superior ability in the optimization of performance and analysis of the biomechanics of athletes in sprint performance and competition analysis. The comparison values are shown in Figure 4 and Table 3. Comparison of ML models for sprint performance. Numerical outcomes of ML Models for Sprint Performance and Biomechanical Outcome of Athletes.
The R2 (coefficient of determination) and AUC metrics were used to measure and distinguish the performance between machine learning models during optimizing sprint performance and biomechanical analysis of track and field athletes. RF achieved an R2 of 0.82 and AUC of 0.78, meaning that the model was reasonably accurate in predictions. The CNN-LSTM model enhanced these values with R2 of 0.89 and AUC of 0.86, and the proposed PB-MKSVM model outperformed both with an R2 of 0.92 and AUC of 0.91, thus showing better model performance in athlete analysis. The R2 and AUC values despite in Figure 5 and Table 4. R2 and AUC result of Sprint Performance. R2 and AUC outcome of sprint performance.
Discussion
PB-MKSVM utilizes multiple kernels, tuned with the optimization algorithm, to capture a complex, phase-specific biomechanical interaction, enabling precise prediction and optimization of sprint performance metrics. RF 26 method does not model temporal or phase-specific biomechanical interactions, and thus it cannot provide high precision and accuracy in the recognition of complex motion patterns. This method determines the accuracy (88.1%), precision (86.5%), recall (87.2%), f1-score (86.8%), R2 (0.82) and AUC (0.78) values. CNN-LSTM 26 is computationally intensive, tends to overfit when data is scarce, and does not generalize well across a wide range of athletes with varying skills. This method determines the accuracy (92.4%), precision (90.2%), recall (91.7%), f1-score (90.9%), R2 (0.89) and AUC (0.86) values.
PB-MKSVM integrates multi-source kernel learning to account for the inability of RF to capture the dependencies in temporal features. The phase-specific analysis that PB-MKSVM shows has a lower computational demand since the Polar Bear algorithm optimizes kernel parameters while ensuring robust precision and recall are preserved. As compared to CNN-LSTM, this model proves better at generalized predictions across the wide range of datasets and levels of skills applied during sprint performances.
Limitations and generalizability
Limitations in our study include the potential biases associated with wearable sensor data, such as sensor placement variability and ambient noise, which may affect the accuracy of biomechanical measurements. To mitigate this, we conducted preliminary tests to ensure consistency in sensor placement and calibrations across trials. Additionally, to address overfitting, we employed k-fold cross-validation during model training, which allowed us to evaluate the model’s performance on unseen data and to regularize weights to preserve generalizability. Performance testing was conducted across diverse athlete populations, including both male and female athletes across various skill levels and body types, ensuring that our model’s predictive capability is robust across a spectrum of training conditions and demographics.
Conclusion
Optimized sprint performance by using the PB-MKSVM approach in track and field athletes, the proposed model of ML successfully identifies changes in intricate biomechanical variables, such as muscle activation patterns, joint angles, ground reaction forces, and stride length with phase-specific precision. The results are more elaborate than those in traditional methods that have been conducted to date concerning sprint performance and skill differentiation; the ML algorithm enhances understanding in the realm of sprinting biomechanics and provides invaluable tools for the athletes and coaches’ better performance and injury reduction. The proposed model of ML showed great success in predicting sprint performance with impressive metrics: accuracy (94.5%), precision (92.7%), recall (93.6%), F1-score (92.1%), R2 (0.92), and AUC (0.91). These results are a testament to the model’s ability to distinguish between various levels of skill levels and rightly predict sprinting efficiency, thus providing an excellent resource to both athletes and coaches for further performance improvements. The limitations include reliance on accurate biomechanical data from wearable sensors, the difficulty of feature extraction, and possibly overfitting in machine learning models due to limited training data. The future scope includes extending the dataset with various athlete profiles, incorporating real-time monitoring systems, and enhancing the PB-MKSVM algorithm to further increase the accuracy of prediction and personalization.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Conflicting interest
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The authors declare that the data supporting the findings of this study are available within the article. The raw/derived data supporting the findings of this study are available from the corresponding author at request.
