Abstract
This research aimed to map the application of machine learning (ML) methods in the metallurgical industry from 2000 to 2024, identifying key topics, trends, and future directions. A mixed-methods approach, combining bibliometric analysis, text mining, and content analysis, was employed. A total of 341 articles from Scopus and 249 from Taylor & Francis were reviewed following the Preferred Reporting Items for Systematic reviews and Meta-Analyses method and an ad hoc selection of 10 key articles. The analysis revealed five main research areas: (1) Advances in ML and materials science; (2) innovation in additive manufacturing; (3) applications of ML algorithms in steel metallurgy; (4) predictive analytics and modeling; and (5) artificial intelligence and deep learning applied to metallurgy. Based on these findings, five future research directions were proposed, including process optimization using deep learning, prediction of alloy behavior, sustainable waste management, integration of digital twins, and addressing ethical and regulatory challenges.
Introduction
The metallurgical industry has experienced steady growth due to the increasing demand for advanced materials, particularly in key sectors such as construction, automotive, aerospace, and the manufacturing of technological devices. This growth is not recent, as metallurgy has historically been one of the foundational pillars in industrial and technological development, dating back to the Industrial Revolution, when advancements in iron and steel production marked a pivotal change in modern societies. 1 During the 20th century, the emergence of advanced alloys and the refinement of metallurgical processes, such as heat treatment and welding, drove sectors such as aviation and defense. 2
In this context, and during this fourth industrial revolution, machine learning (ML) methods have emerged as revolutionary tools for optimizing processes, reducing costs, and discovering new material solutions. 3 This field, which combines computational algorithms and advanced statistics, has been successfully integrated into metallurgy to address challenges related to alloy design, optimization of manufacturing processes, and predictive analysis of mechanical and chemical properties.
The specific contributions of ML to the metallurgical industry are diverse and significant: Advanced alloy design: ML allows for the exploration of millions of possible chemical combinations to develop new alloys with specific properties, such as high strength, low density, or improved corrosion resistance. Models such as random forests and support vector machines (SVMs) have facilitated the design of high-entropy alloys (HEAs) and superalloys for applications in sectors such as aviation and medicine. 4 Optimization of manufacturing Processes: In additive manufacturing (AM), ML has optimized parameters such as cooling rate and material density, reducing defects like porosity and improving the mechanical properties of the produced parts. 5 In traditional processes such as casting and rolling, regression algorithms and decision trees have enabled the identification of optimal operating conditions, maximized product quality and minimized energy consumption. 6 Prediction of Material Properties: ML enables the prediction of mechanical and chemical properties, such as tensile strength, hardness, and ductility, based on chemical composition and processing parameters. This accelerates the development of new materials and reduces the need for physical experimentation. 7 Prevention of Structural Failures: Deep learning tools have been applied to analyze structural monitoring data, predicting failures in metallic components and increasing reliability in critical sectors such as aerospace and automotive. 8 Advanced Quality Control: In industrial environments, ML has been used for real-time product analysis, automatically detecting defects and ensuring higher quality standards. 3 Sustainability in Metallurgical Processes: By optimizing processes and reducing waste, ML contributes to the sustainability of the industry by minimizing energy and resource consumption, aligning with trends toward a circular economy. 9
The main advantage of ML in this sector is its ability to handle large volumes of complex, nonlinear data, which are typical characteristics of metallurgical systems. For example, techniques such as supervised, unsupervised, and deep learning enable the identification of hidden relationships in experimental data and the prediction of material behavior under different conditions. 10 These tools have been applied in areas such as the development of high-strength, low-alloy steels, the design of superalloys and HEAs, and the prediction of structural failures in metallic components. 8 In the field of AM, ML has revolutionized quality control and process efficiency. 3 This manufacturing method, which constructs structures layer by layer, heavily relies on precise parameters such as cooling rate and material density. Through learning algorithms, it is possible to optimize these parameters, resulting in parts with improved mechanical properties and a significant reduction in defects. 5 For instance, models based on deep neural networks have been used to predict and minimize defects like porosity, thus improving the reliability of the fabricated parts.
Another important application of ML in metallurgy is the design of new materials. Traditional approaches to alloy development are often costly and time-consuming, as they involve multiple iterations of trial and error. With the help of ML, it is possible to predict the ideal chemical composition of new alloys with specific properties. Models such as random forests and SVMs enable the analysis of millions of possible combinations and the selection of the most promising ones. 4 This has been particularly useful in the development of lightweight and strong alloys, such as titanium alloys and HEAs, which find applications in the aerospace and medical industries. In addition, ML has also been implemented in the optimization of traditional metallurgical processes, such as casting, rolling, and heat treatment. 9 Through predictive models, optimal operating conditions, such as temperature and cooling time, can be determined to maximize the quality of the final product and minimize energy consumption. For example, decision trees and regression techniques have been used to optimize the hot rolling process, improving the uniformity of the final products and reducing waste rates. 6 In specific cases of high-strength steels, ML algorithms have been used to predict critical properties such as hardness, tensile strength, and ductility based on chemical composition and processing parameters. This not only accelerates the design process but also reduces the need for costly physical experiments. 7 Similarly, in the development of superalloys for gas turbines, deep learning models have been crucial in identifying the ideal compositions that can withstand high temperatures and corrosive environments, thereby increasing the efficiency and lifespan of these components. However, the implementation of these technologies is not without challenges. One of the main obstacles is the need for high-quality data in sufficient quantities to effectively train the models. 9 In metallurgy, data generation can be a costly and slow process, limiting the potential of ML in certain areas. Additionally, model interpretability remains a critical challenge, as many advanced algorithms, such as deep neural networks, operate as “black boxes”, making it difficult to understand the underlying relationships between system variables. 5
ML refers to a subset of artificial intelligence (AI) techniques that enable computer systems to learn patterns from data without being explicitly programed. In materials science, ML is distinct from traditional computational modeling (such as finite element analysis (FEA) or thermodynamic simulations) in that it requires large datasets and statistical training to make predictions or classify outcomes.11–16 Unlike rule-based systems, ML can uncover nonlinear relationships and hidden patterns in complex metallurgical data. Despite these challenges, the future of metallurgy with ML looks promising. Current research is focused on combining these methods with emerging technologies such as AI, quantum computing, and advanced predictive analytics. These integrations have the potential to further revolutionize the sector, enabling the design of entirely new materials and more sustainable and efficient manufacturing processes. Ultimately, ML is not only transforming metallurgy but also laying the foundation for a smarter and more sustainable approach to managing metal resources. Therefore, the present study aims to conduct a bibliometric analysis of the existing literature using databases such as Scopus and Taylor & Francis on ML methods in the metallurgical industry, with the objectives of: (i) Capturing the scientific background of research on ML methods in metallurgy, identifying key themes and trends over the last 24 years; (ii) providing a comprehensive overview of the existing literature on the subject; and (iii) proposing future directions in the field. The research questions this study seeks to answer include: What have been the key themes and trends in research on ML methods in metallurgy over the last 24 years? What are the main contributions of ML methods to the metallurgical industry? What are the future directions in ML models for the mining industry?
The rest of the article is structured as follows: The “Methodology” section presents the methodology, the “Results and discussions” section shows the bibliometric results and key research trends, and the “Conclusions” section provides the conclusions and future directions related to ML applications in metallurgy.
Methodology
Search strategies
The search string was specifically designed around the application of ML methods in the metallurgical industry, guiding the selection of keywords related to this topic. In addition, supplementary terms commonly associated with the application of ML methods in metallurgy, such as “metals,” were included. The search was conducted in the fields of title, abstract, and keywords within the bibliographic databases of both Scopus and Taylor & Francis. The initial search resulted in a total of 590 documents, distributed between Scopus (341) and Taylor & Francis (249), as shown in Figure 1.

Search chain for the information on the application of machine learning methods in the metallurgical industry.
Inclusion and exclusion criteria
In accordance with the objectives of the review, only peer-reviewed articles and literature reviews written in English were included. Books, book chapters, reports, conference proceedings, dissertations, editorials, and unpublished manuscripts were excluded. Both empirical and theoretical or conceptual studies were considered if they addressed the application of ML methods in the metallurgical industry. All articles published between 2000 and 2024 that were relevant to the research on the application of ML methods in metallurgy were included. Figure 2 provides a detailed description of the inclusion and exclusion criteria used in this study.

Inclusion and exclusión criteria.
Selection procedure
The study selection process was illustrated using the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) flow diagram PRISMA. 17 In the first stage, 590 articles were identified after conducting the search in the databases. Additionally, 10 extra publications that were not found through the keyword search were included in the sample, having been identified in the reference lists of the selected articles. After ensuring there were no duplicates and confirming the consistency of the search protocol, 18 600 articles were selected, resolving any discrepancies in the evaluation of these. In the final phase, 203 potentially eligible articles were fully reviewed, and 44 were excluded during the full-text review process, resulting in a final sample of 159 articles (see Figure 3).

Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) methodology in the selection of documents.
Software used for bibliometric analysis
RStudio is an integrated development environment for the R programming language, widely used for statistical analysis and data visualization. Its main function is to facilitate the creation, debugging, and execution of R code, enabling complex analysis and visualizations to be carried out efficiently. RStudio includes tools for data manipulation, graph generation, statistical modeling, and the development of interactive applications such as dashboards and web apps with Shiny. Additionally, it integrates a variety of specialized packages (such as tidyverse for data manipulation and ggplot2 for visualization), allowing users to conduct advanced analysis in research and data science efficiently and reproducibly. 19 In this bibliometric analysis, R-Studio was used to process and analyze bibliographic data through the specialized package “bibliometrix,” This tool allowed us to generate temporal trend graphs, such as the evolution of the number of publications and citations by year, facilitating the identification of key periods in research. It also enabled us to analyze the geographic distribution of scientific production, representing countries with the highest contributions through heatmaps. It is also used to analyze productivity indicators, such as the collaboration index between authors or institutions, providing tables and collaborative network graphs. VOSviewer is a software specialized in creating and visualizing bibliometric maps, primarily used to analyze co-citation, co-authorship, and keyword networks. Its functionality allows researchers to build maps showing relationships between authors, publications, and terms, which helps identify patterns in large bibliographic datasets. Its intuitive visual interface makes it possible to explore in detail research topics and trends, providing a structured view of scientific production in a specific field. VOSviewer is compatible with data from databases such as Scopus and Web of Science and allows exporting maps in various formats. 20 VOSviewer was mainly used for visualizing bibliometric network maps, such as keyword co-occurrence. The keyword co-occurrence map helped us identify the most relevant and emerging topics, while citation networks highlighted the most influential articles or authors. It is important to note that RStudio and the bibliometrix package were used only for data collection and visualization, not for applying ML algorithms to metallurgical datasets. Similarly, VOSviewer was used for network mapping, while the ML techniques referenced in the literature (e.g., neural networks, decision trees) are those applied by other authors to solve specific metallurgical problems.
Text mining
Text mining is a technique that extracts meaningful information and patterns from large volumes of textual data, transforming unstructured content into structured and useful knowledge. This process combines natural language processing, ML, and statistical analysis methods to identify relationships, topics, and trends within texts. It is widely used in various fields such as health sciences, business, and social sciences, allowing researchers and analysts to interpret textual data efficiently.21,22 By automating text analysis, text mining helps uncover hidden knowledge that would be difficult to identify manually, and it is particularly effective when applied to large datasets like academic articles, social media posts, and emails.
Content analysis
Following the methods proposed by authors such as Krippendorff 23 and Mayring, 24 a qualitative content analysis was conducted to complement the quantitative findings and deepen the understanding of the topic under study. This approach allows for the identification of patterns, themes, and structures of meaning within the analyzed texts, providing a richer perspective on the theoretical and conceptual orientations dominating the research field. The most relevant articles were classified into thematic groups, and each group was examined in detail to identify key trends and recurring approaches, allowing for the construction of a comprehensive and coherent view of the field of study.
Results and discussions
To meet the proposed objectives, the results are organized into “Scientific context and trends in ML application in metallurgy,” “Identification of key topics and trends in research,” and “Future directions for research on the application of ML methods in the metallurgical industry” sections, corresponding to each of the specific research objectives.
Scientific context and trends in MLapplication in metallurgy
The analysis captures the scientific context of studies related to the application of ML methods in the metallurgical industry, highlighting relevant topics and predominant trends over the last 24 years.
Analysis of publication progress
The analysis of the figure reveals the evolution in the number of publications and citations received between 2009 and 2024, highlighting four clearly distinct stages: an inactivity phase, an initial low-impact phase, a period of stability, and a recent surge in scientific production.

Annual number of articles published and total citations from 2009 to 2024.
Analysis of leading journals and authors
A total of 90 journals have contributed to the 159 articles selected on the topic of ML applications in the metallurgical industry between 2000 and 2024. As shown in Figure 5, the top 10 journals have contributed 60 publications, which represent 37.74% of the total publications in this field. Furthermore, a subsequent analysis indicates that only 14 journals have 3 or more publications, accounting for 15.56% of all the sources reviewed, reflecting the concentration of knowledge within a small but influential group of journals. Among these, Computational Materials Science tops the list with 9 publications, representing 5.56% of the total publications. It is followed by Canadian Metallurgical Quarterly, with 8 articles, representing 5.03% of the publications. Other notable journals include Materials Today Communications and Mineral Processing and Extractive Metallurgy Review, each with 7 publications (representing 4.40% of the publications each). They are followed by journals such as Journal of Alloys and Compounds, Journal of Materials Science and Technology, Materials and Design, Metallurgist, and Virtual and Physical Prototyping, each contributing 5 publications (representing 3.14% of the publications each). Finally, Metals and Materials International rounds out the list with 4 articles, representing 2.52% of the publications.

Major publications in scientific journals from 2000 to 2024.
A total of 618 authors have contributed to this field of study, of which only 25 authors have contributed 3 or more publications, representing 4.05% of the total authors in this field. Table 1 highlights the top 10 most relevant authors, both by the number of articles published and the impact of their work. The most prolific researchers in the field are Wang and Li, who lead with 7 and 6 articles, respectively, indicating a recent and growing interest in the research, as most of these authors began their contributions after 2020. Other leading authors include Ji Y., Liu Y., and Wang H., each with 4 articles published. Additionally, it is notable that citations are concentrated among a small group of authors, such as Kim M. S., Koo G., Lee C., and Lee S. J., each with 176 citations, reflecting their impact and recognition in the field. The m-index, which is consistent at 0.2 for all authors, suggests a regular and stable contribution in relation to the number of publications and the researchers’ active time. Overall, the data highlights an emerging community with a recent and growing focus on the topic, with a select group of authors leading in terms of impact.
Top 10 authors and cited authors in terms of published articles.
The number of citations received by an article is a key indicator for identifying the most influential publications in a research area. Table 2 presents the top 10 most cited articles within the dataset, highlighting their relevance in the application of ML to various fields in metallurgy. The most cited article, automated defect inspection system for metal surfaces based on deep learning and data augmentation, 25 has accumulated 176 citations and addresses the automation of defect inspection on metal surfaces using deep learning, emphasizing its impact on the manufacturing industry. Another influential article is Invited review: ML for materials developments in metals AM, 26 with 161 citations, reviewing the role of ML in metal AM, highlighting its potential to improve material design and development. Meanwhile, Intelligent vehicle power control based on ML of optimal control parameters and prediction of road type and traffic congestion, 27 with 115 citations, explores ML applications in predicting traffic conditions and optimizing energy consumption in vehicles, standing out in the field of vehicular technology. In the area of corrosion prediction, the article by 28 with 101 citations, presents an ML-based model for estimating corrosion rates by optimizing input parameters, providing a valuable tool for the design of durable materials. Other notable works include 29 on automated detection in flat metal materials using computer vision, and 30 on phase prediction in HEAs, with 76 and 42 citations, respectively.
Top 10 most cited articles regarding the application of machine learning methods in the metallurgical industry.
Total citations.
Citations per year.
Analysis of collaboration between institutions and countries
Between 2000 and 2024, a total of 206 institutions from 41 countries contributed to research on ML applications in metallurgy, with China leading in participation. Only 31 institutions (15.05%) published three or more articles. Table 3 lists the top 10 contributors, led by the University of Science and Technology Beijing with 59 publications (37.11%), followed by Northeastern University (USA) with 42 (26.42%). Other notable institutions include Jeonbuk National University (South Korea), Central South University, and Taiyuan University of Science and Technology (China). The dominance of Chinese and U.S. institutions highlights their leadership in this research area. The international distribution of contributions is illustrated in Figure 6.

Geographical distribution of literature on the application of machine learning methods in the metallurgical industry.
Top 10 institutions in terms of articles published.
Keyword analysis
The image is a keyword cloud representing the most prominent topics at the intersection of ML and metallurgy. In this cloud, the size of each term indicates its relevance and frequency of appearance in the analyzed studies. Words like “machine learning,” “forecasting,” “steel metallurgy,” and “learning algorithms” stand out as the central themes in this research field. This keyword analysis, based on the co-occurrence of terms, helps identify priority areas such as material property prediction, alloy design, and the use of advanced techniques like neural networks and decision trees. While this representation provides an overview of the key concepts in this field, it does not address the temporal evolution of these topics nor details how they interact in specific contexts, such as AM, high-strength steels, and materials microstructure analysis (see Figure 7).

Keyword cloud from articles on the application of machine learning methods in the metallurgical industry.
Figure 8 shows a co-occurrence network of keywords in the field of ML applied to metallurgy, where colors indicate the average research year, with lighter shades representing more recent topics and darker shades for older ones. Keywords such as “machine learning,” “learning systems,” “steel metallurgy,” and “learning algorithms” stand out for their continued relevance over time. Other terms, such as “neural networks,” “high-entropy alloys,” and “additive manufacturing” highlight the growing interest in emerging areas that combine data science and metallurgy. Additionally, significant connections are observed between terms such as “tensile strength,” “powder metallurgy,” and “decision trees,” reflecting the interrelationship between material property optimization, process design, and the application of advanced ML techniques. This network illustrates both the persistence of fundamental topics and the emergence of innovative approaches, adapting to the demands for efficiency, sustainability, and innovation in metallurgical research.

Co-occurrence network of keywords on the application of machine learning methods in the metallurgical industry.
Identification of key topics and trends in research
This section addresses the second objective of the study by analyzing keyword co-occurrence data (Figure 9), which reveals five key research clusters at the intersection of ML and metallurgy: (1) Applications in titanium alloys, superalloys, and HEAs using neural networks and genetic algorithms to optimize microstructure and mechanical properties. (2) AM and process control, with ML techniques enhancing performance, sustainability, and precision in metal printing. (3) ML-driven prediction in steel metallurgy, particularly for high-strength, low-alloy steels, using methods like decision trees and random forests. (4) Forecasting models such as SVMs and regression for analyzing relationships between processing, composition, and final properties. (5) Broader applications of AI and deep learning in metals and alloys to improve properties like tensile strength and wear resistance. These clusters illustrate the growing convergence between materials science and ML, emphasizing interdisciplinary efforts to drive innovation and sustainability in the metallurgical industry (see Table 4). Cluster 1: Advances in machine learning and materials science in titanium alloys, superalloys, and high-entropy alloys.

Co-occurrence map of keywords from titles and abstracts.
Key research topics on the application of machine learning methods in the metallurgical industry.
Recent advances in ML have significantly accelerated the design and optimization of advanced metallic systems such as HEAs,superalloys, and titanium alloys. These materials are crucial in sectors like aerospace, biomedical, and energy, where mechanical performance under extreme conditions is essential. Traditional trial-and-error methods are increasingly limited by the complexity of these alloys. ML offers a scalable solution to explore vast compositional and process parameter spaces with greater efficiency.
In the case of HEAs, ML has been used to predict and optimize mechanical properties like hardness and wear resistance based on elemental composition and microstructure. For instance, Tong et al. 35 applied support vector regression (SVR) to Cr-V wear-resistant alloys, achieving high predictive accuracy (R = 0.979) in estimating wear resistance. This was made possible by incorporating microstructural descriptors such as carbide distributions into the training data. Similarly, Duarte et al. 36 developed ML models to classify the magnetic loss behavior of non-oriented steels during recrystallization, successfully using synthetic data to overcome sample scarcity. The compositional complexity of HEAs, as discussed by Pei et al., 37 creates opportunities for inverse design through neural networks trained on microstructure images, enabling the development of alloys with optimized phase stability. David et al. 38 further demonstrated how computer vision systems integrated with AI can automate billet tracking in rolling mills, enhancing traceability and process control in real time. In the field of superalloys, which must maintain strength and stability at high temperatures, ML has contributed to identifying compositions that resist deformation and cracking. Regression models and genetic algorithms have been used to optimize strain-life fatigue parameters, as shown by Basan and Marohnić. 39 Ganesan et al. 40 emphasized the role of ML in ultra-precision cutting tool design, where understanding tool wear mechanisms is critical. Ahmed et al. 41 applied CatBoost to predict hydrogen embrittlement in pipeline steels using a dataset from 47 alloys, achieving high R2 values and identifying hydrogen pressure and tensile strength as dominant features in degradation risk. Bobbili and Madhu 42 demonstrated how decision trees and SVMs accurately predicted failure modes in austempered ductile iron, with F1-scores reaching 1.0 under optimal tuning.
Titanium alloys also benefit from ML integration. These alloys are valued for their high specific strength and corrosion resistance, making them ideal for aerospace and biomedical components. Kačur et al. 3 used ML to predict melt temperature and carbon concentration in basic oxygen furnaces, improving endpoint estimation for titanium-based steels. Huang et al. 43 applied RBF neural networks to model the hot deformation behavior of powder metallurgy titanium HEAs, finding that grain boundary sliding dominates at low strain rates and high temperatures. Radha 44 focused on corrosion prediction in magnesium alloys for battery anodes using XGBoost, achieving a Spearman coefficient of 1.0 and an R2 of 0.9931, highlighting ML's ability to manage degradation in electrochemical environments. Meanwhile, Chen et al. 45 developed ML models for predicting Charpy impact toughness in low-alloy steels, incorporating alloy composition, heat treatment, and physical properties into symbolic regression models, advanced this further by integrating physical metallurgy (PM) with particle swarm optimization and SVMs to model the effect of copper content on recrystallization in antibacterial stainless steels. Their PM-PSO-LSSVM model accurately predicted stress behavior and provided process optimization insights.
Finney et al. 46 demonstrated how genetic algorithms can optimize wavefront control in spectroscopy, significantly improving signal reliability in metallic copper showing that ML also benefits instrumentation used in alloy characterization. Alarifi 47 explored AM of composite alloys, noting that ML-assisted print speed optimization enhances both resolution and production efficiency in 3D printed components. These developments are echoed by Garel et al., 48 who trained ML models on thin-film compositions of Nb-Ti-Zr-Cr-Mo HEAs, predicting optimal hardness-ductility trade-offs across compositional gradients, thus expanding the design space of multinary alloys.
Despite these successes, integrating ML into materials science still faces barriers, highlighted the challenge of defect prediction in laser powder bed fusion (LPBF), proposing a digital twin model combining neural networks and reinforcement learning to anticipate fusion quality in real time. Bhattacharya et al.
49
addressed the data scarcity issue in composite materials by training regression models on Cu-ZrB₂ composites produced by powder metallurgy, achieving >80% prediction accuracy on hardness despite limited sample sizes. Adnan et al.
34
proposed hybrid optimization algorithms like SVR-SAMOA to refine hyperparameters in small hydrometallurgical datasets, reducing RMSE by over 20%. Cluster 2: Innovations in additive manufacturing and process control: material properties and advanced methods
AM, also known as 3D printing, has transformed the production of metallic components by enabling the fabrication of complex geometries with high precision, reduced waste, and customizable designs, particularly in aerospace, automotive, and biomedical applications.50,51 However, the performance of AM-produced parts strongly depends on the optimization of process parameters, including laser power, scan speed, powder morphology, and cooling rates, which directly affect the microstructure and mechanical properties of the final product.9,52,53 ML techniques have become vital tools for managing this complexity. They enable real-time monitoring, predictive modeling, and process optimization, allowing manufacturers to achieve consistent quality across diverse materials and geometries. For instance, Balaraman et al. 50 proposed a feature-driven ML framework incorporating laser pyrometer data and machine settings to predict the density of maraging steel parts printed via LPBF. Using Random Forest, SVR, and MLP models with hyperparameter optimization and feature selection, their approach achieved high predictive accuracy (R2 = 0.948) and revealed the dominance of physical and machine-dependent features over optical thermal descriptors.
Several studies highlight the need for accurate microstructural control. The formation of metastable phases due to rapid cooling can enhance strength but also introduce residual stresses that compromise long-term durability.54,55 Deep learning techniques have been employed to link microstructural images with mechanical properties, enabling robust predictions under varying process conditions. 56 These efforts are supported by works, where underscore the value of discrete element modeling in simulating powder behavior and surface deformation during metal fusion. The integration of human metallurgical expertise with ML has also proven effective in addressing data scarcity. Then, it has been demonstrated how physics-guided ML models, enriched with domain-specific rules, can improve prediction accuracy in AM despite limited datasets. Similarly, Wang et al. 57 employed extreme gradient boosting combined with a seagull optimization algorithm and SHAP analysis to identify critical alloying elements and processing variables influencing mechanical properties in steel.
Applications extend to sintering and post-processing. Kamal et al. 58 applied Random Forest regression to predict sintered density in nickel-alloyed bronze, enabling precise control of porosity and mechanical strength. In the design of high-entropy and multi-principal element alloys, Si et al. 59 used high-throughput ML models to determine the influence of atomic size mismatch and modulus difference on strength and ductility, achieving prediction errors below 10%. These approaches facilitate accelerated material discovery by narrowing down viable compositions prior to costly experimental trials. Powder variability and surface defects are also major concerns. Rodriguez et al. 60 emphasized the importance of sensor-assisted ML to monitor roll surface wear and identify process instabilities in high-pressure grinding rolls. In line with this, Bornikov et al. 61 integrated Bayesian networks with ML algorithms to predict superheat temperatures in continuous casting, even under uncertain input conditions a critical capability for real-time process control in metallurgical plants.
Beyond prediction, ML supports sustainability. Bruinsma et al.
62
proposed a techno-economic model combining electrodialysis with reverse osmosis for salt recovery in metal refineries, demonstrating how advanced statistical modeling can reduce environmental impact and improve circularity. In AM-specific contexts, Wang and Xiong
63
modeled stacking fault energy in over 300 austenitic steels using ensemble ML, highlighting the complex, non-linear effects of minor alloying elements on deformation mechanisms information critical for tailoring materials to AM environments. The integration of ML into welding, fusion, and joining technologies has also shown promise. Anandan and Manikandan
64
used Random Forest regression to predict peak temperature during friction stir welding, ensuring sound microstructure with minimal defects. These methods complement findings by Amalia et al.,
65
who discussed the role of ML in optimizing mechanical treatment and safety in battery recycling processes, a model transferable to metal powder reuse in AM. Finally, the future of AM lies in smart, hybrid manufacturing systems. ML-enhanced real-time feedback loops reduce defect rates and increase throughput, while new additives improve powder flowability and fusion behavior.61,63 The convergence of additive and subtractive methods, supported by ML-based process maps, promises greater flexibility and automation.9,52,64 These developments reflect a shift toward data-centric production environments where ML augments human decision-making and optimizes the entire production cycle. Cluster 3: Applications of learning algorithms in steel metallurgy: prediction and development of high-strength, low-alloy steels
The optimization of high-strength low-alloy (HSLA) steels is a crucial endeavor in metallurgy due to the growing industrial demand for materials that exhibit superior strength-to-weight ratios, durability, and corrosion resistance. HSLA steels are widely applied in sectors such as automotive, construction, and energy infrastructure, where performance under stress and harsh environments is critical. 66 ML methods are becoming increasingly pivotal in the development of these steels, offering new paths for predictive alloy design, process optimization, and microstructural control. Recent research demonstrates the effectiveness of supervised learning algorithms including SVMs, artificial neural networks (ANNs), and random forests in modeling the relationships between alloy composition, processing parameters, and mechanical performance metrics such as tensile strength, ductility, and yield strength.67,68 For example, Zhi et al. 69 developed a deep ML architecture (DCCF-WKNNs) that outperformed traditional ANN, SVR, and random forest models in predicting corrosion rates in low-alloy steels. Their model could identify environmental threshold variables—such as pH and humidity that cause significant shifts in corrosion behavior, demonstrating how ML can uncover nonlinear interactions that are difficult to detect through conventional modeling approaches.
In another notable study, it has been constructed an ML framework for predicting the corrosion rate of low-alloy steels in atmospheric conditions. By transforming traditional compositional descriptors into physically meaningful features and employing SHAP analysis to interpret variable importance, the authors significantly improved model generalizability across varying conditions. Their optimized XGBoost-based model showed excellent predictive accuracy and provided actionable insights into the design of more corrosion-resistant steel grades. Design optimization has also been enhanced through ML by enabling multi-objective predictions. Machaka 30 demonstrated how classifiers such as SVM, random forests, and neural networks could be used to classify and predict phase formation in high-entropy steels. Using a dataset derived from over 400 peer-reviewed experimental studies, the models achieved classification accuracies exceeding 95%, highlighting their potential for use in accelerated materials discovery. Similarly, Bernicky et al. 70 developed an ANN model to analyze flame emission spectra and predict elemental compositions in smelting powders with remarkable precision (<2% error), facilitating better feedstock control in the production of HSLA steels.
Case-specific studies provide further validation of ML's capacity to improve material design and processing. For instance, Dewangan et al. 71 used ANN models trained on nanoindentation data to predict the creep behavior of tungsten-containing HEAs, which share design challenges with HSLA steels. The model closely matched experimental data, showing that creep resistance can be effectively predicted from microstructural features. In a similar vein, Shen et al. 72 conducted a comparative analysis of PM models and ML algorithms. They found that while PM offers deeper mechanistic insights, ML models especially those trained on large, diverse datasets excel at generalization and accuracy. A hybrid PM-ML strategy provided optimal results, combining physical interpretability with predictive power. Process parameter optimization is another key domain where ML has delivered tangible benefits. Mutel et al. 73 applied a neural network-based Mask R-CNN to segment powder particles in SEM micrographs, linking their morphology to powder flow characteristics relevant to AM of steels. This type of predictive control can ensure consistency in feedstock quality, which is critical in advanced forming processes such as LPBF. Similarly, Mandal et al. (2023) demonstrated how SVM and decision tree classifiers could predict the phase stability in HEAs based on thermodynamic and configurational parameters. These models helped identify robust alloy combinations with desirable mechanical properties and phase stability, reducing the need for costly and time-consuming experimental trials.
MLalso plays a growing role in the assessment and improvement of structural integrity under extreme conditions. Sarwar et al.
74
showed that deep learning, when integrated with FEA, significantly improves the prediction accuracy for stress corrosion cracking in steel pipelines. This integrated AI-FEM approach allows for real-time evaluation of material degradation mechanisms and offers practical advantages for energy sector applications where HSLA steels are extensively used. Beyond prediction, ML is also enabling a shift from empirical to knowledge-based design by revealing hidden correlations. Diao et al.
28
employed feature reduction and creation techniques to construct ML models capable of accurately predicting corrosion behavior of low-alloy steels in marine environments. Their method not only enhanced predictive accuracy but also facilitated broader material generalization, making it applicable across different steel families. These applications underscore the transformational role of ML in steel metallurgy. By shifting from intuition-driven to data-driven strategies, engineers can explore vast composition-process-property spaces with unprecedented efficiency. As databases expand and deep learning architectures mature, these models will continue to improve in both predictive power and interpretability, paving the way toward smarter, faster, and more sustainable development of HSLA steels.50,75 Cluster 4: Predictive analysis and learning models in decision methods, random forests, and support vector machines.
In the field of predictive analytics, the use of MLmodels has become a fundamental tool for analyzing large volumes of data, identifying patterns, and making informed decisions. This cluster focuses on three of the most prominent techniques: Decision trees, random forests, and SVMs. These tools are applied across various disciplines, from industry to scientific research, with the aim of improving efficiency, optimizing processes, and making accurate predictions. 10 Predictive analytics is a branch of data science that uses statistical and MLtechniques to predict future outcomes based on historical data. Predictive models allow organizations to anticipate trends, identify risks and opportunities, and make strategic decisions with greater confidence. 32 Decision trees, random forests, and SVMs are fundamental algorithms within supervised learning, where the goal is to build models capable of predicting values or classifying new data based on previous examples. 76 Each of these methods has its own strengths and applications, which are detailed below.
Decision trees are one of the most intuitive and easy-to-interpret algorithms in ML. 77 They work by splitting the data into subsets based on feature values until reaching final decisions or predictions. Each branch in the tree represents a condition based on a specific attribute, and the terminal nodes (leaves) contain the prediction or classification. 78 Decision trees are especially useful because of their ability to handle both categorical and continuous variables, making them ideal for complex problems across multiple domains such as industry, healthcare, and finance. 79 Additionally, they provide clear, visual interpretations of decision-making processes, which facilitates communication of results to stakeholders. 80 In the industrial domain, decision trees are used, for example, in optimizing manufacturing processes and quality control. 81 However, decision trees have a significant drawback: they are prone to overfitting. This means they can learn the details of the training data too well, which reduces their ability to generalize to new data. To overcome this limitation, more advanced algorithms such as random forests are used. 82 Random forests are an extension of decision trees that combine multiple trees to improve the accuracy and robustness of the model. 83 This method uses the concept of bagging (bootstrap aggregating), where multiple trees are constructed from random samples of the dataset, and their predictions are combined to produce a more accurate and reliable result. 84 One key advantage of random forests is their ability to handle noisy data and large volumes of variables without losing accuracy. 69 Additionally, they are less susceptible to overfitting due to their focus on aggregating multiple trees. In industrial applications, random forests are ideal for tasks such as predicting machinery failures, material quality analysis, and logistical optimizations. 85
SVMs are another popular model in ML. 86 Unlike decision trees and random forests, which primarily operate by grouping data, SVMs seek to find an optimal hyperplane that divides the data into different classes. The goal is to maximize the margin between different categories, which improves the accuracy and generalization of the model. 87 SVMs are particularly useful for binary or multiclass classification problems, and they are effective even with high-dimensional datasets. 29 Their strength lies in their ability to handle cases where the classes are not linearly separable, using kernels to project the data into higher-dimensional spaces where separation is possible. 88 In practical applications, SVMs are widely used in predictive analytics in fields such as structural failure prediction, medical diagnosis, and fraud detection. 89 For example, in the automotive industry, SVMs can classify defects in components based on images and sensor data, helping ensure product quality. 33
While SVMs are highly accurate and robust, their complexity can be a disadvantage, as they require more computational power and may be harder to interpret compared to other models like decision trees.
66
However, their ability to handle high-dimensional data and provide optimized solutions makes them indispensable in certain contexts where precision is critical. The choice between decision trees, random forests, and SVMs depends on the type of problem faced, the quantity and quality of available data, and the specific objectives of the analysis.
90
Decision trees are ideal for problems where interpretability and speed are important. They are more suitable for scenarios where the dataset is relatively small and a quick, understandable model is needed.
91
Random forests offer better accuracy and are more suitable for complex problems with large datasets and variables. Their ability to handle noisy data makes them valuable in situations where robustness is essential.
92
SVMs are preferred for classification tasks with high-dimensional data and where precision is crucial. However, their computational complexity and lower interpretability must be considered.
93
In practice, these algorithms are rarely used in isolation. More advanced predictive systems combine multiple models and methods to achieve better results.
32
For example, random forests can be used to identify important variables, followed by SVMs to classify the data with high precision. Additionally, the use of cross-validation techniques and hyperparameter optimization ensures that the models are robust and generalize well to new data.
94
These practices have become standard in the construction of predictive systems across a range of industries, from manufacturing to healthcare. Cluster 5: AI and deep learning applied to metallurgy and metals.
Currently, AI and deep learning are transforming research and industrial processes into various fields, including metallurgy and metals. This cluster focuses on how these advanced technologies are revolutionizing the way metallic materials are developed, processed, and improved, opening new opportunities to optimize processes, enhance material properties, and explore new frontiers in materials science. 95 AI has become a key tool in metallurgy, where it is used to tackle complex problems related to the development and processing of metals. AI algorithms can analyze large volumes of data on material properties, chemical composition, and processing conditions to identify patterns that would be difficult to detect using traditional methods. For example, in predicting mechanical properties such as tensile strength, hardness, and ductility, AI can provide highly accurate predictive models based on experimental data. 96 This allows researchers to adjust alloy compositions and processing parameters to achieve specific properties without the need for multiple physical experiments. 97 Furthermore, AI is also essential in metal processing, helping to optimize process control. In applications such as the manufacture of complex alloys, AI can predict and adjust critical parameters such as casting temperature or cooling time, ensuring uniformity and quality in the final product. 98 Deep learning is a branch of AI that relies on ANNs with multiple layers (deep neural networks). These networks have the ability to learn complex data representations, making them especially useful in applications where the data is highly nonlinear or has intrinsic characteristics that are difficult to model with traditional approaches. 99 In metallurgy, deep learning is applied in various areas, such as predicting mechanical properties from microstructural data. For example, 6 developed a convolutional neural network to accurately predict tensile strength and hardness of 316L stainless steel based on metallographic images obtained from LPBF processes. Deep neural networks can model complex relationships between alloy composition, processing parameters, and resulting properties. This is crucial in the development of new materials where direct experimentation would be costly and slow. 100
The microstructure of metallic materials plays a crucial role in their final properties. Deep learning is used to analyze microstructure images, identifying phases, defects, and key features with unprecedented precision. This allows for advanced material characterization, which in turn facilitates the design of alloys with enhanced properties. 101 In processes such as AM or powder metallurgy, deep learning is used to control variables in real-time, improving process efficiency and reducing the risk of defects. 102 One of the most significant use cases of AI and deep learning in metallurgy is the improvement of advanced metallic alloys. These technologies enable the optimization of the development of superalloys and titanium alloys, which are essential in industries such as aerospace and automotive due to their high strength and low density. 103 Another area where AI has demonstrated its value is in real-time quality control and diagnostics. Deep learning-based systems can analyze sensor data and images during the manufacturing process to detect defects or deviations in material properties, allowing for immediate adjustments and minimizing waste. 104 Additionally, AI tools are transforming the management and utilization of recycled metals. By analyzing data and predicting behaviors, it is possible to classify and optimize the use of recycled materials, ensuring they meet the necessary standards for specific applications. 105 The combination of AI and deep learning is enabling more efficient exploration of the vast alloy design space. Rather than relying solely on physical experiments or traditional simulations, deep learning algorithms can predict optimal combinations of alloying elements that maximize properties such as strength, hardness, or corrosion resistance. For example, in the study of HEAs, which consist of multiple elements in nearly equimolar proportions, the design space is so vast that it would be impractical to explore it fully without the help of AI. Deep neural networks can help quickly identify promising compositions that can then be synthesized and experimentally tested. 106 Despite significant advances, the application of AI and deep learning in metallurgy still faces several challenges. One of the main challenges is the availability and quality of data. AI and deep learning models require large amounts of labeled data to train effectively, and in some cases, this data may be difficult to obtain or expensive to generate. 107 Another important challenge is the interpretability of the models. Although deep neural networks are extremely powerful, they are often considered “black boxes” due to the difficulty in understanding how they arrive at their predictions. 108 This can be problematic in the industry, where it is crucial to understand the reasoning behind decisions to ensure safety and reliability. 27 However, as technology advances, so do the tools to address these challenges. The development of AI explainability techniques and the integration of AI with other methodologies, such as computational simulation, are helping to overcome these limitations. 109
Future directions for research on the application of MLmethods in the metallurgical industry
Based on the insights derived from the analysis of available information on the impact of ML in the metallurgical industry, this section proposes several future directions. This aims to suggest new areas of exploration that contribute to the technological and sustainable advancement of industry. After a thorough analysis, several knowledge gaps have been identified that require further attention, and the following research lines are proposed as priorities for future studies to address critical challenges and maximize the potential of ML in metallurgy:
Optimization of metallurgical processes using deep learning: The metallurgical industry faces complex challenges related to real-time process control and optimization. While traditional supervisory models have been implemented, deep learning technologies, such as convolutional and recurrent neural networks, offer great potential to predict and optimize critical variables like metal purity, energy efficiency, and reduction of emissions.110,111 Future research could focus on developing customized algorithms that integrate sensor data from real industrial environments, enabling continuous and adaptive improvement.
Prediction of the behavior of metal alloys using ML: The properties of metal alloys depend on multiple variables, such as composition and processing conditions. Despite advances in computational modeling, predicting the behavior of new alloys remains a challenge. 112 The application of supervised and unsupervised learning models that combine experimental data with simulations is suggested to generate robust predictions and accelerate the development of advanced materials with specific applications.113,114
Sustainability and efficiency in metallurgical waste management: Sustainability and efficiency in metallurgical waste management: ML can also play a key role in managing by-products and industrial waste, such as metallurgical tailings.115,116 In this context, it is recommended to explore automated classification technologies based on computer vision and AI to identify recycling and reuse opportunities. 117 Additionally, research integrating intelligent systems to monitor and reduce the carbon footprint of metallurgical operations would be highly relevant.118,119 For example, ML-based optimization in casting and rolling has been shown to reduce energy consumption by up to 15%, while defect prediction systems contribute to minimizing scrap and material losses. However, future studies should incorporate standardized sustainability metrics such as energy savings, waste reduction rates, or CO2 emissions to better quantify these environmental benefits.
Integration of digital twins in metallurgical operations: The implementation of digital twins in the metallurgical sector offers significant opportunities to simulate and predict the behavior of entire systems, from raw material extraction to final production. 120 Future studies should investigate how to combine these models with ML techniques to optimize process design, forecast equipment failures, and improve operational efficiency.121,122 Despite promising academic developments, the industrial implementation of AI and ML in metallurgy faces significant challenges. These include limited access to high-quality operational data, integration difficulties with legacy manufacturing systems, lack of trained personnel, and organizational resistance to adopting data-driven decision-making. Addressing these barriers requires close collaboration between academia and industry, gradual implementation through pilot programs, and upskilling initiatives to bridge technical and cultural gaps.
Ethical and regulatory challenges associated with the use of AI in metallurgy: Although ML technologies promise to transform the industry, there are ethical and regulatory challenges that require attention. 111 Future research should address industrial data privacy, algorithm transparency, and the need for regulatory frameworks that govern their implementation. 116 This is especially important in emerging economies, where advanced technologies often face adoption barriers. 113 In addition to regulatory concerns, interdisciplinary collaboration between materials scientists and data scientists presents its own set of challenges. These include communication gaps due to domain-specific terminology, differences in methodological assumptions, and contrasting priorities such as experimental validation versus model performance. Overcoming these barriers requires the development of shared frameworks, cross-disciplinary training programs, and collaborative platforms that facilitate mutual understanding and integration of knowledge.
Design of functional and smart materials: Another promising direction for future research lies in the design of new classes of functional, magnetic, and smart materials. While ML and deep learning have demonstrated great potential in predicting mechanical and chemical properties, their application to multifunctional materials remains limited due to the scarcity of standardized, high-quality databases. These materials often exhibit complex multiphysics behaviors that are difficult to capture with current datasets. As a result, ML models in this domain tend to be less accurate and harder to generalize. Although simulations and ML can assist in pre-screening promising compositions, they cannot yet replace physical experimentation. Instead, they serve as valuable complementary tools that reduce development cycles and improve efficiency when combined with high-throughput experimentation and active learning. Future research should prioritize the development of open-access databases focused on functional materials to enable more robust and transferable ML models in this emerging area.
Conclusions
This study provides a comprehensive and updated perspective on the evolution of the application of ML methods in the metallurgical industry over the past 24 years, fulfilling three key objectives: (i) Capturing the scientific background of research on the application of ML methods in metallurgy, identifying the key themes and trends of the last 24 years; (ii) presenting an integrated view of the existing literature on the subject; and (iii) proposing future directions for research on the application of ML methods in metallurgy. To achieve this, a mixed-methods approach was adopted, encompassing bibliometric analysis, text mining, and content analysis, applied to a rigorously selected sample of 149 peer-reviewed articles obtained from the Scopus and Taylor & Francis databases, published between 2000 and 2024. To ensure the quality and relevance of the studies included, PRISMA filters were applied.
The findings of this research identified five key areas in the application of ML methods in the metallurgical industry: (1) Advances in ML and materials science: Exploring HEAs, superalloys, and titanium alloys; (2) Innovation in additive manufacturing: Process control, mechanical properties, and material performance; (3) applications of ML algorithms in steel metallurgy: Prediction and development of high-strength, low-alloy steels; (4) predictive analytics and ML models: Decision trees, random forests, and SVMs; and (5) AI and deep learning applied to metallurgy and metals. These areas span from the valorization of tailings and waste as new resources to the creation of regulatory frameworks supporting circular practices, as well as the adoption of digital technologies to optimize the efficiency of metal production and reduce tailings worldwide. Regarding future research directions, several priority lines were highlighted to address fundamental aspects of the metallurgical industry and the application of ML. One of them is the optimization of metallurgical processes through deep learning, focusing on the prediction and real-time control of critical variables to maximize operational efficiency and reduce costs. The importance of predicting the behavior of metal alloys using advanced ML models is also emphasized, which will accelerate the development of new materials with improved properties. Another key aspect is sustainability and efficiency in metallurgical waste management, exploring intelligent systems for classification, recycling, and reducing the environmental impact of by-products generated in industrial processes. Additionally, the integration of digital twins in metallurgical operations is highlighted, enabling the simulation and prediction of the performance of complex systems to optimize processes and improve decision-making. Finally, the analysis of ethical and regulatory challenges related to the implementation of ML technologies in the sector is considered crucial, encouraging the development of regulatory frameworks that promote their adoption in a sustainable and responsible manner.
Footnotes
Ethical approval
This study did not involve human participants, human data, or human tissue and therefore did not require ethical approval.
Author contributions
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
The datasets generated and analyzed during the current study are available upon reasonable request from the corresponding author.
