The balance between artificial and human intelligence in clinical practice

Abstract

Introduction:

Artificial intelligence (AI) is becoming increasingly integrated into clinical care in hand surgery. Its applications extend across diagnosis, planning, intraoperative assistance, postoperative monitoring, rehabilitation, prosthetics and education.

Applications:

In diagnostic imaging, AI improves the detection of distal radius and scaphoid fractures, estimates osteoporosis from hand radiographs, identifies triangular fibrocartilage complex injuries on magnetic resonance imaging, segments bones and cartilage, and supports dynamic wrist analysis; ultrasound- and neurophysiological-based models aid carpal tunnel syndrome diagnosis. Prognostic models predict outcomes after carpal tunnel release and thumb carpometacarpal osteoarthritis with mixed performance. Pre- and intraoperative applications include large language model-based triage and coding, navigation and phase/gesture recognition from surgical video, autonomous microsurgical prototypes and telemanipulator platforms for supermicrosurgery. Artificial intelligence-enabled telemonitoring (e.g. remote photoplethysmography) and video-based mobility tracking support postoperative care and rehabilitation. Vision-guided and multimodal sensing enhance myoelectric prosthesis control.

Risks:

Risks include data privacy and security, algorithmic bias (data, transposition, normative, annotation) and opacity, overreliance with automation bias and skill erosion, and unresolved legal and ethical questions (liability, conflicts of interest, compassion in care).

Conclusion:

Balanced adoption requires diversified datasets, privacy-preserving strategies (pseudonymization, differential privacy, federated learning), transparent reporting, AI literacy and ethics in medical education and interfaces that expose uncertainty and employ cognitive forcing functions. Post-deployment surveillance should track data drift, out-of-distribution inputs and performance using automated alerts and multidisciplinary review. Artificial intelligence should augment, never replace, clinical judgment, with explicit role delineation and continuous monitoring to safeguard equity and patient-centred outcomes.

Keywords

Artificial intelligence clinical decision support diagnostic imaging data governance machine learning ethics hand surgery surgical robotics

Introduction

Artificial intelligence (AI) is defined as the ability of machines to perform human-like tasks. Drawing particularly on machine learning and deep learning, it allows autonomous or semi-autonomous analysis and interpretation of data, which is especially relevant to the field of medicine (Table 1). The integration of AI into medicinal practice began as early as 1986, with technological advances in information management (Kalisman and Kalisman, 1986) and speech recognition (Akers, 1986). Since then, publications in clinical AI have grown exponentially (Figure 1), with the potential to profoundly transform the patient care pathway.

Table 1.

The hierarchical structure and key components of approaches in artificial intelligence, machine learning and deep learning.

Level			Element	Description
AI	Core AI Approaches		Rule-based systems	Follows explicit rules like ‘if this is true, do that’. Fully programmed by humans, they do not learn from data and always follow the same instructions
			Game playing	Capable of playing games like chess or Go by exploring possible moves, evaluating outcomes and choosing the best strategy
			Knowledge representation and reasoning	Stores facts and relationships (e.g. ‘all birds have wings’) to allow the system to infer new knowledge (e.g. ‘a sparrow has wings’)
			Propositional calculus	Uses true/false logic to draw conclusions from basic statements (e.g. if ‘A implies B’ and ‘A is true’, then ‘B is true’)
			Cognitive modeling	Attempts to create computer models that mimic human mental processes like memory or decision-making
			Planning	Involves calculating a sequence of steps or actions to reach a specific goal (e.g. a robot going from point A to B must plan its path while avoiding obstacles)
			Search algorithms	Used to explore many possible options to find a solution (e.g. finding the shortest path on a map or the best combination of pieces to solve a puzzle)
	Machine Learning (ML)	Core ML Approaches	Support vector machines	Separates data into groups by finding the best boundary between them (e.g. distinguishing photos of cats and dogs)
			Linear regression	Predicts a value based on other values, assuming a simple proportional relationship (e.g. estimating house price from size and age)
			Logistic regression	Similar to linear regression but used to predict categories by estimating the probability of each one (e.g. predicting whether an email is spam or not)
			Gaussian process regression	A more advanced method that predicts a value and shows how confident the system is in the prediction, using probabilistic models
			Random forest	Uses many models (decision trees) and combines their results to make more reliable predictions (e.g. like averaging the opinions of multiple experts)
			K-Means clustering	Organizes data into groups based on similarity, without knowing the group labels in advance (e.g. segmenting customers by shopping behavior)
		Deep Learning	Multi-layer perception	A type of neural network with multiple layers of units, where each layer learns more complex features, allowing the system to recognize varied patterns
			Radial basis function network	Another type of neural network that uses special functions to process data and recognize patterns or categories, often used when class boundaries are complex
			Generative adversarial network	Involves two networks in competition: one generates data (e.g. artificial images) and the other tries to detect if it’s fake, pushing the generator to create more realistic outputs
			Long short-term memory	Neural networks designed to handle sequences where order matters (e.g. text or signals), with an architecture that allows them to ‘remember’ important information over long periods
			Autoencoders	Neural networks that learn to compress information by removing unnecessary parts, then reconstruct it (e.g. for reducing data size or detecting anomalies)
			Convolutional neural network	Neural networks specially designed to analyse images, using filters that scan the image to automatically detect edges, textures and useful shapes
			recurrent neural network	Neural networks that handle sequential data by taking into account previous inputs, which is essential for understanding text or a sequence of sounds
			Deep clustering	Groups data using representations learned by deep networks, often allowing better separation of clusters than traditional methods
			Deep reinforcement Learning	Learns through trial and error using rewards and penalties, with deep networks helping the system choose actions to maximize long-term rewards
			Other DL architectures	Includes other deep learning architectures like Transformers, which are widely used today for tasks such as language translation and text generation

Figure 1.

Evolution of the number of scientific publications linking artificial intelligence and surgery, indexed in PubMed between 1986 and 2024. The x-axis represents the years and the y-axis the number of publications. After an initial increase peaking in 2013, a decline is observed until 2016, followed by an exponential growth.

While this is a rapidly evolving field, this paper builds on previous work (Miller et al., 2023, 2025a) to analyse the benefits and risks of AI in hand surgery. The potential benefits are reviewed, including applications in diagnosis, surgical training and perioperative support, while addressing the challenges related to confidentiality, bias, technological dependence and ethical issues.

Benefits of AI

Artificial intelligence can contribute throughout the patient pathway. While some benefits are already in practice, others remain theoretical. Examples include detecting pathology, preoperative planning, robotic-assisted surgery, real-time analysis during surgery, providing personalized follow-up, early identification of postoperative complications and advancing research through the analysis of large datasets. Unlike humans, AI operates tirelessly and consistently, ensuring uninterrupted support across all stages of care.

Diagnostic accuracy

Artificial intelligence has been applied across various imaging modalities, most extensively to conventional radiography (Table 2). Its use for scaphoid fractures is particularly relevant given the potential consequences of a missed diagnosis and the challenges of detecting occult injuries. However, its effectiveness in identifying these fractures remains variable (Kraus et al., 2024; Langerhuizen et al., 2020). These studies should be interpreted cautiously to fully understand the scope and limitations of AI, as performance metrics alone can be misleading. It is important not to overstate the current capabilities of these models and to recognize that a binary outcome, such as fracture or no fracture, reflects probabilities rather than certainties. Therefore, an AI-generated report should not be regarded with the same consideration as a human-generated one (Miller et al., 2025b).

Table 2.

Applications of artificial intelligence in in hand surgery and performance comparison with human experts.

Method	Application	Human performance	AI performance	Reference
Radiology	Distal radius fracture detection	Accuracy 93.7% Sensitivity 85.8% Specificity 92.2%	Accuracy 97.5% Sensitivity 90.2% Specificity 88.2%	Anttila et al. (2023b) Wong et al. (2024)
	Scaphoid fracture detection	Accuracy 84%	Accuracy 90.3%	Oeding et al. (2024)
	Perilunate dislocation detection	Accuracy 89% Specificity 88%	Precision 100% Sensitivity 100%	Majzoubi et al. (2024)
	Kienböck’s disease detection	Detectable from stage 2	Detectable from stage 1	Wernér et al. (2024)
	Intraosseous chondroma detection	/	/	Anttila et al. (2023a)
	Estimating bone density (osteoporosis)	/	/	Burton et al. (2023)
	Estimating bone maturity	/	95% agreement with ground truth	Larson et al. (2018)
CT scan	Measuring anatomical parameters of the distal radius	/	ICC 0.94–0.96	Suojärvi et al. (2021)
	Analysis of distal radioulnar joint	/	/	Roner et al. (2020)
	Wrist bone segmentation and labelling	/	/	Teule et al. (2024)
MRI	TFCC lesion detection	Accuracy 88.9% Specificity 85.2% Sensitivity 94.4%	Accuracy 90.7% Specificity 92.3% Sensitivity 88.2%	Lin et al. (2022)
	Carpal bone segmentation	/	Dice* 0.93 (healthy subjects) Dice 0.91 (osteoarthritis) Time ≈ 5 s/bone	Foster et al. (2018)
	Wrist cartilage segmentation	/	Dice 0.81, fails with osteoarthritis	Brui et al. (2020)
	Dynamic MRI-based carpal segmentation to detect scapholunate lesions	/	Dice 0.96	Radke et al. (2021)
Ultrasound	Diagnosing Palmer 1B-type TFCC lesions	/	Precision 85%	Shinohara et al. (2022)
	Tendon and synovial sheath segmentation in trigger finger ultrasound	/	Dice 0.93	Kuok et al. (2020)
	Carpal tunnel syndrome diagnosis using perineural and epineurium tissue analysis	/	Accuracy 0.96 Precision 0.99 Dice 0.86	Peng et al. (2024)
Neurophysiology	Carpal tunnel syndrome diagnosis using electromyographic analysis	/	94% < Precision < 97.1%	Bakalis et al. (2024) Tsamis et al. (2021)

The Dice score is a number between 0 and 1 that shows how similar two shapes or areas are, often used to check how well an AI matches a human in tasks like medical image segmentation.

CT, computed tomography; MRI, magnetic resonance imaging; TFCC, triangular fibrocartilage complex.

Artificial intelligence has been used to identify perilunate dislocations on standard radiographs without error (Majzoubi et al., 2024). Current evidence indicates that AI can assist in identifying early osseous pathology with accuracy comparable to that of experienced radiologists (Wernér et al., 2024). Nonetheless, few studies have demonstrated clear superiority over expert clinicians. As large language and multimodal models continue to evolve, such advancements may become achievable, warranting discussion on how these technologies will ultimately integrate into clinical workflows and decision-making.

Applications of AI in computed tomography (CT) imaging show promise for wrist analysis, although studies remain too limited for reliable clinical use (Suojärvi et al., 2021; Teule et al., 2024).

While AI performs better with magnetic resonance imaging (MRI), it still lacks routine clinical application (Chen et al., 2024a). It can identify ligament injuries such as those of the triangular fibrocartilage complex (TFCC), automatically segment bones and cartilage with performance approaching that of experts, and analyse joint motion in real time to detect instabilities not visible on static images (Brui et al., 2020; Foster et al., 2018; Lin et al., 2022; Radke et al., 2021).

Ultrasound applications remain limited. Artificial intelligence has been able to detect Palmer type 1B TFCC lesions with accuracy similar to MRI, but it may not reliably distinguish complex foveal injuries (Shinohara et al., 2022). Other models can identify and segment tendons and sheaths in trigger finger with accuracy comparable to humans, but with no clear clinical advantage (Kuok et al., 2020).

When applying deep learning to the ultrasound diagnosis of carpal tunnel syndrome (CTS), some studies are limited by the absence of comparison with neurophysiology (Peng et al., 2024). When used alongside neurophysiology, it provides more accurate diagnostic results for CTS than ultrasound alone. Two studies have applied machine learning to neurophysiology: one analysing motor and sensory signals, with excellent performance (Bakalis et al., 2024) and the other focusing on electrodiagnostic criteria, with high accuracy (Tsamis et al., 2021).

Artificial intelligence has the potential to assist diagnosis and prognosis by integrating medical history, symptoms, clinical examination, imaging and laboratory results. Diagnostically, AI is already used in other specialties such as dermatology and neurology (Brancaccio et al., 2024). It can differentiate skin lesions with 72% accuracy, compared with 66% for dermatologists (Edge Health, 2024; Esteva et al., 2017), and can detect nerve palsies from photographs of the hand with up to 99% accuracy (Gu et al., 2022).

The use of AI in diagnosis prediction has been studied for CTS. One model using 11 clinical variables achieved 76.6% accuracy but remained inferior to neurophysiological examination (Park et al., 2021). Another study reported predicted postoperative outcomes using the Minimal Clinically Important Difference, achieving 71.8% accuracy for function and 75.9% for pain (Harrison et al., 2023). A third study compared AI with surgeon prediction of outcome at 6 months, with AI outperforming surgeons: 78% accuracy and 85% sensitivity vs. 65% and 72% (Loos et al., 2024). Conversely, in carpometacarpal osteoarthritis, AI performed worse than surgeons, with better performance when predicting function rather than pain after trapeziectomy (Loos et al., 2022).

Despite promising results, current AI applications in musculoskeletal imaging and diagnosis often rely on limited datasets, lack external validation and oversimplify clinical complexity. Reported accuracies may not generalize across populations or settings. Caution is warranted before integrating these tools into practice, as overreliance risks misdiagnosis and clinical overconfidence.

Surgical planning and assistance

Pre-operative

Artificial intelligence may support decision-making in surgical patients and assist in surgical planning where appropriate.

Large language models (LLMs), including ChatGPT and Google Gemini, are designed to process and generate natural language. One study compared their performance on 68 hypothetical hand injury cases, evaluating classification and treatment recommendations. When rated by a surgeon certified by the American Society for Surgery of the Hand, Gemini outperformed ChatGPT, with 70.6% correct classifications (vs. 26.5%) and greater treatment accuracy (97.8 vs. 88.9%). Although the absence of human expert comparison was a limitation (Pressman et al., 2024), the study demonstrated a baseline level of clinical ability.

Several studies have explored the use of AI for preoperative planning in hand surgery, but none has yet resulted in a clinically validated application (Ryhänen et al., 2025). Current AI analyses of the distal radioulnar joint remain largely experimental, with limited integration into surgical planning software (Roner et al., 2020). Liu et al. (2019a) proposed a system to assist with Kirschner wire placement for scaphoid fracture fixation, although final trajectory decisions still rely on the surgeon’s judgment. The future probably lies in hybrid approaches, where AI assists with quantitative modelling, trajectory optimization and anatomical risk prediction, while the surgeon integrates this information with intraoperative findings and clinical reasoning. In such settings, the AI-assisted surgeon may ultimately achieve greater precision and consistency than either AI or human expertise alone.

Perioperative

Artificial intelligence could improve visualization, provide real-time video-based guidance and automate selected operative tasks.

In knee arthroscopy, an approach potentially applicable to the wrist, AI allows simultaneous correction of image noise, blurring and colour imbalance, surpassing conventional techniques (Ali et al., 2023). Artificial intelligence techniques have also been shown to efficiently remove endoscopic smoke (Wang et al., 2023), improving clarity, a potential benefit for robotic minimally invasive peripheral nerve surgery. In the operating room, analysis of personnel flow can identify potential error sources and contribute to improving safety and efficiency. However, the use of cameras in this context raises privacy concerns, limiting acceptance and widespread adoption. To address this, tracking operating room movements while also automatically obscuring faces has been proposed (Bastian et al., 2023).

Analysis of surgical videos represents another application in the operating room. Using cameras integrated into surgical lights or on head mounts, recordings can train AI models to identify anatomical structures, operative phases and technical gestures. Examples of this include: segmentation of carpal bones in arthroscopy (Orgiu et al., 2024), automatic phase detection in distal radius fracture fixation (Graëff et al., 2025), interpretation of surgical gestures (Goodman et al., 2024) and assessment of microsurgical metrics (operative time, motion smoothness and distance travelled), with strong correlation to both expert-rated scores and surgeon experience (Sugiyama et al., 2024).

In microsurgery, the µSTAR autonomous robot performs microvascular anastomoses using a motorized suturing device, micro-camera and optical coherence tomography sensor. Tested on an ex vivo model, it completed 90% of sutures without human intervention, with precision matching that of surgeons, though with a longer average time per stitch (353 vs. 141 s). (Haworth et al., 2024).

Real-time feedback in postoperative care

Postoperatively, AI is taking an increasing role, from monitoring microvascular flaps and assessing joint mobility to optimizing rehabilitation protocols and coding surgical procedures.

Video-based techniques, such as remote photoplethysmography, allow monitoring of physiological parameters (perfusion, heart rate, oxygen saturation) in replantations and free flap surgeries, offering an alternative to traditional human observation (Chen et al., 2024b). Miniaturization of sensors now allows AI-powered portable flap telemonitoring. In a controlled ischemia simulation (tourniquet in healthy volunteers), remote photoplethysmography-driven AI detected vascular alterations with accuracy similar to pulse oximetry and manual evaluation (Lu et al., 2025).

Artificial intelligence is also being used to track changes in hand and wrist movements via video analysis (Exer AI, 2024) and evaluate joint movements (Dutrey et al., 2025). One AI system, combining wearable sensors with real-time acoustic feedback, corrects gait after hip arthroplasty and shortens recovery time (Alcaraz et al., 2018). A similar application has also been developed for hand rehabilitation (Bauknecht et al., 2025).

Large language models can also be used for coding hand surgery procedures. One study reported 100% accuracy for Perplexity.AI and 93.3% for Bard and Bing, while ChatGPT-4o and ChatGPT-3.5 reached 53.3% and 46.7%, respectively. For complex procedures, only Perplexity.AI and Bard achieved 60% (Isch et al., 2025).

Robotic surgery

In surgical robotics, telemanipulators replicate the surgeon’s movements in real time, enhancing precision and filtering tremors while keeping the surgeon in full control of each movement. Autonomous or semi-autonomous surgical robots, particularly in orthopaedics, can perform specific tasks such as implant positioning or trajectory planning based on preoperative data (Lim et al., 2025).

Although AI has yet to be integrated, telemanipulator-assisted surgery is now being applied to peripheral nerve injuries (neurolysis, repair, graft) through incisions smaller than 1 cm (Jiang et al., 2025). The Da Vinci^® system, offering magnified 3D vision and high precision, is already used for such operations. New platforms are emerging, including Symani^® (MMI™) and MUSA-3^® (Microsure™). Symani is tailored for microsurgery on delicate structures (0.2–0.3 mm), with instruments offering seven degrees of freedom, active tremor suppression and motion scaling (Innocenti et al., 2023). MUSA-3 allows standard surgical instruments to be used with exceptional accuracy via an intuitive console and a stabilized robotic arm (van Mulken et al., 2020). These systems improve precision, safety and surgeon ergonomics in complex operations.

Early semi-autonomous robots were initially developed without artificial intelligence, including for wrist arthroscopy (Liverneaux et al., 2016) and percutaneous scaphoid screw fixation under navigation (Liverneaux, 2005). More recently, new semi-autonomous systems integrating AI have emerged, particularly for navigation-guided bone fixation. These robots enable precise fracture reduction and optimization of fixation implants, with indications including scaphoid fractures and non-unions (Liu et al., 2019b), perilunate fracture-dislocations (Yi et al., 2023), hamate fractures (Jie et al., 2022) and partial carpal arthrodeses (Gao et al., 2024).

To date, the only truly autonomous robots in surgery remain at the experimental stage, mainly used for performing microvascular anastomoses, such as the µSTAR robot.

Electronic hand prostheses

Currently, hand prostheses are controlled through voluntary contractions of residual limb muscles, producing electromyographic signals that are processed and translated into motor commands to drive the device. However, variability in electromyographic signals can limit accuracy, sometimes causing errors in determining the intended grasp type (Castellini et al., 2014).

To address these errors, alternative strategies are under development. For example, radio frequency identification technology offers reliable detection of objects placed near a sensor, but only if they are fitted with an embedded electronic chip (Trachtenberg et al., 2011). Since this is impractical, alternative solutions use AI-driven image recognition to identify objects in real time and select the correct grasp pattern. One approach uses three head-mounted cameras, although this raises ergonomic issues (Markovic et al., 2015). Another integrates a camera into the prosthetic palm, achieving 93.2% accuracy in object recognition (DeGol et al., 2016). A third combines image recognition with a multimodal sensing system (distance sensor, accelerometers, gyroscopes), achieving 91.8% success in object manipulation, 88.6% in functional tasks (YCB Gripper Benchmark) and an average grasp time of 0.73 s (Weiner et al., 2022).

Education and training

Artificial intelligence holds potential for training hand surgeons, reviewing scientific literature and educating patients.

Large language models can help create clinical scenarios, formal lectures and multiple-choice questions (Siu et al., 2023), and have been used to compare the difficulty of hand surgery board exams (Hasan et al., 2025). ChatGPT can deliver step-by-step guidance for procedures such as microvascular arterial anastomosis and thumb pollicization, but its inaccuracies risk misleading trainees (Mohapatra et al., 2023).

Training in Da Vinci^® robotic surgery uses simulators such as Mimic^® to objectively assess technical skills through a standardized global score calculated from parameters like applied force, collisions and instrument visibility (Egi et al., 2013). Artificial intelligence can identify surgeons’ expertise from simulated videos with 83–100% accuracy (Juarez-Villalobos et al., 2021). Robotic microsurgery training programs have also been implemented (Selber and Alrasheed, 2014).

Surgical workflow recognition applies AI to segment operations into discrete steps, enabling performance assessment, skill standardization and prompt error feedback (Garrow et al., 2021). Training such AI models demands substantial annotated datasets, a limitation partly mitigated using different AI learning strategies (Demir et al., 2023). The uncertainty- and cluster-aware temporal diffusion method enhances surgical workflow recognition by incorporating clustering and self-supervised spatial features, shortening training time while improving accuracy (Graëff et al., 2025).

Although most publishers forbid LLM-based peer-reviewing, some studies have explored it: with generic prompts, performance was poor, whereas targeted instructions enabled ChatGPT to produce evaluations comparable with those of human experts (Marrella et al., 2025).

Many patients turn to LLMs to better understand their condition, but quality hinges on accuracy, completeness and readability. ChatGPT 3.5 scored 4.83/6 for accuracy and 2/3 for completeness (Jagiella-Lodise et al., 2025) when asked about CTS, yet occasionally provided advice lacking scientific support, such as recommending non-steroidal anti-inflammatory drugs not endorsed by guidelines (Amen et al., 2024).

The readability of ChatGPT was lower than that of a Google search (Croen et al., 2025) and judged inferior to that of Mayo Clinic or WebMD for several conditions (CTS, trigger finger, Dupuytren’s disease, ganglion cyst) by surgeons (Pohl et al., 2024), but considered equivalent for CTS by patients (Pohl et al., 2025).

Risks

Data privacy and security

Training AI in healthcare relies on large datasets (Akyüz et al., 2024), raising issues of privacy and personal data use, where breaches can cause ethical concerns and direct harm to patients (Cohen et al., 2014). Whether access to patient data for training is legitimate depends on the purpose: public health vs. commercial gain (Faden et al., 2013). Even if patient data is fully anonymized, the issue of whether patients can opt out of their data being used for training or other purposes remains.

Public mistrust of health data use is warranted, given previous unauthorized sharing (Royal Free London NHS, Cambridge Analytica) (Dawson et al., 2019). Risks include discrimination by employers or insurers (Calo, 2011) and personal harms, including anxiety from exposure of sensitive details (Price and Cohen, 2019). Artificial intelligence may also infer information never disclosed, generating intrusive or inaccurate profiles outside current laws (Crawford and Schultz, 2014).

Patient consent for using health data is essential. Dynamic consent strengthens security by requiring authorization for each use but limits large-scale processing (Kaye et al., 2015). Broad consent enables wide data sharing, sometimes without direct oversight, as seen in some biobanks (Grady et al., 2015).

Bias and inequity

The integration of AI may introduce biases that undermine the reliability of results and compromise equity in healthcare, since large language models are trained on vast internet-based datasets that inherently mirror cultural and societal biases, particularly those rooted in Western perspectives and values (Table 3). Consequently, data biases can perpetuate these prejudices and social inequalities, leading to the exclusion of vulnerable groups. These forms of discrimination are often unintentional and systemic, making them difficult to detect and to challenge in court (Barocas and Selbst, 2016). Such biases have been documented in the prediction of recidivism, health status, insurability and disease risk (Kostick-Quenet and Gerke, 2022).

Table 3.

Main biases related to artificial intelligence in healthcare, from design to clinical use.

Intrinsic biases in AI	Biases related to human users
Data	Cognitive
Non-representative data (e.g. gender, ethnicity, socioeconomic status)	Overreliance on AI recommendations (automation bias)
Incomplete or imbalanced datasets	Disregard for relevant AI advice
Errors or subjectivity in manual annotation	Progressive loss of clinical expertise (erosion of professional skills)
Algorithmic	Ethical and regulatory
Underestimation or overestimation of performance (evaluation bias)	Reduction of human interaction in healthcare delivery
Reproduction of social or historical discrimination	Risk of inappropriate commercial use (profit-driven recommendations)
Lack of model transparency (black box effect)	Uncertainty regarding liability in the event of errors
Implicit prioritization of certain medical values (e.g. survival over quality of life)
Valid results in laboratory settings that are not reproducible in clinical practice
Privacy
Unintended inferences drawn from data
Excessive or intrusive profiling

Transparency in reporting a particular model development and training is important for safe clinical use. In one analysis of 1.7 million responses from nine LLMs to 1000 emergency medical cases, clinical recommendations were influenced by sociodemographic profile: with identical patient data, LLMs were more likely to propose unwarranted interventions for LGBTQIA+, Black, or homeless patients and to recommend advanced imaging for those with higher incomes (Omar et al., 2025).

Transposition bias arises when AI systems that perform well in laboratory settings fail to achieve comparable results in clinical practice, for example, the low proportion of randomized trials in hand surgery (26%) within a body of research dominated by retrospective studies (Keller et al., 2023) and by the limited clinical adoption of AI despite the rapid growth of the field (Nair et al., 2024).

Some algorithms embed normative bias by prioritizing certain values (longevity over quality of life), potentially disregarding patient preferences and reinforcing algorithmic paternalism (Quinn et al., 2021). Supervised learning, relying on error-prone manual annotations, may introduce annotation bias that undermines AI reliability (Hashimoto et al., 2018), as demonstrated by a wrist surgery workflow recognition study, showing substantial inter- and intra-annotator variability (Graëff et al., 2023).

Many AI systems function as ‘black boxes’, where recommendations are given without a clear explanation of how they are reached. This lack of transparency may be due to the complexity of underlying algorithns or proprietary reasons (Hassan et al., 2024).

Over-reliance on technology

The growth of AI has prompted safeguards against algorithmic bias, yet cognitive biases remain underexplored. Lacking critical expertise, users may develop trust bias: underuse (omission errors) or overreliance (commission errors) (Hasanzadeh et al., 2025).

Omission is less concerning at present, as AI’s role remains modest. The traditional physician’s independent judgement continues to serve as the standard. Commission errors can be serious: studies on human–machine interaction show that users tend to accept the outputs of automated systems, even when unreliable (Gerke, 2021). Factors influencing automation bias include non-experts being more likely to trust algorithms than humans, and participants following ‘black box’ AI recommendations despite their lack of transparency. In contrast, expertise tends to mitigate this bias, and forming one’s own estimate beforehand reduces blind trust (Logg et al., 2019). Quantitatively, automation bias increases human error risk by 26% when AI is wrong (Goddard et al., 2012) and unjustly alters 7% of correct assessments under the influence of erroneous AI advice (Rosbach et al., 2025). Overreliance on AI can also lead to progressive clinical skill loss, or skill erosion (Samuel et al., 2024).

Regulatory and ethical concerns

Artificial intelligence poses challenges that extend beyond technical matters to legal uncertainties and ethical issues. Legally, regulatory frameworks lag behind technological progress, leaving liability for harm unclear (Cestonaro et al., 2023). Responsibility may be shared between the physician, the software supplier, the algorithm developer, and the data provider (Price et al., 2024). The lack of clear safety and liability frameworks slows the uptake of these technologies (Ahmed et al., 2023).

Ethically, several concerns emerge: (1) the potential replacement of healthcare professionals by automated systems (Chustecki, 2024); (2) the risk that some algorithms are deliberately biased to promote lucrative procedures or products contrary to clinical guidelines (Knudsen et al., 2024); and (3) the erosion of human qualities such as compassion in care relationships (Klugman and Gerke, 2022), although some hybrid systems aim to integrate human psychology with ‘artificial empathy’ (Morrow et al., 2023).

Prerequisites to restore balance between risks and benefits

To balance risks and benefits, AI requires rigorous oversight from design through to clinical use (pre- and post-development). An international consensus (FUTURE-AI), involving 117 experts from 50 countries, produced recommendations to mitigate cognitive, ethical, regulatory and AI-specific biases (Supplementary table 1) (Lekadir et al., 2025).

(1) Cognitive – many experts stress that medical education should include proficiency in AI, critical appraisal of its outputs, and competence in data science and decision-making (Grunhut et al., 2022). Training may need to focus more on information management rather than rote memorization (Wartman and Combs, 2019). Although explainable AI is intended to mitigate this bias, they should be clear and accessible, since overly complex explanations may be disregarded. Explanations therefore need to be clear, accessible and explainable (Vasconcelos et al., 2023). Cognitive forcing strategies, such as forming an initial judgement or displaying AI uncertainty, are more effective at mitigating this bias than explainable AI alone (Bucinca et al., 2021).

(2) Ethical – one study advocates teaching AI ethics in medicine using real-world cases (Katznelson and Gerke, 2021).

(3) Regulatory – evidence-based guidelines for the publication of clinical trials on medical AI have been proposed, including a description of AI type, clinical role, integration into the care pathway, data quality, algorithm version, human–AI interaction, error analysis and conditions of access to the tool or its code (Liu et al., 2020).

(4) Data-related – diversifying AI training datasets is essential (Cross et al., 2024). Privacy can be protected through pseudonymization (replacing direct identifiers), differential privacy (adding random noise to data) and federated learning (training a shared AI from data held across multiple sites). Regular audits and strengthened security standards help prevent misuse (Rieke et al., 2020). Integration of the necessary infrastructure to clinical systems is expensive and requires expertise. The use or reliance on multiple applications on personal devices that do not integrate to hospital systems limits useability.

Monitoring the balance of AI

After development, AI performance can decline if real-world conditions differ from those used in training (US Food and Drug Administration, 2023). Two scenarios are recognized: data drift, where data change gradually (e.g. more images of smokers for AI trained on adult chest radiographs) and out-of-distribution data, where they differ markedly (e.g. radiographs of knees or of children). These situations require post-development surveillance, and the FUTURE-AI consensus advocates continuous monitoring (Lekadir et al., 2025) (Table 4).

Table 4.

Recommendations from the international FUTURE-AI consensus for monitoring the performance of artificial intelligence in healthcare after-market release.

Objective	Recommendations
AI monitoring	Traceability 1	Implement risk management throughout the AI lifecycle by identifying potential risks (e.g. errors, misuse, technical failure) and defining concrete measures such as alerts or system shutdown mechanisms
	Traceability 2	Create comprehensive AI documentation, including a summary for professionals, a technical file for developers and a risk management report
	Traceability 3	Set up automatic and continuous quality control to detect errors in input data or outputs, and alert the user in case of anomalies
	Traceability 4	Schedule regular audits to verify AI reliability, detect drifts or performance loss, and update the model if needed
	Traceability 5	Implement an AI logging system – an automatic and secure recording of user actions, data used, and AI outputs – to ensure full traceability while respecting privacy
	General 4	Define a clear evaluation plan with test data strictly separated from training data, appropriate performance metrics, and comparison with current clinical practice

For example, data drift can be monitored by continuous tracking of input data and AI performance with specialized tools that trigger alerts, initiate retraining when drift occurs and monitor performance (Sahiner et al., 2023). In one study, human CT scan reports were reviewed by a LLM to verify performance of a pulmonary embolism detection AI model. Disagreements were tracked and alert thresholds set to prompt a human review (Sorin et al., 2025). Practically, implementing post deployment monitoring may be difficult and resource-intensive. It is also unclear who would be responsible for the task.

To monitor out-of-distribution data, on radiology study suggested encoding each image as numerical vectors compared with a reference, flagging deviations. In testing across various scenarios, the system detected over 95% of anomalies and identified drift in under three days (Zamzmi et al., 2025).

Algorithm to restore balance

Clinical monitoring of AI should identify early performance declines via threshold alerts and generate regular reports on errors and performance. These should be reviewed by a multidisciplinary committee to steer evidence-based continuous improvement (van Leersum and Maathuis, 2025).

Equally important is defining the respective roles of AI and the clinician. Some tasks may be delegated to AI while preserving human control over the final decision (Tanaka et al., 2023). Artificial intelligence should never supplant human judgement, particularly given persistent biases affecting vulnerable or underrepresented groups (Mennella et al., 2024).

In conclusion, the integration of AI is inevitable in all areas of hand surgery. However, realizing benefits demands vigilance: reliable data, clear regulations, continuous oversight and education. As a support tool, it should remain under human control, in the service of patients.

Supplemental Material

sj-docx-1-jhs-10.1177_17531934251401382 – Supplemental material for The balance between artificial and human intelligence in clinical practice

Supplemental material, sj-docx-1-jhs-10.1177_17531934251401382 for The balance between artificial and human intelligence in clinical practice by Domenico Marrella, Turkka Anttila, Jorma Ryhänen, Robert Miller, Bo Liu and Philippe Liverneaux in Journal of Hand Surgery (European Volume)

Footnotes

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The authors received no financial support for the research, authorship and/or publication of this article.

ORCID iDs

Robert Miller

Philippe Liverneaux

Supplemental material for this article is available online.

References

Ahmed

Spooner

Isherwood

Lane

Orrock

Dennison

A systematic review of the barriers to the implementation of artificial intelligence in healthcare. Cureus. 2023, 15: e46454.

Akers

GA.

Using your voice: speech recognition technology in medicine and surgery. Clin Plast Surg. 1986, 13: 509–11.

Akyüz

Cano Abadía

Goisauf

Mayrhofer

MT.

Unlocking the potential of big data and AI in medicine: insights from biobanking. Front Med. 2024, 11: 1336588.

Alcaraz

Moghaddamnia

Poschadel

Peissig

. Machine learning as digital therapy assessment for mobile gait rehabilitation. In: 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP), Aalborg, Denmark. New York, IEEE, 2018: 1–6.

Ali

Jonmohamadi

Fontanarosa

Crawford

Pandey

AK.

One step surgical scene restoration for robot assisted minimally invasive surgery. Sci Rep. 2023, 13: 3127.

Amen

Torabian

Subramanian

Yang

Liimakka

Fufa

Quality of ChatGPT responses to frequently asked questions in carpal tunnel release surgery. Plast Reconstr Surg Glob Open. 2024, 12: e5822.

Anttila

Aspinen

Pierides

Haapamäki

Laitinen

Ryhänen

. Enchondroma detection from hand radiographs with an interactive deep learning segmentation tool: a feasibility study. J Clin Med. 2023a, 12: 7129.

Anttila

Karjalainen

Mäkelä

, et al. Detecting distal radius fractures using a segmentation-based deep learning model. J Digit Imaging. 2023b, 36: 679–87.

Bakalis

Kontogiannis

Ntais

Simos

Tsamis

Manis

Carpal tunnel syndrome automated diagnosis: a motor vs. sensory nerve conduction-based approach. Bioengineering (Basel). 2024, 11: 175.

10.

Barocas

Selbst

AD.

Big data’s disparate impact. Calif Law Rev. 2016, 104: 671.

11.

Bastian

Wang

Czempiel

Busam

Navab

DisguisOR: holistic face anonymization for the operating room. Int J Comput Assist Radiol Surg. 2023, 18: 1209–15.

12.

Bauknecht

Moeller

Mentzel

Lebelt

Mack

Vergote

Physiotherapie für die Hosentasche: Effektivität von Heimübungen mittels KI-basierter Smartphone App zur Nachbehandlung von Handverletzungen – eine randomisierte, kontrollierte und offene Studie. Handchir Mikrochir Plast Chir. 2025, 57: 186–94.

13.

Brancaccio

Balato

Malvehy

Puig

Argenziano

Kittler

Artificial intelligence in skin cancer diagnosis: a reality check. J Invest Dermatol. 2024, 144: 492–9.

14.

Brui

Efimtcev

Fokin

, et al. Deep learning-based fully automatic segmentation of wrist cartilage in MR images. NMR Biomed. 2020, 33: e4320.

15.

Bucinca

Malaya

M. B.

Glassman

E. L.

, (2021). To trust or to think: Cognitive forcing functions can reduce overreliance on AI in AI-assisted decision-making. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1–14). Yokohama, Japan (virtual), May 8–13, 2021.

16.

Burton

Bodansky

Silver

Yao

Horwitz

Assessing bone mineral density using radiographs of the hand: a multicenter validation. J Hand Surg Am. 2023, 48: 1210–6.

17.

Calo

The boundaries of privacy harm. Indiana Law J. 2011, 86: 1131–58.

18.

Castellini

Artemiadis

Wininger

, et al. Proceedings of the first workshop on peripheral machine interfaces: going beyond traditional surface electromyography. Front Neurorobot. 2014, 8: 22.

19.

Cestonaro

Delicati

Marcante

Caenazzo

Tozzo

Defining medical liability when artificial intelligence is applied on diagnostic algorithms: a systematic review. Front Med. 2023, 10: 1305756.

20.

Chen

Lim

LJR

Lim

RQR

, et al. Artificial intelligence powered advancements in upper extremity joint MRI: a review. Heliyon. 2024a, 10: e28731.

21.

Chen

Lim

LJR

, et al. Deep learning and remote photoplethysmography powered advancements in contactless physiological measurement. Front Bioeng Biotechnol. 2024b, 12: 1420100.

22.

Chustecki

. Benefits and risks of AI in health care: narrative review. Interact J Med Res. 2024, 13: e53616.

23.

Cohen

Amarasingham

Shah

Xie

The legal and ethical concerns that arise from using complex predictive analytics in health care. Health Aff. 2014, 33: 1139–47.

24.

Crawford

Schultz

Big data and due process: toward a framework to redress predictive privacy harms. BCL Rev. 2014, 55: 93–128.

25.

Croen

Abdullah

Berns

Rapaport

Hahn

Barrett

Sobel

AD.

Evaluation of patient education materials from large-language artificial intelligence models on carpal tunnel release. Hand (NY). 2025, 20: 893–9.

26.

Cross

Choma

Onofrey

JA.

Bias in medical AI: implications for clinical decision-making. PLOS Digit Health. 2024, 3: e0000651.

27.

Dawson

Schleiger

Horton

, et al. (2019). Artificial intelligence: Australia’s ethics framework – a discussion paper. Analysis & Policy Observatory. Canberra: CSIRO Data61. https://apo.org.au/node/229596 (Accessed: 26 November 2025).

28.

DeGol

Akhtar

Manja

Bretl

, (2016). Automatic grasp selection using a camera in a hand prosthesis. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (pp. 431–434). Orlando, Florida, USA, August 16–20, 2016.

29.

Demir

Schieber

Weise

, et al. Deep learning in surgical workflow analysis: a review of phase and step recognition. IEEE J Biomed Health Inform. 2023, 27: 5405–17.

30.

Dutrey

Maximen

Mevel

Ropars

Dreano

. Evaluation of the Rennes Universal Measurement Method (RUMM), an artificial intelligence application for hand joint angle assessment. J Hand Surg Eur. 2025, 50: 480–5.

31.

Health

Edge

. (2024). Evaluating pathways for AI dermatology in skin cancer detection: a white paper. NHSE Outpatient Recovery and Transformation Programme. https://www.edgehealth.co.uk/wp-content/uploads/2024/08/Evaluating-Pathways-for-AI-Dermatology-in-Skin-Cancer-Detection.pdf (Accessed: 26 November 2025).

32.

Egi

Hattori

Tokunaga

, et al. Face, content and concurrent validity of the Mimic^® dV-Trainer for robot-assisted endoscopic surgery: a prospective study. Eur Surg Res. 2013, 50: 292–300.

33.

Esteva

Kuprel

Novoa

, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017, 542: 115–8.

34.

Exer

AI.

(2024, March). Exer AI collaborates with Mayo Clinic to advance standard of care for hand and wrist disorders with AI. https://www.exer.ai/news/exer-ai-collaborates-with-mayo-clinic (Accessed: 24 March 2025).

35.

Faden

Kass

Goodman

Pronovost

Tunis

Beauchamp

TL.

An ethics framework for a learning health care system: a departure from traditional research ethics and clinical ethics. Hastings Cent Rep. 2013, 43(s1): S16–27.

36.

Foster

Joshi

Borgese

Abdelhafez

Boutin

Chaudhari

AJ.

WRIST: a wrist image segmentation toolkit for carpal bone delineation from MRI. Comput Med Imaging Graph. 2018, 63: 31–40.

37.

Gao

Lim

RQR

Liu

A novel technique of arthroscopic-assisted four-corner fusion and robot-assisted fixation for scaphoid nonunion advanced collapse wrist: a single case study. Orthop Surg. 2024, 16: 490–6.

38.

Garrow

Kowalewski

, et al. Machine learning for surgical phase recognition: a systematic review. Ann Surg. 2021, 273: 684–93.

39.

Gerke

Health AI for good rather than evil? The need for a new regulatory framework for AI-based medical devices. Yale J Health Policy Law Ethics. 2021, 20: 433–513.

40.

Goddard

Roudsari

Wyatt

JC.

Automation bias: a systematic review of frequency, effect mediators, and mitigators. J Am Med Inform Assoc. 2012, 19: 121–7.

41.

Goodman

Patel

Zhang

, et al. Analyzing surgical technique in diverse open surgical videos with a multi-task real-time AI model capable of quantifying kinematic hand motion descriptors that distinguish surgeon skill levels. JAMA Surg. 2024, 159: e230626.

42.

Grady

Eckstein

Berkman

, et al. Broad consent for research with biological samples: workshop conclusions. Am J Bioeth. 2015, 15: 34–42.

43.

Graëff

Daiss

Lampert

, et al. Preliminary stage in the development of an artificial intelligence algorithm: variations between 100 surgeons in phase annotation in a video of internal fixation of distal radius fracture. Orthop Traumatol Surg Res. 2023, 109: 103564.

44.

Graëff

Padoy

Liverneaux

Lampert

Introducing surgical workflow recognition in orthopaedic surgery with timestamp supervision. Comput Biol Med. 2025, 197(Pt A): 110995.

45.

Grunhut

Marques

Wyatt

AT.

Needs, challenges, and applications of artificial intelligence in medical education curriculum. JMIR Med Educ. 2022, 8: e35587.

46.

Fan

Cai

, et al. Automatic detection of abnormal hand gestures in patients with radial, ulnar, or median nerve injury using hand pose estimation. Front Neurol. 2022, 13: 1052505.

47.

Harrison

Geoghegan

Sidey-Gibbons

Stirling

PHC

McEachan

Rodrigues

JN.

Developing machine learning algorithms to support patient-centered, value-based carpal tunnel decompression surgery. Plast Reconstr Surg Glob Open. 2023, 11: e4744.

48.

Hasan

Ipaktchi

Meyer

Liverneaux

. Comparison of hand surgery certification exams in Europe and the United States using ChatGPT 4.0. J Hand Microsurg. 2025, 17: 100258.

49.

Hasanzadeh

Josephson

Waters

Adedinsewo

Azizi

White

JA.

Bias recognition and mitigation strategies in artificial intelligence healthcare applications. NPJ Digit Med. 2025, 8: 154.

50.

Hashimoto

Rosman

Rus

Meireles

OR.

Artificial intelligence in surgery: promises and perils. Ann Surg. 2018, 268: 70–6.

51.

Hassan

Kushniruk

Borycki

Barriers to and facilitators of artificial intelligence adoption in health care: scoping review. JMIR Hum Factors. 2024, 11: e48633.

52.

Haworth

Biswas

Opfermann

, et al. (2024, October 10). Autonomous robotic system with optical coherence tomography guidance for vascular anastomosis. arXiv preprint, arXiv:2410.07493 [cs.RO], v1. https://arxiv.org/abs/2410.07493

53.

Innocenti

Malzone

Menichini

First-in-human free flap tissue reconstruction using a dedicated microsurgical robotic platform. Plast Reconstr Surg. 2023, 151: 1078–82.

54.

Isch

Lee

Self

, et al. Artificial intelligence in surgical coding: evaluating large language models for current procedural terminology accuracy in hand surgery. J Hand Surg Glob Online. 2025, 7: 181–5.

55.

Jagiella-Lodise

Suh

Zelenski

NA.

Can patients rely on ChatGPT to answer hand pathology-related medical questions?

Hand (NY). 2025, 20: 801–9.

56.

Jiang

Naito

Liverneaux

Advantages in precision, safety, and aesthetic outcomes of robot-assisted minimally invasive techniques in peripheral nerve microsurgery: a narrative review. Adv Technol Neurosci. 2025, 2: 122–7.

57.

Jie

Hui

Dawei

Weiya

Treatment of the hook of hamate fracture with robot navigation: a note on technique. Acta Orthop Traumatol Turc. 2022, 56: 296–9.

58.

Juarez-Villalobos

Hevia-Montiel

Perez-Gonzalez

, (2021). Machine learning–based classification of local robotic surgical skills in a training tasks set. In Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (pp. 4596–4599). Virtual conference, hosted from Guadalajara, Mexico, November 1–5, 2021.

59.

Kalisman

Data storage and retrieval. Clin Plast Surg. 1986, 13: 529–43.

60.

Katznelson

Gerke

The need for health AI ethics in medical school education. Adv Health Sci Educ. 2021, 26: 1447–58.

61.

Kaye

Whitley

Lund

Morrison

Teare

Melham

Dynamic consent: a patient interface for twenty-first century research networks. Eur J Hum Genet. 2015, 23: 141–6.

62.

Keller

Guebeli

Thieringer

Honigmann

Artificial intelligence in patient-specific hand surgery: a scoping review of literature. Int J Comput Assist Radiol Surg. 2023, 18: 1393–403.

63.

Klugman

Gerke

Rise of the bioethics AI: curse or blessing?

Am J Bioeth. 2022, 22: 35–7.

64.

Knudsen

Ghaffar

Hung

AJ.

Clinical applications of artificial intelligence in robotic surgery. J Robot Surg. 2024, 18: 102.

65.

Kostick-Quenet

Gerke

AI in the hands of imperfect users. NPJ Digit Med. 2022, 5: 197.

66.

Kraus

Anteby

Konen

Eshed

Klang

Artificial intelligence for X-ray scaphoid fracture detection: a systematic review and diagnostic test accuracy meta-analysis. Eur Radiol. 2024, 34: 4341–51.

67.

Kuok

Yang

Tsai

, et al. Segmentation of finger tendon and synovial sheath in ultrasound image using deep convolutional neural network. Biomed Eng Online. 2020, 19: 24.

68.

Langerhuizen

DWG

Bulstra

AEJ

Janssen

, et al. Is deep learning on par with human observers for detection of radiographically visible and occult fractures of the scaphoid? Clin Orthop Relat Res. 2020, 478: 2653–9.

69.

Larson

Chen

Lungren

Halabi

Stence

Langlotz

CP.

Performance of a deep-learning neural network model in assessing skeletal maturity on pediatric hand radiographs. Radiology. 2018, 287: 313–22.

70.

Lekadir

Frangi

Porras

, et al. FUTURE-AI: international consensus guideline for trustworthy and deployable artificial intelligence in healthcare. BMJ. 2025, 388: e081554.

71.

Lim

Liverneaux

Chen

Liu

. Robotic hand surgery: current insights and future directions. J Hand Surg Eur. 2025, 50: 721–7.

72.

Lin

Han

, et al. Deep learning to detect triangular fibrocartilage complex injury in wrist MRI: retrospective study with internal and external validation. J Pers Med. 2022, 12: 1029.

73.

Liu

Chen

Jiang

Tian

. Robot-assisted percutaneous scaphoid fracture fixation: a report of ten patients. J Hand Surg Eur. 2019a, 44: 685–91.

74.

Liu

. Wrist arthroscopy for the treatment of scaphoid delayed or nonunions and judging the need for bone grafting. J Hand Surg Eur. 2019b, 44: 594–99.

75.

Liu

Cruz Rivera

Moher

Calvert

Denniston

SPIRIT-AI | CONSORT-AI Working Group. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat Med. 2020, 26: 1364–74.

76.

Liverneaux

. Le vissage percutané du scaphoïde assisté par ordinateur: étude expérimentale. Chir Main. 2005, 24: 169–73.

77.

Liverneaux

Prunières

Hidalgo Diaz

Salazar Botero

Vernet

Facca

Feasibility of wrist arthroscopy using a new free hand camera robot-assisted prototype. Mathews J Orthop. 2016, 1: 010.

78.

Logg

Minson

Moore

DA.

Algorithm appreciation: people prefer algorithmic to human judgment. Organ Behav Hum Decis Process. 2019, 151: 90–103.

79.

Loos

Hoogendam

Souer

, et al. Machine learning can be used to predict function but not pain after surgery for thumb carpometacarpal osteoarthritis. Clin Orthop Relat Res. 2022, 480: 1271–84.

80.

Loos

Hoogendam

Souer

, et al. Algorithm versus expert: machine learning versus surgeon-predicted symptom improvement after carpal tunnel release. Neurosurgery. 2024, 95: 110–7.

81.

Zhang

Wang

, et al. An Artificial Intelligence model for vascularity monitoring of postoperative flap. J Craniofac Surg. 2025, 36: 1527–32.

82.

Majzoubi

Allègre

Wemmert

Liverneaux

A deep learning-based algorithm for automatic detection of perilunate dislocation in frontal wrist radiographs. Hand Surg Rehabil. 2024, 43: 101742.

83.

Markovic

Dosen

Popovic

Graimann

Farina

Sensor fusion and computer vision for context-aware control of a multi degree-of-freedom prosthesis. J Neural Eng. 2015, 12: 066022.

84.

Marrella

Jiang

Ipaktchi

Liverneaux

Comparing AI-generated and human peer reviews: a study on 11 articles. Hand Surg Rehabil. 2025, 44: 102225.

85.

Mennella

Maniscalco

De Pietro

Esposito

Ethical and regulatory challenges of AI technologies in healthcare: a narrative review. Heliyon. 2024, 10: e26297.

86.

Miller

Farnebo

Horwitz

. Insights and trends review: artificial intelligence in hand surgery. J Hand Surg Eur. 2023; 48: 396–403.

87.

Miller

Jackson

Vilic

Boyce

Shuaib

. Artificial intelligence and machine learning capabilities in the detection of acute scaphoid fracture: a critical review. J Hand Surg Eur. 2025a, 50: 1129–33.

88.

Miller

Kedgley

Farnebo

Stockmans

Zlotolow

Horwitz

. Round table discussion. Integration of artificial intelligence into daily practice. J Hand Surg Eur. 2025b, 50: 1134–41.

89.

Mohapatra

Thiruvoth

Tripathy

, et al. Leveraging large language models (LLM) for the plastic surgery resident training: do they have a role? Indian J Plast Surg. 2023, 56: 413–20.

90.

Morrow

Zidaru

Ross

, et al. Artificial intelligence technologies and compassion in healthcare: a systematic scoping review. Front Psychol. 2023, 13: 971044.

91.

Nair

Svedberg

Larsson

Nygren

JM.

A comprehensive overview of barriers and strategies for AI implementation in healthcare: mixed-method design. PLoS One. 2024, 19: e0305949.

92.

Oeding

Kunze

Messer

, et al. Diagnostic performance of artificial intelligence for detection of scaphoid and distal radius fractures: a systematic review. J Hand Surg Am. 2024, 49: 411–22.

93.

Omar

Soffer

Agbareia

, et al. Sociodemographic biases in medical decision making by large language models. Nat Med. 2025, 31: 1873–81.

94.

Orgiu

Karkazan

Cannell

Dechaumet

Bennani

Grégory

Enhancing wrist arthroscopy: artificial intelligence applications for bone structure recognition using machine learning. Hand Surg Rehabil. 2024; 43: 101717.

95.

Park

Kim

Lee

, et al. Machine learning-based approach for disease severity classification of carpal tunnel syndrome. Sci Rep. 2021, 11: 17464.

96.

Peng

Zeng

Lai

Huang

One-stop automated diagnostic system for carpal tunnel syndrome in ultrasound images using deep learning. Ultrasound Med Biol. 2024, 50: 304–14.

97.

Pohl

Derector

Rivlin

, et al. A quality and readability comparison of artificial intelligence and popular health website education materials for common hand surgery procedures. Hand Surg Rehabil. 2024, 43: 101723.

98.

Pohl

Tarawneh

Johnson

Aita

Tadley

Fletcher

DJ.

Patient preferences for carpal tunnel release education: a comparison of education materials from popular healthcare websites and ChatGPT. Hand Surg Rehabil. 2025, 44: 102073.

99.

Pressman

Borna

Gomez-Cabello

Haider

Forte

AJ.

AI in hand surgery: assessing large language models in the classification and management of hand injuries. J Clin Med. 2024, 13: 2832.

100.

Price

2nd Cohen

IG.

Privacy in the age of medical big data. Nat Med. 2019, 25: 37–43.

101.

Price

II Gerke

Cohen

. Liability for use of artificial intelligence in medicine. In: Solaiman

Cohen

(Eds.) Research handbook on health, AI and the law. Cheltenham, Edward Elgar, 2024: 150–66.

102.

Quinn

Senadeera

Jacobs

Coghlan

Trust and medical AI: the challenges we face and the expertise needed to overcome them. J Am Med Inform Assoc. 2021, 28: 890–4.

103.

Radke

Wollschläger

Nebelung

, et al. Deep learning-based post-processing of real-time MRI to assess and quantify dynamic wrist movement in health and disease. Diagnostics (Basel). 2021, 11: 1077.

104.

Rieke

Hancox

, et al. The future of digital health with federated learning. NPJ Digit Med. 2020, 3: 119.

105.

Roner

Fürnstahl

Scheibler

Sutter

Nagy

Carrillo

Three-dimensional automated assessment of the distal radioulnar joint morphology according to sigmoid notch surface orientation. J Hand Surg Am. 2020, 45: 1083.e1–11.

106.

Rosbach

Ganz

Ammeling

Riener

Aubreville

Automation bias in AI-assisted medical decision-making under time pressure in computational pathology. In: Palm

Breininger

Deserno

, et al. (Eds.) Bildverarbeitung für die Medizin 2025. BVM 2025. Wiesbaden, Springer Vieweg, 2025: 129–34.

107.

Ryhänen

Wong

Anttila

Chung

. Overview of artificial intelligence in hand surgery. J Hand Surg Eur. 2025, 50: 738–51.

108.

Sahiner

Chen

Samala

Petrick

Data drift in medical machine learning: implications and potential remedies. Br J Radiol. 2023, 96: 20220878.

109.

Samuel

Norman

Vaknin

Saban

The physician-AI relationship: partnering for precision medicine via clinical decision support. Eur J Public Health. 2024, 34(Suppl 3): ckae144.1125.

110.

Selber

Alrasheed

Robotic microsurgical training and evaluation. Semin Plast Surg. 2014, 28: 5–10.

111.

Shinohara

Inui

Mifune

, et al. Ultrasound with artificial intelligence models predicted Palmer 1B triangular fibrocartilage complex injuries. Arthroscopy. 2022, 38: 2417–24.

112.

Siu

AHY

Gibson

, et al. Employing large language models for surgical education: an in-depth analysis of ChatGPT-4. J Med Educ. 2023, 22: 1.

113.

Sorin

Korfiatis

Bratt

, et al. Using a large language model for post-deployment monitoring of FDA approved AI: pulmonary embolism detection use case. J Am Coll Radiol. 2025, 22: 1404–14.

114.

Sugiyama

Sugimori

Tang

, et al. Deep learning-based video-analysis of instrument motion in microvascular anastomosis training. Acta Neurochir (Wien). 2024, 166: 6.

115.

Suojärvi

Lindfors

Höglund

Sippo

Waris

. Radiographic measurements of the normal distal radius: reliability of computer-aided CT versus physicians’ radiograph interpretation. J Hand Surg Eur. 2021, 46: 176–83.

116.

Tanaka

Matsumura

Bito

Roles and competencies of doctors in artificial intelligence implementation: qualitative analysis through physician interviews. JMIR Form Res. 2023, 7: e46020.

117.

Teule

EHS

Lessmann

van der Heijden

EPA

Hummelink

. Automatic segmentation and labelling of wrist bones in four-dimensional computed tomography datasets via deep learning. J Hand Surg Eur. 2024, 49: 507–9.

118.

Trachtenberg

M. S.

Singhal

Kaliki

Smith

R. J.

Thakor

N. V.

, (2011). Radio frequency identification – an innovative solution to guide dexterous prosthetic hands. In Proceedings of the 33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (pp. 3511–3514). Boston, Massachusetts, USA, August 30–September 3, 2011.

119.

Tsamis

Kontogiannis

Gourgiotis

Ntabos

Sarmas

Manis

Automatic electrodiagnosis of carpal tunnel syndrome using machine learning. Bioengineering (Basel). 2021, 8: 181.

120.

US Food and Drug Administration. (2023). Methods and tools for effective postmarket monitoring of artificial intelligence (AI)-enabled medical devices. https://www.fda.gov/medical-devices/medical-device-regulatory-science-research-programs-conducted-osel/methods-and-tools-effective-postmarket-monitoring-artificial-intelligence-ai-enabled-medical-devices (Accessed: 26 November 2025).

121.

van Leersum

Maathuis

Human centred explainable AI decision-making in healthcare. J Responsible Technol. 2025, 21: 100108.

122.

van Mulken

TJM

Schols

Scharmga

AMJ

, et al. First-in-human robotic supermicrosurgery using a dedicated microsurgical robot for treating breast cancer-related lymphedema: a randomized pilot trial. Nat Commun. 2020, 11: 757.

123.

Vasconcelos

Jörke

Grunde-McLaughlin

Gerstenberg

Bernstein

Krishna

Explanations can reduce overreliance on AI systems during decision-making. Proc ACM Hum Comput Interact. 2023, 7(CSCW1): 129.

124.

Wang

Sun

Surgical smoke removal via residual Swin transformer network. Int J Comput Assist Radiol Surg. 2023, 18: 1417–27.

125.

Wartman

Combs

. Reimagining medical education in the age of AI. AMA J Ethics. 2019, 21: E146–52.

126.

Weiner

Starke

Rader

Hundhausen

Asfour

Designing prosthetic hands with embodied intelligence: the KIT prosthetic hands. Front Neurorobot. 2022, 16: 815716.

127.

Wernér

Anttila

Hulkkonen

Viljakka

Haapamäki

Ryhänen

Detecting avascular necrosis of the lunate from radiographs using a deep-learning model. J Imaging Inform Med. 2024, 37: 706–14.

128.

Wong

Zhu

Baltzer

HL.

The accuracy of artificial intelligence models in hand/wrist fracture and dislocation diagnosis: a systematic review and meta-analysis. JBJS Rev. 2024, 12: e24.00106.

129.

Chen

Zhang

Liu

A novel mini-invasive technique of arthroscopic-assisted reduction and robot-assisted fixation for trans-scaphoid perilunate fracture dislocations. Orthop Surg. 2023, 15: 1203–09.

130.

Zamzmi

Venkatesh

Nelson

, et al. Out-of-distribution detection and radiological data monitoring using statistical process control. J Imaging Inform Med. 2025, 38: 997–1015.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.03 MB