Current status and challenges of artificial intelligence application in managing Children’s emotions and attention

Abstract

Machine learning (ML), a core component of artificial intelligence (AI), is increasingly being used to assess children’s emotions and attention, with potential applications in developmental monitoring and early identification of neurodevelopmental conditions such as autism spectrum disorder (ASD) and attention deficit hyperactivity disorder (ADHD). This narrative review synthesizes studies published between 2012 and 2025 from PubMed, IEEE Xplore, and Web of Science. We examine multimodal data sources (including facial, speech, physiological, eye movement, and behavioral features) and computational approaches such as convolutional neural networks (CNNs), support vector machines (SVMs), and long short-term memory (LSTM) networks. These methods can capture behavioral and physiological signals and provide complementary information for assessing children’s emotional and attentional states, particularly in controlled settings. However, the current evidence remains heterogeneous, with many studies relying on limited or laboratory-based datasets, which may constrain real-world applicability. Key challenges include data bias, cross-cultural variability, ethical concerns, and the need for robust privacy protection and external validation. Recent work has explored integrating AI with virtual reality (VR), augmented reality (AR), and Internet of Things (IoT) technologies to support more adaptive monitoring systems. Nevertheless, these applications remain largely exploratory. Future research should prioritize real-world validation, pediatric-specific datasets, and interdisciplinary collaboration to better define the role of AI in children’s mental health and education.

Keywords

machine learning emotion recognition concentration detection children’s mental health prediction of mental illness

1. Introduction

1.1. The rise of artificial intelligence (AI) technology in children’s mental health

The rapid development of artificial intelligence (AI) has transformed daily life and is increasingly influencing mental health research involving children younger than twelve—a boundary chosen to approximate the end of middle childhood and reduce heterogeneity related to adolescence.¹ AI applications extend beyond an analysis of emotions and have shown promise in identifying psychological risk patterns, although their clinical applicability remains under active investigation.^2–5 AI technologies can extract critical information from diverse sources, potentially supporting more objective assessments of children’s mental health, particularly in research or controlled settings. For example, methods such as voice recognition, sentiment analysis, and facial expression recognition can effectively capture emotional fluctuations in children, assisting mental health professionals in identifying potential early indicators of psychological risk in a timely manner, particularly with structured or multimodal data.⁶

The most commonly used AI technologies today include support vector machine (SVM),⁷ random forest,⁸ and deep learning models. These techniques process large-scale data to support the prediction of behavioral patterns associated with psychological conditions, while their role in formal clinical diagnosis remains limited.^9,10 AI technologies can support real-time emotional monitoring and may assist in detecting subtle emotional fluctuations, thereby contributing to early identification and potential intervention planning, although their effectiveness in real-world settings remains under investigation.¹¹ For example, AI algorithms can help identify early signs of anxiety, depression, or other psychological disorders by processing multidimensional signals such as children’s language patterns, vocal characteristics, and social interactions.^12–14 Based on the results of these analyses, AI systems may support the generation of data-informed suggestions for intervention strategies, although such recommendations typically require validation and oversight by clinical professionals. The application of such AI technologies is profoundly important for children’s mental health, particularly as children often struggle to express their inner emotions and psychological states accurately through traditional methods such as questionnaires or face-to-face interviews. By integrating multimodal data, AI systems can continuously capture and analyze patterns in children’s daily behaviors, verbal communication, and physical gestures, helping professionals assess children’s mental health from multiple perspectives.¹⁵ This approach may facilitate the early identification of individuals at potential risk of psychological disorders¹⁶ and provides valuable data support for designing personalized treatment plans, making psychological interventions more targeted and effective.

Moreover, the application of these technologies to identify psychological disorders has been explored to support more personalized treatment planning. Using AI systems, professionals can track individual emotional fluctuation trends and develop dynamic treatment plans based on these changes, thereby potentially contributing to improved therapeutic planning.³ For example, when an AI system detects abnormalities in a child’s emotional state, it may provide alerts to relevant professionals, supporting timely follow-up to mitigate potential deterioration.

In summary, while AI has good potential, its accuracy can be influenced by the quality of the dataset and cultural factors.^17,18 With further advancements, these technologies are expected to support more personalized and potentially accurate assessment approaches and treatment support,³ contributing to improved mental health outcomes for children. In the future, integrating AI technologies into child mental health management systems may lead to more precise and efficient models for the identification and treatment of disorders, ultimately potentially contributing to improved mental health support.¹⁶

1.2. The critical role of emotion and attention management in children’s mental health

Childhood is a critical stage for personality formation and psychological development, as well as the foundational period for building various cognitive abilities and emotional regulation skills. During this phase, managing emotions and attention is not only essential for children’s overall development but also a key means of early identification of potential psychological issues and developmental delays.^19,20 First, effective emotional regulation helps children maintain a positive attitude when facing challenges and setbacks, enhancing their performance in learning, social interactions, and self-regulation.¹⁹ If children exhibit excessive anxiety, depression, or irritability when dealing with stress or social interactions, these may be early signs of underlying psychological disorders such as autism spectrum disorder (ASD) or attention deficit hyperactivity disorder (ADHD).²¹ By identifying these signs of emotional dysregulation early, educators, psychologists, and parents can intervene in a timely manner, providing appropriate psychological support and interventions to prevent these issues from worsening and affecting children’s long-term mental health. For emotional problems, timely identification and intervention can reduce the risk of psychological disorders such as anxiety and depression²² and mitigate their profound impacts on children’s learning and daily life.

In addition, attention deficits are often associated with underlying psychological or developmental issues. A lack of attention not only affects children’s academic performance but also may be a manifestation of ADHD. If children frequently struggle to concentrate during learning, are easily distracted, or exhibit significant difficulty in completing tasks, these signs may warrant further evaluation. Unresolved attention issues can hinder children’s academic development and lead to secondary emotional problems, such as academic anxiety or decreased self-confidence. Therefore, early identification and assessment of emotional and attention-related issues in children are crucial for preventing the development of potential psychological disorders and developmental delays.²³ By fostering collaboration among professionals in the psychology, education, and healthcare sectors, regular assessments of children’s emotions and attention can help identify underlying psychological issues early and enable appropriate interventions to prevent further deterioration. These early interventions not only support children’s healthy development during childhood but also lay a solid foundation for their future mental health and social adaptability.

In practice, artificial intelligence does not provide a definitive diagnosis independently; rather, it functions by employing advanced machine learning algorithms, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), to extract “digital biomarkers” from the multi-modal inputs shown in Figure 1, including facial expressions, voice, eye-tracking patterns, and physiological signals (e.g., EEG and ECG). In addition to conventional CNN architectures, advanced models such as 3D convolutional neural networks have been applied to capture spatiotemporal features in neuroimaging data.^24,25 These biomarkers often encompass subtle features imperceptible to the human eye, such as micro-expressions or atypical gaze scanning trajectories. As illustrated in the feature fusion stage of Figure 1, the system integrates these unstructured visual, auditory, and biological data points, transforming them into quantifiable features that may be associated with clinically relevant behavioral patterns, although direct mapping to standardized diagnostic criteria (e.g., DSM-5 or ICD-11) is not yet established. Ultimately, the processed outputs, which are presented as classification results and prediction outputs, serve as an analytical reference that provides clinicians with quantitative information to inform clinical decision-making.

Figure 1.

A conceptual schematic illustrating a potential multimodal AI framework for recognizing children’s emotions and attention. Rather than representing a standardized or fully validated pipeline, this figure integrates approaches reported in the literature to highlight how different data modalities may be combined. The framework begins with multimodal data inputs, which may include: (1) spatial data (e.g., images of facial expressions), (2) temporal data (e.g., speech signals and physiological signals such as heart rate variability), (3) behavioral data (e.g., gaze tracking), and (4) motion data (e.g., body movement patterns). These heterogeneous data types typically require modality-specific preprocessing, such as normalization, filtering, synchronization, and feature standardization. In the feature extraction stage, different computational techniques may be applied depending on the data modality. For example, convolutional neural networks (CNNs) are commonly used for visual feature extraction, while recurrent models such as long short-term memory (LSTM) networks may be used to capture temporal dependencies. Traditional machine learning methods may also be incorporated for supplementary feature processing. At the integration stage, multimodal features may be combined through feature-level or decision-level fusion strategies, with mechanisms such as cross-attention potentially enhancing interactions between modalities. These approaches aim to improve the robustness of inference, although their effectiveness remains dependent on the data quality and experimental conditions. The output layer may include estimates of emotional states, attention patterns, and related behavioral indicators. However, these outputs should be interpreted with caution, as the current systems have been largely developed and evaluated in controlled settings, and their generalizability to real-world pediatric contexts remains an active area of research. Overall, this schematic highlights the potential structure of multimodal AI systems in this domain and is intended to provide a conceptual overview rather than a definitive or clinically validated workflow.

Therefore, the early identification, assessment, and intervention for emotional and attention-related issues in children are highly important for promoting their long-term healthy development.^12,23 Through the effective management of emotions and attention, early detection of psychological problems, and timely intervention may contribute to improved psychological development and a potential reduction in long-term mental health risks. This approach may support long-term developmental monitoring and well-being. Specifically, the digital biomarkers extracted through this pipeline correspond to distinct clinically relevant dimensions: facial micro-expressions may reflect emotional dysregulation relevant to anxiety- or depression-related screening; atypical gaze-scanning trajectories may indicate altered social attention patterns associated with ASD; heart rate variability (HRV) may index autonomic regulation under affective stress; and motor pattern irregularities have been explored as potential motor signatures of neurodevelopmental conditions. These correspondences remain under active validation and should be interpreted as exploratory rather than diagnostic.

1.3. The potential of AI technology in supporting children’s emotions and attention recognition

The application of AI technology in managing children’s emotions and attention has advantages that surpass those of traditional methods. First, these technologies may support more personalized approaches to emotion and attention management.¹⁶ Each child exhibits unique emotional responses, attention levels, and psychological challenges. Traditional psychological interventions often rely on experience and generalized strategies, and precisely tailoring approaches to individual needs is difficult.¹² In contrast, AI technologies can be applied to analyze large volumes of behavioral data to dynamically adjust intervention plans.²⁶

Second, the continuous monitoring capabilities of AI technologies offer increased temporal resolution and responsiveness compared to traditional methods for emotion and attention management. Traditional methods for assessing emotions and attention often rely on periodic evaluations or observations, which may overlook subtle emotional changes or fail to capture issues as they arise. In contrast, AI systems, which are equipped with various sensory devices, can be applied to continuously monitor children’s facial expressions, vocal tones, and even physiological data, thereby providing real-time data that may provide insights into their emotional fluctuations and attention changes.²⁶ When abnormalities are detected, such as a sudden emotional downturn or difficulty concentrating, the AI system may enable the generation of alerts under predefined conditions. This instant responsiveness may help facilitate earlier responses to emotional changes and help children promptly regain focus.

Third, the data-driven nature of AI systems makes emotion and attention management more scientific and precise. These systems can collect data not only from individual children but also from large-scale populations to extract valuable patterns and trends. Through long-term data accumulation, AI systems can identify patterns potentially associated with emotional fluctuations or attention deficits in children and inform the development of response strategies based on these findings. AI systems may contribute to the refinement of intervention strategies based on accumulated data, although their effectiveness in real-world clinical settings remains to be validated. This data-driven decision-making approach not only may support the optimization of intervention strategies but also provides data-informed insights for parents and educators, enabling them to better understand children’s psychological needs and adopt more appropriate guidance measures in daily life.

The specific implementation process of AI technology in identifying children’s emotions and attention is illustrated in Figure 1, which outlines the complete data processing workflow of a multimodal AI model applied to emotion and attention recognition. The process begins with data input, which is categorized into four main types: spatial data (e.g., facial expression images used to analyze children’s visual emotional features)²⁷; temporal data (e.g., speech signals and physiological signals such as heart rate variability (HRV), which capture vocal tone and bodily responses)²⁸; behavioral data (e.g., gaze tracking, reflecting attention focus)²⁹; and motion data (e.g., hand or body movement patterns for assessing emotional and attentional states).³⁰ In the data preprocessing stage, specific strategies are applied to each data type. Spatial data undergo image normalization and noise reduction, temporal data are filtered and synchronized to ensure temporal consistency across multimodal data, and behavioral data are standardized and smoothed to reduce interference and enhance analysis accuracy. During feature extraction, different techniques are employed on the basis of the data characteristics. Convolutional neural networks (CNNs) are used to extract hierarchical visual features from images.³¹ Long short-term memory (LSTM) networks capture the temporal dependencies in speech and physiological signals,³² and traditional machine learning methods (e.g., support vector machines (SVMs) and random forests) are utilized to process supplementary features, such as behavioral patterns. In the fusion layer, multimodal features are integrated through feature fusion, leveraging cross-attention mechanisms to enhance correlations between different data sources. Furthermore, decision fusion strategies, such as weighted voting and ensemble learning, are employed to improve prediction stability and ensure more accurate emotion and attention analyses.³³ The final output of this conceptual framework may include three types of results: emotion recognition, attention detection, and psychological health assessments. The models can automatically identify children’s emotional states (e.g., happiness, sadness, and anger), track attention changes (e.g., focus or distraction), and provide quantitative assessments of psychological health. These outputs support psychological interventions and the development of personalized behavior improvement plans. This workflow demonstrates the potential and technical advantages of multimodal AI models in research on children’s psychology and behavior.

Table 1 presents studies related to the use of AI methods in identifying common psychological disorders in children, such as ASD and ADHD. Overall, the application of AI technology has enabled the real-time identification of children’s emotions and attention with more personalized and scientific insights,³⁴ providing potential support for understanding and managing their mental health. Additionally, the continuous advancement of AI technologies opens new possibilities for future mental health management and educational models. This progress gives us reason to anticipate significant improvements in children’s mental health and learning outcomes as technology continues to evolve.

Table 1.

This table presents a summary of AI applications in detecting children’s mental health and learning disorders, covering four major areas: ASD, ADHD, anxiety/depression, and dyslexia.

Method	Disease
Method	Method definition	ASD	ADHD	Anxiety/depression	Dyslexia
Speech analysis	Words or possible audio features	CNN^35,36	DNN³⁷	Decision tree,¹² RF¹²	Multilayer perceptron¹⁴
Facial expression recognition	Facial imaging	CNN,³⁸ transfer learning,³⁹ label distribution learning⁴⁰	DCNN⁴¹	K-NN,⁴² DCNN^43,44	Bag of features (BOF)^33,42
Physiological signal monitoring	GSR, HRV, EEG signals, prefrontal cortex oxyhemoglobin	SVM,⁴⁵ K-NN^25,46,47	CNN⁴⁸	CNN,⁴⁹ SVM^43,50	3D-CNN⁵¹
Eye movement analysis	Pupil diameter and eye movement behavior	RF,⁵²K-means clustering,⁵³ SVM⁵⁴	CNN,^55,56Canopy⁴⁴	Most are research for treatment, not identification	SVM^57–59
Behavior analysis	Individual behavior patterns recorded	R-CNN,⁶⁰ RF,⁶¹ decision tree⁶¹	CNN^62,63	Most are research for treatment, not identification	Most methods involve eye movements or gaze direction and are therefore not repeated here.
Gaze direction analysis	Eye direction	SVM,⁶⁴ decision tree,⁶⁵ DFNN⁶⁵	SVM⁶⁶	Most are research for treatment, not identification.	LS^TM59

The technological functions include speech analysis, facial expression recognition, physiological signal monitoring, eye movement analysis, behavior analysis, and gaze direction analysis. Speech analysis uses technologies such as LENA, CNN, and MFCC for ASD and ADHD detection. Facial expression recognition employs CNNs and other techniques to analyze emotions, aiding in ASD, ADHD, and dyslexia detection. Physiological signal monitoring focuses on the GSR, HRV, and EEG signals, using methods such as SVM to detect anxiety, depression, and dyslexia symptoms. Eye movement analysis applies RF and SVM methods to study pupil diameter and eye behavior, particularly in ASD and ADHD research. Behavior analysis integrates CNNs and similar technologies for ASD and ADHD diagnosis. Gaze direction analysis leverages LSTM networks and other techniques to support ASD and ADHD research. These AI technologies provide diverse tools for diagnosing mental health and learning disorders, laying a strong foundation for future research.

2. Applications of AI technologies in children's emotion analysis

We defined the age cutoff at 12 years to align with the developmental boundary of middle childhood. Research indicates that the age of 12 years serves as a critical transition point; while it concludes a period of consistent cognitive development, the subsequent onset of adolescence (typically age 13 years and older) introduces significant neurobiological and behavioral variances.¹ By limiting our inclusion criteria to children under 12 years of age, we aimed to minimize these confounding variables and focus on the efficacy of AI tools within a stable developmental window.

This review adopts a narrative synthesis approach to map the current landscape of AI applications in children’s emotion and attention management. We conducted a literature search across PubMed, IEEE Xplore, and Web of Science for studies published between 2012 and 2025, using keyword combinations including “artificial intelligence,” “emotion recognition,” “attention detection,” “children’s mental health,” and “machine learning.” Studies were considered for inclusion if they (1) focused on children aged 12 years or younger, (2) applied AI or machine learning methods to emotion or attention-related tasks, and (3) reported quantitative performance metrics. Studies were excluded if they focused exclusively on adult populations, relied solely on subjective assessments without computational components, or were unavailable in English. Given the heterogeneity of methodologies and outcome measures across included studies, a formal meta-analysis was not conducted; instead, representative studies were selected to illustrate technological trends and key findings, which are synthesized in Tables 2 and 3. Notably, the accuracy rates reported are predominantly derived from controlled experimental settings and may not fully generalize to real-world clinical environments.

Table 2.

This table presents a summary of research achievements in detecting ASD from 2012 to 2025, covering various age groups, data sources, and methodologies.

Year	Age	Type	Method	Accuracy	Features
2012⁶⁷	< 12 years	Autism Diagnostic Observation Schedule (ADOS)	Weka	99.7%	For initial assessment and prioritization, to shorten diagnosis and provide earlier care than current methods can.
2014⁴	< 12 years	Behavioral traits	Observation-based classifier (OBC)	97%	high diagnostic validity but are time consuming
2016⁶⁸	2∼3 years	Behavioral traits	SVM	96.7%	These findings offer insight into a potential motor signature of ASD.
2016⁶⁹	Average 7year	Eye tracking	SVM	88.51%	With certain constraints that may apply in clinical practice.
2016⁷⁰	Average 4year	Hand movements	Random forest, regularized greedy forest	93%	Motion analysis to identify autism through tablet games
2016⁷¹	9∼10 years	Facial expression	Relief-F	90%	Innovative data sources and multiple classification methods were used, but the sample size is small and uneven across groups.
2017⁷²	1∼2 years	Autism Diagnostic Observation Schedule (ADOS)	Linear regression, SVM	93%	Improves stability and reduces time complexity,
2018⁷³	2∼4 years	Behavioral traits	SVM	94%	Validation on children with existing diagnoses limited generalization to undiagnosed populations.
2019²⁶	1∼2 years	Modified Checklist for Autism in Toddlers, Revised (M-CHAT-R)	Feedforward neural network (FNN)	99.92%	A beneficial tool for automatic, efficient scoring that eliminates labor-intensive follow-up and human error, offering an advantage over previous screening methods.
2020⁷⁴	< 12 years	Eye gaze, facial expression	Random forest, SVM	91%	Multimodal framework improves recognition efficiency and reduces costs
2021³⁶	< 12 years	Voice	CNN	90%	Maintain high accuracy in uncontrolled environments
2021³⁹	< 12 years	Facial expression	transfer-learning, k-means	92.10%	can be used to identify ASD by analyzing facial features, eye contact
2022⁷⁵	Average of 6 years	Autism Diagnostic Observation Schedule (ADOS)	LR, SVM	93.9%	A limitation of this study is the relatively small sample size. If the sample size continues to increase, the ML trained model will be more convincing.
2023⁷⁶	< 12 years	Autism-spectrum quotient-10	Random forest, SVM	92%	This study did not account for the potential influence of the cultural and linguistic differences on the accuracy of the model.
2025⁷⁷	< 12 years	Q-Chat -10	LR & SVM	95%	The current study lacks external validation across geographically and culturally diverse datasets.

Studies have utilized data from behavioral features, eye tracking, facial expressions, and speech analysis combined with AI technologies to increase diagnostic efficiency. Early studies, such as that by D. P. Wall et al. in 2012, employed the ADOS tool and achieved an accuracy of 99.7%, highlighting the importance of early diagnosis. In 2015, M. Duda et al. analyzed behavioral features with 97% accuracy, whereas Alessandro Crippa et al. achieved 96.7% accuracy by analyzing motion features⁶⁸. After 2016, new technologies emerged. For example, Anna Anzulewicz et al. analyzed hand movements through tablet games, achieving 93% accuracy,⁷⁰ and Mahiye Uluyagmur-Ozturk et al. used facial expressions and the Relief-F algorithm, reaching 90% accuracy.^71,78 Deep learning and multimodal analysis dominated later research. In 2019, Luke E. K. Achenie et al. employed M-CHAT-R and neural networks, achieving 99.92% accuracy. In 2020, Jingying Chen et al. combined gaze tracking and facial expression analysis to improve diagnostic efficiency. In 2023, Hasan Alkahtani et al. used the ASQ-10 scale with random forest and SVM, achieving 92% accuracy. Despite significant technological advancements, challenges such as limited sample sizes and insufficient cultural diversity remain. With increasing data availability, more efficient and widely applicable diagnostic technologies are expected in the future.

Table 3.

This table presents a systematic comparison of five types of AI models used in recognizing children’s emotions and attention, highlighting their advantages, limitations, applicable data types, computational complexity, and typical application scenarios.

AI model	Feature
AI model	Advantages	Disadvantages	Input signal type (for children)	Processing complexity	Typical applications (child-specific)
SVM	• Performs well on small datasets.	• Limited scalability.	Text, voice, simple physiological signals (e.g., EEG).	Low: Fast and best for targeted, simple tasks.	• Early-stage emotional state classification (e.g., happy, sad).
	• Easy to train and interpret.	• Not suitable for complex, high-dimensional data			• Screening for ASD via EEG pattern detection.
	• Quick to train	• Not suitable for complex, high-dimensional data			• Screening for ASD via EEG pattern detection.
CNN	• High accuracy in image-based tasks.	• Computationally expensive.	Images, videos, eye tracking, and spatial data.	High: Requires GPU resources.	• Autism-related emotion recognition.
	• Robust for facial expression analysis.	• Large datasets required.			• Classifying children’s facial micro-expressions.
	• Generalizes well with large.	• Poor interpretability.			• Gaze tracking for attention detection.
LSTM	• Ideal for sequential data.	• Computationally intensive.	Speech, EEG, ECG, and other time-series signals.	Moderate to High: Handles sequential complexity.	• Tracking ADHD children’s attention span.
	• Captures temporal dependencies.	• Requires significant sequential data.			• Detecting emotional escalation (e.g., frustration) through speech or EEG.
	• Real-time processing capable.	• Requires significant sequential data.
GAN	• Enhances data diversity.	• Training instability issues.	Any type: Especially useful for images, videos, or rare data scenarios.	Very High: Demands significant tuning.	• Creating synthetic children’s emotional images for training CNNs.
	• Generates realistic synthetic data.	• Requires extensive computational power.			• Augmenting datasets for rare emotions (e.g., fear, anxiety).
	• Reduces overfitting.	• Requires extensive computational power.
Multimodal fusion	• Integrates diverse signal types.	• Complex to train and optimize.	Multiple: Combines image, speech, and physiological signals (EEG/ECG).	Very High: Complex data preprocessing and modeling.	• Comprehensive emotion recognition using multimodal data.
	• Achieves robust predictions across domains.	• Requires synchronized multimodal datasets.			• Personalized emotional intervention plans for ADHD children.
	• Excellent generalization.	• Requires synchronized multimodal datasets.			• Real-time mental health monitoring.

SVM models are known for their low computational complexity and stability, making them particularly suitable for small datasets and simple signal processing,⁷⁹ such as HRV and emotional text classification. These models are commonly applied in early emotion screening (e.g., binary classification of happiness and sadness) and autism screening on the basis of EEG patterns. However, their ability to handle high-dimensional and nonlinear data is limited. CNN models excel in processing spatial data (e.g., images and videos) because of their deep structures, which effectively extract hierarchical visual features. They are widely used in facial expression analysis, eye-tracking, and video-based emotion dynamics studies.⁶² For example, CNNs have been applied in emotion recognition and analyzing the micro-expressions of autistic children, as well as in attention assessments based on eye-tracking data.⁸⁰ However, these models require extensive labeled data and computational resources, which may restrict their use in resource-constrained environments. LSTM models are particularly effective for handling time series data because of their ability to capture temporal dependencies. They are suitable for analyzing speech and physiological signals⁸¹ and are often used for modeling temporal patterns to support emotion detection and behavioral pattern analyses, such as identifying emotional fluctuations (e.g., frustration or stress) from speech intonation or EEG signals.⁸² Despite their excellent performance in sequence modeling, LSTM models face challenges such as high dependency on data quality and consistency, along with significant training costs. GAN models have unique advantages in data generation, particularly for addressing data imbalance issues.⁸³ For example, GANs can generate synthetic samples of rare emotions (e.g., anxiety or fear) to increase the generalizability of emotion recognition models.³³ While they are effective at increasing data diversity and improving model robustness, GANs face challenges such as instability during training and high computational resource requirements. Multimodal fusion models focus on integrating multiple signal data types, such as facial expressions, speech, and physiological signals, to support emotion and attention recognition.⁸⁴ These models are applied to integrate multimodal data sources (e.g., speech, video, and physiological signals) for emotion recognition and mental health-related analyses, and have been explored in contexts such as autism-related emotion recognition.⁸⁵ Despite their advantages in terms of performance and accuracy, multimodal fusion models rely heavily on synchronized data processing and require substantial computational resources, limiting their scalability in real-world scenarios. These application scenarios represent commonly explored directions in the literature, though the specific implementations may vary across studies.

This narrative review has several limitations. First, the literature selection was not exhaustive and may be subject to selection bias. Second, the heterogeneity of included studies limits direct comparisons across methods. Third, most reported findings are derived from controlled experimental settings, which may restrict generalizability to real-world clinical environments.

2.1. Developmental progress of AI-assisted diagnosis for children’s mental health

Over the past decade, AI technology has made significant advancements in diagnosing children’s mental health. As shown in Table 2, the progression of research from 2012 to 2024 highlights a clear trajectory of evolution, transitioning from single-method to multimethod approaches and from static to dynamic techniques. This section provides a comprehensive review of key research developments in this field, laying the foundation for subsequent in-depth technical discussions.

Early studies focused primarily on applying traditional machine learning methods. Wall et al. (2012) pioneered the use of the Weka analysis tool for analyzing the ASD Diagnostic Observation Schedule (ADOS) data, achieving a diagnostic accuracy of 99.7%. This tool opened new avenues for automated diagnosis.⁶⁷ While this groundbreaking study revealed exceptional accuracy, its primary value lies in reducing diagnostic timelines and enabling early intervention. Duda et al. (2015) subsequently introduced an observation-based classifier (OBC), which achieved an accuracy of 97%.⁴ Crippa et al. (2015) achieved 96.7% recognition accuracy through motion feature analysis,⁶⁸ highlighting the advent of a new paradigm in multimodal feature analysis.

Researchers focused on the integrated analysis of multisource data between 2016 and 2018. Liu et al. (2016) introduced eye-tracking technology into the diagnostic process, improving diagnostic objectivity despite a reduction in accuracy to 88.51%.⁶⁹ Anzulewicz et al. (2016) achieved a 93% accuracy rate by analyzing hand motion patterns in tablet-based games, highlighting the value of behavioral data in the diagnosis.⁷⁰ Levy et al. (2017) further refined this approach by applying machine learning to identify stable subsets of behavioral features for automated ASD detection, achieving 93% accuracy.⁷² This phase of research was characterized by the diversification of data sources and the naturalization of data collection methods, laying a solid foundation for the application of deep learning technologies.

After 2019, deep learning techniques became a dominant focus in research.⁷⁵ Achenie et al. (2019) developed a feedforward neural network model based on the M-CHAT-R scale, achieving an impressive accuracy of 99.92% and significantly improving the efficiency of automated assessments.²⁶ Chen et al. (2020) integrated gaze fixation and facial expression analysis into a multimodal diagnostic framework, achieving high accuracy while optimizing computational efficiency.⁷⁴

Recent studies (2023–2024) have described higher levels of technical maturity. Alkahtani et al. (2023) reported, for the first time, 92% classification accuracy under specific experimental conditions on the ASQ-10 scale but noted potential limitations in cross-cultural applications.⁷⁶ Moreover, Alzakari et al. (2025) achieved slightly lower accuracy (95%) but emphasized the importance of validating geographic and cultural diversity, suggesting a path for future research.⁷⁷

This development trajectory reflects the continuous advancements in AI-assisted assessment and screening technologies in terms of accuracy, practicality, and universality. However, current studies still face challenges such as limited sample sizes and insufficient cultural adaptability. As the data scale expands and algorithms improve, these technologies may have the potential to contribute to clinical practice as further validation becomes available.

Based on the synthesis of current research and the high accuracy rates (frequently exceeding 90% in controlled experimental settings) presented in Table 2, one of the main potential applications of AI is large-scale preliminary screens. While these accuracy rates are promising, they should be interpreted with caution, as they are predominantly derived from controlled experimental settings and may not fully generalize to real-world clinical complexity (as noted in Section 2.1). Nonetheless, they provide a quantitative basis for evaluating the potential of AI as a supplementary screening tool. This capability may support the identification of children at potential risk within the general population, particularly in preliminary screening contexts, and may help inform subsequent referral decisions, although its effectiveness in real-world clinical workflows remains to be validated. Furthermore, AI provides quantitatively derived indicators, which may be influenced by data quality and model design. Beyond the initial assessment, AI-driven identification extends to long-term monitoring of progress. By maintaining continuous data tracking, these systems may enable the observation of longitudinal trends in children’s emotional and attentional behaviors following therapeutic interventions, supporting a more dynamic and personalized management approach.

2.2. Facial expression-based applications in children’s emotion recognition

Facial expression recognition (FER) serves as a commonly used approach for deriving quantitative behavioral indicators, supporting the automated analysis of affective states in children. By translating subtle facial cues into quantifiable data, FER may assist clinicians in identifying patterns of emotional dysregulation that might be overlooked during traditional subjective observations.

Facial expressions provide a direct insight into emotional expression, particularly in children, whose facial changes are highly dynamic and illustrative. From smiles to frowns, these subtle facial movements may reflect aspects of emotional states. Traditionally, the interpretation of these expressions has relied heavily on the subjective judgment of parents, teachers, and other caregivers. However, this approach is prone to bias and lacks the ability to provide continuous and comprehensive monitoring. With the advancement of AI technologies, particularly the application of deep learning algorithms, facial expression recognition has gradually transitioned from subjective interpretation to more structured and automated processing.⁴² This technological progress supports the use of AI systems to capture children’s facial expressions in real time and analyze patterns of facial muscle movement. Thus, these systems can be used to classify different emotional states, such as happiness, sadness, anger, and surprise.⁸⁶

Facial expression recognition techniques are currently categorized into two main types: traditional methods and neural network-based methods. Traditional methods typically rely on handcrafted features, such as image processing techniques and pattern recognition. These approaches rely on predefined features established by experts and use extracted facial features to recognize expressions.^78,87 While these methods can achieve good accuracy in specific scenarios and have advantages when working with smaller datasets, their primary limitation lies in their lack of generalizability. They struggle to handle variations in facial expressions across different contexts and environments.^42,87,88 In contrast, neural network-based methods, particularly CNNs, exhibit robust self-learning capabilities,^41,42,88 enabling the recognition of psychological disorders such as ASD.^39,40,89 Neural networks can automatically learn and extract features from large datasets, making them significantly more powerful in analyzing complex and diverse emotional expressions. This advantage becomes especially pronounced with larger datasets.^39,88 Additionally, CNNs are often combined with the facial action coding system (FACS) to perform more precise analyses of facial expressions. FACS is a technique that involves decomposing facial movements into distinct “action units,” which AI systems use to identify specific emotions.⁴¹ Thus, FACS serves as a fundamental tool in the field of emotion recognition. Furthermore, generative adversarial networks (GANs) are increasingly being applied to facial emotion recognition.^90,91 GANs can generate images of facial expressions corresponding to different emotional states, thereby providing more diverse data for training emotion recognition models.^83,92 Overall, these advanced models further enhance feature representation and improve recognition performance, particularly when large-scale datasets are available.

Despite the promising potential of facial expression recognition technology in practical applications, several challenges remain. One major issue is the variability in human expressions. Owing to differences in individual facial muscle structures, there is significant variation in expression across individuals. Moreover, expressions are influenced by cultural, social, and biological factors, making accurate emotion recognition even more complex.⁴² Another issue is that most existing emotion recognition datasets lack authenticity. Many datasets are derived from laboratory settings or artificially synthesized scenarios,^93,94 and thus they may not accurately reflect emotional expressions in real-life contexts. Therefore, future studies should focus on developing more realistic datasets and improving algorithms to address real-world variability and uncertainties. The effectiveness of facial expression recognition systems is highly dependent on data quality and the robustness of feature representation, which directly influence the system’s sensitivity to subtle emotional variations and overall classification performance.⁴² The selection and implementation of each step are crucial to the overall performance of the system. In particular, the image preprocessing and feature extraction stages play critical roles, as they determine the system’s sensitivity to emotional changes and directly affect the accuracy of subsequent classifications. Additionally, the choice of classifier is central to the effectiveness of emotion recognition.

Looking ahead, FER holds potential for deployment across diverse domains—from mental health management and educational assessments to continuous emotional monitoring in rehabilitation contexts.^38,43,86 Nevertheless, its clinical utility as an objective tool for obtaining behavioral evidence currently remains constrained by the limited ecological validity of laboratory-derived datasets and the substantial cross-cultural and developmental variability in facial expressions, underscoring the need for more naturalistic, pediatric-specific training data before broader real-world deployment can be realized.

2.3. Speech-based applications in children’s emotion analysis

Speech and language analyses primarily function as a vehicle for developmental screening and behavioral quantification. By extracting acoustic features and linguistic patterns, these techniques offer reproducible metrics to detect early markers of neurodevelopmental conditions, such as ASD or language delays, which are essential for early-stage screening protocols.

Speech is one of the key pathways for emotional expression. In children, vocal features such as pitch, volume, and speech rate change with emotional state. These changes are often more subtle and concealed than facial expressions are, making it difficult for untrained listeners to accurately discern a child’s emotional state. The application of AI technologies in speech analysis focuses primarily on two areas: monitoring children’s language development and recognizing emotions. First, AI systems have displayed significant potential in tracking language development, particularly in predicting language acquisition delays and language disorders.^35,95 AI technologies have also been widely used for the early diagnosis of ASD.^32,78 Researchers extract features from speech behaviors collected via smart devices in everyday home environments to assess children’s language development and aid in determining a diagnosis.⁷³ These studies rely on rich sources of speech data gathered through various methods, including online games, social media posts, and recordings of infant cries.⁹⁶ Among these, the analysis of infant cries has gained particular attention. Through applications or professional recording equipment, AI systems can examine infant cries in detail to distinguish between different needs and emotional states, such as identifying whether the infant is in distress.^96,97 These technologies not only help parents or caregivers better understand an infant’s needs but also predict certain developmental abnormalities.

Speech emotion recognition technology relies on the ability of AI systems to extract and analyze vocal features such as pitch, rhythm, volume, and tone in detail. Among common techniques, LS^TM networks, a type of recurrent neural network (RNN) specifically designed for sequential data, are widely used.⁹⁸ One of the core methods for extracting speech features is the Mel-frequency cepstral coefficient (MFCC),⁹⁹ which simulates the human ear’s perception of sound, converting speech signals into cepstral coefficients that effectively convey emotional information. By analyzing these extracted features, machine learning models such as SVMs⁷ and random forests⁸ can classify speaker emotions, making them well-suited for early research in speech emotion recognition. With advancements in deep learning, methods such as deep belief networks (DBNs) ¹⁰⁰ and autoencoders¹⁰¹ have been widely adopted for extracting high-level speech features. These methods are particularly powerful when handling large volumes of speech data. Another significant technology is the emotional acoustic model, which specifically targets modeling emotional acoustic features. By analyzing aspects such as pitch, speed, and tone in speech, AI systems can accurately infer the speaker’s emotional state.¹⁰² In addition to these techniques, automated language environment analysis systems play a critical role in monitoring infants’ language development.^103–105 LENA systems assess infant vocal behaviors, track language development progress, and provide essential data for early intervention. The integration of these technologies not only enhances the accuracy of speech emotion recognition but also enables AI systems to identify children’s emotional states, including joy, anxiety, anger, and fatigue, providing valuable data for screening mental health and monitoring emotional development.^37,43

Currently, AI-driven speech analysis has achieved remarkable results in areas such as language delays, dyslexia, and ASD,^78,87 with high levels of accuracy. In particular, AI technology has been able to distinguish different needs by analyzing subtle variations in infant cries, providing caregivers with real-time assistance. However, the diversity of speech features presents challenges for this technology. Factors such as regional differences in languages and the influence of family environments can cause variations in speech patterns, imposing greater demands on the cross-linguistic applicability of AI technologies. While some studies have demonstrated the feasibility of AI technologies in cross-linguistic applications for languages such as German and Spanish,¹⁰⁶ validation is needed to determine whether these technologies can be effectively extended to more languages and diverse cultural contexts.

Overall, the application of AI technology in speech analysis not only demonstrates its potential in monitoring language development and supporting the identification of patterns associated with ASD but also highlights its value in analyzing emotions. By analyzing vocal features in detail, AI systems may detect patterns not easily perceived by untrained observers, which is highly important for enhancing the monitoring of children’s emotional development and enabling early interventions.¹⁴ While its promise as a developmental screening instrument is well-supported, the transition to scalable clinical tools depends on resolving persistent challenges in cross-linguistic generalizability, speaker variability, and the ecological validity of data collected outside controlled recording environments.

2.4. Physiological signal-based applications in children’s emotion monitoring

Physiological sensing (e.g., PPG, GSR, and EEG) is specifically positioned for longitudinal monitoring and the estimation of latent internal states. Unlike overt behavioral cues, physiological signals provide a continuous stream of data that reflects autonomic nervous system responses, allowing the tracking of cumulative stress and emotional fluctuations over extended periods.

Currently, in the research and application of children’s emotion management, physiological signals are also regarded as key indicators reflecting emotions, in addition to facial expression and speech analysis. These physiological signals provide valuable emotional information, as they are closely linked to the autonomic nervous system’s responses. By measuring various physiological signals, such as electrocardiograms (ECGs), electroencephalograms (EEGs), galvanic skin responses (GSRs), and respiration rates, we can indirectly assess children’s emotional states.^{24,25,45,48,78} For example, when children feel anxious, stressed, or tense, their heart rate often increases significantly, and their GSR intensifies, providing clear indicators of emotional fluctuations. Traditionally, measuring these physiological signals required professional clinical equipment and settings. However, with rapid technological advancements, particularly the emergence of wearable devices and AI technologies, monitoring and analyzing these data is no longer limited to clinical environments but can now be integrated into everyday life.

AI technologies have shown significant potential in analyzing physiological signals to assess patterns associated with psychological disorders in children, such as ASD, ADHD, anxiety, and depression. Physiological signals, including HRV, GSR, EEG, ECG, and respiration rates, reflect children’s emotional and psychological states, revealing potential mental health issues.^{45,48–50,78} AI systems leverage various techniques to process these data and identify emotional and psychological abnormalities in children. Traditional machine learning algorithms, such as SVMs⁷ and k-nearest neighbors (KNNs),¹⁰⁷ are often employed for the classification and analysis of physiological data, aiding in the identification of changes in emotional states, such as anxiety or depression. With the advancement of deep learning technologies, more sophisticated models, such as CNNs and RNNs, have been widely adopted to process temporal data from physiological signals. These models can automatically learn key features, with some studies reporting improved classification performance under specific conditions.^51,108 Moreover, multimodal fusion techniques integrate data from various physiological signals for a comprehensive analysis, which may support a more comprehensive evaluation of children’s mental health.^109,110 As AI systems become increasingly adaptive, they can construct personalized models on the basis of each child’s unique characteristics and long-term physiological responses. This capability allows for more precise predictions of psychological disorders and emotional recognition, offering robust support for early intervention and personalized treatment.

The integration of AI technologies with wearable devices has led to revolutionary advancements in children’s emotion monitoring. These smart devices can continuously track children’s physiological signals and transmit data in real time to cloud-based AI systems for analysis. When irregularities such as significantly elevated heart rates or accelerated breathing patterns that persist over a specific period are identified, the AI system automatically generates reports and sends alerts to caregivers, enabling them to promptly recognize and address potential anxiety or stress.⁷⁸ Furthermore, AI systems can analyze long-term data to predict future emotional fluctuations, allowing for the proactive development of intervention plans to effectively mitigate the negative impacts of emotional issues.⁴⁶ These technologies not only increase the accuracy of emotion monitoring but also provide valuable data support for long-term emotional management. In summary, monitoring physiological signals is best understood as a longitudinal tracking modality, providing continuous objective data streams that reflect children’s internal emotional states over extended periods in ways that overt behavioral measures cannot. While its integration with wearable devices and AI-driven analysis shows considerable promise, its broader clinical deployment requires larger pediatric-specific datasets, standardized signal processing pipelines, and prospective validation in ecological settings such as classrooms and homes.

3. Applications of AI systems in identifying children's attention

3.1. Eye-tracking technology for children’s attention recognition

Eye-tracking technology serves as the primary data acquisition layer for an objective attention measurement. This section describes how AI systems transform continuous eye movement streams into quantifiable indicators of children’s visual attention and concentration by capturing raw ocular signals, including fixation points, gaze duration, and blink frequency, through hardware-based methods such as PCCR.

Eye-tracking technology is an essential tool for studying children’s visual attention, and its integration with AI technology enables more precise data analysis.²⁰ Currently, eye-tracking technology relies primarily on the pupil center-corneal reflection (PCCR) method, an optical technique that uses cameras to capture images of the eyes and determines the point of gaze by analyzing corneal and pupil reflections.²⁰ Through deep learning analysis of these eye movement data, AI systems can accurately track children’s eye movement trajectories, providing insights into their attention focus.^52,55 Additionally, methods such as random forests,^8,53 CNNs,⁵⁶ and VGG-Net¹¹¹ are widely applied in processing eye-tracking images. By analyzing the position and dynamics of the eyes,¹¹² AI systems can precisely predict children’s gaze and focus when they view different images or scenes, aiding researchers in better understanding children’s visual attention patterns.

In eye-tracking image processing, random forest is a commonly used machine learning method. By aggregating decisions from multiple decision trees, this technique effectively handles multidimensional features from eye-tracking data, such as eye position, movement speed, and changes in gaze points. The random forest model not only accurately predicts children’s gaze focus when they view different scenes but also assists researchers in analyzing their visual preferences and attention patterns, providing a stable and reliable approach for attention analysis. The CNN and its improved model, VGG-Net, play critical roles in the deep analysis of eye-tracking images. CNNs are typically used to extract spatial features from images, such as eye movement direction and gaze point location. VGG-Net, as a traditional deep learning architecture, can be applied to explore detailed features in high-resolution images. Through these models, AI systems can precisely predict children’s visual focus on different stimuli and analyze their attention distribution patterns, offering a more in-depth evaluation of attention. By integrating deep learning models (e.g., CNN and VGG-Net), machine learning methods (e.g., random forest), high-precision hardware devices, and advanced algorithms, AI technology has significantly enhanced the application of eye tracking in assessing attention and social skills.

Eye-tracking technology measures children’s attention levels by capturing and analyzing their eye movements. High-precision cameras record eye movement trajectories, whereas specialized algorithms process patterns such as the gaze duration, blink frequency, and changes in fixation points.¹¹³ This technology provides an objective and accurate method for identifying and managing children’s attention, aiding in improving their learning efficiency.¹¹³ Moreover, the integration of AI and eye-tracking technologies has extensive applications in analyzing children’s social interaction abilities, particularly in the early detection of ASD.^53,114 By monitoring children’s visual focus, gaze duration, and eye movements, these technologies can identify abnormal behaviors in social interactions.^115,116 This is significant for detecting potential issues in children’s social skills or identifying atypical visual attention patterns.

Specifically, AI technology can be applied to analyze eye-tracking data, revealing that children with ASD often pay less attention to facial features, such as the eyes and mouth, during interactions—one of the common traits of ASD.^{24,45,109,117} Additionally, this technology can be applied to examine children’s eye movement responses to visual stimuli and detect abnormal visual processing patterns, which may also serve as potential indicators of ASD.¹¹⁸ These technologies provide powerful tools for early ASD diagnosis, assisting medical professionals in implementing timely interventions. In addition to ASD, eye-tracking technology has been applied to detect ADHD.^44,56 By analyzing eye movement patterns, researchers can identify issues such as inattention or visual distraction, which are typical symptoms of ADHD.²¹ Furthermore, some studies have integrated facial expressions, 3D body posture, and other information to conduct more in-depth analyses of children’s attention fluctuations during specific visual tasks. This approach provides more detailed data regarding their attention problems.^{57–59,119,120} The combined application of these technologies has significantly enhanced the ability to detect and analyze attention-related issues in children.

As the foundational data acquisition layer, eye-tracking technology provides the objective ocular signal streams upon which all subsequent gaze analyses depend; its current maturity in pediatric research is promising, though consistent accuracy across naturalistic settings with younger or less cooperative children remains an ongoing technical challenge. Once the eye-tracking data are accurately captured through these technological modalities, the subsequent challenge lies in interpreting these raw ocular movements through a gaze analysis to understand a child’s specific attentional focus.

3.2. Behavioral analysis techniques for monitoring children’s attention

This section examines how AI-driven behavioral analysis techniques leverage computer vision and deep learning to quantify children’s attention through observable non-verbal cues, including body movements, postural dynamics, and fine-grained gestural patterns. By translating these behavioral signals into objective metrics, AI systems may support the monitoring of attentional states across both general educational and clinical contexts.

The application of AI technologies in behavioral analyses is becoming increasingly sophisticated, transitioning from simple movement detection to a high-dimensional behavioral understanding. These technologies primarily leverage computer vision and deep learning to capture a wide array of nonverbal cues, including body movements, postural transitions, facial expressions, and micro-gestures.¹¹⁸ Behavior recognition models that utilize AI to analyze both spatial features, such as body orientation and joint positions, and temporal dynamics, such as the duration and frequency of specific movements, to infer underlying behavioral states are central to this process.⁶² Traditional machine learning algorithms, such as random forests⁸ and SVMs,⁷ continue to play a vital role by processing hand-crafted features extracted from video streams to classify fundamental behavioral patterns, including movement velocity and gaze direction.¹²¹ Furthermore, the integration of advanced deep learning architectures has significantly expanded the analytical scope. Convolutional neural networks (CNNs) are employed to extract complex spatial hierarchies from image frames, effectively identifying patterns in posture and movement. In parallel, recurrent neural networks (RNNs), particularly long short-term memory (LSTM) units, are adept at capturing the sequential dependencies of behavior over time, allowing for the dynamic and continuous monitoring of children during various activities.^122,123

Modern AI systems facilitate comprehensive analyses by integrating multimodal data from various sensors to achieve higher precision. By synthesizing facial micro-expressions with postural stability, speech patterns, and even physiological signals, these approaches provide a holistic assessment of a child’s emotional and behavioral trajectory.^54,82 For instance, some advanced frameworks combine a behavioral analysis with eye-tracking technologies, such as electrooculography (EOG), to conduct more detailed analyses of changes in attention. These systems analyze specific eye-tracking metrics, such as gaze movement paths, fixation duration, and blink frequency, to assess visual attention states in real time. This multimodal integration allows behavior recognition models to move beyond binary “attentive vs. inattentive” classifications to a more nuanced quantification. These models can distinguish whether a child is genuinely focused, immersed in play, experiencing cognitive overload, or suffering from physical fatigue.^62,124 This high-precision analysis transforms subtle physical changes into objective, quantitative behavioral indicators, providing a more profound understanding of a child’s mental and cognitive state.^60,117

In practical educational environments, these quantitative metrics are invaluable for creating a responsive feedback system to improve learning efficiency. An AI-driven behavioral analysis enables the automated detection of subtle shifts in concentration during classroom activities or social interactions. When the system identifies a decline in a student’s level of focus, as indicated by increased head-tilting, decreased gaze fixity, or unintentional unrelated hand movements, it can provide timely feedback to educators.^113,124 This data-driven insight may inform individualized strategies, such as adjusting instructional methods or task difficulty according to the child’s context. Moreover, these AI-driven methods are noninvasive and capable of long-term monitoring, providing objective data that complement traditional subjective observations. This continuous tracking helps in understanding how a child’s attention fluctuates throughout the day, enabling a more personalized approach to education and mental health support.^63,125,126

While primarily focused on general attention monitoring, these behavioral analysis techniques also provide critical insights in clinical and neurodevelopmental contexts. For children with developmental disorders such as ASD or ADHD, AI systems can identify specific atypical behavioral patterns that may be difficult for human observers to quantify consistently. For example, individuals with ADHD may display frequent physical restlessness and difficulty maintaining visual attention.^62,127 AI technologies can detect these common symptoms by monitoring attention fluctuations over extended periods, thereby improving the diagnostic accuracy of clinical assessments.^{48,63,121,128} By identifying differences in facial interactions and behavioral expressions, these systems assist medical professionals in designing personalized intervention plans and early treatment strategies.^58,117 In this capacity, AI serves as an auxiliary tool that complements professional judgment, ensuring that the early detection of and interventions for psychological disorders are grounded in objective behavioral data.^60,124

In summary, the integration of traditional machine learning, deep learning models, and multimodal data enables automated, high-precision monitoring of children’s behavior and attention. These technologies not only accurately identify children’s attention levels but also detect abnormal behavior patterns, demonstrating significant practical value in both general education and specialized clinical support. By fostering a better understanding of children’s behavioral performance, these AI-driven approaches promote overall mental health development and improve learning outcomes. Collectively, a behavioral analysis currently functions most reliably as a quantitative monitoring instrument in structured educational and clinical settings, with its translation to uncontrolled naturalistic environments remaining the primary barrier to scalable deployment.

3.3. Gaze direction-based applications in children’s visual attention recognition

Gaze direction estimation operates at the computational inference layer, building upon the raw ocular data captured by eye-tracking hardware. Rather than recording where the eye physically moves, the techniques described in this section reconstruct where a child is directing their attention in three-dimensional space by integrating head pose, pupil geometry, and multimodal sensor fusion, thereby enabling higher-precision assessments of attentional allocation and social engagement patterns.

Head pose-based gaze estimation techniques are applied to analyze the position and angle of a child’s head and infer their gaze direction. AI systems capture head movements through cameras and combine these data with eye-tracking information to accurately determine gaze points.^20,123 Additionally, pupil localization techniques estimate a child’s gaze focus by precisely identifying the pupil’s position and analyzing its movement direction. This approach is often integrated with CNNs or other deep learning algorithms, such as LS^TM networks, to achieve more accurate gaze direction estimation.^54,59,128

In higher-precision applications, 3D gaze estimation technology integrates AI methods with 3D modeling to calculate children’s gaze points by analyzing the three-dimensional spatial relationships between the head and eyes. This approach enables AI systems to more accurately predict gaze direction in three-dimensional environments.¹²⁹ Moreover, multimodal gaze analysis technology combines eye-tracking data, head posture, and facial expressions to comprehensively determine children’s gaze direction.⁷⁴ By integrating data from multiple sensors, this approach provides more accurate visual attention analysis in various environments.¹²⁸ The use of AI technologies for gaze direction detection has significant potential in the identification of psychological disorders. These technologies utilize high-resolution cameras and gaze-tracking algorithms to precisely capture and analyze individuals’ gaze directions, allowing for assessments of their psychological state. Abnormal changes in gaze behavior, such as difficulty maintaining focus, frequent shifts in gaze, or prolonged fixation on irrelevant objects, may be associated with certain psychological conditions, including ADHD^{52,53,56,62,123} or ASD.^48,55,61,64

In clinical settings, AI technologies can assist professionals in tracking patients’ gaze patterns and comparing them with symptoms of specific psychological conditions.^52,130 For example, in social interaction scenarios, if a child is expected to maintain eye contact with their conversation partner but the system detects persistent gaze aversion or difficulty sustaining eye contact, this may indicate early signs of ASD. Similarly, for individuals with anxiety disorders, frequent changes in gaze direction or abnormal gaze patterns may prompt professionals to conduct additional psychological evaluations. This technology provides a noninvasive and continuous monitoring method that can identify potential psychological issues at an early stage and support professionals in making diagnoses. By automatically analyzing gaze data, AI systems can deliver critical behavioral indicators in real time, facilitating early intervention and treatment of psychological disorders. These technologies not only offer powerful tools for mental health management but also help parents and educators better understand children’s emotions and behavioral responses, enabling personalized support and guidance.

Despite these advances, the translation of gaze direction estimates from controlled laboratory settings to real-world pediatric environments remains a critical challenge. In unstructured settings such as classrooms or therapy rooms, the accuracy of the estimate decreases substantially due to environmental variability, including inconsistent illumination, the occlusion of facial landmarks, and the wide range of spontaneous head movements characteristic of young children.⁶⁵ Children with ASD or ADHD are particularly prone to rapid, unpredictable head rotations and reduced cooperation during calibration procedures, which introduces systematic noise into both head-pose-based and pupil-localization-based estimation pipelines.⁵⁶ Furthermore, most existing gaze estimation models have been trained on adult datasets or highly constrained child-specific paradigms, limiting their generalizability to the naturalistic, dynamic interaction scenarios most relevant to clinical and educational assessment.⁶⁶

Addressing these limitations requires more ecologically valid training datasets that reflect the behavioral diversity of pediatric populations and the development of robust, calibration-free estimation frameworks suitable for deployment outside of laboratory contexts. As gaze direction estimation matures from a precision instrument in controlled research to a scalable tool in applied settings, its capacity to serve as a reliable computational inference layer—translating raw ocular data into clinically actionable indicators of attentional allocation—will ultimately determine its utility in supporting the early identification of and interventions for children with developmental disorders.^56,65 Recent advances in self-supervised learning and cross-domain adaptation may offer promising pathways to mitigate the adult-to-pediatric domain gap, enabling pretraining on large-scale adult datasets followed by fine-tuning on limited pediatric samples. These approaches, alongside federated learning frameworks for privacy-preserving multi-site data aggregation, represent active research directions for advancing the ecological validity of pediatric gaze-based assessments.

4. Challenges and future developments of AI technology

4.1. Challenges in emotion and attention recognition for children

Although AI technology has demonstrated significant potential in recognizing children’s emotions and attention, it still faces several critical challenges. The foremost issue is data privacy and security. AI systems require vast amounts of personal data for learning and analysis, including highly sensitive information such as children’s facial expressions, speech, behavioral patterns, and physiological signals.¹³¹ Ensuring the functionality of these systems while maximizing the protection of these data poses a significant challenge. Unauthorized access to or leakage of such data could have severe consequences for children and their families. To address this problem, stringent data protection measures must be implemented, including data encryption, access control, and anonymization, to ensure data security and privacy.

Ethical concerns represent another significant challenge. The application of AI technology in recognizing children’s emotions and attention must be approached with caution to avoid the risks of excessive surveillance or data misuse.¹³² Overmonitoring may impose stress on children, affect their sense of autonomy and privacy, and even distort their natural behavior. To mitigate these issues, the design and implementation of AI systems must carefully consider ethical implications, ensuring that the primary goal is to promote children’s healthy development rather than excessive control or commercial gain¹⁶.

Accuracy is another critical challenge that needs to be addressed. The effectiveness and reliability of AI systems in emotion and attention recognition depend directly on their accuracy. If the error rate of an AI system is too high, it could lead to incorrect judgments and interventions, potentially harming children’s learning and mental health.^131,133 Therefore, continuously improving the accuracy of AI technologies, reducing error rates, and ensuring stable performance across various contexts are key objectives for developers. Table 3 presents a summary of the comparative characteristics of different AI models used in children’s emotion and attention recognition, including their advantages, disadvantages, applicable data types, computational complexity, and typical application scenarios. Comparative studies evaluating various classification models have been increasingly reported in the recent literature,^{25,62,79–81} while other works have focused on benchmarking performance across datasets and feature representations.^33,82–85 Owing to their low computational complexity and stability, SVMs are particularly suitable for small datasets and simple signal processing, such as in HRV and emotional text classification. These models are commonly applied in early emotion screening (e.g., binary classification of happiness and sadness) and ASD screening on the basis of EEG patterns. However, these methods are limited in handling high-dimensional and nonlinear data. CNN models excel at processing spatial data (e.g., images and videos), and their deep structures can effectively extract hierarchical visual features. Thus, these models are widely applied in facial expression analysis, eye tracking, and dynamic video emotion studies. LS^TM, known for its ability to process time series data, is highly suitable for analyzing speech and physiological signals because of its ability to capture temporal dependency features. GAN models have unique advantages in data generation, particularly in addressing data imbalance issues, making them valuable in creating diverse and balanced datasets. Multimodal fusion models, which focus on integrating multiple types of signal data, effectively combine facial expressions, speech, and physiological signals to provide comprehensive solutions for emotion and attention recognition.

4.2. Future development and prospects of AI technology in mental health identification

In the future, AI technology is anticipated to play a potentially increasing role in recognizing children’s emotions and attention. Its deep integration with other cutting-edge technologies is anticipated to enhance its functionality and potential applicability. Multidisciplinary integration is likely to become a key development trend.³

New deep learning algorithms may have exploratory relevance for detecting children’s emotions and attention levels and supporting screening for potential psychological disorders, although such applications require rigorous validation in pediatric populations. Generative AI and multimodal learning models are among the technologies contributing to these developments. Models such as GPT-4¹³⁴ and Llama 2¹³⁵ have shown general multimodal and language-processing capabilities in non-pediatric or task-specific contexts. To date, the pediatric-specific validation of these general-purpose large language models remains limited; rigorous clinical evaluation in pediatric emotion and attention assessments is still forthcoming, and their direct deployment in child-focused clinical workflows should be regarded as exploratory. In exploratory ADHD-related research, LLM-integrated robotic platforms may generate interaction-based behavioral signals (e.g., attention patterns and emotional responses), although such approaches remain hypothetical for deployment in pediatric populations.¹³⁶ Similarly, graph neural networks (GNNs) and wearable-integrated deep learning systems have been proposed for modeling behavioral and physiological data, but their application in pediatric psychological assessments remains largely unvalidated.

AI recognition technology can be integrated with virtual reality (VR) and augmented reality (AR) technologies.^62,113,137 This integration has not only been associated with enhanced learning experiences but also with potential improvements in user engagement, although the evidence for therapeutic outcomes remains limited. For example, AI systems may be used to estimate users’ learning progress, attention levels, or emotional states, allowing for adjustments in the difficulty or content of virtual courses accordingly. AR technology enhances abstract learning materials, such as visualizing mathematical concepts or simulating historical scenes, to create immersive experiences for students. In language learning, the combination of AI and AR technologies can create realistic conversational environments, allowing users to practice dialogs in simulated everyday scenarios, which may contribute to improved learning engagement.¹³⁸

Similarly, in psychotherapy, these technologies have been explored for applications in mental health, rehabilitation, and behavioral therapy. These approaches show potential in addressing emotional conditions such as anxiety, stress, and depression, although the evidence remains heterogeneous. For example, AI systems may be used to monitor physiological responses (e.g., HRV, eye movements, or GSR) during immersive therapy sessions, enabling adaptive adjustments to intervention content based on user states. AR/VR technologies can create simulated environments that allow users to engage in structured interaction scenarios. A previous study has demonstrated the use of VR systems to support social interaction training in children with autism.¹³⁹ More broadly, these environments may be extended to other situations, although these applications remain less well established. In addition, VR-based interventions have been associated with potential improvements in emotional well-being in populations with anxiety and depression, although the current findings are preliminary and vary across study designs.¹⁴⁰

By analyzing patients’ emotional expressions and linguistic features, AI systems may support therapists in developing more tailored psychological intervention strategies. This technological integration opens new possibilities and may foster innovation across domains such as education and healthcare. As the accuracy of AI-based recognition improves alongside advancements in VR/AR technologies, the potential applications of these systems may continue to expand.

In addition, the establishment of policies and regulations will play a critical role in the future application of AI technologies. As these technologies become widely used in recognizing children’s emotions and attention, it is essential to develop relevant legal and policy frameworks to ensure their safe and lawful use. These policies should include regulations for data privacy protection, ethical guidelines for technology usage, and requirements for accuracy and reliability. Policymakers must work closely with technology developers, educators, and psychology experts to ensure that the application of AI technologies promotes advancements in children’s education and mental health while safeguarding their rights and development from potential adverse impacts.

The application of the Internet of Things (IoT) in recognizing children’s emotions and attention has become a key driving force characterized by greater intelligence and collaboration. With the continuous advancement of IoT devices and the integration of multimodal sensing technologies, future systems may be able to more comprehensively capture children’s physiological, behavioral, and environmental data, such as HRV, GSR, speech features, and EEG signals,¹⁴¹ providing a more comprehensive data foundation for the dynamic monitoring of emotions and attention.

Future IoT systems may leverage advancements in edge computing and AI technologies to support real-time feedback, although their implementation in pediatric contexts remains an open research challenge. The technical feasibility of such architectures has been demonstrated in diverse IoT applications outside the healthcare domain that require near-real-time response, as illustrated in previous studies of safety monitoring and environmental sensing systems.^142,143 The synergy among IoT sensors is expected to evolve from simple unidirectional data transmission to more interactive networks capable of adaptively adjusting parameters based on situational needs. For example, when emotional fluctuations are detected, these systems may generate context-aware feedback, such as guiding a child through simple breathing exercises or adjusting ambient lighting to enhance focus.

Furthermore, with the widespread adoption of 5G and future communication technologies, the connectivity of IoT devices will be significantly improved, reducing data transmission latency and supporting closer to real-time processing. Moreover, IoT-based data platforms integrate more efficient encryption technologies to ensure data privacy and security, increasing parents’ and educational institutions’ confidence in the use of these systems. In the future, we can also expect the inclusion of new IoT sensors, such as wearable devices capable of directly monitoring brainwave activity, providing more diverse data sources for emotion and attention recognition. As these technologies continue to evolve, the integration of the IoT and emotional AI technologies will not only focus on detection and feedback but also move toward long-term data insights and predictive analytics, potentially contributing to technological support for children’s health and well-being.

In the future, the role of AI may evolve beyond its current function as an auxiliary “second opinion” tool toward a more integrated clinical support tool, particularly as the evidence and validation continue to develop. By leveraging multimodal feature fusion (Figure 1), the current systems provide quantitatively derived indicators that assist clinicians in identifying risks. Future developments may focus on integrating longitudinal behavioral tracking to help clinicians better understand a child’s progress over time. This data-driven approach aims to complement traditional screening methods, helping to bridge the gap in pediatric mental health resources.

5. Conclusions

In conclusion, artificial intelligence has shown promising potential as an auxiliary decision-support tool in the management of children’s emotion and attention. The primary value of AI modalities, ranging from facial and speech analysis to physiological sensing, lies in their ability to provide objective behavioral indicators and facilitate longitudinal monitoring. These technologies do not aim to replace clinical judgment but rather to augment it by offering quantifiable, reproducible metrics that may help mitigate certain limitations associated with the subjectivity of traditional rating scales and intermittent clinical observations.

Despite these advancements, several critical challenges hinder the widespread clinical adoption of AI systems. The field currently grapples with significant dataset heterogeneity and a lack of ecological validity, as many high-accuracy models are trained in controlled laboratory settings that do not generalize to real-world environments like homes or classrooms. Furthermore, concerns regarding cross-cultural generalizability, data privacy, and the absence of large-scale external validation remain substantial barriers that must be addressed to ensure the ethical and robust application of these tools.

In the future, the evolution of AI in pediatric care must shift from mere model scaling to a focus on multimodal integration and clinically interpretable outputs. Future research should prioritize real-world validation and the development of privacy-preserving deployment strategies, such as federated learning. Emerging integrations with technologies such as virtual reality (VR), augmented reality (AR), and Internet of Things (IoT) systems may further support adaptive and context-aware monitoring environments, although these approaches remain largely exploratory. Moreover, the emergence of interactive technologies, including large language models (LLMs) and social robotics, suggests a transition from passive monitoring toward active systems that may inform individualized intervention strategies in research or exploratory settings. Ultimately, establishing a standardized framework for evidence-based AI will likely be important for bridging the gap between technological innovation and sustainable clinical practice.

Footnotes

Acknowledgments

The authors thank the National Science and Technology Council of Taiwan and the National Health Research Institutes and collaborating institutions for their support and contributions to this study.

ORCID iDs

Yi-Ling Fan

Lun-De Liao

Author contributions

Conceptualization: Yi-Ling Fan and Lun-De Liao. Methodology: Yi-Ling Fan. Investigation: Yi-Ling Fan. Data curation: Yi-Ling Fan and Guan-Lin Wu. ;Writing—original draft preparation: Yi-Ling Fan and Ying-Ying Tsai. Writing—review and editing: all authors. Supervision: Ching-Han Hsu, Hui-Ju Chen, Fang-Rong Hsu, Hung-Yi Chiou, and Lun-De Liao. All authors have read and agreed to the published version of the manuscript.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported in part by the National Science and Technology Council of Taiwan under grant numbers 110-2221-E-400-003-MY3, 111-3114-8-400-001, 111-2314-B-075-006, 111-2221-E-035-015, and 111-2218-E-007-019; by the National Health Research Institutes of Taiwan under grant numbers NHRI-EX108-10829EI, NHRI-EX111-11111EI, and NHRI-EX111-11129EI; by the Ministry of Economic Affairs of Taiwan under the grant numbers MOHW 112-0324-01-30-06 and MOHW 113-0324-01-30-06 and MOHW 113-0324-01-30-11.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Eccles

. The development of children ages 6 to 14. Future Child 1999; 9(2): 30–44, Fall.

Noorbakhsh-Sabet

Zand

Zhang

, et al. Artificial intelligence transforms the future of health care. The American journal of medicine 2019; 132(7): 795–801. https://doi.org/10.1016/j.amjmed.2019.01.017

Hoodbhoy

Masroor Jeelani

Aziz

, et al. Machine Learning for Child and Adolescent Health: A Systematic Review. Pediatrics 2021; 147(1): e2020011833. https://doi.org/10.1542/peds.2020-011833

Duda

Kosmicki

Wall

. Testing the accuracy of an observation-based classifier for rapid detection of autism risk. Translational psychiatry 2014; 4(8): e424. https://doi.org/10.1038/tp.2014.65

Abd-Alrazaq

Alhuwail

Schneider

, et al. The performance of artificial intelligence-driven technologies in diagnosing mental disorders: an umbrella review. Npj Digital Medicine 2022; 5(1): 87. https://doi.org/10.1038/s41746-022-00631-8

Mengi

Malhotra

. A systematic literature review on traditional to artificial intelligence based socio-behavioral disorders diagnosis in India: Challenges and future perspectives. Applied Soft Computing 2022; 129: 109633. https://doi.org/10.1016/j.asoc.2022.109633

Hearst

Dumais

Osuna

, et al. Support vector machines. IEEE Intelligent Systems and their Applications 1998; 13(4): 18–28. https://doi.org/10.1109/5254.708428

Breiman

. Random Forests. Machine Learning 2001; 45(1): 5–32. https://doi.org/10.1023/A:1010933404324

Bussu

Jones

Charman

, et al. Prediction of autism at 3 years from behavioural and developmental measures in high-risk infants: a longitudinal cross-domain classifier analysis. Journal of Autism and Developmental Disorders 2018; 48(7): 2418–2433. https://doi.org/10.1007/s10803-018-3509-x

10.

Tariq

Fleming

Schwartz

, et al. Detecting developmental delay and autism through machine learning models using home videos of Bangladeshi children: development and validation study. Journal of medical Internet research 2019; 21(4): e13822. https://doi.org/10.2196/13822

11.

Mansoor

Ansari

. Early Detection of Mental Health Crises through Artificial-Intelligence-Powered Social Media Analysis: A Prospective Observational Study. Journal of Personalized Medicine 2024; 14(9): 958. https://doi.org/10.3390/jpm14090958

12.

McGinnis

Anderau

Hruschak

, et al. Giving voice to vulnerable children: machine learning analysis of speech detects anxiety and depression in early childhood. IEEE journal of biomedical and health informatics 2019; 23(6): 2294–2301. https://doi.org/10.1109/JBHI.2019.2913590

13.

Song

D-Y

Kim

Bong

, et al. The use of artificial intelligence in screening and diagnosis of autism spectrum disorder: a literature review. Journal of the Korean Academy of Child and Adolescent Psychiatry 2019; 30(4): 145–152. https://doi.org/10.5765/jkacap.190027

14.

Low

Bentley

Ghosh

. Automated assessment of psychiatric disorders using speech: A systematic review. Laryngoscope Investig Otolaryngol 2020; 5(1): 96–116, (in eng). https://doi.org/10.1002/lio2.354

15.

Guo

Wang

, et al. Development and application of emotion recognition technology—a systematic literature review. BMC psychology 2024; 12(1): 95. https://doi.org/10.1186/s40359-024-01581-4

16.

Reinhart

Bischops

Kerth

, et al. Artificial intelligence in child development monitoring: A systematic review on usage, outcomes and acceptance. Intelligence-Based Medicine 2024; 9: 100134. https://doi.org/10.1016/j.ibmed.2024.100134

17.

Figueiredo

Pereira

Frias

, et al. Applications of artificial intelligence in emotion recognition in pediatrics health care: Scoping review. Journal of Pediatric Nursing 2025; 85: 593–606. https://doi.org/10.1016/j.pedn.2025.09.017

18.

Yan

Ruan

Jiang

. Challenges for Artificial Intelligence in Recognizing Mental Disorders. Diagnostics (Basel) 2022; 13(1): 2, (in eng). https://doi.org/10.3390/diagnostics13010002

19.

Wang

Liu

DiStefano

, et al. Utilizing Deep Learning and Oversampling Methods to Identify Children’s Emotional and Behavioral Risk. Journal of Psychoeducational Assessment 2021; 39(2): 227–241. https://doi.org/10.1177/0734282920951727

20.

Chen

J-c.

P-q.

Yao

C-y.

, et al. Eye detection and coarse localization of pupil for video-based eye tracking systems. Expert Systems with Applications 2024; 236: 121316. https://doi.org/10.1016/j.eswa.2023.121316

21.

Caldani

Acquaviva

Moscoso

, et al. Reading performance in children with ADHD: An eye-tracking study. Annals of Dyslexia 2022; 72(3): 552–565. https://doi.org/10.1007/s11881-022-00269-x

22.

Zeman

Cassano

Perry-Parrish

, et al. Emotion regulation in children and adolescents. Journal of Developmental & Behavioral Pediatrics 2006; 27(2): 155–168. https://doi.org/10.1097/00004703-200604000-00014

23.

Matsushita

Krebs

VLJ

Carvalho

. Artificial intelligence and machine learning in pediatrics and neonatology healthcare. Rev Assoc Med Bras 1992; 68(6): 745–750, (in eng). https://doi.org/10.1590/1806-9282.20220177

24.

Gedam

Paul

. A Review on Mental Stress Detection Using Wearable Sensors and Machine Learning Techniques. IEEE Access 2021; 9: 84045–84066. https://doi.org/10.1109/ACCESS.2021.3085502

25.

Wang

Ren

Luo

, et al.

Deep learning-based EEG emotion recognition: Current trends and future perspectives

(in English), Frontiers in Psychology, Review 2023; 14, 1126994. https://doi.org/10.3389/fpsyg.2023.1126994

26.

Achenie

Scarpa

Factor

, et al. A machine learning strategy for autism screening in toddlers. Journal of Developmental & Behavioral Pediatrics 2019; 40(5): 369–376. https://doi.org/10.1097/DBP.0000000000000668

27.

Goulart

Valadão

Delisle-Rodriguez

, et al. Emotion analysis in children through facial emissivity of infrared thermal imaging. PloS one 2019; 14(3): e0212928. https://doi.org/10.1371/journal.pone.0212928

28.

Narayanan

Georgiou

. Behavioral signal processing: Deriving human behavioral informatics from speech and language. Proceedings of the IEEE 2013; 101(5): 1203–1233. https://doi.org/10.1109/JPROC.2012.2236291

29.

Oranje

Gorin

Jia

, et al. Collecting, analyzing, and interpreting response time, eye-tracking, and log data. Validation of score meaning for the next generation of assessments, 2017, pp. 39–51.

30.

Valenza

Scilingo

. Autonomic nervous system dynamics for mood and emotional-state recognition: Significant advances in data acquisition, signal processing and classification (Series in BioEngineering). Springer, 2014.

31.

Sun

Yang

, et al. Hierarchical semantic image matching using CNN feature pyramid. Computer Vision and Image Understanding 2018; 169: 40–51. https://doi.org/10.1016/j.cviu.2018.01.001

32.

Mao

Sejdić

. A review of recurrent neural network-based methods in computational physiology. IEEE transactions on neural networks and learning systems 2022; 34(10): 6983–7003. https://doi.org/10.1109/tnnls.2022.3145365

33.

Younis

Mohsen

Houssein

, et al. Machine learning for human emotion recognition: a comprehensive review. Neural Computing and Applications 2024; 36(16): 8901–8947. https://doi.org/10.1007/s00521-024-09426-2

34.

. AI-assisted emotion recognition: impacts on mental health education and learning motivation. International Journal of Emerging Technologies in Learning (IJET) 2023; 18(24): 34–48. https://doi.org/10.3991/ijet.v18i24.45645

35.

Toki

Tatsis

, et al. Applying neural networks on biometric datasets for screening speech and language deficiencies in child communication. Mathematics 2023; 11(7): 1643. https://doi.org/10.3390/math11071643

36.

Welarathna

Kulasekara

Pulasinghe

, et al. Automated sinhala speech emotions analysis tool for autism children. 2021 10th International Conference on Information and Automation for Sustainability (ICIAfS). IEEE, 2021, pp. 500–505.

37.

Toki

Tatsis

, et al. Employing classification techniques on SmartSpeech biometric data towards identification of neurodevelopmental disorders. Signals 2023; 4(2): 401–420. https://doi.org/10.3390/signals4020021

38.

Talaat

. Real-time facial emotion recognition system among children with autism based on deep learning and IoT. Neural Computing and Applications 2023; 35(17): 12717–12728. https://doi.org/10.1007/s00521-023-08372-9

39.

Akter

Ali

Khan

, et al. Improved Transfer-Learning-Based Facial Recognition Framework to Detect Autistic Children at an Early Stage. Brain Sciences 2021; 11(6): 734. https://doi.org/10.3390/brainsci11060734

40.

Chen

Guo

, et al. Toward children's empathy ability analysis: Joint facial expression recognition and intensity estimation using label distribution learning. IEEE Transactions on Industrial Informatics 2021; 18(1): 16–25. https://doi.org/10.1109/tii.2021.3075989

41.

Mayya

Pai

. Automatic facial expression recognition using DCNN. Procedia Computer Science 2016; 93: 453–461. https://doi.org/10.1016/j.procs.2016.07.233

42.

Sajjad

Ullah

FUM

Ullah

, et al. A comprehensive survey on deep facial expression recognition: challenges, applications, and future guidelines. Alexandria Engineering Journal 2023; 68: 817–840. https://doi.org/10.1016/j.aej.2023.01.017

43.

Sadeghi

Richer

Egger

, et al. Harnessing multimodal approaches for depression detection using large language models and facial expressions. npj Mental Health Research 2024; 3(1): 66. https://doi.org/10.1038/s44184-024-00112-8

44.

Yoo

Kang

Lim

, et al. Development of an innovative approach using portable eye tracking to assist ADHD screening: a machine learning study. Frontiers in Psychiatry, Original Research 2024; 15: 1337595. https://doi.org/10.3389/fpsyt.2024.1337595

45.

Krupa

Anantharam

Sanker

, et al. Recognition of emotions in autistic children using physiological signals. Health and technology 2016; 6(2): 137–147. https://doi.org/10.1007/s12553-016-0129-3

46.

Welch

. Physiological signals of autistic children can be useful. IEEE Instrumentation & Measurement Magazine 2012; 15(1): 28–32. https://doi.org/10.1109/mim.2012.6145259

47.

Bairavi

Sundhara

. EEG based emotion recognition system for special children. Proceedings of the 2018 international conference on communication engineering and technology (ICCET '18), New York, NY, USA. Association for Computing Machinery, 2018, pp. 1–4. https://doi.org/10.1145/3194244.3194245

48.

Moghaddari

Lighvan

Danishvar

. Diagnose ADHD disorder in children using convolutional neural network based on continuous mental task EEG. Computer methods and programs in biomedicine 2020; 197: 105738. https://doi.org/10.1016/j.cmpb.2020.105738

49.

Faust

Ang

PCA

Puthankattil

, et al. Depression diagnosis support system based on EEG signal entropies. Journal of mechanics in medicine and biology 2014; 14(03): 1450035. https://doi.org/10.1142/s0219519414500353

50.

Acharya

Sudarshan

Adeli

, et al. A novel depression diagnosis index using nonlinear features in EEG signals. European neurology 2015; 74(1-2): 79–83. https://doi.org/10.1159/000438457

51.

Zahia

Garcia-Zapirain

Saralegui

, et al. Dyslexia detection using 3D convolutional neural networks and functional magnetic resonance imaging. Computer methods and programs in biomedicine 2020; 197: 105726. https://doi.org/10.1016/j.cmpb.2020.105726

52.

Wei

Cao

Shi

, et al. Machine learning based on eye-tracking data to identify Autism Spectrum Disorder: A systematic review and meta-analysis. Journal of Biomedical Informatics 2023; 137: 104254. https://doi.org/10.1016/j.jbi.2022.104254

53.

Meng

, et al. Machine learning-based early diagnosis of autism according to eye movements of real and artificial faces scanning. Frontiers in Neuroscience 2023; 17: 1170951. https://doi.org/10.3389/fnins.2023.1170951

54.

Zhao

Tang

Zhang

, et al. Classification of Children With Autism and Typical Development Using Eye-Tracking Data From Face-to-Face Conversations: Machine Learning Model Development and Performance Evaluation. J Med Internet Res 2021; 23(8): e29328, (in eng). https://doi.org/10.2196/29328

55.

Lee

Shin

Park

, et al. Use of eye tracking to improve the identification of attention-deficit/hyperactivity disorder in children. Scientific Reports 2023; 13(1): 14469. https://doi.org/10.1038/s41598-023-41654-9

56.

Chen

Wang

Yang

, et al. Utilizing artificial intelligence-based eye tracking technology for screening ADHD symptoms in children. Frontiers in Psychiatry 2023; 14: 1260031. https://doi.org/10.3389/fpsyt.2023.1260031

57.

Prabha

Bhargavi

. Predictive model for dyslexia from fixations and saccadic eye movement events. Computer methods and programs in biomedicine 2020; 195: 105538. https://doi.org/10.1016/j.cmpb.2020.105538

58.

Jothi Prabha

Bhargavi

. Prediction of dyslexia from eye movements using machine learning. IETE Journal of Research 2022; 68(2): 814–823. https://doi.org/10.1080/03772063.2019.1622461

59.

Gomolka

Zeslawska

Czuba

, et al. Diagnosing Dyslexia in Early School-Aged Children Using the LSTM Network and Eye Tracking Technology. Applied Sciences 2024; 14(17): 8004. https://doi.org/10.3390/app14178004

60.

Ghafghazi

Carnett

Neely

, et al. AI-augmented behavior analysis for children with developmental disabilities: building toward precision treatment. IEEE Systems, Man, and Cybernetics Magazine 2021; 7(4): 4–12. https://doi.org/10.1109/msmc.2021.3086989

61.

Fabiano

Canavan

Agazzi

, et al. Gaze-based classification of autism spectrum disorder. Pattern Recognition Letters 2020; 135: 204–212. https://doi.org/10.1016/j.patrec.2020.04.028

62.

Joung

Chung

, et al. Diagnosis of ADHD using virtual reality and artificial intelligence: an exploratory study of clinical applications. Frontiers in Psychiatry 2024; 15: 1383547. https://doi.org/10.3389/fpsyt.2024.1383547

63.

Maniruzzaman

Shin

Hasan

MAM

. Predicting Children with ADHD Using Behavioral Activity: A Machine Learning Analysis. Applied Sciences 2022; 12(5): 2737. https://doi.org/10.3390/app12052737

64.

Banire

Al Thani

Qaraqe

. One size does not fit all: detecting attention in children with autism using machine learning: B. Banire et al. User Modeling and User-Adapted Interaction 2024; 34(2): 259–291. https://doi.org/10.1007/s11257-023-09371-0

65.

Varghese

Qaraqe

Al-Thani

. Attention Level Evaluation in Children With Autism: Leveraging Head Pose and Gaze Parameters From Videos for Educational Intervention. IEEE Transactions on Learning Technologies 2024; 17: 1737–1753. https://doi.org/10.1109/TLT.2024.3409702

66.

Stokes

Rizzo

Geng

, et al. Measuring Attentional Distraction in Children With ADHD Using Virtual Reality Technology With Eye-Tracking. Front Virtual Real 2022; 3: 855895, (in eng). https://doi.org/10.3389/frvir.2022.855895

67.

Wall

Kosmicki

Deluca

, et al. Use of machine learning to shorten observation-based screening and diagnosis of autism. Translational psychiatry 2012; 2(4): e100. https://doi.org/10.1038/tp.2012.10

68.

Crippa

Salvatore

Perego

, et al. Use of Machine Learning to Identify Children with Autism and Their Motor Abnormalities. Journal of Autism and Developmental Disorders 2015; 45(7): 2146–2156. https://doi.org/10.1007/s10803-015-2379-8

69.

Liu

. Identifying children with autism spectrum disorder based on their face processing abnormality: A machine learning framework. Autism Res 2016; 9(8): 888–898, (in eng). https://doi.org/10.1002/aur.1615

70.

Anzulewicz

Sobota

Delafield-Butt

. Toward the autism motor signature: Gesture patterns during smart tablet gameplay identify children with autism. Scientific reports 2016; 6(1): 31107. https://doi.org/10.1038/srep31107

71.

Uluyagmur-Ozturk

Rodopman Arman

Yilmaz

S.S.

, et al. ADHD and ASD classification based on emotion recognition data. 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 2016, pp. 810–813.

72.

Levy

Duda

Haber

, et al. Sparsifying machine learning models identify stable subsets of predictive features for behavioral detection of autism. Molecular Autism 2017; 8(1): 65. https://doi.org/10.1186/s13229-017-0180-6

73.

Tariq

Daniels

Schwartz

, et al. Mobile detection of autism through machine learning on home video: A development and prospective validation study. PLoS medicine 2018; 15(11): e1002705. https://doi.org/10.1371/journal.pmed.1002705

74.

Chen

Liao

Wang

, et al. An Intelligent Multimodal Framework for Identifying Children with Autism Spectrum Disorder. Int. J. Appl. Math. Comput. Sci 2020; 30(3): 435–448. https://doi.org/10.34768/amcs-2020-0032

75.

Song

Jiang

, et al. A machine learning-based diagnostic model for children with autism spectrum disorders complicated with intellectual disability. Frontiers in psychiatry 2022; 13: 993077. https://doi.org/10.3389/fpsyt.2022.993077

76.

Alkahtani

Aldhyani

THH

Alzahrani

, Deep Learning Algorithms to Identify Autism Spectrum Disorder in Children-Based Facial Landmarks. Applied Sciences 2023; 13(8); 4855. https://doi.org/10.3390/app13084855

77.

Alzakari

Allinjawi

Aldrees

, et al. Early detection of autism spectrum disorder using explainable AI and optimized teaching strategies. Journal of Neuroscience Methods 2025; 413: 110315. https://doi.org/10.1016/j.jneumeth.2024.110315

78.

Landowska

Karpus

Zawadzka

, et al. Automatic Emotion Recognition in Children with Autism: A Systematic Literature Review. Sensors (Basel) 2022; 22(4): 1649, (in eng). https://doi.org/10.3390/s22041649

79.

Salcedo Sanz

Rojo Álvarez

Martínez Ramón

, et al. Support vector machines in engineering: an overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2014; 4(3): 234–267. https://doi.org/10.1002/widm.1125

80.

Ruan

Zhang

, et al.

Can micro-expressions be used as a biomarker for autism spectrum disorder?

Frontiers in Neuroinformatics 2024; 18: 1435091. https://doi.org/10.3389/fninf.2024.1435091

81.

Kanjo

Younis

Ang

. Deep learning analysis of mobile physiological, environmental and location sensor data for emotion detection. Information Fusion 2019; 49: 46–56. https://doi.org/10.1016/j.inffus.2018.09.001

82.

Ghosh

Banna

MHA

Rahman

, et al. Artificial intelligence and internet of things in screening and management of autism spectrum disorder. Sustainable Cities and Society 2021; 74: 103189. https://doi.org/10.1016/j.scs.2021.103189

83.

Douzas

Bacao

. Effective data generation for imbalanced learning using conditional generative adversarial networks. Expert Systems with applications 2018; 91: 464–471. https://doi.org/10.1016/j.eswa.2017.09.030

84.

Zhang

Yin

Chen

, et al. Emotion recognition using multi-modal data and machine learning techniques: A tutorial and review. Information fusion 2020; 59: 103–126. https://doi.org/10.1016/j.inffus.2020.01.011

85.

Banos

Comas-González

Medina

, et al. Sensing technologies and machine learning methods for emotion recognition in autism: Systematic review. International journal of medical informatics 2024; 187: 105469. https://doi.org/10.1016/j.ijmedinf.2024.105469

86.

Gupta

Kumar

Tekchandani

. Facial emotion recognition based real-time learner engagement detection system in online learning context using deep learning models. Multimedia Tools and Applications 2023; 82(8): 11365–11394. https://doi.org/10.1007/s11042-022-13558-9

87.

Canal

Müller

Matias

, et al. A survey on facial emotion recognition techniques: A state-of-the-art literature review. Information Sciences 2022; 582: 593–617. https://doi.org/10.1016/j.ins.2021.10.005

88.

Deng

. Deep Facial Expression Recognition: A Survey. IEEE Transactions on Affective Computing 2022; 13(03): 1195–1215. https://doi.org/10.1109/TAFFC.2020.2981446

89.

Asmetha Jeyarani

Senthilkumar

. Eye Tracking Biomarkers for Autism Spectrum Disorder Detection using Machine Learning and Deep Learning Techniques: Review. Research in Autism Spectrum Disorders 2023; 108: 102228. https://doi.org/10.1016/j.rasd.2023.102228

90.

Goodfellow

Pouget-Abadie

Mirza

, et al. Generative adversarial networks. Commun. ACM 2020; 63(11): 139–144. https://doi.org/10.1145/3422622

91.

Hariharan

Karthic

Nalina

, et al. Hybrid deep convolutional generative adversarial networks (DCGANS) and style generative adversarial network (STYLEGANS) algorithms to improve image quality. 2022 3rd International Conference on Electronics and Sustainable Communication Systems (ICESC). IEEE, 2022, pp. 1182–1186.

92.

Mehralian

Karasfi

. RDCGAN: Unsupervised Representation Learning With Regularized Deep Convolutional Generative Adversarial Networks. 2018 9th Conference on Artificial Intelligence and Robotics and 2nd Asia-Pacific International Symposium, 2018, pp. 31–38. https://doi.org/10.1109/AIAR.2018.8769811

93.

Lucey

Cohn

Kanade

, et al. The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. 2010 ieee computer society conference on computer vision and pattern recognition-workshops, 2010. IEEE, pp. 94–101.

94.

Bänziger

Scherer

. Using Actor Portrayals to Systematically Study Multimodal Emotion Expression: The GEMEP Corpus. In: Paiva

ACR

Prada

Picard

(eds). Affective Computing and Intelligent Interaction. Springer Berlin Heidelberg, 2007, pp. 476–487.

95.

Beccaluva

Catania

Arosio

, et al. Predicting developmental language disorders using artificial intelligence and a speech data analysis tool. Human–Computer Interaction 2024; 39(1-2): 8–42. https://doi.org/10.1080/07370024.2023.2242837

96.

Parga

Lewin

Lewis

, et al.

Defining and distinguishing infant behavioral states using acoustic cry analysis: is colic painful?

Pediatric research 2020; 87(3): 576–580. https://doi.org/10.1038/s41390-019-0592-4

97.

Joshi

Srinivasan

Vincent

PDR

, et al. A multistage heterogeneous stacking ensemble model for augmented infant cry classification. Frontiers in Public Health 2022; 10: 819865. https://doi.org/10.3389/fpubh.2022.819865

98.

Hochreiter

Schmidhuber

. Long short-term memory. Neural computation 1997; 9(8): 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735

99.

Abdul

Al-Talabani

. Mel frequency cepstral coefficient and its applications: A review. IEEE Access 2022; 10: 122136–122158. https://doi.org/10.1109/access.2022.3223444

100.

Hua

Guo

Zhao

. Deep belief networks and deep learning. Proceedings of 2015 international conference on intelligent computing and internet of things. IEEE, 2015, pp. 1–4.

101.

Zhai

Zhang

Chen

, et al. Autoencoder and its various variants. 2018 IEEE international conference on systems, man, and cybernetics (SMC). IEEE, 2018, pp. 415–419.

102.

Lian

, et al. A Survey of Deep Learning-Based Multimodal Emotion Recognition: Speech, Text, and Face. Entropy 2023; 25(10): 1440. https://doi.org/10.3390/e25101440

103.

Cristia

Lavechin

Scaff

, et al. A thorough evaluation of the Language Environment Analysis (LENA) system. Behavior research methods 2021; 53(2): 467–486. https://doi.org/10.3758/s13428-020-01393-5

104.

Ganek

Eriks-Brophy

. Language ENvironment analysis (LENA) system investigation of day long recordings in children: A literature review. J Commun Disord 2018; 72: 77–85, (in eng). https://doi.org/10.1016/j.jcomdis.2017.12.005

105.

Bastianello

Lorenzini

Nazzi

, et al. The Language ENvironment Analysis system (LENA): A validation study with Italian-learning children. Journal of child language 2024; 51(5): 1172–1192. https://doi.org/10.1017/s0305000923000326

106.

Rauschenberger

Baeza-Yates

Rello

. A Universal Screening Tool for Dyslexia by a Web-Game and Machine Learning. Frontiers in Computer Science 2022; 3: 628634. https://doi.org/10.3389/fcomp.2021.628634

107.

Guo

Wang

Bell

, et al. KNN model-based approach in classification. “OTM Confederated International Conferences” On the Move to Meaningful Internet Systems. Springer, 2003, pp. 986–996.

108.

Usman

Muniyandi

. CryptoDL: Predicting dyslexia biomarkers from encrypted neuroimaging dataset using energy-efficient residue number system and deep convolutional neural network. Symmetry 2020; 12(5): 836. https://doi.org/10.3390/sym12050836

109.

Shu

Xie

Yang

, et al. A Review of Emotion Recognition Using Physiological Signals. Sensors (Basel) 2018; 18(7), (in eng). https://doi.org/10.3390/s18072074

110.

Hassani

Bafadel

Bekhatro

, et al. Physiological signal-based emotion recognition system. 2017 4th IEEE International Conference on Engineering Technologies and Applied Sciences (ICETAS). IEEE, 2017, pp. 1–5.

111.

Sharma

Giannakos

Dillenbourg

. Eye-tracking and artificial intelligence to enhance motivation and learning. Smart Learning Environments 2020; 7(1): 13. https://doi.org/10.1186/s40561-020-00122-x

112.

Tang

Lin

, et al. An eye detection method based on convolutional neural networks and support vector machines. Intelligent Data Analysis 2018; 22(2): 345–362. https://doi.org/10.3233/ida-173361

113.

Ngoc Anh

Tung Son

Truong Lam

, et al. A computer-vision based application for student behavior monitoring in classroom. Applied Sciences 2019; 9(22): 4729. https://doi.org/10.3390/app9224729

114.

Jenner

Farran

Welham

, et al. The use of eye-tracking technology as a tool to evaluate social cognition in people with an intellectual disability: a systematic review and meta-analysis. Journal of Neurodevelopmental Disorders 2023; 15(1): 42. https://doi.org/10.1186/s11689-023-09506-9

115.

Hosozawa

Tanaka

Shimizu

, et al. How children with specific language impairment view social situations: an eye tracking study. Pediatrics 2012; 129(6): e1453–e1460. https://doi.org/10.1542/peds.2011-2278

116.

Lohan

Sheppard

Little

, et al. Toward improved child–robot interaction by understanding eye movements. IEEE Transactions on Cognitive and Developmental Systems 2018; 10(4): 983–992. https://doi.org/10.1109/tcds.2018.2838342

117.

Gardner

Wacker

Boelter

. An evaluation of the interaction between quality of attention and negative reinforcement with children who display escape‐maintained problem behavior. Journal of Applied Behavior Analysis 2009; 42(2): 343–348. https://doi.org/10.1901/jaba.2009.42-343

118.

Sasson

Elison

. Eye tracking young children with autism. Journal of visualized experiments: JoVE 2012; 61: 3675. https://doi.org/10.3791/3675

119.

Zhang

Kong

Zhao

, et al. Auxiliary diagnostic system for ADHD in children based on AI technology. Frontiers of Information Technology & Electronic Engineering 2021; 22(3): 400–414. https://doi.org/10.1631/fitee.1900729

120.

J-Y

Tsai

Chen

, et al. Digital transformation of mental health therapy by integrating digitalized cognitive behavioral therapy and eye movement desensitization and reprocessing. Medical & Biological Engineering & Computing 2025; 63(2): 339–354. https://doi.org/10.1007/s11517-024-03209-6

121.

Mengi

Malhotra

. Artificial intelligence based techniques for the detection of socio-behavioral disorders: a systematic review. Archives of Computational Methods in Engineering 2022; 29(5): 2811–2855. https://doi.org/10.1007/s11831-021-09682-8

122.

Zhang

Gan

, et al. Human action recognition using convolutional LSTM and fully-connected LSTM with different attentions. Neurocomputing 2020; 410: 304–316. https://doi.org/10.1016/j.neucom.2020.06.032

123.

Ansari

Kasprowski

Obetkal

. Gaze Tracking Using an Unmodified Web Camera and Convolutional Neural Network Applied Sciences 2021; 11(19): 9068. https://doi.org/10.3390/app11199068

124.

Lin

F-C

Ngo

H-H

Dow

C-R

, et al. Student Behavior Recognition System for the Classroom Environment Based on Skeleton Pose Estimation and Person Detection. Sensors 2021; 21(16): 5314. https://doi.org/10.3390/s21165314

125.

Lai

Chang

Tsai

, et al. Data fusion analysis for attention‐deficit hyperactivity disorder emotion recognition with thermal image and Internet of Things devices. Software: Practice and Experience 2021; 51(3): 595–606. https://doi.org/10.1002/spe.2866

126.

Krishnappa Babu

Di Martino

Aiello

, et al. Validation of a Mobile App for Remote Autism Screening in Toddlers. NEJM AI 2024; 1(10): AIcs2400510. https://doi.org/10.1056/aics2400510

127.

DuPaul

Ervin

. Functional assessment of behaviors related to attention-deficit/hyperactivity disorder: Linking assessment to intervention design. Behavior Therapy 1996; 27(4): 601–622. https://doi.org/10.1016/s0005-7894(96)80046-3

128.

Valenti

Sebe

Gevers

. Combining Head Pose and Eye Location Information for Gaze Estimation. IEEE Transactions on Image Processing 2012; 21(2): 802–815. https://doi.org/10.1109/TIP.2011.2162740

129.

Shoja Ghiass

Arandjelovć

Laurendeau

. Highly Accurate and Fully Automatic 3D Head Pose Estimation and Eye Gaze Estimation Using RGB-D Sensors and 3D Morphable Models. Sensors 2018; 18(12): 4280. https://doi.org/10.3390/s18124280

130.

Fang

Duan

Shi

, et al. Identifying children with autism spectrum disorder based on gaze-following. 2020 IEEE International Conference on Image Processing (ICIP). IEEE, 2020, pp. 423–427.

131.

Ramgopal

Heffernan

Bendelow

, et al. Parental Perceptions on Use of Artificial Intelligence in Pediatric Acute Care. Academic Pediatrics 2023; 23(1): 140–147. https://doi.org/10.1016/j.acap.2022.05.006

132.

Deo

. Machine Learning in Medicine. Circulation 2015; 132(20): 1920–1930, (in eng). https://doi.org/10.1161/circulationaha.115.001593

133.

Strauß

. From Big Data to Deep Learning: A Leap Towards Strong AI or ‘Intelligentia Obscura. Big Data and Cognitive Computing 2018; 2(3): 16. https://doi.org/10.3390/bdcc2030016

134.

Lian

Sun

, et al. GPT-4V with emotion: A zero-shot benchmark for Generalized Emotion Recognition. Information Fusion 2024; 108: 102367. https://doi.org/10.1016/j.inffus.2024.102367

135.

Sandmann

Riepenhausen

Plagwitz

, et al. Systematic analysis of ChatGPT, Google search and Llama 2 for clinical decision support tasks. Nat Commun 2024; 15(1): 2050, (in eng). https://doi.org/10.1038/s41467-024-46411-8

136.

Berrezueta-Guzman

Kandil

Wagner

. Integrating AI into ADHD Therapy: Insights from ChatGPT-4o and Robotic Assistants. Human-Centric Intelligent Systems 2025; 5(2): 230–245. https://doi.org/10.1007/s44230-025-00099-1

137.

Yeh

Lin

EHK

, et al. A Virtual-Reality System Integrated With Neuro-Behavior Sensing for Attention-Deficit/Hyperactivity Disorder Intelligent Assessment. IEEE Transactions on Neural Systems and Rehabilitation Engineering 2020; 28(9): 1899–1907. https://doi.org/10.1109/TNSRE.2020.3004545

138.

Asish

Kulshreshth

Borst

. Detecting distracted students in educational VR environments using machine learning on eye gaze data. Computers & Graphics 2022; 109: 75–87. https://doi.org/10.1016/j.cag.2022.10.007

139.

Lahiri

Warren

Sarkar

. Design of a Gaze-Sensitive Virtual Social Interactive System for Children With Autism. IEEE Transactions on Neural Systems and Rehabilitation Engineering 2011; 19(4): 443–452. https://doi.org/10.1109/TNSRE.2011.2153874

140.

Zeng

Pope

Lee

, et al. Virtual reality exercise for anxiety and depression: A preliminary review of current research in an emerging field. Journal of clinical medicine 2018; 7(3): 42. https://doi.org/10.3390/jcm7030042

141.

Lin

C-T

Wang

Chen

S-F

, et al. Design and verification of a wearable wireless 64-channel high-resolution EEG acquisition system with wi-fi transmission. Medical & Biological Engineering & Computing 2023; 61(11): 3003–3019. https://doi.org/10.1007/s11517-023-02879-y

142.

Kao

W-C

Fan

Y-L

Hsu

F-R

, et al. Next-Generation swimming pool drowning prevention strategy integrating AI and IoT technologies. Heliyon 2024; 10(18): e35484. https://doi.org/10.1016/j.heliyon.2024.e35484

143.

Liu

Wang

Chen

, et al. An IoT-based smart mosquito trap system embedded with real-time mosquito image processing by neural networks for mosquito surveillance. Front Bioeng Biotechnol 2023; 11: 1100968, (in eng). https://doi.org/10.3389/fbioe.2023.1100968