Abstract
Background
Autism Spectrum Disorder (ASD) is a lifelong neurodevelopmental condition affecting social interaction, communication, and behavior, with traditional diagnosis relying on subjective and time-consuming behavioral assessments. Advances in neuroimaging have enhanced understanding of the brain mechanisms underlying ASD.
Objective
This systematic review aimed to comprehensively examine ASD classification datasets and recent advancements in ASD diagnosis using neuroimaging modalities, and to analyze machine learning techniques for ASD diagnosis to evaluate their diagnostic performance in terms of accuracy and Area Under the Curve (AUC).
Methods
The review followed PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. A comprehensive literature search (2021–2025) was conducted across major databases, including Web of Science, IEEE Xplore, ACM, ScienceDirect, MDPI, and Springer.
Results
Out of 2,329 initially identified records, 825 were screened for eligibility after title and abstract review. The final analysis included 107 studies, which predominantly used structural and functional Magnetic Resonance Imaging, Electroencephalography, and multimodal datasets for ASD classification. The most common classifiers were Convolutional Neural Networks, Support Vector Machines, Random Forests, and hybrid Deep Learning (DL) models. Studies reported performance metrics such as accuracy and AUC, with many showing promising diagnostic results. Key limitations included small sample sizes, lack of external validation, dataset imbalance, and limited generalizability across multi-site datasets.
Conclusion
Neuroimaging-based Machine Learning (ML) offers strong potential for improving ASD diagnosis but faces challenges in reproducibility, interpretability, dataset variability, and clinical translation. Future work should focus on multi-site validation, explainable AI, and standardized evaluation to ensure reliable, real-world applications.
Introduction
Autism Spectrum Disorder (ASD) affects approximately in 44 children in the United States. 1 It is a a neurodevelopmental condition characterized by persistent challenges in social interaction and cognitive functioning, along with repetitive behavioral patterns typically diagnosed through behavioral assessment.. 2 Children with ASD often face lifelong cognitive difficulties and a reduced quality of life. Although the precise causes of ASD remain unclear, diagnosis is typically based on behavioral phenotypes and can be made as early as the second year of life.3,4 However, many children are diagnosed after this age,3,5 either because they do not initially meet diagnostic thresholds or are overlooked during early screenings. 6 In particular, girls may exhibit behavioral phenotypes that differ from those of boys and are often more likely to mask their traits in order to conform to social norms7–9 and in some cases, healthcare data is not fully and independently accessible, as noted in study.10,11
Clinical behavioral assessments, including the Autism Diagnostic Observation Schedule (ADOS) and the Autism Diagnostic Interview-Revised (ADI-R), remain the standard tools for diagnosing ASD. Despite their clinical utility, these approaches are often labor-intensive and subject to variability. To minimize the possible adverse effects and improving the quality, diagnosis at an early and accurate stage is very crucial. Although ASD is traditionally diagnosed through behavioral assessment, such evaluations are often subjective, time-consuming, and dependent on clinical expertise. Neuroimaging provides objective insights into structural and functional brain abnormalities associated with ASD, enabling the identification of reproducible neurological biomarkers. Studies have shown atypical brain connectivity, cortical thickness variations, and functional network disruptions in individuals with ASD, which cannot be reliably captured through behavioral observation alone.12–16 As the morbidity rate of ASD is increasing, there is a challenge of highly experienced clinical psychiatrists, highlighting the significance of automated diagnostic systems. Artificial Intelligence (AI) based diagnosis methods suggest an effective approach to handle this challenge. A child’s cognitive, social, and educational acquisition, and brain growth are enhanced by the early diagnosis and intervention16–18 as illustrated in Figure 1. Although proper cure and treatment for the ASD has not been adopted while the early diagnosis and intervention with the help of therapeutic play a significant role in the betterment of affected children. Therefore, neuroimaging plays a crucial role in supporting early, objective, and data-driven ASD diagnosis when combined with ML techniques. Behavioral and phenotypic indicators in the diagnosis of autism spectrum disorder.
Neuroimaging has significantly advanced to understand the pathogenic processes involved in brain disorders,19–23 and it has also been utilized in diagnosing ASD.24–26 Neuroimaging refers to non-invasive techniques used to visualize the structure and function of the brain, includes functional, structural, multimodal, and electrical modalities. The most widely used neuroimaging technique for studying the brain is Magnetic Resonance Imaging (MRI). As a non-invasive imaging modality, MRI provides high-resolution, three-dimensional anatomical views of internal brain structures. It is frequently employed to assess brain development and detect structural abnormalities. 27 MRI is classified into two category: functional MRI (fMRI) and structural MRI (sMRI). fMRI measures the fluctuation in blood oxygen levels, that diagnosis the abnormal connectivity patterns related to the ASD 28 by using the functional modalities. sMRI are the structural modalities that capture the anatomical information and provide the details about abnormalities in white matter, cortical thickness, and brain volume. 29 Multimodal modalities are the combination of fMRI and sMRI, which provide the details of both structural and functional insights, enhancing the diagnosis accuracy of ASD. 30 Real-time brainwave activities are helpful to detect the anomalies in neural timing and responses31–33 by adopting the electrical imaging techniques, such as Electroencephalography (EEG). EEG signals focused on the temporal resolution and accessibility, useful for the ASD diagnosis at the early stages in children. 34
Many of the reviews manuscripts have focused on neuroimaging techniques for the diagnosis of ASD at early stages, mostly highlighting the Positron Emission Tomography (PET), EEG, sMRI, and fMRI 35–40. Nonetheless, a significant group of review studies examine the psychiatric and neurological conditions, ASD, Attention-Deficit/Hyperactivity Disorder (ADHD), Alzheimer’s disease, and schizophrenia rather than focusing only on the ASD.41–45 These review articles examine the significance of neuroimaging techniques in detectecting the neurological structure by comibing the multiple conditions, which is the limitation in accordance with the ASD diagnosis procedure.
A few comprehensive reviews have aimed to identify imaging biomarkers associated with ASD, but they are often outdated or narrowly scoped.46–51 Moreover, despite growing evidence that hybrid or multimodal approaches enhance diagnostic Accuracy (Acc) and Area Under the Curve Area under the Curve (AUC), most existing studies focus on individual neuroimaging modalities. The AUC measures the overall discriminative ability of a classification model. It represents the probability that the model assigns a higher score to a randomly selected positive case than to a randomly selected negative case. Unlike accuracy, AUC is threshold-independent and provides a more robust evaluation in scenarios involving class imbalance. Higher AUC values indicate stronger diagnostic performance.
This review aims to address gaps by providing a focused overview of neuroimaging for ASD with special attention to recent advancements in multimodal neuroimaging techniques. It presents a current and comprehensive perspective on ASD diagnosis using neuroimaging, covering approximately 107 key studies, primarily from 2021 to 2025. 1. This review article provides a comprehensive overview of ASD diagnosis techniques based on four primary neuroimaging modalities: fMRI, sMRI, EEG, and multimodal approaches. 2. It offers an in-depth analysis of key diagnostic features used in machine and DL based ASD classification. 3. The study includes a critical analysis of the methodological limitations inherent in existing ASD diagnosis techniques. 4. A comparative summary of diagnostic performance, in terms of Acc and AUC values, is presented based on findings reported in the literature. 5. An organized overview is provided of the most frequently used neuroimaging datasets for ASD diagnosis. 6. Finally, this article identifies potential concerns and future directions aimed at improving the robustness, generalizability, and clinical applicability of computational approaches in ASD diagnosis.
The organization of the paper is as follows: Section II outlines the research methodology in detail, including the research questions, research objectives, search strategy, quality assessment criteria, planning the review, reporting the review, and the inclusion and exclusion criteria. Section III focuses on neuroimaging based diagnosis approaches for ASD. The literature related to ASD diagnosis datasets is systematically compiled in Section IV. In Section V, potential concerns and future directions of ASD diagnosis technology are thoroughly examined.
Materials and methods
This review draws methodological inspiration from the survey conducted by, 52 which has made a notable contribution to survey based research in the medical domain utilizing ML and DL. ML and DL are computational approaches that enable automated pattern recognition and classification from complex data, widely applied for extracting diagnostic biomarkers from neuroimaging data. To ensure a rigorous and reproducible process, we adopted a systematic review framework encompassing three core phases: planning, conducting, and reporting. These stages were instrumental in guiding the review and aligning it with the defined Research Questions (RQs).
Data extraction and analysis were performed on the studies selected through this process. The RQs served as a foundation for determining the scope and type of information to be collected. Specific data elements targeted during extraction were defined in the methodology and outcomes sections. Furthermore, the credibility of each study was assessed based on several factors, including the authors’ reliability, publication domain, scholarly recognition, publication year, and venue whether journal, conference, or workshop.
We also reviewed the features used in diagnosing ASD to analyze different methodological approaches. The literature evaluation considered the limitations of existing studies, and we further examined the research gaps across the various models.
Planning the review
Identifying needs associated with the goals of writing, this study was a key part of the planning process for this review. We employed ML to examine the current research limitations in ASD diagnosis methods that use neuroimaging data. The review followed the structured phases outlined in 53 and, 54 which include reporting the review, applying quality assessment criteria, and defining the search strategy.
Search strategy
Search strategy and keyword selection for ASD-related literature.
Population
To begin the review, we carefully identified the population of studies to be included. ”ASD” was the primary keyword used during the research investigation. This review ensures a comprehensive analysis of various critical scenarios influenced by technological advancements in the fields of neuroimaging and ASD diagnosis.
Methodology or technique
We used four modalities for ASD diagnosis: fMRI, sMRI, multimodal, and EEG based domain detection, as demonstrated in Figure 4. Multiple approaches were applied to extract and monitor data in order to obtain the necessary information from the selected literature. The methodology was further strengthened through the use of keywords such as “DL,” “ML,” “ASD Diagnosis,” “Neuroimaging Data,” and “ASD Datasets.”
The main objective of our data synthesis was to categorize and summarize the outcomes from the selected literature. The following components served as the foundation for the empirical analysis: 1) Key features as described in techniques; 2) utilization of models and methods in various techniques; 3) datasets used in models; 4) focusing on limitation mention in techniques; and 5) AUC/Acc achieved by various techniques.
Outcome
The potential outcomes of the research study are largely determined by the methods and procedures used for data collection and monitoring, which also influence the study’s applicability. To examine the potential applications of the research, the following keywords were used: ‘Asperger’s Syndrome Diagnosis,’ ‘Childhood Disintegrative Disorder Diagnosis,’ ‘Pervasive Developmental Disorder Diagnosis,’ ‘PDD-NOS (Pervasive Developmental Disorder – Not Otherwise Specified) Diagnosis,’ ‘Pervasive Developmental Disorder (PDD) Diagnosis,’ ‘Childhood Autism Diagnosis,’ ‘Kan-ner’s Autism Diagnosis,’ and ’Atypical Autism Diagnosis.’
To gather all relevant information for our RQs, we performed every feasible combination of related search queries using keywords or phrases. Boolean terms such as “OR,” “AND,” and others were applied to study the research questions. For example: (ASD OR Autism Spectrum Disorder OR Autism OR Autistic Disorder) AND (ML OR DL OR ASD Diagnosis OR Neuroimaging Data OR ASD Datasets OR Graph Based) AND (Asperger’s Syndrome Diagnosis OR Childhood Disintegrative Disorder Diagnosis OR Pervasive Developmental Disorder Diagnosis OR PDD-NOS (Pervasive Developmental Disorder – Not Otherwise Specified) Diagnosis OR Pervasive Developmental Disorder (PDD) Diagnosis OR Childhood Autism Diagnosis OR Kanner’s Autism Diagnosis OR Atypical Autism Diagnosis).
Research questions and objectives used to analyze ASD diagnosis approaches.
Data extraction was based on the results from the literature and was then synthesized and linked to relevant comparisons to support the RQs. Following data analysis, various visualization tools, including tables, pie charts, and histograms, were used to enhance data presentation.
As illustrated in Figure 2, we analyzed 2,329 studies from various digital online repositories published between 2021 and 2025, focusing exclusively on the most relevant research (n=107) focused on the use of neuroimaging techniques for ASD diagnosis. PRISMA diagram representing the study selection procedure.
Quality assessment criteria
Regarding the study’s selection process, we focused on various inclusion and exclusion criteria, evaluating sources such as technical reports, journals, newsletters, magazines, languages, and digital online repositories. Properly applying these selection criteria ensured that the literature included was relevant and directly addressed the RQs. We verified that the outcomes of the selected literature were unbiased, appropriate, and based on clear selection standards. Quality assessment plans were developed to guide the determination of inclusion and exclusion criteria.
Inclusion criteria
The defined phrases were used to search the title, abstract, or keywords. • Studies were included if the term “ASD” appeared in any section of the literature, even if it was not part of the keywords, title, or abstract. • Empirical research evidence was required. • Only studies conducted in 2021 or later were included. • Research using ML techniques was considered. • Studies employing DL techniques were included. • Studies focusing exclusively on ASD diagnosis were selected. • Key phrases, articles, titles, journals, conferences, abstracts, and workshops relevant to ASD diagnosis were included.
Exclusion criteria
• Undergraduate project documents, PhD dissertations, and master’s theses were not considered as they are often unpublished and not peer-reviewed. • Technical reports and institutional project documentation were excluded to focus on published academic research. • Patents were excluded because they typically lack thorough experimental validation or scientific discussion. • Articles not focused on diagnosing ASD through computational or neuroimaging methods were excluded to maintain relevance. • Studies using unreliable or insufficient techniques were disqualified to preserve data integrity and research quality.
Study exclusion summary to minimize selection bias.
Reporting the review
The literature review emphasized the role of ASD diagnosis techniques in enhancing diagnostic performance. Figure 2 presents the collection of the literature, while Figure 3 highlights the most widely used ASD diagnosis classifiers. The tables provide detailed information on key methodological categories, including ASD diagnosis based on fMRI, multimodal modalities, and EEG. These approaches have been evaluated in terms of their contributions to Acc or AUC in ASD diagnosis, underscoring the importance of effective diagnostic methods. An overview of classifiers for ASD diagnosis.
Results and discussion
Publication overview
A comprehensive literature search was performed across various digital online repositories published between 2021 and 2025 in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines as illustrated in Figure 2. A total of 2,329 records were initially identified, with 478 duplicates removed. After title and abstract screening, 825 studies were assessed for eligibility. Studies without ML classifiers or without reported performance metrics were excluded. Ultimately, 107 studies were included in the final analysis focused on the use of neuroimaging techniques for ASD diagnosis.
Year-wise distribution of selected studies (2021–2025).
ASD diagnosis techniques based on neuroimaging modalities
Neuroimaging has emerged as an essential tool in the diagnosis of ASD, offering valuable insights into structural and functional abnormalities within the brain. Commonly used modalities include fMRI, , EEG, and multimodal techniques that combine fMRI and sMRI. Each method offers a distinct perspective: fMRI captures patterns of brain activity, sMRI highlights structural differences, EEG provides high temporal resolution of neural dynamics, and multimodal techniques deliver both functional and structural insights into brain activity as shown in the Figure 4. An overview of neuroimaging modalities for ASD diagnosis.
ASD diagnosis based on fMRI neuroimaging techniques
Summary of studies on ASD diagnosis using fMRI-based neuroimaging techniques.
In the context of ASD diagnosis, fMRI is commonly employed to detect atypical functional connectivity within brain regions responsible for social cognition, emotional regulation, communication, and executive functioning. Resting-state fMRI (rs-fMRI), which measures brain activity during rest (i.e., in the absence of task performance), is frequently used to investigate intrinsic neural networks. Among individuals with ASD, disrupted connectivity is often observed in the Default Mode Network (DMN), associated with self-referential processing, and in the salience network, which is involved in detecting socially and emotionally significant stimuli. These findings contribute to a deeper understanding of the neurological basis of ASD symptoms, including social withdrawal, repetitive behavior, and speech difficulties.
A ML based method for detecting ASD using multisite fMRI data was proposed by Kang et al. 56 Their approach addressed inconsistencies across neuroimaging datasets to improve the generalizability of ASD classification frameworks as discussed in 83 . One important step is transforming fMRI data into a “glass brain” dataset to facilitate more effective feature extraction. To capture brain connectivity patterns, the study employs a LeNet5 Convolutional Neural Network (CNN) for feature extraction, followed by the construction of a subject-level partial correlation matrix. Kang et al. 56 further introduced a method known as sms partitioning to improve Acc across multiple datasets. This technique effectively addresses inter-site variability, contributing to more robust model training. While the model traning or model optimization mostly depend upon the hyperparameter. Hyperparameter tuning plays a vital role in improving the performance of ML and DL models. Common strategies include grid search, which exhaustively evaluates predefined parameter combinations; random search, which samples parameter settings more efficiently across the search space; and Bayesian optimization, which intelligently selects promising configurations based on prior evaluations. These approaches help identify optimal learning rates, batch sizes, number of layers, and regularization parameters, thereby enhancing model stability and predictive performance in ASD neuroimaging tasks. As the classification is performed using a Multi Layer Perceptron (MLP), which identifies complex data patterns. The model exhibited strong generalization performance, achieving 93% Acc on single-site data (OHSU site) and 83.5% across multi-site datasets. Despite these encouraging results, the study does not explicitly consider limitations such as inter-site data heterogeneity, limited dataset sizes, and potential overfitting risks.
A Multi-Model Deep Ensemble Classifier (MMEC) for identifying ASD using fMRI data was introduced by Herath et al. 57 to address the limitations of traditional DL models, which often struggle to generalize due to the limited availability of labeled neuroimaging datasets as also mentioned in the study. 84 To overcome this challenge, they implemented an ensemble-based classification method combined with transfer learning to increase both Acc and robustness in ASD detection. Herath et al. employed multiple DL architectures, including Inception V3, Residual Network (ResNet)50, MobileNet, and DenseNet, to extract diverse and meaningful features from fMRI data. Several ensemble techniques were evaluated to maximize model performance.
The averaging and weighted averaging strategies achieved an impressive 97.82% classification Acc, while a stacking-based ensemble method reached 97.78%. By utilizing pre-trained models from large-scale image recognition tasks, MMEC effectively overcomes the limitation of small datasets, offering a more robust and generalizable ASD classification framework as described in 85 .
A novel diagnostic framework named Dual-Atlas Multi-Feature Learning Graph Neural Network (DML-GNN) was proposed by Liu et al., 58 employing a Graph Neural Network (GNN)-based approach for the classification of ASD based on the studies.86,87 This model utilizes a dual-atlas mechanism to derive both global and local brain characteristics from neuroimaging data, thereby improving the analysis of functional and structural deviations associated with ASD. Through the integration of multiple neuroimaging features, their approach achieved a classification accuracy of 91.62% in differentiating ASD subjects from typical controls. The results highlight the effectiveness of combining multi-feature learning with GNN architectures for characterizing brain connectivity in the context of ASD diagnosis.
A DL architecture named 3T Dilated Inception Network (3T-DINet), specifically developed for ASD diagnosis using rs-fMRI data, was proposed by Kavitha and Siva. 59 Their model addresses key challenges found in existing techniques, such as overfitting, suboptimal diagnostic performance, and high computational overhead. The 3T-DINet model integrates dilated convolutions into an inception module that operates with three distinct dilation rates: low, medium, and high. These are employed to extract multi-scale features from functional connectivity patterns and are also described in the model. 88 To enhance feature representation and address the vanishing gradient issue, Kavitha and Siva 59 integrated ResNet blocks into the architecture. Additionally, they employed the Crossover-based Black Widow Optimization (CBWO) algorithm to fine-tune the model’s hyperparameters. The framework was assessed across five ASD datasets, where it outperformed several contemporary methods in diagnostic accuracy. These improvements are largely attributed to the synergy between residual learning, multi-scale feature extraction, and advanced optimization. As a result, the model offers a promising advancement in early ASD detection, with potential implications for improving treatment strategies.
An approach for diagnosing ASD using dynamic fMRI data, which captures time-varying brain activity often neglected by conventional static methods, was introduced by Wang et al. 60 They introduced the Masked Connection-based Dynamic Graph Learning Network (MCDGLN), which incorporates both static and dynamic brain connectivity features to enhance classification accuracy. In their method, BOLD signals are segmented into temporal frames to extract dynamic brain characteristics. To isolate task-relevant connections and fuse dynamic functional connectivity, Wang et al. employed a Weighted Edge Attention (WEA) module. A self-attention-enabled Hybrid Graph Convolutional Network (HGCN) was used to derive topological features, emphasizing important neural patterns. Static connections were optimized using a task-specific mask that filtered out irrelevant signals, while an Adaptive Compression.
Encoder (ACE) emphasized critical features by compressing static information. Their model achieved a classification accuracy of 73.3% on 1,035 participants from the Autism Brain Imaging Data Exchange (ABIDE) I dataset, demonstrating superior performance over several prior methods. This work highlights the value of modeling dynamic connectivity and refining static patterns for more effective ASD diagnosis in neuroimaging research.
By combining Graph Convolutional Network (GCN) with multimodal Functional Connectivity (FC) data, Ma et al. 61 presented a novel model for diagnosing ASD. To enhance the relevance and precision of features used for classification, this approach focuses on FC data extracted from specific brain subnetworks associated with ASD, in contrast to earlier methods that typically rely on whole-brain FC information. The Ma et al. 61 suggested a unique External Attention Network Readout (EANReadout) layer to address the variability observed in ABIDE datasets. By making it easier to investigate possible subject patterns, this approach successfully reduces dataset variability and strengthens the model’s resilience. The ABIDE dataset, which included 714 individuals, was used to test the suggested framework. An average classification Acc of 70.31% was observed by their findings. With an average Acc improvement of 4.32%, the EANReadout layer significantly beat typical readout layers.
To address the limitations of conventional methods that often overlook the distinct roles of positive and negative FC, Guan et al. 62 proposed a dual-view feature extractor. This component independently captures and integrates features from both positive and negative functional connectivity, providing a more nuanced and accurate representation of brain connectivity patterns associated with ASD as mentioned in the study. 89 The study uses a K-Nearest Neighbor (KNN) technique to create a dynamic population graph without depending on preset architectures. This approach enables a more adaptable and customized depiction of brain networks by dynamically determining the graph’s topology depending on the retrieved attributes. The Guan et al. 62 also used the Vrex technique and the PolyLoss function to address issues like class imbalance and inter-site variability that are present in multi-site datasets. These approaches aim to enhance the model’s robustness and ability to generalize effectively across diverse datasets. A total of 1,102 participants from the ABIDE I dataset were used to test the suggested framework. Wang et al. 63 created the Graph ASD Classifier (GAC), a classifier based on Graph Attention Networks (GATs). The authors apply a sample attention technique to solve the problem of data heterogeneity imposed by multi-site fMRI datasets. Each brain image is transformed into a graph, with edges showing the functional connectivity between each node and nodes representing the brain’s areas of interest (ROIs) as in 90 . Rich information for categorization is ensured by extracting statistical and wavelet-based characteristics from the BOLD signal to describe each node. To learn discriminative representations from the brain graphs, the GAC model uses a graph attention network. In contrast to traditional GATs, improves adaptability against site-specific variability found in datasets such as ABIDE by incorporating a sample attention mechanism that gives training samples varying weights. This enables the model to concentrate more on complicated and instructive samples. In addition, a node selection pooling strategy is employed to identify key brain regions that play a significant role in ASD classification, thereby enhancing both the interpretability and predictive accuracy of the model. The suggested model outperforms a number of cutting-edge baseline models with an Acc of 70.07%, according to experiments done on the ABIDE I dataset (children younger than 12 years old). The attention mechanism increases the interpretability of identified biomarkers in addition to improving generalization.
To enhance the diagnosis of ASD, Wang et al. 64 proposed the IFC-GNN model, which integrates functional connectivity interactions into a multimodal graph neural network framework. By leveraging both spatial and temporal features derived from rs-fMRI data, the model constructs a comprehensive representation of brain connectivity patterns associated with ASD as discussed in the. 91 Within the IFC-GNN architecture, a novel approach utilizing multimodal GCNs is employed to effectively process dynamic and spatial aspects of neural activity. This design enables the model to detect distinctive connectivity signatures that differentiate individuals with ASD from typically developing controls. By accounting for the inherent complexity of neural networks like as, 92 the method offers improved sensitivity to variations linked with neurodevelopmental disorders. The IFC-GNN model demonstrated a 10% performance improvement over traditional GNN models, achieving a classification accuracy of 80.66% on publicly available datasets. This work highlights the effectiveness of combining multimodal data and complex graph structures to better capture the heterogeneous nature of brain disorders.
Zhang et al. 65 introduced the ASD-SWNet model, which employs a shared-weight mechanism between an autoencoder (AE) and a CNN to integrate supervised and unsupervised learning strategies. In this framework, the autoencoder extracts stable, low-dimensional representations from fMRI data to facilitate joint training and activate the convolutional neural network’s weights for enhanced classification. To address the challenge of limited sample sizes commonly found in ASD datasets, the authors incorporated a data augmentation approach tailored for time-series medical data as included in 93 . The performance of ASD-SWNet was evaluated using the ABIDE-I and ABIDE-II datasets, with leave-one-out cross-validation applied to the former and nested ten-fold cross-validation used for the latter. On the ABIDE-I dataset, the model achieved an Acc of 76.52% and an AUC of 0.81, surpassing several state-of-the-art methods, including Hi-GCN, ASD-SAENet, and ASD-DiagNet. Visualization through t-distributed stochastic neighbor embedding (t-SNE) further demonstrated ASD-SWNet’s ability to clearly separate individuals with ASD from healthy controls, showing well-defined clustering after classification.
To improve feature extraction and mitigate overfitting, Chandra et al. 66 introduced ASDC-Net, a generalized end-to-end CNN architecture comprising convolutional layers, batch normalization, and dropout. The model employs the Adam optimizer for training and incorporates the ReLU activation function to introduce non-linearity. Designed to recognize complex patterns in rs-fMRI data, the framework facilitates effective discrimination between individuals with ASD and typically developing controls. ASDC-Net was evaluated using multisite rs-fMRI datasets from the ABIDE database, achieving a classification accuracy of 76.72%, thereby demonstrating its diagnostic capability.
In a separate study, Zhou et al. 67 proposed the Multipattern Graph Convolutional Network (MPGCN) model for ASD detection using rs-fMRI data inspired from. 94 Unlike conventional GCNs that rely on a single Functional Brain Network (FBN) pattern, MPGCN extracts and integrates multiple FBN configurations to form a more comprehensive representation of atypical brain connectivity. This multipattern fusion addresses the limitations of single-pattern models by providing a richer depiction of neural network abnormalities associated with ASD. When tested on a dataset of 92 subjects from ABIDE, the model achieved an accuracy of 91.1% and an AUC of 0.9742, outperforming several state-of-the-art classifiers. The enhanced performance is attributed to the model’s ability to capture diverse connectivity features, although its generalizability may be constrained by the small sample size and thus warrants further validation on larger and more diverse datasets.
To address the challenges posed by high-dimensional FC features and limited dataset sizes, Khan and Shang 68 developed the ASD-Generative Adversarial Network (GAN)Net framework, which utilizes a conditional Generative Adversarial Network (cGAN) to generate synthetic FC features. This data augmentation strategy helps expand the dataset and mitigates overfitting. The framework operates in two stages: first, the cGAN is trained on functional connectivity features extracted from the NYU dataset to generate synthetic data for each class; second, a Multi-Head Attention mechanism is applied during classification to focus on the most relevant features, thereby enhancing model interpretability. By streamlining the diagnostic process and eliminating the need for manual feature engineering, the end-to-end design of ASD-GANNet offers both efficiency and performance. When evaluated using 10-fold cross-validation on the ABIDE dataset, the model achieved an accuracy of 82%, with sensitivity and specificity also at 82% and 81%, respectively. Remarkably, in a site-wise comparison, the method outperformed competing approaches in 10 out of 17 sites, demonstrating strong generalizability and robustness.
Tang et al. 69 proposed a hybrid diagnostic framework, GNN-Long Short-Term Memory (LSTM), designed to capture both spatial and temporal dynamics in rs-fMRI data for the identification of ASD. To address the limitations of static functional connectivity analysis, their model uses a sliding window method to generate dynamic functional connectivity matrices. These matrices are subsequently processed through LSTM layers to model temporal dependencies and GNN layers to learn spatial relationships. The architecture incorporates a jump connection strategy to facilitate multi-scale information flow and employs dynamic graph pooling to highlight salient connectivity features. When evaluated on the ABIDE I and II datasets, the model achieved classification accuracies of 80.4% and 79.63%, respectively. This integration of spatial and temporal learning demonstrates the model’s capacity to comprehensively characterize the complex neural dynamics associated with ASD.
In a separate investigation, a specialized CNN framework was designed by Feng and Xu 70 to process high-dimensional neuroimaging data. The architecture includes convolutional layers, pooling layers, batch normalization, dropout, and fully connected layers. It was trained using data from the ABIDE I dataset, specifically 22,176 two-dimensional Echo-Planar Imaging (EPI) slices extracted from 4D fMRI scans. Their dataset comprised 70 typically developing controls and 56 children diagnosed with ASD, with preprocessing performed via the Connectome Computation System (CCS), incorporating slice-timing correction and spatial normalization. After training over 50 epochs on 17,740 samples, the model achieved notable performance metrics: 99.39% Acc, 98.80% recall, 99.85% precision, and a 99.32% F1 score. Analysis of the resulting feature maps confirmed the model’s capacity to extract hierarchical representations, highlighting its effectiveness in differentiating between ASD and control subjects.
An innovative approach to constructing Functional Connectivity Networks (FCNs) from rs-fMRI time series data was proposed by Parui et al., 71 using correlation matrices between regions of interest (ROIs). Their method incorporates multiple brain atlases, including the Automated Anatomical Labeling (AAL), Harvard-Oxford, Eickhoff-Zilles, Talairach and Tournoux, Dosenbach 160, and Craddock 200 to account for individual variability and address the high dimensionality of fMRI data. To further refine the connectivity matrices, the authors introduced the Low-Estimated Rank Tensor (LERT) method, which organizes FCNs into a three-dimensional tensor and applies a low-rank approximation for dimensionality reduction. This enables the model to extract intrinsic similarities across individuals while enhancing the robustness of connectivity features used for classification. To identify the most informative ROIs, their framework aggregates predictions across atlases using a majority voting scheme. According to results on the ABIDE dataset, the proposed AI-based model achieved an Acc of 84.79% for ASD detection.
The effectiveness of various 3D data augmentation techniques for ASD classification using rs-fMRI was assessed by Jo¨nemo et al. 72 In their study, a 3D CNN was trained using the ABIDE dataset, which consisted of 1,112 subjects, 539 with ASD and 573 controls. The authors applied several augmentation methods, including image flipping, rotation, brightness correction, scaling, and elastic deformation, to regional homogeneity (ReHo) maps derived from the rs-fMRI data. These augmented samples were used to evaluate whether such preprocessing techniques could enhance the model’s classification performance. Their findings indicated modest accuracy gains ranging from 0.6% to 2.9%, with overall accuracies between 62% and 66%. They concluded that while 3D data augmentation has the potential to improve performance, its benefits are marginal and highly influenced by other variables such as feature selection and preprocessing pipeline design.
Yousefian et al. 73 investigated the use of graph representation learning algorithms to detect ASD in rs-fMRI data. They sought to use sophisticated graph embedding techniques to capture both local and global connectivity patterns, acknowledging that ASD is represented by changes in brain functional connectivity. Their method aims to improve the robustness of ASD categorization by generalizing across different recording settings and phenotypic differences. The researchers used a number of graph embedding approaches, such as Anonymous Walk Embeddings, Node2vec, Struct2vec, multi Node2vec, and Graph2Img, using data from the ABIDE I and II datasets. The Graph2Img approach performed better than the others. In order to minimize dimensionality, Principal Component Analysis (PCA) was also applied. Classification was then performed using a deep CNN based on the LeNet architecture. With 80% Acc on the ABIDE I dataset, the highest findings were obtained. Stember et al. 77 presented a novel application of Deep Reinforcement Learning (DRL) using rs-fMRI data. The researchers sought to create a data-efficient model that could differentiate between neurotypical controls and individuals with ASD, acknowledging the challenges presented by small fMRI datasets. A supervised DL SDL model trained on the same dataset was compared to the efficiency of a DRL classifier built on 100 graph-label pairs taken from the ABIDE dataset. On a variety of evaluation criteria, the DRL model performed noticeably better than the SDL approach. Interestingly, the DRL classifier had a statistically significant p-value of 2.4 × 10−7 and an F1 score of 76, as opposed to 67 for the SDL model. The DRL model showed progressive learning that transferred well to a different testing set, but the SDL model rapidly overfitted the training data. This suggests that DRL is a useful method for situations with less data since it can efficiently learn from small training datasets. A model that utilizes AAL, Bootstrap Analysis of Stable Clusters (BASC), and Power atlases to extract functional connectivity features from 866 subjects (402 with ASD and 464 controls) in the ABIDE dataset was presented by Subah et al.. 78 A Deep Neural Network (DNN) classifier as an with two hidden layers as discussed in 95, each containing 32 neurons, was trained using these features. To prevent overfitting, dropout and L2 regularization were applied. The model outperformed several other methods, achieving a mean Acc of 88%, sensitivity of 90%, F1-score of 87%, and an AUC of 96%. Ahammed et al. 79 proposed a Bag-of-Features (BoF) technique to differentiate individuals with ASD from Typically Developing (TD) controls. This model captures local features from fMRI images, encodes them into a histogram of visual words, and employs a Support Vector Machine (SVM) classifier. The approach aims to identify spatial patterns in brain activity associated with ASD. By focusing on local characteristics and simplifying the input space, the BoF model effectively manages the high dimensionality of fMRI data. These discriminative local features are then used to train the SVM classifier, which categorizes participants as either TD or ASD.
A novel domain adaptation approach for fMRI data, particularly addressing challenges within the ABIDE-II dataset using Fader Networks, was introduced by Pominova et al. 80 Given that ABIDE-II includes fMRI scans from multiple sites with different acquisition protocols, domain shifts can hinder model generalization. To mitigate this, they learned site-invariant latent representations of fMRI data via 3D convolutional AEs. These representations aim to minimize site-specific information while preserving critical brain features. The Fader Network utilizes an encoder-decoder framework, in which the encoder transforms fMRI data into a latent space representation, and the decoder reconstructs the original input from this latent encoding. An adversarial discriminator attempts to predict the site label from the latent space, while the encoder is trained to reduce this discriminator’s ability, effectively removing site-specific signals. This adversarial training encourages learning of domain-invariant features that better reflect the underlying brain pathology. The method demonstrated superior classification Acc and generalizability compared to conventional approaches on the ABIDE-II dataset. Haweel et al. 81 developed a Computer-Aided Diagnosis (CAD) system for ASD detection using fMRI data by combining CNN with Discrete Wavelet Transform (DWT). The system analyzes patterns of local and global brain activity elicited during a speech task to extract discriminative features associated with ASD. The DWT decomposes fMRI signals into temporal and frequency components, which the CNN then uses to learn hierarchical representations for classification.
Utilizing the ABIDE I dataset, which includes 871 samples from both ASD and TD individuals acquired via rs-fMRI, Bayram et al. 82 conducted their analysis. They implemented various DL architectures, including CNNs, LSTM networks, and multimodal models combining CNN and LSTM layers. These models were designed to capture the temporal and spatial characteristics inherent in fMRI data to improve ASD classification Acc. The results showed that the multimodal CNN-LSTM model outperformed the CNN and LSTM models individually, leveraging CNN’s strength in spatial pattern recognition and LSTM’s ability to model temporal dependencies for a more comprehensive analysis of fMRI data.
ASD diagnosis based on sMRI neuroimaging techniques
Summary of studies on ASD diagnosis using sMRI-based neuroimaging techniques.
Additionally, sMRI allows for detailed examination of specific brain regions such as the amygdala, hippocampus, and cerebellum, which have been implicated in ASD. Alterations in the size and shape of these structures correlate with ASD related symptoms like memory deficits, motor coordination problems, and emotional regulation difficulties. Structural scans and diffusion based imaging further reveal altered white matter connectivity, suggesting disrupted communication and information processing in individuals with ASD. Thus, sMRI facilitates early and objective ASD identification by detecting consistent structural patterns. Sravani and Kuppusamy 96 proposed an advanced sMRI based ASD diagnostic method. Their two-step approach, applied to the ABIDE dataset, involved preprocessing and model tuning. During preprocessing, the Canny Deriche Edge Detection (CDED) technique was used to enhance image quality by accurately identifying edges. The images were then resized and enhanced to optimize them for analysis. This meticulous preprocessing ensured high-quality input data, enabling more precise model training. For model optimization, the authors combined Deep Convolutional Neural Network (DCNN)s with a Dipper-Throated Particle Swarm Optimization (DTPSO) algorithm. This multimodal method aimed to efficiently tune DCNN parameters to improve differentiation between ASD and TD individuals. To enhance interpretability, Class Activation Mapping (Grad-CAM) was applied, providing insights into the model’s decision-making process, an essential aspect for clinical relevance. The CDED-DCNN-DTPSO model achieved impressive performance, with 95.9% Acc, 96.5% precision, 95.9% sensitivity, 95.9% specificity, 96.2% F1-score, and 94.5% AUC, demonstrating significant improvement over models without CDED preprocessing.
Focusing on diagnosing ASD in children aged 5 to 10, Bahathiq et al. 97 identified robust neuroimaging biomarkers from sMRI data. Their study utilized data from the ABIDE I and II datasets, along with a local Saudi dataset from King Abdulaziz University (KAU) hospital. Multiple feature selection techniques, including Boruta, Grey Wolf Optimizer (GWO), and rfe with Cross-Validation rfeCV, were employed to enhance model Acc. Hyperparameter tuning was conducted using GWO and random search methods. The study also examined the effect of incorporating personal characteristics, such as age and gender, alongside sMRI features. Among seven evaluated. ML models, the combination of SVM with GWO for feature selection and hyperparameter tuning performed best, achieving an average Acc of 71%. The model’s adaptability and generalizability were reinforced by testing on a separate local dataset and applying 10-fold cross-validation. This research contributes valuable insights into the neurological basis of ASD by identifying key brain regions associated with the disorder. Manoj et al. 98 investigated the effectiveness of Morphological Distance Related Features (MDRF) compared to traditional Morphological Features (MF) for ASD classification using sMRI. Data from seven sites across the ABIDE-I and ABIDE-II databases underwent uniform preprocessing. Using the Destrieux atlas, the brain was parcellated into regions from which both MDRF and MF (e.g., surface area) were extracted. Various ML classifiers, including Random Forest (RF), SVM, and MLP, were evaluated on their ability to distinguish ASD from TD based on these features. Results indicated that MDRF outperformed MF notably. Specifically, the RF classifier achieved a single-site average Acc of 95.27% using MDRF compared to 91.78% with MF, and the overall average Acc across sites was 82.91% with MDRF versus 69.08% with MF. These findings highlight MDRF as a more reliable and generalizable feature set for ASD classification. Additionally, the study found that features from the right hemisphere and frontal lobe contributed more to classification, suggesting their importance in ASD related morphological differences.
Variability. They developed three separate CNN models to explore demographic effects: Model 1 classified subjects based on gender and diagnosis (male-ASD, male-control, female-ASD, female-control); Model 2 segmented based on age groups (children-ASD, children-control, adults-ASD, adults-control); and Model 3 merged both demographic factors into an eight-class structure. Grid Search Optimization was used to tune model parameters, and performance was validated using five-fold cross-validation. Results showed that demographic-aware models significantly improved diagnostic accuracy. Model 2 achieved the highest Acc of 85.42%, while Model 1 reached 80.94%. Model 3, although more complex, yielded a lower Acc of 67.94%, likely due to reduced sample sizes per class.
To further improve classification performance, Jain et al. 101 proposed a multimodal diagnostic framework that integrates a Deep CNN with a Decision-Making (Dwarf Mongoose (DM)) optimized ResNet. Their approach involves preprocessing sMRI images to remove non-brain tissues, followed by segmentation using multimodal Fuzzy C-Means (FCM) and Gaussian Mixture Model (GMM) techniques. Feature extraction was conducted using the VGG-16 network, and hyperparameters of the DM-ResNet were optimized via the DM algorithm. This pipeline yielded an impressive Acc of 99.83% in classifying ASD using sMRI scans, highlighting the potential of combining advanced segmentation and optimization techniques.
In another study, Gogoi et al. 102 introduced a DL approach to identify ASD using the ABIDE I dataset, which includes unlabeled brain MRI scans from individuals with ASD. They employed five different DL architectures VGG16, Inception v3, ResNet50, DenseNet121, and MobileNet, to analyze anatomical and functional variations.
After clustering the data, the models were evaluated for classification performance, with the modified VGG16 network achieving the highest accuracy. The findings underscore the effectiveness of DL models in extracting discriminative features from MRI data and the superior performance of optimized architectures like VGG16.
To overcome the limitations of behavioral diagnostic methods, a digital pipeline for early and objective ASD detection was proposed by Nogay and Adeli. 103 Their two-stage approach involved preprocessing sMRI scans, applying quality control, data augmentation, cropping, and Canny Edge Detection (CED), followed by classification using a CNN optimized via grid search. This pipeline successfully isolated diagnostically significant patterns from neuroimaging data.
An ensemble deep CNN architecture optimized for sMRI-based ASD classification was introduced by Mishra and Pati. 104 Their strategy integrated on-the-fly data augmentation to enhance generalization and mitigate overfitting. Multiple CNN models were fine-tuned using various optimizers, including RMSProp, Adam, and SGD. Evaluation on the ABIDE I dataset showed that the Adam-optimized ensemble achieved the highest performance, with an Acc of 81.3% and an AUC of 0.84, demonstrating the efficacy of their real-time augmentation and ensemble framework.
To improve interpretability and performance, a model combining an attention mechanism with a 3D-ResNet was developed by Chen et al. 105 The attention module enabled the network to focus on diagnostically relevant brain regions, while the residual connections in 3D-ResNet facilitated the training of deeper architectures by addressing vanishing gradient issues.
Jain et al. 106 proposed an automated diagnostic method leveraging shape-based features extracted from the brainstem. Using T1-weighted MRI scans from the ABIDE dataset, they employed FreeSurfer for preprocessing and Spherical Harmonics for generating descriptors that capture the 3D geometry of the brainstem. Feature vectors were statistically refined via ANOVA and classified using an SVM with stratified k-fold cross-validation.
A Self-Attention GAN (SAGAN) trained exclusively on healthy control sMRI data was introduced by Oruganti et al. 107 The model reconstructed consecutive slices from axial, coronal, and sagittal planes, with reconstruction loss serving as an anomaly indicator. Various loss functions, including L2, cosine, and hybrid metrics, were tested. The SAGAN trained on combined axial and coronal planes achieved the best performance (Acc: 95.65%, AUC: 0.90), outperforming those trained on individual planes.
To tackle the challenges of high dimensionality and small sample sizes in sMRI-based classification, a recursive feature selection framework named RFS-MHDS was developed by Ali et al. 108 Using the ABIDE I dataset, morphological characteristics such as cortical thickness, surface area, and curvature were derived through the FreeSurfer processing pipeline. The RFS-MHDS method iteratively eliminated redundant features through random sampling and selection. Classification using ANN achieved an Acc of 82%, surpassing the 72.
The comparative analysis of left and right surface morphometric features derived from T1-weighted MRI scans was carried out by Misra and Pati. 109 These features were used to train two ML classifiers: Decision Tree (Decision Tree (DT)) and Random Forest (RF). Results revealed that RF achieved superior classification accuracy, indicating its effectiveness in capturing subtle morphometric differences associated with ASD.
An integrative CNN classifier utilizing individual Morphological Brain Networks (Morphological Brain Network (MBN)s) was proposed by Gao et al. 110 Using the ABIDE I dataset comprising 518 ASD and 567 control participants, they extracted cortical thickness measures from 68 brain regions as defined by the Desikan-Killiany atlas. Pearson correlation matrices were computed to construct individual structural covariance networks. These MBNs were then input into a deep residual CNN model. Interpretability was enhanced via Grad-CAM, which visualized regions most influential to classification. The resulting architecture achieved a classification Acc of 71.8%, outperforming prior approaches focused solely on local morphological descriptors.
ASD diagnosis based on EEG neuroimaging techniques
Summary of studies on ASD diagnosis using EEG-Based neuroimaging techniques.
In the work of Sharifi et al., 111 EEG signals were collected using 129 channels from both autistic and neurotypical participants. To reduce the high dimensionality, the study selected 17 channels over the scalp. Bandpass filtering was applied to eliminate frequencies below 0.5 Hz and above 50 Hz, which minimized noise and irrelevant components. A CNN was then employed to extract features from the raw EEG signals. These extracted features were used to train five classifiers including SVM, Linear Discriminant Analysis (LDA), DT, Gaussian Naive Bayes (GNB), and RF. Among these, the models using SVM, GNB, and RF achieved 100% Acc.
A multimodal GCN named Rest-HGCN was introduced for ASD diagnosis using resting-state EEG data by Tang et al., 112 The EEG data were filtered into five bands: delta, theta, alpha, beta, and gamma. For each band, functional connectivity networks were constructed using the phase-locked value method to assess synchronization across brain regions. Differential entropy features were also extracted to reflect local activation levels. The Rest-HGCN model consists of two branches: a data-driven branch built from the EEG data itself and a cognitive prior branch that incorporates brain functional networks. These two branches share information through an attention mechanism that emphasizes relevant features and suppresses irrelevant ones. The final classification used features combined from both branches. This method, when evaluated using the Autism Biomarkers Consortium for Clinical Trials (ABC-CT) dataset with k-fold cross-validation, achieved an accuracy of 87.12% in single-subject analysis and 85.32% in cross-experiment analysis.
In a study conducted by Hao Luo et al., 113 EEG recordings from 20 toddlers with ASD and 25 healthy participants were collected using a 19-channel system based on the international 10–20 system. The recordings were made during both resting and sleep states. Preprocessing involved manual inspection, independent component analysis for artifact removal, and band-pass filtering between 0.1 Hz and 45 Hz. The signals were separated into Alpha, Theta, Alpha-1, Alpha-2, and Beta sub-bands. Magnitude-squared coherence was used to assess functional connectivity across channel pairs, and node strength was computed for each electrode. A sliding-window technique divided the signals into overlapping segments, from which six statistical features (mean, standard deviation, median, interquartile range, kurtosis, and skewness) were computed. This resulted in a total of 95 features per measure. Using SVM with recursive feature elimination for classification and feature selection, the best performance was achieved with a 3-second window and 50% overlap, yielding an average accuracy of 95.2%. Only a few features, especially from the frontal region and Beta band, were needed to reach 100% accuracy.
In the research by Alhassan et al., 114 an energy-efficient EEG-based system for ASD detection was developed using wearable sensors. Low-power portable devices were used to acquire signals, and real-time preprocessing was performed on-node to remove artifacts, reduce noise, and apply band-pass filtering. Features from statistical and frequency domains were extracted and compressed to save energy. These were used as input to ML models including SVM, LR, and DT, and the system achieved a classification accuracy of 96%, combining high detection accuracy with low energy consumption, making it suitable for real-time and large-scale applications.
According to the work of Santhosh Peketi and Sanjay B. Dhok, 115 the Brain-Computer Interface for Autism (BCIAUT)-P300 dataset, which includes EEG signals from 15 individuals with ASD aged between 16 and 38, was utilized. Signals were recorded from eight parietal and central electrodes during a visual oddball task, as are also discussed in 90 . After preprocessing to remove artifacts, the data were segmented into epochs corresponding to target (P300) and non-target stimuli. Variational Mode Decomposition was applied to extract five modes per epoch. From these, 30 linear and nonlinear features were calculated from both time and frequency domains. Class imbalance was addressed using the SMOTE technique. Each mode’s features were evaluated using SVM with a fine Gaussian kernel, KNN, and DT. The fifth mode with SVM provided the best performance with 91.12% Acc, a 91.18% F1-score, and a 96.6% AUC.
As reported by Tawhid et al., 116 a publicly available EEG dataset with data from both ASD and control participants was used. Preprocessing involved re-referencing, filtering, artifact elimination, and normalization of the EEG signals. The Short-Time Fourier Transform (STFT) was then employed to convert the signals into two-dimensional spec-trograms capturing time-frequency features. Two distinct classification strategies were subsequently evaluated. The ML pipeline extracted textural features from the spectro-grams, reduced them using PCA, and tested six classifiers. Meanwhile, the DL pipeline used three CNN models trained directly on the spectrogram images. The highest-performing CNN achieved a classification accuracy of 99.15%, surpassing the best ML classifier which reached 95.25%.
As presented by Sinha et al., 117 a computer-aided diagnostic method for ASD was developed using EEG data from 30 subjects (20 control and 10 ASD). Preprocessing included digital filtering using both FIR and IIR filters. Time-domain features such as mean, standard deviation, skewness, entropy, variance, and kurtosis were extracted. Frequency-domain features were obtained through the Discrete Wavelet Transform, breaking down signals into delta, theta, alpha, beta, and gamma bands. Multiple classifiers, including LDA, KNN, subspace KNN, and SVM were evaluated, and the subspace KNN achieved the highest Acc of 92.8%.
In the work of Esqueda-Elizondo et al., 118 attention detection in a 13-year-old child with ASD was investigated using an Emotiv Epoc+ headset. Signals were captured from electrodes F3, F4, P7, and P8 at 2048 Hz and later downsampled to 128 Hz after filtering. Behavioral recordings were synchronized with the EEG data. Power Spectral Density analysis was used to extract frequency-domain features. Relative power values from theta, alpha, and beta bands were used to compute 24 metrics, including the Theta–Beta Ratio and the Theta/(Alpha+Beta) ratio. These features were evaluated using eight ML models, such as 121 , and the multilayer perceptron achieved the best performance with an accuracy of 92.98% and an AUC of 0.9299.
Using a 19-channel DABO system, Zubair et al., 119 collected EEG signals from 10 children (eight with ASD and two controls) during several activities, including resting with eyes open/closed, watching emotional videos, hand clenching, and playing a memory game. Signals were filtered using an elliptic filter and decomposed into alpha, beta, and gamma bands. The alpha band was analyzed in detail due to its link to cognitive processes. Mel Frequency Cepstral Coefficients (MFCC), commonly used in speech analysis, were extracted to describe the EEG data. An MLP neural network was used for classification, and emotional states were analyzed using Russell’s valence-arousal model.
Alturki et al., 120 proposed a feature extraction approach combining Local Binary Pattern (LBP) and Common Spatial Pattern (CSP) for improved classification. This method was tested on EEG datasets for both ASD and epilepsy. Signals were preprocessed using a fifth-order Butterworth filter and ICA. CSP features were combined with energy, band power, and entropy-based features. Multiple classifiers including KNN, SVM, LDA, and ANN were tested. The combination of CSP and LBP performed best with KNN, achieving an average Acc of 98.46% for ASD and 98.62% for epilepsy.
ASD diagnosis based on multimodal neuroimaging techniques
Summary of studies on ASD diagnosis using multimodal-based neuroimaging techniques.
By integrating both structural and functional neuroimaging modalities, the method introduced by Song et al., 123 enhanced diagnostic Acc while simultaneously addressing the heterogeneity observed in individuals with ASD. Their framework employs two transformer models to capture the dynamic nature of brain activity over time. The first transformer processes temporal aspects of fMRI data, also explained in 129 , whereas the second merges these temporal features with spatial features extracted through a GCN applied to sMRI data. This co-training mechanism enables the model to generalize better across datasets by utilizing complementary information from both imaging types. Validation conducted on the ABIDE-I and ABIDE-II datasets confirmed the model’s performance, reaching 79.47% Acc, 78.97% precision, 82.11% recall, and an AUC of 0.85.
To examine both anatomical and functional connectivity patterns, a dual-branch GNN architecture was developed by Gao et al., 124 combining structural and functional MRI data. Their model yielded a classification accuracy of 73.9%. To further explore mechanisms distinguishing ASD from typical development, the authors implemented a perturbation model to identify brain imaging biomarkers. Notably, prominent fMRI features were localized in the frontal, temporal, occipital, and cerebellar regions, whereas the sMRI data revealed significant characteristics within the frontal, temporal, parietal, and occipital lobes. These results underscore the complementary roles of structural and functional modalities in detecting ASD-related brain changes. In addition, the study incorporated transcriptomic data with neuroimaging features using partial least squares regression and filtering techniques, uncovering gene associations related to “presynapse,” “behavior,” and “modulation of chemical synaptic transmission,” thereby shedding light on the neurogenetic basis of ASD.
A diagnostic model named Multi-Kernel Learning Fusion algorithm based on Recurrent Neural Network (RNN) and Gated Recurrent Unit (GRU) (MKLF-RAG) was introduced by Chen et al., 125 to classify ASD and identify critical brain regions. This approach fuses sMRI and fMRI data, utilizing RNN and GRUs for multimodal feature selection, followed by multi-kernel fusion to combine extracted information. The method was tested on the ABIDE dataset, comprising data from 1,111 participants, and achieved a diagnostic accuracy of 64.3%, illustrating its capability to improve ASD classification through DL and multimodal fusion techniques. A related approach was proposed by Yakolli et al., 126 where phenotypic data were integrated with sMRI and fMRI features in an automated classification framework. The model employed CNNs for processing neuroimaging inputs from the ABIDE datasets and reported high classification accuracies of 87% using sMRI and 88% using fMRI, indicating the strong capability of CNNs in multimodal brain data analysis.
Manikantan and Jaganathan 127 proposed a graph-based model where subjects are represented as nodes and inter-node connections are defined using radiomic similarity measures derived from sMRI data. Spatial-temporal features from rs-fMRI were processed using a multichannel 3D CNN to generate node attributes. Radiomic features such as texture, shape, and intensity were reduced using stacked AEs to manage dimensionality. To address variations across sites in the ABIDE dataset, the fusion strategy combined both structural and functional information. A GCN was used to learn discriminative embeddings for classification. Functional connectivity metrics such as Voxel-Mirrored Homotopic Connectivity (VMHC), Regional Homogeneity (ReHo), and Amplitude of Low-Frequency Fluctuations (ALFF) were used to describe neural synchrony. Incorporating structural similarities into graph edges helped reduce inter-site variability. The proposed model achieved an AUC of 0.87 and an accuracy of approximately 83.3%.
To address the challenge of limited neuroimaging data, Alharthi and Alzahrani 128 proposed a slice-based strategy that involved extracting multiple slices from 3D sMRI and 4D fMRI volumes along the axial, coronal, and sagittal planes. These extracted slices were subsequently analyzed using 3D CNNs for both imaging modalities. In addition, the study evaluated the performance of four vision transformer models, ConvNeXT, MobileNet, Swin, and ViT on sMRI data to explore the potential of transfer learning as discussed in.130,131 The approach, validated on the NYU subset of the ABIDE dataset, yielded strong results. When using 50 central slices from fMRI volumes, the model achieved a classification accuracy of 87.10% and an F1-score of 82.61%. Using the complete set of mid-volume slices (excluding the peripheral views) resulted in 83.87% accuracy and a 77.27% F1-score. Among the transformer models, ConvNeXT delivered the best performance when applied to the central sMRI slices.
Most frameworks are evaluated under controlled experimental conditions, which may not reflect real-world clinical complexity. Furthermore, limited interpretability of DL models reduces clinician trust and decision transparency. Therefore, greater emphasis on standardized protocols, large-scale validation, and clinically explainable models is essential to bridge the gap between research findings and practical ASD diagnosis.
ASD classification datasets
The datasets used for ASD diagnosis techniques vary depending on the neuroimaging modalities involved. These datasets are employed for training and testing the performance of ASD diagnostic methods. Small datasets increase overfitting and reduce generalizability. High variability (e.g., multi-site scanners, demographics, protocols) introduces domain shifts that can reduce model stability. Larger and diverse datasets improve robustness, but require harmonization to ensure consistent and reproducible ASD classification performance. Below is a detailed overview of the available datasets.
ABIDE dataset
A widely used open-access neuroimaging dataset developed to support research on the diagnosis of ASD is the ABIDE. 132 It was launched in 2012 as a collaborative initiative to collect and share pre-existing MRI data from research institutions worldwide. Its primary objective is to advance understanding of the neurological foundations of ASD by providing extensive, standardized data that promotes generalization and reproducibility across studies. The dataset includes phenotypic information such as age, gender, IQ, and diagnostic status, along with sMRI and fMRI data. The ABIDE dataset is divided into two parts: ABIDE I and ABIDE II.
ABIDE I
To facilitate investigations into ASD using neuroimaging data, the open-access ABIDE I dataset was introduced in August 2012. 133 Seventeen international research institutions collaborated to form ABIDE I, aiming to aggregate and share previously acquired MRI data to support an extensive study of the brain foundations of ASD. Researchers have access to sMRI, rs-fMRI, and a wide range of phenotypic information, including age, gender, IQ scores, and clinical diagnoses. The dataset’s reproducibility and generalizability have greatly benefited research in ML and neuroscience.
Comprising data from 1,112 participants: including 539 with ASD and 573 typically developing (TD) controls, ABIDE I spans a wide age range, from childhood through adulthood. Seventeen leading institutions worldwide, such as NYU Langone Medical Center, Stanford University, Yale, the University of Michigan, and the University of California (Los Angeles and San Diego), contributed to the dataset. Although each center employed its own imaging protocols, which introduced some variability, this also created a valuable resource for evaluating the consistency of findings across diverse populations and scanning conditions. The heterogeneity strengthens the reliability and external validity of conclusions related to ASD-associated brain alterations.
In terms of imaging content, the dataset includes rs-fMRI scans that capture brain connectivity patterns while subjects are at rest, enabling analysis of functional networks commonly implicated in ASD, such as the Default Mode Network (DMN) and the Salience Network. Meanwhile, the sMRI scans offer high-resolution anatomical images used to study features like cortical thickness, brain volume, and white and gray matter irregularities. To ensure compatibility with popular neuroimaging software and workflows, the dataset follows the Brain Imaging Data Structure (BIDS) standard. Beyond contributing to the development of advanced computational methods for classification and diagnosis, ABIDE I has played an essential role in uncovering structural and functional biomarkers associated with ASD.
ABIDE II
To address the limitations of the initial dataset and advance ASD research by providing larger, more balanced, and diverse neuroimaging data, ABIDE II was developed. Launched in 2017, 134 ABIDE II builds upon the primary goals of ABIDE I, which aimed to support robust and consistent neuroscience research through open-access, standardized brain imaging and phenotypic data. ABIDE II emphasizes developmental trajectories and clinical heterogeneity in ASD, expanding the number of imaging sites, offering more detailed phenotypic assessments, and covering a wider age range of participants.
The dataset includes neuroimaging data from 19 international research institutions, comprising approximately 1,044 individuals with a more balanced distribution of ASD and TD controls. A notable advancement in ABIDE II is the increased participation of women, which helps address the gender imbalance seen in ABIDE I. The broader age range, spanning early childhood to adulthood, allows researchers to examine how ASD-related brain abnormalities evolve over time.
ABIDE II provides high-quality sMRI and rs-fMRI data, along with comprehensive phenotypic evaluations such as assessments of multiple medical conditions, behavioral scales (e.g., ADOS, SRS), and cognitive scores. An important improvement in ABIDE II is the adoption of more standardized scanning protocols across participating sites, which enhances data comparability and reduces site-related variability, facilitating accurate multi-site analyses.
Similar to ABIDE I, ABIDE II is organized according to the BIDS standard, making it compatible with common neuroimaging analysis tools and easily accessible. Researchers have used ABIDE II to study functional connectivity differences in ASD, identify brain network biomarkers, and train ML models for ASD classification and subgrouping. Overall, ABIDE II plays a critical role in addressing the gaps left by ABIDE I, particularly in understanding development, providing richer phenotypic data, and improving demographic diversity. Numerous studies exploring neurodevelopmental mechanisms, brain network abnormalities, and ASD biomarkers have referenced this publicly available dataset.
KAU dataset
The datasets for ASD research from KAU provide valuable resources for developing ML based diagnostic methods. These datasets include sMRI and EEG recordings from children and adolescents with ASD, as well as TD controls. 135 The sMRI dataset consists of T1-weighted scans from 33 subjects aged 5 to 10, acquired using 3T Siemens SKYRA and VERIO scanners. Diagnoses were established following DSM-5 criteria, excluding individuals with comorbid conditions such as ADHD or epilepsy to ensure clinical validity and consistency. The imaging data were converted into NIfTI format and organized according to the BIDS standards to maintain compatibility with modern neuroimaging software.
The dataset comprising EEG recordings includes data from 18 male subjects between the ages of 9 and 16, where 8 participants were diagnosed with ASD and 10 served as control subjects, as reported by the study in. 135 The signals were acquired using a 16-channel g. tec system with a sampling frequency of 256 Hz. Data were recorded under resting-state conditions and processed with appropriate filtering, making them suitable for examining neural activity patterns linked to ASD. Researchers have applied methods such as wavelet transformations and CNNs to study anomalies in brain rhythms, coherence, and entropy metrics using this dataset.
Importantly, the KAU datasets contribute to international collaboration and innovation in ASD diagnosis by providing unique demographic diversity to the field. These datasets are publicly accessible through the university’s Brain-Computer Interface (BCI) research portal.
ABC-CT dataset
The ABC-CT dataset was developed as a comprehensive multisite neuroimaging and behavioral resource to identify empirical biomarkers for ASD. 53 The dataset includes data from 280 children aged 6 to 11 (180 with ASD and 100 TD controls) collected over six months. It comprises EEG, eye-tracking (ET), and behavioral video recordings from five leading U.S. institutions: Yale, Duke, Boston Children’s Hospital, UCLA, and University of Washington. The dataset integrates ET to analyze eye movement patterns, automated behavioral coding to assess social interaction, and high-temporal-resolution EEG to capture neural oscillations during social tasks. The inclusion of DNA samples from participants and their biological parents also allows for future genetic research. The ABC-CT employs strict, standardized protocols to address ASD heterogeneity and focuses on evaluating biomarkers for clinical trials, particularly those that predict treatment outcomes. Its open-access structure is maintained through quarterly data uploads to the National Database for Autism Research (NDAR), making the dataset publicly available.
BCIAUT-P300 dataset
The BCIAUT-P300 is a multi-session, multi-subject EEG dataset developed specifically for the study of ASD using BCI technology based on the P300. 136 It contains data from 15 individuals with ASD who participated in 105 sessions of joint-attention training, each consisting of seven sessions. The data was collected during an intervention that used a virtual environment where participants responded to social cues via P300 signals to improve joint-attention abilities, which are often impaired in ASD. This large dataset supports the development and evaluation of ML and DL algorithms for EEG decoding, especially in multisession and multi-subject contexts, enabling reliable and broadly applicable BCI systems for ASD populations. Following the 2019 IFMBE Scientific Competition, where several international teams competed to achieve the highest P300 signal classification Acc, the BCIAUT-P300 dataset was made publicly available. DL methods, particularly lightweight CNNs such as EEGNet, outperformed traditional ML techniques with accuracies above 90%. Due to its unique characteristics multi-subject, multi-session, and clinical relevance, the dataset serves as a valuable baseline for future research aiming to improve ASD diagnosis and intervention through EEG based BCI. It also allows researchers to design personalized, adaptive BCI training protocols for individuals with ASD and to study the stability of brain signals over time.
Potential concerns and future directions
The diagnosis of ASD using fMRI, sMRI, EEG, and multimodal neuroimaging modalities is expected to change rapidly in the future due to advances in neuroimaging technology, ML, and multimodal integration. Below is an extensive review of recent developments and promising directions for each modality:
Potential concerns
The following are the potential concerns of ASD diagnosis.
Requirements of Geographical Data To improve diagnosis and enable customized treatment, data collection and evaluation focused on ASD subgroups is urgently needed. Additionally, since the prevalence of ASD varies geographically, we suggest concentrating research efforts on regional datasets. 137
Requirements of Standardized Data Structure ML classifiers need to be stable, meaning that modifications to the training data should not significantly affect the results. Stability is essential for classification studies because replicating results can be challenging due to variability in classifier predictions. 138 The repeatability of the model should also be considered. Several proposed approaches aim to harmonize data structure, analysis, and storage, including the BIDS). 139
Multimodal Fusion EEG, fMRI, and sMRI data are combined in multimodal fusion to identify structural and functional challenges related to ASD. 140 Using DL techniques, this approach enhances the extraction and integration of complementary information across modalities, leading to more accurate and reliable classification.
Brain Age Prediction Models Brain age prediction models, as described by kumari et al., 141 utilize structural MRI data along with ML techniques to estimate an individual’s brain age. In individuals with ASD, differences between chronological age and predicted brain age may indicate abnormal neurodevelopment or underlying neurological irregularities. Such models present a valuable biomarker for identifying and tracking developmental abnormalities associated with ASD.
Portable EEG Devices Portable EEG devices are becoming useful tools for early ASD screening in everyday environments such as homes and schools. 142 These wearable devices enable continuous and non-invasive monitoring of brain activity, making assessments easier and more comfortable for children. Their convenience supports widespread early detection efforts beyond traditional healthcare settings.
Atlas-Free Learning Approaches Data-driven models can now directly detect structural anomalies from sMRI data using atlas-free learning techniques that do not rely on predefined brain regions. 143 This flexibility allows for the identification of distinct patterns associated with ASD.
Cost Constraints of Neuroimaging-Based ASD Diagnosis Neuroimaging modalities such as MRI and fMRI, widely used in ASD diagnostic research, involve high equipment costs, maintenance requirements, and trained personnel for acquisition and analysis. These expenses limit the accessibility of such diagnostic approaches in low-resource and rural healthcare settings. Even when ML models demonstrate high accuracy, the dependency on costly imaging infrastructure restricts their practical scalability. Future research should explore cost-effective imaging alternatives, optimized scanning protocols, and lightweight computational frameworks that can reduce the financial burden of deploying ML-based ASD diagnostic systems.
Regulatory and Ethical Challenges The clinical adoption of ML-driven neuroimaging systems must comply with strict regulatory standards and ethical guidelines. Issues related to patient data privacy, informed consent, model validation, and approval from medical regulatory bodies present significant barriers. Additionally, the black-box nature of many DL models raises concerns regarding accountability and decision transparency in clinical decision-making. Addressing these regulatory and ethical challenges requires the development of explainable, validated, and standards-compliant AI frameworks suitable for healthcare deployment.
Usability and Clinical Integration Another critical challenge is the usability of ML-based diagnostic tools by clinicians who may not possess expertise in ML or data science. Integrating these systems into existing clinical workflows, electronic health record systems, and diagnostic procedures remains complex. User-friendly interfaces, automated reporting, and clinician-interpretable outputs are essential for practical adoption. Future systems should prioritize human-centered design principles to ensure seamless interaction between AI tools and healthcare professionals.
Key Barriers to Clinical Translation of Neuroimaging-Based ASD Diagnosis Translating neuroimaging-based ASD diagnosis into clinical practice still faces several key challenges. First, limited generalizability and small sample sizes reduce confidence in model reliability across diverse populations and imaging centers. Second, multi-site variability and lack of standardized acquisition protocols introduce inconsistencies that affect reproducibility and clinical trust. Third, model interpretability, computational complexity, and regulatory validation requirements make integration into routine clinical workflows difficult.
Future directions
The following are the future directions in ASD diagnosis.
Model Optimization DL techniques rely heavily on hyperparameter tuning, which can significantly impact performance. The hyper-parameter set consists of two components: network structural parameters (such as the number of hidden layers and units, activation function, dropout rate, and so on) and model optimization parameters (such as batch size, optimization method, and learning rate). Hyper-parameter optimization methods, including automated techniques like Bayesian Optimization 144 and manual methods such as grid search and random search, 145 are recommended to find the optimal configuration.
In addition to data-related challenges, effective optimization of DL models remains crucial for reliable ASD diagnosis. Many studies employ standard training settings without systematic hyperparameter tuning, architecture search, or regularization strategies tailored to neuroimaging data. Future research should emphasize advanced optimization techniques such as automated hyperparameter optimization, neural architecture search, cross-site validation strategies, and regularization methods that prevent overfitting to small datasets. Incorporating optimization-aware training pipelines can significantly enhance model stability, reproducibility, and generalization across diverse neuroimaging cohorts.
Data Augmentation Although obtaining training samples is often challenging in many real-world situations, especially for neuroimaging data, DL techniques require a large number of samples to effectively train neural networks. One of the most commonly cited obstacles in applying DL to neuroimage analysis is the lack of sufficient training data. 146 To address this issue, data augmentation techniques have been proposed and are frequently used to increase the number of training samples.
Although several studies72,146–148 employ basic data augmentation techniques such as rotation, flipping, noise addition, and synthetic sample generation, these approaches often do not fully capture the complex variability present in neuroimaging data. Future research should explore advanced augmentation strategies, including GANs, diffusion models, connectivity-preserving transformations, and modality-specific augmentation that respects anatomical and functional brain structures. Such intelligent augmentation methods can enhance data diversity while maintaining neurological validity, thereby improving the robustness and generalization of DL models for ASD diagnosis.
Diverse Training Data Sources To enhance generalization, it may be essential to train models using datasets collected from multiple sources, as demonstrated in the work of Nielsen et al., 149 which supports adaptive classifier learning. Furthermore, later research by Li et al. 150 shows that applying multitask learning, where each data collection site is considered a distinct task, can help mitigate the effects of dataset variability.
Multiscale Entropy and Complexity Metrics The nonlinear and dynamic features of brain activity, which often change in individuals with ASD, can be better understood using multiscale entropy and complexity metrics. 151 Traditional linear approaches may overlook subtle anomalies in brain signals, but these metrics can detect them. Therefore, they offer a great opportunity to use EEG data to identify more precise and sensitive biomarkers for ASD diagnosis.
Dataset Size Limitations for DL A critical requirement for the effective training of DL models is the availability of large, diverse, and well-annotated datasets. However, ASD neuroimaging research is often constrained by limited sample sizes, site-specific data collection, and demographic imbalance. Although multi-site datasets such as ABIDE have partially addressed this issue, the overall volume of publicly available ASD neuroimaging data remains insufficient for fully exploiting the potential of DL architectures. Future research should focus on large-scale data sharing initiatives, standardized acquisition protocols, and collaborative repositories to enable robust, generalizable, and clinically reliable DL based ASD diagnostic models.
Trust, Transparency, and Explainable AI in ASD Diagnosis Despite the high classification performance reported in many studies, the clinical adoption of ML models for ASD diagnosis depends heavily on trust, interpretability, and transparency. Black-box DL models often fail to explain which brain regions or connectivity patterns influence the decision-making process. Recent studies have started incorporating Explainable AI (XAI) techniques such as saliency maps, feature importance analysis, attention mechanisms, and graph interpretability methods to highlight neurological biomarkers contributing to ASD classification. Improving model transparency not only enhances clinician trust but also supports the validation of neurobiological findings, making ML-driven ASD diagnosis more reliable and clinically acceptable.
Conclusion
Throughout this survey paper, we conducted a comprehensive systematic literature review on ASD diagnosis to provide an in-depth understanding of technological advancements in the AI-driven medical field. The classification of ASD diagnosis techniques was organized around four major neuroimaging modalities: sMRI, fMRI, EEG, and multimodal approaches. An empirical analysis was presented, covering key diagnostic features, reported limitations, and performance metrics such as Acc and AUC. In addition, we examined the most widely used datasets for ASD diagnosis and classification. We examined that the high dimensionality of neuroimaging data (fMRI, sMRI, EEG) further complicates model development, as insufficient feature selection and inadequate validation strategies may inflate performance metrics. Moreover, many studies lack external validation on independent cohorts, and the limited interpretability of DL models reduces clinical transparency and trust, thereby hindering real-world applicability. This article aimed to support clinicians and developers by guiding the selection of effective datasets and informing model improvements, ultimately contributing to more robust, accurate, and clinically applicable ASD diagnostic systems.
Footnotes
Ethical considerations
This study did not involve human participants or animals.
Author contributions
Naveed Ur Rehman Ahmed served as the lead author and primary investigator. He conceptualized the research idea, designed the overall methodology, developed the experimental framework, executed all core experiments, performed the main data analysis, and prepared the original manuscript draft. He also coordinated the research activities, ensured technical integration across components, and led the revisions and final refinement of the paper. Ayesha Tajammul contributed to data acquisition, preprocessing, and assisted in model development, evaluation, and preparation of supporting visual materials. Afzal Badshah supported algorithm optimization, provided key technical insights throughout the experimental phase, and assisted in the interpretation of model performance. Muhammad Saad contributed to the literature review, validation of results, and comparative assessment of the experimental outcomes. Abdulrahman Ahmed Gharawi assisted in writing, manuscript organization, and provided support in aligning the work with current research standards and domain requirements. Ammar Almutawa contributed to quality assurance, proofreading, and assisted in verifying methodological consistency and clarity. Sakher Ghanem supported data management, helped in preparing figures and tables, and contributed to the analysis of experimental results. Ali Daud supervised the entire study, ensured methodological rigor, provided critical feedback, and reviewed and edited the final manuscript for intellectual and technical accuracy. All authors reviewed and approved the final version of the manuscript.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Declaration of conflicting interests
The authors declare that there are no conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.
