Skip to main content

A systematic literature review of neuroimaging coupled with machine learning approaches for diagnosis of attention deficit hyperactivity disorder

Abstract

Problem

Attention deficit hyperactivity disorder (ADHD) is the most commonly found neurodevelopmental condition among children with an estimated 2.5% to 9% global prevalence. While ADHD has been regarded as a lifelong condition, its early diagnosis elevates the probability of recovery and normal life for children. Despite clinical diagnosis being the primary one, substantial developments in ADHD diagnosis have been made during the past decade.

Aim

Several imaging-based technologies and approaches have been presented in the existing literature including magnetic resonance imaging (MRI) and functional MRI (fMRI). In addition, the deployment of machine learning and deep learning models has paved the way for automated diagnosis of ADHD which increases both accuracy and robustness of ADHD detection. A comprehensive and systematic literature review (SLR) of imaging technologies and machine learning approaches is highly desired to comprehend the current status of such approaches concerning their potential and challenges to outline future directions. Although a substantial body of literature exists on imaging-based ADHD diagnosis, comprehensive SRL on such approaches is scarce. This SLR aims to provide a comprehensive overview of imaging-based ADHD diagnosis with emphasis on machine learning approaches, reveals their pros and cons, and provides potential future research directions thereby contributing to the scientific community to accelerate further research for ADHD diagnosis.

Methods

This SRL focuses on analyzing recently published studies between 2010 and 2023. For this purpose, preferred reporting items for systematic review and meta-analyses (PRISMA) approach is performed in this study. Five eminent academic databases Web of Science, ACM, Springerlink, Elsevier, and PubMed are selected for article search. The SLR follows a systematic methodology comprising article search, selection based on inclusion and exclusion criteria, and rigorous assessment for categorization.

Results

It is found that MRI and fMRI are the dominant approaches integrated with machine learning models for ADHD detection and function near-infrared spectroscopy is also adopted by a few studies. Predominantly, the ATHENA preprocessing approach is used to preprocess MRI data before model training. Due to the public availability of the ADHD-200 dataset, it is widely used in the existing literature while a few studies utilized their self-collected datasets. Machine learning models are the choice of the majority of the studies, particularly, the support vector machines model has been widely used for ADHD detection. Feature fusion is observed to be a better choice for obtaining more accurate results.

Conclusion

Machine learning and deep learning models provide automated ADHD detection with better accuracy and robustness, however, such models are not generalizable and their performance varies concerning the locality of data used for experiments. In addition, the heterogeneity of data collection from various devices is a challenge, and the use of a standard device may provide better solutions. The lack of labeled data also adds difficulties to training models. Besides the use of MRI and fMRI, other novel technologies should be explored for better ADHD detection performance.

Introduction

Attention deficit hyperactivity disorder (ADHD) is a neurodevelopmental condition commonly found in children and the second most common chronic illness [1]. The World Health Organization (WHO) report of the 2016 World Mental Health Survey involving ten countries showed a global prevalence of 2.8% for ADHD [2]. The manifestation of ADHD is found between 2.5% to 9% in the world population [3]. The report indicates a higher ADHD ratio for high-income countries and low education and males. Similarly, 9.4% of children in the United States (US) are reported to have an ADHD diagnosis while 8.4% are reported to have ADHD [4, 5]. ADHD is more common in boys than girls, with a ratio of 2-4 boys for every girl. It is usually diagnosed in children using the guidelines from the Diagnostic and Statistical Manual of Mental Disorders, 5th edition (DSM-5). To diagnose ADHD, a child must show at least six symptoms of inattention or hyperactivity for at least 6 months in two different settings.

ADHD patients, both children and adults, exhibit various symptoms but inattention, and impulsive and hyperactive behavior are more frequent, although their intensity may vary concerning ADHD subtype [6]. ADHD has three subtypes ADHD inattentive (ADHD-I), ADHD hyperactive (ADHD-H), and ADHD combined (ADHD-C) [7]. ADHD-I patients experience distraction and inattention but lack hyperactivity symptoms while ADHD-H patients exhibit hyperactivity and impulsivity but lack inattention. Finally, people suffering from ADHD-C exhibit all symptoms including impulsivity, hyperactivity, and inattention.

ADHD has been regarded as a lifelong disorder and diverse technologies and approaches have been designed for its diagnosis. Although clinical diagnosis remained the primary diagnosis method, other modalities have been adopted lately. Now, a diverse range of technologies is used for diagnosing ADHD including magnetic resonance imaging (MRI), electroencephalography (EEG), questionnaire, eye movement analysis, functional MRI (fMRI), electrocardiography (ECG), actimetry, etc. The adoption of a technology depends on several factors like complexity, desired level of accuracy, the willingness of the subjects to undergo a particular test, etc. Each of these techniques has associated advantages and challenges for diagnosing ADHD. Despite the substantial body of literature on ADHD, systematic literature (SLR) studies on ADHD diagnosing approaches are rather scarce. Therefore, this study embarks on exploring the realm of machine learning and imaging modalities concerning the diagnosis of ADHD.

Neuroimaging using magnetic resonance imaging

During the past few decades, various methods have been designed to study the functioning of the human brain. These techniques include those that map the brain’s electrical activity and those that map the local physiological or metabolic changes caused by brain-altered electrical activity. In the first category, non-invasive methods are used including EEG and magnetoencephalography (MEG). These methods provide excellent temporal resolution of neural processes but suffer from poor spatial resolution. fMRI falls into the second category of brain mapping techniques. It detects changes in regional blood flow, blood volume (using injected contrast agents), or blood oxygenation linked to neuronal activity. Blood oxygenation level-dependent (BOLD) fMRI, which mainly focuses on blood oxygenation, offers image spatial resolution of a few millimeters and a temporal resolution of a few seconds (due to the time it takes for the blood to respond) [8].

In early childhood, kids with ADHD often show behavioral changes that might prompt a pediatrician to order a clinical MRI scan. Such scans are initiated to check for structural issues causing cognitive or behavioral problems. Existing studies on ADHD have discovered patterns of anatomical differences and variations in brain function among various groups. Various types of MRIs have been utilized to study neurological disorders.

Structural MRI is a well-know technique used for research and clinical purposes in various patient groups of neurological disorders. It provides anatomical detail and clear contrast between gray and white matter which is useful for detecting various neurological disorders [9]. The short scan times make it suitable to acquire images quickly that can be used for diagnosis. It is used to obtain information regarding anatomical location, cortical thickness, regional volumes, vascular lesions, and morphological changes.

The ability to visualize anatomical connections in the brain non-invasively has revolutionized functional neuroimaging. This visualization of brain tissues has become possible, thanks to diffusion magnetic resonance imaging (dMRI). The dMRI generates MRI-based maps showing the microscopic movements of water molecules in brain tissues [10]. These water molecules act like probes, thereby making it possible to get detailed information about the structure of healthy and diseased tissues.

fMRI detects changes in blood oxygen levels (BOLD) in the MRI signal that happen when neuronal activity changes, such as when the brain responds to a stimulus or performs a task. Many current uses of functional imaging are based on the idea that different behaviors and brain functions depend on the coordinated interaction of components within large-scale brain systems. These systems are spatially distinct but connected in functional networks. Task activation fMRI studies aim to create different neural states in the brain by changing visual, auditory, or other stimuli during the scan [11]. Activation maps are then created by comparing the signals recorded during these different states. To get accurate results, it is important to take each image quickly to prevent head movement and physiological processes like breathing and heartbeats from adding noise to the signals related to neural processing [12, 13].

Structural MRI is used to examine the sizes and shapes of various brain regions. On the other hand, diffusion MRI helps map out the white matter pathways connecting these regions. In addition, fMRI helps obtain images of those brain areas that are active at different times.

Research questions

This SLR investigates the following research questions (RQs)

  • RQ1: What are the key neuroimaging technologies and how do they differ in ADHD diagnostic capabilities?

  • RQ2: Which data preprocessing approaches are predominant for fMRI data?

  • RQ3: Which datasets are frequently used for fMRI-based ADHD diagnosis?

  • RQ4: Which machine learning and deep learning models have been utilized for ADHD diagnosis?

  • RQ5: Which feature extraction approaches proved to be most influential for ADHD diagnosis?

  • RQ6: How ADHD diagnosis using imaging-based approaches can be improved?

Existing surveys

The survey [6] discusses the genetics related to ADHD. Various models related to ADHD are elaborated on including the dopamine model, dopamine \(D_2\), generational association of dopaminergic genes, dopamine transporter gene, etc. Then the survey goes on to discuss various medical treatments for ADHD. In particular, pharmacological, polypharmacy, and multigenetic approaches, are combination therapies discussed. In addition, dietary treatment, vitamin supplements, herbal remedies, and Nutraceuticals are also analyzed in detail.

A survey of automated diagnosis approaches concerning ADHD is presented in [14]. The study discovers various machine learning and deep learning approaches from the existing literature, as well as, other diagnostic methods. The prime objective of the study is the use of artificial intelligence (AI) for ADHD diagnosis involving different kinds of data, for example, imaging data, physiological signals data, etc. In addition, various questionnaire types, simulators, and motion modalities are also covered.

The study [15] reviews the choice of various machine learning models and their evaluations concerning MRI-based ADHD diagnosis. The survey focuses on the clinical significance of the applied machine learning models. In addition, the choice of evaluation is discussed such as cross-validation, size of the dataset used for experiments, etc. The survey also evaluates the impact of different factors like training dataset size, class imbalance problem, etc. The study does not cover the neurological imaging aspect of ADHD diagnosis, its types, or the role of various feature extraction approaches in the context of machine learning models. Similarly, the benchmark datasets used in the existing studies, data collection, and processing pipelines are also not covered (Table 1).

Table 1 Comparative analysis of existing surveys

Research gaps

Even though a wide range of research works are found in the existing literature which also includes a few review articles, the following research gaps are found:

  • MRI and fMRI-based works are not reviewed extensively and several aspects are not covered. For example, Zhang-James et al. [15] focused on machine learning models from the sample size, training, and evaluation perspective only.

  • Feature engineering is the essence of machine learning models and the performance of machine learning models relies heavily on the choice of an appropriate feature extraction approach. However, ML models are not investigated from a feature engineering perspective.

  • Aspects related to ML models like feature fusion, feature reduction approaches, and data heterogeneity are also not very well covered.

  • Due to the publication of several articles during the past couple of years, an SLR is needed to select and discuss those articles concerning automated ADHD detection.

Contributions

The survey paper aims to explore the most widely acknowledged imaging modalities for imaging-based ADHD diagnosis. The choice of imaging modalities is based on a large number of published literature regarding ADHD diagnosis using various image-based technologies and approaches. Despite their wide use and applications, each imaging technology has its advantages and disadvantages, which will be explored in this survey. Besides that, this survey paper makes several significant contributions

  1. i.

    Imaging Techniques: This study contributes to exploring the imaging technologies and approaches that are widely adopted in the existing literature for ADHD diagnosis.

  2. ii.

    Machine Learning: The literature is filled with various approaches, machine learning, deep learning, and transfer learning, which are employed for automated diagnosis of ADHD. A comprehensive review of such approaches along with their pros and cons is provided.

  3. iii.

    Feature Engineering: Feature engineering is part and parcel of machine learning approaches and the choice of feature engineering approach is based on several factors like nature of data, quality of data, desired accuracy, and selected model for diagnosis. A detailed discussion of feature engineering approaches and their suitability for ADHD diagnosis is carried out.

  4. iv.

    Performance Analysis: ADHD diagnostic methods require high performance with fewer false alarms. Performance metrics are important elements of diagnostic systems and various evaluation metrics, commonly utilized in the existing literature, are discussed in this survey.

  5. v.

    Dataset: The availability of publicly available datasets is essential to design and test novel machine learning models. This study provides a discussion on the publicly available datasets and their suitability for ADHD diagnosis.

  6. vi.

    Optimization: For more accurate ADHD diagnosis and model optimization, different research directions are formulated to help future research.

These motivations and contributions collectively aim to improve the efficiency, realism, standardization, adaptability, scalability, and performance optimization in the load-testing process for web applications.

Further, this paper is divided into six sections. The methodology adopted for this SLR is described in “Methodology” section which is followed by the analysis of results in “Results” section. Afterward, ADHD diagnosis using fMRI data and MRI is discussed in “ADHD diagnosis using function magnetic resonance imaging data” and “ADHD diagnosis using magnetic resonance imaging and functional near infrared spectroscopy” sections, respectively. “Discussions of research questions” section provides a discussion of the formulated research questions while the conclusion is given in “Conclusions and future directions” section.

Methodology

This paper follows a systematic methodology to develop and evaluate the proposed approach. This section defines data sources, search strategy, selection of studies, and eligibility criteria for data extraction and analysis procedure. This study follows preferred reporting items for systematic reviews and meta-analysis (PRISMA) methodology, as outlined by Kitchenham in [16]. Figure 1 shows the sequence of various steps carried out in this survey. Starting with the query formulation, various well-known repositories are selected. Articles are searched from the selected repositories using the formulated query. Although the search query is defined after meticulous efforts, search results might contain irrelevant, and redundant articles which require further refining. Articles are refined manually by reading their abstracts and keywords to define their scope. After the article refining, classification is performed for research questions. Afterward, articles are discussed from various aspects.

Fig. 1
figure 1

The workflow of steps carried out in systematic literature review

Formulating search query

Search query is very important to search all the relevant articles related to the selected topic. In addition, since the articles need to be selected from various repositories, it must be ensured that the query is effective for all sources. The following query is used to search articles in this survey:

((TI = adhd) OR (TI = attention deficit hyperactivity disorder) OR (TI = attention deficit/ hyperactivity disorder)) OR (TI = attention deficit-hyperactivity disorder) OR (TI = ad/hd) OR (TI = ad-hd) AND ((TI/Abstract = machine learning) OR (TI/Abstract = deep learning) OR (TI/Abstract = artificial intelligence) OR (TI/Abstract = transfer learning))

Data sources and strategies

We carried out a comprehensive search across various online electronic repositories including the Institute of Electrical and Electronics Engineers (IEEE) Xplore, Elsevier, ACM digital library, Springer, and PubMed. These repositories are selected due to their containing a large number of papers and their wide use in existing surveys. We carefully expanded the search terms to ensure the inclusion of a wide range of relevant studies. These repositories are selected as these are the most commonly used among scholars and contain a large number of article collections.

Selection of studies

We conducted an initial literature search and screened the titles and abstracts of each study. Potentially relevant studies were further assessed concerning the eligibility criteria. To reduce the risk of bias, the studies are assessed from three authors and are only selected if an agreement between at least two authors is reached. Among the screened articles, those without full-text availability are excluded. The eligibility of the remaining full-text articles was assessed, resulting in the exclusion of some articles based on predefined criteria.

Eligibility criteria

In this study, the following criteria are employed for the articles’ inclusion:

  • Publications between 2010 and 2023 with the diagnosis of ADHD as the main theme of the study,

  • Publishing language is English; several articles in Chinese, and French are excluded,

  • The focus of the study is ADHD diagnosis using a machine learning approach,

  • Published in a peer-reviewed journal or conference,

  • Only research articles are considered.

The exclusion of a study is based on the following criteria

  • Survey, book chapters, books, and non-peer-reviewed articles are excluded,

  • Publications published without a peer review,

  • Unpublished studies like thesis or dissertations,

  • Studies that do not perform ADHD diagnoses such as brain-to-computer interfaces, teaching solutions for ADHD individuals, games and simulations-based treatments for ADHD, etc.

For article selection, three authors evaluated the scope and quality of an article. To include or exclude an article, at least two authors agreed. In the case of a dispute, where there was no consensus among the authors, the article was removed.

Results

Figure 2 shows the results of the screening process for article search and selection from IEEE Xplore, WoS, ACM, Springer, and PubMed. Five online well-known repositories were searched using the formulated search string. Each repository contains a different number of published articles and the search resulted in 102 articles from WoS, 287 articles from IEEE Xplore, 162 articles from ACM, 300 articles from Springer, and 202 articles from PubMed. Since the articles may be duplicated due to selection from different repositories, a duplication removal step is carried out from all selected articles (n = 1053). Comparing article titles indicates that only 712 articles are unique while the rest 341 are duplicates, hence removed from further processing. Afterward, inclusion and exclusion criteria, as defined in “Eligibility criteria” section, are applied to further refine the selected articles. This process results in a total of 85 studies After the inclusion criteria

Fig. 2
figure 2

Results of search and selection of relevant articles for SLR

The articles were selected and published between 2010 and 2023. The starting year is selected as 2010 because other review and survey papers cover the articles published before 2010. A distribution of selected articles for this survey is presented in Fig. 3 indicating that a higher number of articles were published during recent years with the maximum number of articles published in 2020. A higher number of publications is expected for the year 2023, as the current selection is only up to July 2023.

Fig. 3
figure 3

Selected articles concerning publication year

The selected articles are published in journals and conferences. Figure 4 illustrates the publication venue of the selected papers. It shows that the majority of the selected articles, i.e., 62%, are published in peer-reviewed journals while the rest 38% are published in conference proceedings.

Fig. 4
figure 4

Selected articles with respect to publication venue, conference vs journal

Figure 5 indicates the number of publications for each modality discussed in this survey including MRI, fNIRS, and fMRI. It can be observed that the fMRI modality has the highest number of published articles while the fNIRS has the lowest number of publications.

Fig. 5
figure 5

Distribution of selected articles concerning imaging modality and publication venue

ADHD diagnosis using function magnetic resonance imaging data

ADHD diagnosis is an important research field that aims to provide timely and accurate diagnosis of ADHD and its various classes and help children, adolescents, and adults get appropriate treatment. Existing research works focus on various aspects to improve the accuracy and efficiency of ADHD diagnosis and prognosis. The categorization of such works depends on specific objectives and may vary concerning the aim of the analysis. In this study, we have divided fMRI-based works into the following categories

  • Dimensionality reduction: Works that particularly focus on feature vector dimension reduction approaches like principal component analysis (PCA), etc.

  • Machine learning: Some research works especially focus on the use of specialized and customized machine learning approaches to increase ADHD diagnosis accuracy.

  • Feature engineering: Machine learning approaches require feature engineering which can substantially impact the performance of machine learning models. This category discusses those works that especially emphasize the implementation of feature selection and extraction approaches.

  • Feature fusion: Owing to the importance of feature engineering approaches for machine learning models, several works in the existing literature have been dedicated to multi-modal feature fusion. Such works incorporate features from multiple sources to obtain better ADHD classification.

  • Deep learning: These works utilize various deep learning approaches to overcome the issue of manual feature crafting and focus on customized and pre-trained deep learning models.

  • Sub-type classification: Predominantly, works on ADHD classification focus on discriminating between ADHD and healthy controls, and works on ADHD sub-type classification are very few. This category explores those works that primarily focus on classifying ADHD subjects into subtypes.

Dimensionality reduction-based approaches

The study [17] investigated different feature extraction approaches for the automated diagnosis of ADHD using fMRI data. Feature extraction is performed using Fast Fourier Transform (FFT), PCA and its variants, and FFT and PCA together. For experiments, resting state fMRI and phenotypic data including age, gender, IQ, etc. are considered. Various experiments are performed using a support vector machine (SVM) model involving phenotypic data and imaging data alone, as well as, both types of data together. Experimental results indicate a 72.9% accuracy with phenotypic data for ADHD and control while 66.8% for three classes. The study shows that the use of fMRI images alone does not produce good results. A 76.0% accuracy is achieved when using both the FFT and PCA features from phenotypic and image data for ADHD and control while the accuracy is reduced to 68.6% for three classes.

In Bohland et al. [18], quantitative markers, about 12,000 per subject, from fMRI data are obtained containing anatomical features, and local and global network measures. Of various features used from the covariance matrix, the L1-norm regularization penalty is reported to be the most efficient. Experiments involve various machine learning approaches like cross-validation, 2 sample t-test, and recursive feature elimination (RFE). Results show that the highest performance is obtained from non-imaging features. However, adding the imaging features improves the generalization of the results. Another important factor is the stratification by gender which can enhance the classification performance. Phenotypic features provide an area under the curve (AUC) of 0.81 while network features provide an AUC score of up to 0.72. Combining these features tends to improve the robustness.

The study [19] presents a machine learning approach to automatically classify the subjects into ADHD and typically developing (TD). For model training, structural and demographic features are used together. Structural features contain quantitative metrics while functional features contain Pearson correlation, nodal power spectra, etc. To improve the classification accuracy, feature ranking is performed using SVM RFE (SVM-RFE). Generalization of SVM is improved using radial basis function kernel on selective features. Predictions are made for each modality separately and are later combined using voting. Results show an accuracy of 55%, sensitivity of 33%, and specificity of 80%.

A deep belief network (DBN) is proposed in [20] for ADHD diagnosis from ADHD-2000, Neuro, OHSU, and Pittsburgh datasets. Preprocessing of data involves a Broadmann template to transform the data from 4D to 3D to reduce dimensionality. Later, FFT is applied to convert the data to the frequency domain. Lastly, frequency max-pooling is performed to select the frequency with maximum amplitude. Results show that accuracy for Neuro, OHSU, and Pittsburgh are 44.4%, 80.88%, and 55.56%, respectively.

Similarly, the study [21] follows a DBN-based approach for ADHD diagnosis from fMRI data. A total of 263 samples from ADHD-200 data are used for children between the ages of 7 to 21 years for experiments. Besides using the imaging features, the authors utilize several other features like handedness, IQ, medication status, verbal IQ, etc. to enhance the predictive performance of DBN. An accuracy of 63.68% is reported for the NYU dataset while the accuracy is improved to 69.83% for the NeuroImage dataset.

The study [22] presents an automated approach for ADHD classification using rs-fMRI of 56 drug-naive ADHD and 56 TD children. The study collects its own data of whole-brain T1-weighted 3D MPRAGE using SEIEMS TRIO 3-Tesla MRI scanner. The dataset is preprocessed using the GRENTA toolbox and SPM12 [23]. Later automated anatomical labeling (AAL) atlas is employed to obtain 116 ROIs. Local and global graph measures are calculated for the brain using graph theory. For this purpose, the between centrality, clustering coefficient, path length, local efficiency, and degree centrality are calculated as local measures. In addition, global measures and small-world parameters are also obtained. Feature selection is performed using the RFE approach to determine impactful features. Classification is carried out using SVM, gradient boosting (GB), and RF. GB model proves to be more accurate with a 78.2% accuracy, 75% sensitivity, and 80% specificity when used with RFE-based selective features.

The authors propose a machine learning-based solution for ADHD diagnosis in [24]. The study considers phenotype features such as age, sex, IZ, and medication status, from fMRI data for this purpose. The study utilizes the ADHD-200 dataset for Peking, Brown, KKi, NeuorImage, NYU, and OHSU sites with 776 subjects. Experiments are carried out using different feature sets, called personal characteristics selection (PCS) 1, 2, 3, and 4. Five machine learning models are investigated for their classification performance including logistic SVM, liner SVM, quadratic SVM, cubic SVM, and radial basis function SVM. In addition, LR, RF, DT-J48, KNN, and MLP are also utilized with genetic algorithms for model optimization. DT-J48 shows the best performance with an 84.41% accuracy, followed by the RF model with an 82.74% accuracy.

Machine learning approaches

The use of machine learning and deep learning approaches has been dominant in several applications, however, healthcare and disease diagnosis have readily adopted them. The study [25] provides a comprehensive framework for disabled patients. The proposed solution is based on federated learning and deep CNN (DCNN) to design a digital framework. Wheelchair bio-sensors are used for data collection and DCNN is evaluated concerning cost, accuracy, latency, etc. Results suggest a 25% energy reduction with a 99% disease prediction accuracy.

The study [26] designs a federated learning-based CNN-LSTM model to improve the detection accuracy of autism spectrum disorder. The authors also added an advanced standard encryption (AES) scheme to secure patients’ data in the proposed framework. Various Internet of Things (IoT) applications are also incorporated to improve performance. Results show an accuracy of 99%, which is better compared to existing frameworks. The authors present a detailed review of artificial intelligence (AI) approaches for diagnosing Parkinson’s disease in [27]. The focus is particularly on the use of data-driven AI approaches in the context of better diagnosis accuracy.

To avoid noise issues with the input data and lower classification accuracy, the study [28] introduces an automatic ADHD diagnosis system with high accuracy. Initially, the data is preprocessed using min-max normalization, followed by feature extraction which is performed using the fast independent component analysis. Finally, the authors adopt a deep extreme learning machine (DELM) for classifying the subjects into ADHD and health controls. DELM combines extreme learning (EL) with kernel and multilayer EL machine (MELM). Experimental results show an accuracy of 98.2% for DELM while efficiency-enhanced extreme learning machines (EEELM), SVM, and ANN obtain an accuracy of 84.2%, 74.2%, and 71.3%, respectively.

The study [29] employs pattern classification on fMRI data for ADHD diagnosis. The data are collected from 30 adolescents with ADHD and 30 control subjects with Stop tasks. Gaussian process classifiers (GPC) are used to classify the subjects using the whole-brain beta maps. Using brain activation features, an accuracy of 77% is reported for ADHD subjects while the sensitivity is 90%. On the other hand, for control subjects, 63% subjects are classified.

Deep learning models suffer from the high dimensions of fMRI data. Due to a lack of enough data models experience overfitting and lack generalization. The study [30] utilizes a generative model to model small datasets. In addition, a deep variational AE is designed to cope with the problem of small data. The model uses FBNs of fMRI data to discriminate between ADHD and healthy subjects. Using the ADHD-200 dataset for experiments, the accuracy of 77.8%, 65.1%, 61.2%, and 60.8% are reported for KKI, PU, NYU, and NI sites respectively. Reportedly, the model shows better results for smaller datasets.

The authors present a deep learning-based ADHD diagnosis approach in [31] using rs-fMRI modality. Features are extracted using the convolutional AE which is customized for feature extraction. For classification, a novel interval type-2 fuzzy regression (IT2FR) model is proposed. The proposed IT2FR model is optimized using particle swarm optimization (PSO) and gray wolf optimization (GWO) approaches. For experiments, the University of California Los Angeles dataset is used. The performance of the proposed approach is compared with several other machine learning models including MLP, KNN, SVM, RF, DT, and adaptive neuro-fuzzy inference system (ANFIS) methods. Results show an accuracy of 72.71% using the IT2FR which is better than other employed models.

The model called, deep channel self-attention factorization (Deep CSAF) is presented in [32] to obtain the most important non-linear features from the fMRI data of ADHD and healthy subjects for better classification. N-correlated self-attention CNNs are used to extract non-linear factors for this purpose. The proposed approach is a tensor factorization approach that can learn maximum factor matrics with less or no a priori knowledge. Experimental results using the ADHD-200 data indicate a superior accuracy of 99% while precision, recall, and F1 scores are 99.87%, 99.90%, and 99.88%, respectively.

A hybrid model is introduced in [33] which combines a 3D CNN and BiLSTM models to make a 3D CNN-BiLSTM model for accurate ADHD classification. The CNN model is used to extract features from fMRI which are fed into the BiLSTM model for training and classification. For experiments, NYU, PU, KKI, and NI sites of the ADHD-200 dataset are used. Experimental results show an accuracy of 75.4%.

The authors adopt a transfer learning approach for ADHD diagnosis using the fMRI data in [34]. The study focuses on improving classification accuracy by resolving the issue of class imbalance distribution. The authors adopt a pre-trained ResNet-50 model, a CNN variant, for the automatic classification of ADHD and control subjects. Experiments show an accuracy of 93.45% using ten-fold cross-validation.

Feature selection

Despite better accuracy from various machine learning and deep learning architectures, the desired accuracy requires further efforts, particularly concerning important feature extraction from fMRI data. The study [35] focuses on a similar endeavor and investigates impactful feature extraction from original fMRI data. A novel approach, the local binary encoding method (LBEM) is proposed to characterize function interaction patterns from fMRI data. Later a kernel ELM (ELM) is used for subject classification. For experiments, a public dense individualized and common connectivity-basic cortical landmarks (DICCCOL) dataset is used [36] which contains fMRI data from 45 control and 23 ADHD children. Improved accuracy of 96.06% and 97.64% is obtained for ADHD and control groups showing the efficacy of the proposed approach.

Along the same lines, [37] proposes an intuitive feature selection method for ADHD diagnosis using resting state fMRI (rs-fMRI) data. A feature subset is obtained by preprocessed fMRI data fractional amplitude of low-frequency fluctuation (fALFF) from rs-fMRI features are extracted. The feature selection method is founded on the Relief algorithm and verification accuracy (VA-Relief). The data for 82 ADHD and 72 control subjects from the ADHD-200 dataset is used for experiments. Maximum accuracies of 77.92%, 80.52%, and 98.04% are obtained using Relief, minimum redundancy maximum relevance (mRMR), and VA-Relief algorithms, respectively.

The use of spatio-temporal decomposition involving spatial filtering like the Fukunaga–Koontz transform, and independent component analysis (ICA) have provided valuable insights for fMRI data. These approaches can decompose fMRI data into spatial and temporal features which can provide discriminative features for better classification. One limitation is their error-proneness in estimating covariance matrices. The study [38] presents a framework to robustly estimate the covariance matrices to reduce atypical anomalies. The proposed approach, the regularized spatial filtering method (R-SFM), uses Mahalanobis whitening to maintain spatial arrangements as well. The proposed approach is evaluated using the ADHD-200 dataset involving 947 subjects (580 boys and 367 girls) with spatial filtering method (SFM) and R-SFM. Results demonstrate a better performance with 70.39% accuracy using R-SFM while the SFM approach shows an accuracy of 64.54%.

The authors adopt the multi-task learning (MTL) paradigm in [39] to investigate its capability to classify autism spectrum disorder (ASD) and ADHD. To improve the discrimination between ADHD and control subjects, a graph-based feature extraction approach is also proposed to characterize the correlations that remove low-impact FC features. For the MLT-based framework, a multi-gate mixture-of-experts (MMoE) is employed which comprises two expert networks as an ensemble for learning various patterns from the data. Experimental results using the ADHD-200 dataset show an accuracy of 67.4% using 776 samples from the dataset. The mean accuracy for ASD and ADHD is 68.7% and 65.0%, respectively.

One challenge for ADHD diagnosis is the high dimension of rs-fMRI data. In addition, inter-classes have low separability while intra-class separability is high which further complicates the classification process. Spatial transformation has shown promise in this regard however suffers from a lack of generalization as most of the proposed approaches are tested on the ADHD-200 datasets alone. The study [40] contributes in this regard and introduces a metaheuristic spatial transformation (MST) approach that transforms the spatial filter problem into an optimization problem and resolves it using the genetic algorithm. Later MST is used to obtain the highly impactful features with linear discriminant analysis (LDA). Results using ten-fold cross-validation produce a 72.10% classification accuracy.

The authors propose a deep learning-based approach in [41] for accurate classification of ADHD subjects. The study aims to demonstrate the importance of FC for accurate ADHD classification results which are interpretable. The proposed approach is an end-to-end approach that comprises a feature extractor, FC network, and classification network. Feature extractor is a convolutional neural network (CNN) to extract features from brain regions. FC network is used to learn FC between brain regions and uses Siamese-based similarity measures. The study uses ADHD-200 data involving 247 (147 ADHD, 100 control) from NYU, 73 (36 ADHD 37 control) from NI, and 136 (51 ADHD, 85 control) from the Peking site. Experimental results reveal a superior performance of the proposed model with 67.9%, 62.7%, and 73.1% for NI, Peking, and NYU sites.

The study [42] introduces a lightweight CNN model that follows an end-to-end approach. The proposed approach is based on hierarchical representation learning, called hierarchical and lightweight graph Siamese network (HLGSNet). Experiments use the ADHD-200 dataset to extract 116 anatomical regions for subjects. Temporal connections between brain regions are used to build the graphs which are later fed into a graph CNN for training. Testing experimental results involving 25 subjects (11 ADHD, 14 control) for NI and 41 subjects (29 ADHD, 12 control) from NYU sites show 68% accuracy for the NI site and 68.29% accuracy for the NYU site.

The study [43] considers FC features from fMRI data using various time templates (Atlases) to improve the classification accuracy of ADHD subjects. Various templates are used for FC feature analysis like CC400, CC200, and automatic anatomical labeling (AAL). Each template is marked by different ROI function connectivity maps. For feature extraction, the authors employ a local binary encoding method (LBEM). Finally, the subjects are classified using a hierarchical extreme learning machine (HELM). Experiments are performed involving 100 ADHD and 143 typically growing children from the ADHD-200 dataset from the Peking site. An accuracy of 96% is reported for ADHD while for healthy controls the accuracy is 98% using CC400 features.

Feature extraction holds central importance for the accuracy of ADHD classification and various pivotal features are engineered manually. However, it is time time-consuming and laborious task. The study [44] devises an approach called, convolutional denoising (CDAE), which is an automatic feature extraction approach. The CDAE extracts 3D spatial information features of fMRI data. Later, PCA is used to reduce the dimensions of features to avoid model overfitting. For classification, the authors utilize an adaptive boosting DT (AdaDT) which has multiple weak learners to build a strong model. The authors use Peking, KKI, NeuroImage, NYU, and OHSU sites of the ADHD-200 dataset for experiments. Results show 73.17%, 70.59%, 81.82%, 78.95%, and 82.35% accuracy for NYU, Peking, KKI, NeuroImage, and OHSU sites respectively.

The study [45] presents a subspace learning framework where a subspace is learned for ADHD and healthy subjects using different subspace measures. To enhance intra-class relationships, a graph embedding measure is adopted. The core idea of the study is to compare the energy of the FC feature of a subject to the energy of subspaces to find the subspace with a large energy. To improve the efficiency and stability of this process, the binary hypothesis is also employed to select discriminative FC. The ADHD-200 dataset is used to analyze the performance of the proposed approach which obtains very good results with an 90% accuracy.

The two core problems of ADHD classification from fMRI data, insufficient data, and noise disturbance, are handled in [46]. The study utilizes the FC features using an \(l_{2,1}\)-norm LDA model and binary hypothesis testing. The binary hypothesis framework is utilized to deal with the problem of smaller datasets for ADHD. For training, FCs are used without labeling for subspace learning. The authors use the concept of subspace energy to discriminate between ADHD and healthy subjects’ subspace. Experiments are carried out on the ADHD-200 dataset which shows an accuracy of 97.6% outperforming existing approaches.

The authors adopt a transfer learning-based approach for ADHD diagnosis and focus on Haralick features [47]. In addition, HOG features were also investigated for the same task. 2D texture images are generated from 3D MRIs where multiple slices are combined to get a 2D image. Four transfer learning CNN variants are utilized for experiments including AlexNet, VGGNet, ResNet, and GoogleNet. Experiments are performed on static MRI data from NPNeuroPsychiatric Hospital and the ADHD-200 dataset. ResNet shows the best performance with 100% accuracy, sensitivity, and specificity while decision tree (DT) shows 99% accuracy, sensitivity, and specificity using Haralick features.

The study [48] focuses on two important problems of ADHD classification, feature noise due to similarity with other mental disorders and small datasets. The authors introduce a deep learning-based architecture for accurate ADHD classification in conjunction with binary hypothesis testing. In addition, a modified AE is also designed in the study. The insufficient data problem is resolved by using the binary hypothesis framework. FC features are used for training while the AE network shows a better training process. Its purpose is to enhance the inter and intra-class difference and mitigate the influence of feature noise. Experiments using the ADHD-200 dataset, comprising subjects from NI, NYU, KKI, and PU, show an impressive accuracy of 99.6%.

The study [49] introduces two novel deep-learning approaches for ADHD classification involving various models. The first approach utilizes ICA with CNN to extract independent components. The extracted components are fed as features into another CNN model for training so as to classify the subjects into ADHD and control groups. The second approach involves using correlation AE which utilizes ROI-based correlations as training features. Both methods use the inter-voxel information from fMRI data. Experiments are carried out using two approaches along with other models like LR, SVM, etc. on the ADHD-200 dataset from NI, NYU, Peking, BU, KKI, and OHSU sites. Experimental results show 67% accuracy using the proposed ICA-CNN model while specificity, sensitivity, and precision are 89%, 42%, and 77% respectively. The correlation AE-based approach shows slightly better results with a 69% accuracy.

The authors consider FC of subjects using voxel size blood-oxygen-level-dependent (BOLD) signal in [50]. The authors introduce a modified bidirectional LSTM (BiLSTM) to use voxel features from active regions of RSN for subject classification. Experiments are performed using 28 active regions, in addition to the behavioral data of 40 subjects. Initial experiments indicate that the model can provide accurate results with an accuracy of 87.50% which is better than several complex models in the existing literature.

ADHD subtype classification

While predominantly the majority of the existing studies focus on classifying the subjects into ADHD and control subjects, comparatively less work is done for ADHD subtype classification. To bridge this gap, the study [51] employs genetic algorithm learning vector quantization 2 neural network (GA-LVQ2NN) to divide the subjects into various ADHD subtypes. Learning vector quantization 2 (LVQ2) can advantageously set the weight vectors for supervised learning but can lead to poor performance if weights are set improperly. The authors utilize a genetic algorithm (GA) to optimize weight vectors. The dataset is collected from House of Fatima involving 100 subjects with ADHD symptoms which are classified into high-intensity, medium-intensity, and low-intensity. Experimental results show an average accuracy of 80% for LVQ2NN and 89.5% for GA-LVQ2NN approaches.

The [52] utilizes the fully connected cascade (FCC)-based artificial neural network (ANN) to carry out the classification of ADHD subjects into different subtypes. The research involves using different brain connectivity-based methods to identify the most impactful features to enhance prediction performance. The data from 173 ADHD-I subjects and 260 ADHD-C subjects are used from the ADHD-2000 dataset. The study performs various experiments using balanced and unbalanced datasets with connectivity path weights, principal components, and latent variables features using SVM and ANN models. The accuracy for subjects classification into ADHD and healthy groups is approximately 90% which increases to approximately 95% when classifying ADHD subtypes. FCC ANN performs better in comparison to the SVM classifier and is less influenced by balanced and unbalanced class distribution.

The complex network theory has been utilized to detect several neuropathological conditions including Alzheimer’s, ADHD, etc. For this purpose, predominantly, the correlation matrix from fMRI is utilized for subtype classification. Such approaches are computationally complex and contain personalized bias. The study [53] presents a deterministic approach for ADHD subtype classification using eigenvector centrality. The human connectome project (HCP) dataset is used for experiments and three classes, ADHD-C, ADHD-I, and ADHD-H are used for experiments. ADHD-C and ADHD-I have 43 male subjects and 19 and 22 female subjects, respectively while ADHD-H has 6 males and 1 female subject. The classification is carried out using the classification and regress tree (CART) model using the eigenvector centrality measure. An average accuracy score of 0.675 is reported for male subjects while for female subjects the accuracy score is 0.771 for three classes of ADHD.

Another interesting work on ADHD subtype classification is [54] which aims to improve the ADHD subtype classification accuracy using structural MRI and fMRI data. The study proposes a hierarchical binary hypothesis testing (H-BHT) that uses FC as input signals. The proposed framework utilizes a two-stage process involving the DT model for subtype classification. The process involves finding important FC at both stages concerning each subtype. For experimental evaluation, fMRI data from the ADHD-200 dataset is used. A total of 490 subjects are selected comprising 275 TD, 97 ADHD-I, and 118 ADHD-C. The obtained accuracy varies between 93.3% to 98.8% for different sites of the ADHD-200 dataset, however, an average accuracy of 97.1% is reported with a kappa score of 0.947.

The authors utilize the neuroimaging features to find differences in ADHD subtypes and classify them in [55]. Experiments are conducted using the fMRI data of 34 subjects comprising 13 ADHD-IA subjects and 21 ADHD-C subjects. Connectivity measures for ADHD subjects were used with an SVM classifier which reports an accuracy of 91.8% for two classes. Brain regions are identified using the gambling punishment and emotion task paradigms. It is found that there are significant connectivity differences in frontal, cingulate, and parietal cortices between ADHD subtypes.

Feature fusion

Improving ADHD classification accuracy is an important aspect of the automated diagnosis of ADHD and various approaches have been presented in this regard including feature fusion, combining models, and data preprocessing. The study [56] proposes the use of multiple features to train machine learning models for better classification accuracy. The authors integrate imaging and non-imaging features from rs-fMRI data in this regard. The study investigates the automatic classification of ADHD by identifying the functional connectivity of ADHD and control subjects. In addition, the impact of using imaging and non-imaging features on classification accuracy is also studied. For determining functional connectivity, clustering algorithms including affinity propagation (AP) [57] and density peak algorithm [58] are used. The study uses the ADHD-200 dataset involving 189 ADHD and 243 control subjects while the synthetic minority oversampling technique (SMOTE) is applied to balance the class distribution. Experimental results with imaging and non-imaging features combined using the SVM model show accuracies of 86.7%, 52.7%, and 85.8% for the KKI, NI, and Peking sites of the ADHD-200 dataset. On the other hand, using imaging features only manages to obtain classification accuracy of 67.4%, 72.9%, 25.4%, and 85.3% for KKI, NI, NYU, and Peking. In addition, it is also observed that using a higher number of imaging features also tends to improve SVM performance.

Another similar endeavor is the study [59] that fuse spatial and temporal features to enhance the effectiveness of fMRI-based ADHD classification. For improved performance, a 4D convolutional neural network model is built which is based on granular computing. To comprehend the structure of rs-fMRI data, the authors designed different models for fusing spatial and temporal features such as feature pooling, long short-term memory (LSTM), and spatio-temporal convolution. In addition, a data augmentation approach is also proposed to augment rs-fMRI frames into short pieces. Experiments are performed using the ADHD-200 sample dataset and an accuracy of 71.3% is reported with an area under the curve (AUC) score of 0.80.

The study [60] uses the fractional amplitude of low-frequency fluctuation (fALFF) with fMRI data to improve the diagnosing capabilities of machine learning models. The authors propose multiple linear regressions to mitigate the confounding effects. The fALFF is integrated with the principal component analysis (PCA), Shannon entropy, and sample entropy to formulate features. Experiments use logistic regression (LR), linear SVM, radial basis function SVM (RBF-SVM), random forest (RF), and decision tree (DT) using the publicly available ADHD-200 dataset. The highest accuracies of 81.82%, 76.0%, 70.73%, and 68.63% are reported for KKI, NI, NYU, and Peking sites of the ADHD-200 dataset.

The authors introduce a feature selection framework in [61] which is based on the function function connectivity patterns. The proposed feature selection uses relative importance and weighted ensemble learning to reduce dimensionality. For experiments, the data are collected from 112 ADHD and 77 control adult subjects from the Sixth Hospital of Peking University. Similarly, the data from 106 ADHD and 73 control children are also gathered for experiments. For the ensemble model, gradient boosting (GB), RF, extra tree (ET), and extreme gradient boosting (XGBoost) are combined with 0.15, 0.30, 0.10, and 0.45 weights which provide a 70.1% accuracy. Most importantly, the study reports that the FC features derived from children can be used to diagnose adult subjects.

Deep learning models have proven to be more accurate than machine learning models, however, require larger datasets and higher computational resources. The study [62] works on an alternative approach to deep learning models and investigates deep forest, also called gcForest for ADHD classification. The gcForest, being shown excellent results for image processing tasks, is adopted for fMRI data to classify subjects into ADHD and control groups. The authors utilized 1-D functional connectivity and 3-D amplitude of low-frequency fluctuations (ALFF) as features for training the gcForest. For feature fusion, gcForest is modified that utilize the two concatenated features as input. In addition, the dataset is balanced using the synthetic minority over-sampling technique. The ADHD-200 dataset is used to test the performance of the proposed approach. Results show that using FC and ALFF features together, produces better results for Peking, KKI, NYU, and NI sites with average accuracy of 64.87%, 82.73%, 73.17%, and 72.00%, respectively.

While several studies focused on different biomarkers and spatial and temporal features, phenotypic information is not investigated very well. For example, age and gender are known to be contributing features for ADHD diagnosis, but are seldom studied. In this regard, [63] focuses on integrating age and gender features into other features obtained from fMRI images using an attention mechanism. A convolutional variational autoencoder is used to jointly optimize the learning process using brain connectivity embedding and age and gender attributes. For experiments, the ADHD-200 dataset with 780 subjects is used with 256 subjects from NYU, 245 subjects from Peking, 112 subjects from OHSU, 94 subjects from KKI, and 73 subjects from NueroIMAGE. The proposed model provides an accuracy of 76.42%, 78.43%, 94.54%, 83.33%, and 98.40% for NYU, Peking, OHSU, KKI, and NeuroIMAGE samples. An average accuracy of 86.22% is obtained by the proposed approach for ADHD and TD control groups.

In the realm of machine and deep learning, often ensemble models are reported to provide better results compared to single models. Using multiple models is advantageous, as these models can compensate for limitations of each other. The [64] adopts a similar approach for ADHD classification by proposing a multi-network approach where multiple LSTM models are adopted. For feature selection, the authors adopt the Gaussian mixture model which clusters the regions of interest (ROIs). It is followed by the data augmentation approach where phenotypic information is also integrated to better the classification accuracy. Experimental results using the ADHD-200 dataset with 947 subjects, provide an accuracy score of 0.737 using an independent set validation approach.

In the existing literature, predominantly, studies focus on using single channel information such as FC features which means that these studies ignore using intrinsic information from fMRI. To fill this gap, [65] designs a two-stage model combining separate channel CNN (SC-CNN) and an attention-based network for accurate classification. The SC-CNN learns temporal features in the first stage while temporal-dependent features are learned by the attention network in the second stage. The ADHD-200 dataset is used from five sites including 104, 73, 262, 113, and 245 subjects from KKI, NI, NYU, OHSU, and Peking sites for experiments. An average accuracy of 68.6% is reported for all sites.

Another study on feature fusion is [66], where a modified AE is utilized for discriminating between ADHD and healthy controls. The authors combine FC features and non-imaging data from fMRI. A traditional AE is used to learn high-level features while a sub-network is leveraged to integrate high-level features with their labels. The sub-network also incorporates non-imaging features such as age, gender, and IQ to improve the robustness of the model. A binary hypothesis is also generated using two sets of high-level features. To discriminate between binary hypotheses, a variability score is utilized. Experiments are carried out using ADHD-200 data from NYU, KKI, and PU sites. Results show superior accuracy for all sites obtaining higher than 99% accuracy for NYU, PU, and KKI.

The study [67] presents a multimodal data fusion to boost the accuracy of ADHD classification. The authors design an ensemble model that uses transform for 3D images (Trans3D) and RF. The former is used for extracting spatio-temporal features while the latter extracts clinical features. Trans3D comprises a 3D-CNN model to extract volumetric spatial information which is transformed into patch embeddings. Temporal pooling is also used to fuse image tokens across time and get features from fMRI images. Compared to existing studies that utilize stand-alone models, the ensemble model operates on the principle of stacking and produces better results with a 74.5% accuracy using the ADHD-200 dataset.

A spatiotemporal method for the classification of subjects into ADHD and TD subjects is presented in [68]. The authors design a 3-dimensional CNN (3D-CNN) to extract the most suitable features from rs-fMRI data. For classification, the authors build a customized gated recurrent unit (GRU) model that can process 3D spatial and 1D temporal information efficiently. Using a 5-fold cross-validation on the ADHD-200 dataset, the proposed approach obtains 71.65% accuracy, 68.00% sensitivity, and 73.80% specificity. The model proves to show better generalizability than other approaches in the existing literature.

The study [69] works on capturing long-distance dependency (LDD) which is not covered by existing deep learning models due to the sequential nature of LDD. The authors propose a novel spatiotemporal attention AE (STAAE) to address the issue of obtaining LDDs from volumetric rs-fMRI. The proposed STAAE is an unsupervised framework that can model spatiotemporal sequence by decomposing rs-fMRI into spatial and temporal patterns. Despite spatial patterns being widely investigated, temporal patterns are rarely investigated. A resting-state temporal template (RSTT) is formulated in this regard while an STAAE-based classification framework is utilized for ADHD classification. The study uses the ADHD-200 dataset for NYU, PU, KKI, NeuroImage, and OHSU sites. Results show that the proposed approach can achieve an accuracy of 72.5%. RSTTS obtained from one dataset works on other sites’ datasets, but accuracy varies depending upon the age differences that exist between subjects of various sites Table 2.

Table 2 Summary of fMRI-based ADHD diagnosis approaches

The fMRI data availability for training deep learning models is a big challenge for ADHD classification. The authors propose three approaches in this regard for data augmentation [70]. The data augmentation is based on functional connection networks in the fMRI data which is further complemented by a deep feature fusion approach. The first approach uses the Gaussian noise, mixup, and sliding window methods for FCN data augmentation. These methods are selected to balance the variation of the sample distribution. The second approach leverages CNN and the graph attention network to obtain local and global features. In the end, two features are combined for ADHD classification. The proposed approach is tested using the ADHD-200 dataset involving NYU, NI, PKU, KKI, OHSU, and Pittsburgh sites. Results show that using the feature fusion+sliding window the accuracy can be improved even when using a smaller number of samples. The proposed approach also tends to improve the robustness. A detailed overview of all discussed research works concerning the adopted approach, dataset, target classes, accuracy, and contributions is provided in Table 3.

Table 3 Summary of MRI-based ADHD diagnosis approaches

ADHD diagnosis using magnetic resonance imaging and functional near infrared spectroscopy

A fully automatic ADHD diagnosis system is presented in [71]. The proposed approach comprises several steps starting with obtaining globally optimal segmentation of caudate in MRI images. A new segmentation approach is also proposed based on shape features which proves to be more accurate. Secondly, internal caudate segmentation is carried out using the shape feature. Lastly, ADHD diagnosis is carried out using extended volumetric features. Experiments utilized the URNC database with 39 ADHD, including 35 boys and 4 girls, and 39 control groups comprising 27 boys and 12 girls. Various strategies are used for ADHD diagnosis including decision stump (DS), geometric criterion, and SVM variants like linear, polynomial, and RBF kernels. The best results are obtained using ADA SVM which obtains an accuracy of 72.48%, specificity of 85.93%, and sensitivity of 60.07%. The study shows that the proposed shape features using the SVM and DS produce better results than the state-of-the-art geometric criterion approach.

While several studies provide accurate automated diagnosis of ADHD, MRI images involve high computational complexity thus involving longer processing time. The authors propose an extreme learning machine (ELM) in [72] for ADHD diagnosis with reduced computational complexity and investigate the impact of sample size on ELM and SVM. Special focus is placed on identifying brain segments to identify ADHD. For experiments, 3D images from 55 ADHD and 55 control subjects are acquired. The study extracts 340 cortical features from 68 brain segments. F-score and selective feature selection are used for selecting optimal features. Experimental results indicate an accuracy of 90.18% for ELM using 11 features while linear SVM shows an accuracy of 84.73%. RBF SVM is also used which provides an 86.55% accuracy. It is further pointed out that the frontal lobe, temporal lobe, occipital lobe, and insular show striking differences for ADHD and control subjects.

Contrary to many existing studies which focus on prefrontal-striatal circuit dysfunction, the study [73] investigates the involvement of various brain regions in ADHD. Special emphasis is placed on analyzing structural and functional networks of the brain and producing comparable results with existing studies. The study utilizes cortical landmarks and points out connectomics abnormalities in ADHD. The authors identify RoIs called dense individualized and common connectivity-based cortical landmarks (DICCCOL) to show group-wise similarity. The authors use 25 ADHD-c and 49 healthy control children between 8 to 14 years, for experiments. Results show that stronger interactions between emotion networks and memory networks are found for ADHD children. It is concluded that ADHD is a dysfunction of the whole brain which centers upon the emotion network.

The study [74] investigates the use of structural T1 weighted brain scans to detect ADHD. A total of 68 participants were used for data collection including 34 each for ADHD and control groups. Mean threshold features were used involving the selection of voxels and the SVM classifier. The analysis involved the gray matter and white matter compartments, as well as, both combined. Results show an accuracy of 93% using the white matter, while the use of gray matter resulted in a 63% accuracy. The white and gray matter combined showed an accuracy of 81% to discriminate between ADHD and control groups.

The authors present an ADHD classification approach called, meta-cognitive neuro-fuzzy interface system (McFIS) in [75]. In addition, a feature extraction approach is also proposed which utilizes a binary coded genetic algorithm (BCGA). The BCGA is combined with ELM to form BCGA-ELM to make an optimal feature extraction approach. The hippocampus features from MRI are extracted using the proposed approach. Optimal features that contribute to the final label are selected using the BCGA-ELM approach which is used to train the McFIS model. Best solutions are found using the count of each voxel appearing in 10, 50, 100, and 200 solutions. Results show that McFIS shows better performance using 63 voxels from the top 50 solutions with 0.68 and 0.56 accuracy scores for training and testing.

A multiclassification approach is presented in [76] that utilizes a hierarchical ELM (H-ELM) classifier. Moreover, feature selection is performed using the SVM-based RFE approach. Performance comparison is carried out using SVM and ELM models as baseline models and 159 structural MRI images are used from the ADHD-200 MRI dataset. Experiments involve three classes including ADHD-TD, ADHD-I, and ADHD-C. Experiments show an accuracy of 60.78% for three classes.

The study [77] proposed a modified RFE approach for optimal feature selection for multi-class ADHD classification. The selected features are fed into an SVM classifier for classification. Experiments involve 90 subjects, 30 each from three sub-classes of ADHD. Experimental results indicate an 84.17% accuracy using the modified RFE-SVM feature selection and linear SVM classifier. It is also observed that cortical thickness and volume tend to be the most important features for classification.

The authors introduce a 3D CNN model for ADHD classification using MRI data in [78]. Keeping in view the limitations of CNN to learn discriminative features from raw data, the authors extract low-level features from functional MRI and structural MRI. In addition, the 3D CNN model helps to learn local spatial patterns. Later, functional and structural features are combined for a more discriminative representation of brain MRI. Results report an accuracy of 69.15% with a smaller training dataset. The authors suggest that the use of multi-modality data can enhance ADHD classification accuracy.

The authors in [79] use neuroanatomical signatures for the ADHD spectrum with the help of a machine-learning model for pattern classification. The data from structural MRI and diffusion tensor imaging is used for ADHD and healthy controls. SVM is used as the machine learning classifier on data from 67 ADHD and 66 healthy control subjects. Classification accuracy of 66% is reported on average while the accuracy for male subjects only is 74%.

The study [80] proposes a three-level CNN for accurate ADHD classification. Preprocessing involves skull stripping, Gaussian kernel smoothing, etc. to MRI data. The right caudate nucleus left precuneus, and other regions are selected using a coarse segmentation approach. In the end, the proposed CNN is implemented in these regions for training and testing. A classification accuracy of 62.52% is reported using the proposed three-level CNN model.

Shape and texture features are used to classify subjects into ADHD and control groups in [81]. The study extracts several shape features like eccentricity, axe lengths, area, etc in addition to texture features of mean, skewness, kurtosis, etc., as well as, contrast and correlation features. A KNN model is used to obtain gay and white matter. In addition, local and global features are also obtained using texture and second-order statistics, respectively. Feature importance is determined using the PCA approach. Using a cross-validation approach on 26 subjects’ data, an accuracy of 100% is reported.

The study [82] preferred the interregional morphological patterns over regional patterns for ADHD classification from MRI. A surface-based analysis is carried out to extract these features. Features are prioritized using SVM concerning their importance and a total of 45 features are used for experiments. A hybrid machine learning model is implemented using a leave-one-out cross-validation approach. Experimental results show better performance with interregional morphological connectivity features. The proposed approach obtained an accuracy of 74.65%, while the sensitivity and specificity are 75% and 74.29%, respectively.

In [83], the sparse representation is leverage for analyzing the functional connectivity of ADHD and healthy subjects using MRI data. A dictionary model is utilized to learn the feature space of both classes with the help of SVM-RFE. Models’ learning is improved using a penalty scheme to reduce the feature energy in the wrong feature space. Experimental results using an SVM classifier, show a 75% accuracy.

The study [84] performs a diagnosis-agnostic homogeneity for autism spectrum disorder, ADHD, and obsessive-compulsive disorder (OCD). For ADHD subjects, structural MRI data was recorded using a 3-Tesla Siemens Trio TIM sensor from 184 subjects. To analyze multi-dimensional brain-behavior data, the study designed a machine-learning pipeline that follows a three-step process. Initially, clustering is performed on group participants, which is followed by the bagging procedure to improve clusters. In the end, feature weights are calculated to determine cortical regions. The agreement between labels and clusters, the study utilized several parameters such as normalized mutation information, homogeneity, completeness, etc. Analysis results indicate that diagnostic labels do not correspond to clusters and one cluster might contain subjects from different disorders.

A multi-region-based ensemble approach is presented in [85] for accurate classification of ADHD subtypes. The approach utilizes features from structural MRI data of the amygdala, caudate, and hippocampus. For feature extraction, a feature-selecting genetic algorithm is employed to extract multiple feature sets. The best feature set and its corresponding ELM are later determined using a classifier-selectin genetic algorithm. The ensemble classifier is based on the combining of the selected classifiers which are joined using a risk-sensitive loss function. Various scenarios are used to analyze the performance of the proposed ensemble classifier. Results show that using a multi-region ensemble classifier, an accuracy of 81.24% can be obtained for three classes.

The study [86] designs a deep CNN model for improved performance concerning ADHD diagnosis. The authors employ a data augmentation approach to overcome the limitation of the smaller dataset for training. In addition, the study encodes gray-scale images to 3 channel images so that pre-trained CNN variants can be utilized with these images. Performance comparison of all deployed CNN variants indicates the superior performance of the proposed customized CNN model with an accuracy of 66.67% to classify the subjects into ADHD and healthy controls.

A 3D fractal dimension complexity map is used for automated ADHD classification in [87]. Fractal analysis has already been adopted for texture image analysis and can highlight the intrinsic structural information. The study uses the same concept for diagnosing ADHD using MRI images. The gray matter of structural MRI is used where the Hausdorff fractal dimension is obtained for further processing. A 3D CNN is designed to be used with these features and classify ADHD and healthy subjects. Experiments show an accuracy of 69.01% using the proposed 3D CNN and gray matter features.

The study [88] uses a machine learning-based approach for ADHD classification using MRI data. Following the preprocessing, a deep 3D CNN is used for training. Gray matter from structural MRI and fALFF from fMRI are obtained using the CNN model. Later, early and late fusion is followed and classification is carried out using SVM, KNN, and LDA models. Results indicate improved performance if personal characteristics are employed. In addition, multimodal data including early, and late fusion and personal characteristics can further elevate the performance. LDA is reported to show a 74.93% classification accuracy.

The limited number of labeled samples is one of the issues for MRI-based ADHD classification. To resolve this issue, [89] adopts a conditional adversarial domain adaptation network (CDAN). The objective is to learn impactful features from MRI data that can discriminate between the aDHD and healthy subjects. Experiments involve MRI data from multiple sites, to evaluate the generalizability of the proposed approach. For classification, SVM and DNN models are used along with different methods used with DNN including vanilla DNN, DNN-Z, domain adversarial NN (DANN), and conditional DANN with entropy. The best accuracy is reported as 73.9%.

The study [90] focuses on radiomic features from MRI data for ADHD subtype classification. Image preprocessing involves image reconstruction, correction, segmentation, etc. Feature extraction involves 1057 radiomic features from the gray matter of MRI data. Top impactful features are selected and classification is carried out using sequential backward elimination SVM (SBE-SVM). Experiments use the data of 88 patients. An accuracy of 84% is reported for two subtypes of ADHD.

The study [91] investigated archival MRI data in connection with the behavioral data for ADHD patient analysis. The objective was to evaluate the functional connectivity to differentiate ADHD patients from healthy subjects. Experiments involved 80 adults, including 55 with ADHD while the ranging 25 were healthy subjects. Classification is performed using a multilayer feedforward classifier, with k-fold cross-validation. Reported scores for hit rate (HIT) and false alarm rate (FAR) are 0.86 and 0.04 indicating superior performance of the proposed approach.

Serotonin transporter (SERT) is reported to impact ADHD patients and the study [92] investigates the measure of SERT to classify ADHD subjects. During preprocessing RoIs were defined and the authors utilized an RF model for selecting features. Classification is also done using the RF model with a k-fold cross-validation approach. Several important features were selected that can potentially discriminate between ADHD patients and healthy subjects. RF shows a mean accuracy of 82% while the sensitivity and specificity scores are 0.75 and 0.86, respectively.

The study [93] uses a multimodal framework combining multiple machine learning models including Boruta. The Boruta is used for feature selection, along with multiple kernel learning to incorporate multiple features from structural and functional MRI. Various features are incorporated such as functional connectivity, macro and micro structural features, etc. which are used at the kernel level. SVM is used as the final classifier for ADHD and healthy subjects which obtains a 64.3% classification accuracy while the AUC score is 0.698.

The authors propose a hybrid system in [94] to improve the classification accuracy of ADHD disease. Experiments involve using the data of resting state MRI to classify them into ADHD and healthy subjects. The proposed system is based on a two-dimensional CNN model and a hybrid model that comprises 2D CNN and an LSTM model. Results suggest an average accuracy of 98.12% using the hybrid 2D CNN-LSTM model. In addition, the model shows superior performance with 97.50% sensitivity, 98.16% specificity, 97.72% F1 score, and 97.85% AUC.

The objective of [95] was to find new subgroups based on brain-behavior for similar neurological disorders including ASD, ADHD, and OCD. For this purpose, the study utilizes brain imaging features in conjunction with behavior features to discriminate between the three disorders. Normalized mutual information is used to determine the ranking of features that can contribute to determining inter-group similarity. The study identified transdiagnostic groups that exhibit homogeneous attributes within groups. It is observed that cortical thickness and inattention score are the top contributing features for the models. Using the top contributing features, various groups (ASD, ADHD, OCD) can be identified with a 42% to 86% mean sensitivity.

Along the same direction, the study [96] focuses on discriminating between ASD and ADHD, considering the impact of sex differences in each disorder. For this purpose, the authors consider the cognitive profiles of four groups, two groups each in the ASD and ADHD disorders, as males and females. A total of 526 patient data is analyzed including 277 and 86 from ASD, for males, and females, respectively, and 99 males and 64 females from ADHD. Higher IQ and perceptual organization are observed in ADHD males compared to females and males of ADHD and ASD, respectively. In addition, the percentage of the lowest score in a cluster is higher in females. The study also utilized an RF model to determine the importance of various features and found that autism quotient (AQ) is the most contributing attribute to discriminating between ASD and ADHD males. Similarly, for females, verbal and performance intelligence discrepancy is the leading feature.

Another similar study is [97] that investigated various subtypes found in ASD and ADHD by considering multiple features including behavior, brain imaging, and cognition. The study utilizes subjects from three different groups such as ASD, ADHD, and healthy controls, and employs unsupervised methods to cluster the subjects using an agglomerative hierarchical clustering approach. By clustering the subjects into various groups, the objective is to find new diagnostic boundaries that are not covered by the DSM. The study finds three subtypes that indicate the association of white matter with symptoms and neurocognition.

The study [98] performs an investigation to determine if the cortical thickness and volumetric features can be utilized as biomarkers to discriminate between ADHD patients and healthy controls. Volumetric features are obtained from gray matter with the help of the Automated Anatomical Labelling (AAL3) atlas. Similarly, cortical thickness (CT)-related features are extracted using the Destrieux atlas. For obtaining the most contributing features, minimum redundancy, and maximum relevance are used along with an ensemble feature selection (EFS) approach. Classification is performed using various machine learning models like linear SVM, radial-based SVM, and RF. Models are evaluated using a different set of features to determine the best model and most impactful features. Results indicate that a 75% accuracy can be obtained using CT and personal characteristics using radial-based SVM and SVM when EFS is utilized for feature extraction.

The authors propose an approach for performing ADHD classification in [99] using MRI images. Image preprocessing is carried out involving contrast adjustment, noise removal, etc. to improve segmentation performance. Skull stripping is carried out using the largest connected component (LCC). The authors utilized several approaches and showed that LCC is very efficient for skull stripping to obtain better segmentation. The segmented grey matter is then utilized to classify subjects. Performance evaluation is carried out using dice coefficients, as well as, the Jaccard index.

The study [100] investigates new biomarkers for early ADHD diagnosis. The authors investigate the centrality of the phase-lag index concerning brain connectivity can be used as an important biomarker. This biomarker is found in beta and delta frequency bands. Another biomarker is the node betweenness centrality, found in the inter-site phase clustering connectivity. This biomarker is found in the delta and theta bands. Using these two biomarkers, six machine learning classifiers are implemented to classify the subjects into ADHD and healthy control. Results show a remarkable 99.17% accuracy.

Besides the fMRI and MRI approaches, functional near-infrared spectroscopy (nFIRS) is another brain imaging approach studied by several works. For example, the study crippa2017utility uses DYNOT Compact, NIRxBerlin NIRS device comprising 32 channels, 8 emitters, and 24 diagnoses for recording brain activity. Feature reduction is carried out using z-score data. Later, the PCA algorithm is used to obtain a smaller set of features. The authors use an SVM model for binary classification tasks, along with an ensemble SVM model where multiple models trained on different features are combined. An accuracy of 81% is reported for the ensemble classifier.

The study [101] utilizes the fNIRS data for ADHD diagnosis and proposes a half-fold windowing (HFW) method. The proposed method aims at obtaining sub-series and interval features. In addition, functional connectivity features are obtained using correlation, and wavelet coherence. Z-score normalization is used in conjunction with PCA for feature space reduction. In the end, fused features are utilized with an RF model to classify subjects as ADHD and control. The best accuracy of 87.4% is reported along with 83.5% precision, 95.7% recall, and 89.2% F1 score.

Diagnosis of co-existing mental disorders including ASD and ADHD is considered in [102]. The proposed framework is based on a deep learning ensemble approach and uses the fNIRS data from 28 subjects gathered for experiments. The data are collected when children are asked to perform different tasks including periodic lines (PL), zigzag lines (ZL), etc. The hybrid model comparing CNN and BiLSTM models, shows an accuracy of 94.0% under the PL line task.

The study [103] explores a discriminant correlation analysis (DCA) supported feature fusion approach to classify ADHD patients. Feature fusion is performed using joint features from the 1-back condition and 0-back condition. Experiments involve data obtained from 25 ADHD and 25 control subjects. For classification, SVM is trained on the fused feature set which shows an accuracy of 88.0% to classify subjects into ADHD and healthy controls.

Discussions of research questions

This section discusses the outputs concerning the defined research questions in the context of automated classification of ADHD subjects and control groups.

RQ1: What are the key neuroimaging technologies and how do they differ in ADHD diagnostic capabilities?

ADHD patients are characterized by inattention and impulsive and hyperactive disorders. There are different studies from neuroanatomical, as well as genetic disciplines concerning ADHD diagnosis, however, neuroimaging studies have increased substantially. The fMRI modality is the leading technology used for neuroimaging in the existing literature comprising both rs-fMRI and fMRI. Besides that MRI, positron emission tomography (PET), and diffusion tensor imaging (DTI) are also employed.

The fMRI assesses the variations in brain metabolism which is characterized by oxygenated and deoxygenated blood fluctuations. The increase and decrease in brain activity across particular regions over time, called BOLD can be assessed by the fMRI [105]. While MRI can provide brain structure information, fMRI can be used to examine regional brain functions. For this reason, fMRI results from ADHD and control groups can provide better insights into understanding abnormalities associated with ADHD [106]. Several findings are reported concerning ADHD in the existing studies using fMRI, indicating lower blood flow in frontal regions for ADHD patients [107,108,109,110]. The fMRI is advantageous concerning the examination of temporally correlated brain activities and helps derive inferences for FC of different regions [111]. The differences in FC of various anatomical regions help find differences in ADHD and control subjects.

PET, a function neuroimaging approach, uses intravenous injections, containing radioactive isotopes, into the blood. PET scanner can detect positively charged particles in the brain [105, 112]. PET can provide more detailed analyses of neurotransmitter binding site density for better study of various brain regions [112].

DTI is yet another neuroimaging modality that is primarily employed to study white matter in ADHD individuals [111]. Unlike function neuroimaging approaches like fMRI and PET, the DTI assesses the axonal organization of the brain. It measures the translational motion of water molecules which enables us to draw inferences on the structural connectivity of the brain [113]. However, studies utilizing the DTI approach are few compared to fMRI and PET technologies.

Undoubtedly, neuroimaging technologies have laid the foundations of the physiological basis for ADHD diagnosis and shed light on differences in brain functionality concerning ADHD and control groups. However, there are several methodological concerns for these technologies. First, the neuroimaging procedures are not standardized and variations exist in mathematical formulations leading to different interpretations of results. In addition, signal thresholds, contrast colors, and statistical methods chosen for analysis also add to this variation.

Most studies interpret neuroimages concerning change or difference in brain activity, however, the normal brain activity is not well defined and there is no baseline activity [114]. This is further complicated by the fact that age, health, emotional states, sex, etc. also affect the baseline activity. Despite the results reported in existing studies involving fMRI, PET, and DTI, such approaches lack the sensitivity and specificity required for psychiatric evaluations. While each neuroimaging technology has pros and cons, the superiority of one technology may change with respect to the adopted model, statistical approach, and the data used for experiments. Despite that, fMRI is more widely adopted for ADHD diagnosis [115].

RQ2: Which data preprocessing approaches are predominant for fMRI data?

Data preprocessing is a potentially important step to provide clean and appropriate data for training machine learning and deep learning models. It removes unnecessary, and redundant data and cleans it from noise added during the data recording process. In the case of fMRI data several standard libraries have been designed to preprocess the fRMI data before model training. Table 4 shows the details of various preprocessing pipelines used in the studies discussed in this survey.

Table 4 Studies, and preprocessing pipelines

The data preprocessing pipeline for fMRI data follows several steps to process the data before it can be used for model training. Often, it starts with the timing correction of slices, followed by the motion correction. Next, quality control is carried out to correct motion. Normalization is also performed for linear and non-linear spatial information. It can be followed by the coregistration and concatenation phases. In addition, extraction of functional images, correction of time drifts and physiological nose, resampling, and spatial smoothing can be utilized.

Table 4 shows that ATHENA [116] is the most widely used preprocessing pipeline for fMRI data used for classifying ADHD and control subjects. ATHENA pipeline mainly focuses on processing the fMRI data, however, can also be used for T1 images to obtain transformation from subject space to MNI space. Studies [30, 40, 46, 48, 49, 66] utilize the ATHENA processing pipeline. Four studies [31, 41, 42, 70] have used the Functional MRI of the Brain (FMRIB) Software Library (FSL) [117] for data preprocessing. Similarly, analysis of functional neuroImages (AFNI) [118] is used by [41, 42, 70]. The AFNI pipeline involves using slice-time correction, registration, and normalization of data, alignment of slices and motion correction, smoothing and masking, and scaling processed for fMRI data.

Neuroimaging analysis kit (NIAK) is another commonly used processing pipeline for fMRI data. The NIAK pipeline differs from the ATHENA pipeline in three ways. First, an automated approach is followed in the NIAK to detect physical noise while the ATHENA uses a regression approach. Secondly, contrary to the ATHENA, the NIAK does not use low-pass filtering. Lastly, the resolution used for functional volumes is 3 mm, compared to the 4 mm used by the ATHENA.

RQ3: Which datasets are frequently used for fMRI-based ADHD diagnosis?

The data holds central importance when considering ADHD classification using machine learning and deep learning models. Collecting fMRI data is difficult requiring specialized equipment containing various sensors to record brain activity and medical experts perform the data collection. In addition, the procedure involves the evaluation of the participants as ADHD and control subjects before data collection. While different authors can utilize various datasets, it becomes very difficult to analyze the efficiency of a particular classification approach without a benchmark dataset. The studies that utilize their own dataset, often do not make it public and it is very difficult to recreate the results of the proposed approach. Table 5 shows the details of the datasets used for experiments in the studies covered in this survey.

Table 5 Studies and corresponding datasets

Table 5 indicate that 22 studies [24, 30, 32, 33, 40,41,42,43,44, 46,47,48,49, 54, 63,64,65,66,67,68,69,70] utilized the ADHD-200 dataset which is publicly available and serves as the benchmark dataset.

The ADHD-200 dataset is the most widely used dataset for fMRI-based ADHD diagnosis [125]. Prior to the release of this dataset, the majority of existing literature focused on biological markers and lacked a comprehensive pathophysiology model for ADHD. Predominantly, studies utilized small-sized self-collected datasets which were not available publicly thereby making it impossible to analyze the models on a common benchmark. In addition, such models lack robustness and generalizability and model reproducibility was a big challenge.

The ADHD-200 Sample was an excellent initiative to accelerate research and development of models for accurate diagnosis of ADHD via open data-sharing. Under this project, fMRI samples from children and adolescents aged 7 to 21 years were collected from eight different sites. The participants involve 491 typically developing 285 subjects with ADHD. The sites for imaging include Kennedy Krieger Institute (KKI), NeuroImage (NI), New York University Medical Center (NYU), Bradley Hospital/Brown University, Oregon Health & Science University, Peking University (Peking), University of Pittsburgh, and Washington University in St. Louis. Now, the ADHD dataset has released the data of 776 rs-fMRIs combined from eight different sites. Each site has a different number of participants concerning age gender and sample distribution. With the release of the ADHD-200 dataset, various approaches can now be tested for performance evaluation on a standard benchmark.

While the ADHD-200 dataset is predominantly used for fMRI-based ADHD classification, several studies collect and use their own datasets. For example, [53] used the HCP dataset, while [31] performed experiments using the UCLA dataset. The study [22] relied on its own collected data while [28] used the ABIDE-ADHD-200 dataset. Similarly, the authors in [47] run the experiments using the dataset collected from NPNeuroPsychiatric Hospital.

RQ4: Which machine learning and deep learning models have been utilized for ADHD diagnosis?

Manual evaluation of the subjects concerning ADHD is a laborious and time-consuming task, not to mention the fact that it requires expert psychologists. In addition, medical experts and other evaluators need to follow the rating scales determined by various medical institutes [126]. For example, the American Academy of Pediatrics (AAP) [127] and the American Academy of Child and Adolescent Psychiatry (AACAP) [128] provide a systematic approach to ADHD evaluation. Often, parents and teachers have to participate in the evaluation of the child for ADHD.

Machine learning and deep learning models play a central role in the automated classification of subjects into various ADHD classes. In the existing literature, various machine learning and deep learning approaches have been designed and adopted for the automated classification of ADHD and control subjects. A large variety of machine learning models is available and the choice of a particular model depends on the type of task, such as classification, regression, etc., the nature of data, and the desired objective, such as accuracy, computational complexity, etc. Machine learning models require feature extraction and often a dedicated feature extraction approach is used [129]. Table 6 shows a summary of all the models, both machine learning and deep learning, used in the discussed studies. The models and their accuracies are reported concerning various categories such as feature selection, feature fusion, subtype classification, etc.

Table 6 Machine and deep learning models and their corresponding accuracy for ADHD classification

Figure 6 shows the performance of machine learning models for classifying subjects into ADHD and control subjects. The accuracy depicted in Fig. 6 is reported from the original articles. It is also noteworthy to point out that for some articles, multiple values for accuracy are reported. Several of the studies performed experiments using ADHD-200 benchmark dataset which contains multi-site data, involving eight sites. Consequently, studies report the accuracy of particular site data. However, for comparison, a standard procedure is followed and accuracy values for multiple sites are averaged to report only one value in this case.

Fig. 6
figure 6

Machine learning models for ADHD classification

Accuracy statistics indicate that an accuracy score of higher than 0.95 is reported in several studies. For example, the study [28] reports an accuracy of 98.2% using the DELM model, which is the best among the machine learning models. Several other studies also report an accuracy, marginally lower than [28]. The study [37] shows an accuracy of 98.04% using fALFF with VA-Relief model while an accuracy of 98.0% is obtained in [43] using ELM model. On the other hand, [35] and [46] report accuracies of 97.6% and 96.06% using binary hypothesis and ELM models, respectively.

Table 7 provides the details of the machine learning models that are most widely used for ADHD classification. Seemingly, SVM is the most commonly employed machine learning model for that purpose, followed by the ELM model. Many studies have embraced the SVM model and provided moderate results for fMRI-based ADHD classification. SVM is well-known for its effectiveness in dealing with high-dimensional data. It can be particularly effective in cases where a limited number of samples are available for training. In addition, its kernel functions make it a well-selected choice for no-linear classification. It provides robust results using the hyperplane to divide the data using different boundary lines. It can learn complex relationships and can produce good results. These features make it well-suited for fMRI data. Despite its downside of having high computational complexity, it is widely used for ADHD classification.

Table 7 Machine learning models used for ADHD classification

Despite the success of machine learning models for fRMI-based ADHD diagnosis, machine learning models need manual feature engineering to produce better results. The choice of an appropriate feature is very important and the same model may perform very differently with the change in the dataset or feature engineering approach [130]. This problem leads to a lack of generalizability for machine learning models. In addition, machine learning models are unable to learn complex relationships among various attributes of the dataset. Deep learning models can solve most of these issues. They do not require a particular feature engineering approach and can extract features on their own during the training process. ADHD diagnosis has also embraced a deep learning approach and several models have been designed in this regard. Figure 7 shows deep learning models and their performance concerning ADHD classification. The best results are reported using the ResNet-50 in [47] and two AE models from [48, 66], respectively with accuracy reaching 100% for the ResNet-50 while 99.6% for AE models.

Fig. 7
figure 7

Deep learning models for ADHD classification

CNN models are well known for image processing tasks and are reported to obtain excellent performance in the medical domain [131,132,133]. CNN can learn intricate relationships among attributes without the need for manual feature engineering. It can process large amounts of data to produce more accurate results. Similarly, they produce more robust results compared to machine learning models. AE is the second most widely used model for ADHD classification. AE follows a self-supervised learning approach and is particularly useful when very little training data is available. Several other models are used as well in the context of ADHD classification such as DBN, ResNet, LSTM, GRU, and ANN. Nonetheless, CNN and AE remain the first choice of researchers to discriminate between ADHD and control subjects.

Table 8 shows the most frequently used deep learning models concerning ADHD classification. For brevity, we have grouped different variants of a model into a single category. For example, 3D CNN, 4D CNN, SC-CNN, etc. are all grouped into a single category of CNN model. Similarly, studies have presented variants of AE, but are grouped as one for simplicity. However, in the case of ensemble models combining multiple deep learning models, models are given under separate categories.

Table 8 Deep learning models used for ADHD classification

Figure 8 shows the performance of the machine learning and deep learning models used in MRI-based ADHD diagnosis studies covered in this survey. While the accuracy varies significantly between 56% to 100%, the best performance is reported using machine learning models KNN and SVM, followed by deep learning models CNN-BiLSTM. On average, machine learning models tend to show better performance when MRI data is used for ADHD binary classification.

Fig. 8
figure 8

Machine learning and deep learning models used for MRI-based ADHD classification

The distribution and frequency of works employing machine learning and deep learning models in MRI-based ADHD classification are provided in Table 9. Similar to works based on fMRI-based ADHD diagnosis, SVM is the leading machine learning model adopted in this case as well. A total of nine studies [74, 77, 79, 83, 90, 93, 98, 103, 104] prefer using SVM model from a total of 26 studies. These studies adopt SVM models with different variations such as linear, radial basis function, SVM ensemble, etc. Among deep learning models, CNN and its variants like 2D CNN, 3D CNN, and 3-level CNN have been adopted by 4 studies. RF and hybrid models have been used in three and two studies respectively, while other models like ELM, KNN, DNN, BilLSTM, etc. have been adopted in single studies.

Table 9 Machine learning and deep learning models used for MRI-based ADHD diagnosis

For evaluating the performance of machine learning and deep learning approaches, while various studies have adopted diverse evaluation metrics, accuracy, specificity, and sensitivity remain the most commonly used metrics. Accuracy determines the capability of a model to determine true positives and true negatives. Sensitivity and specificity are two other metrics widely used in disease diagnosis. Sensitivity indicates the ability of an approach to determine an individual with a disease as positive. High sensitivity shows that there will be few false negatives thereby indicating that fewer cases are missed. On the other hand, specificity shows the ability of an approach to determine an individual with no disease as negative. High specificity shows that there will be few false positives. Among the studies reviewed in this survey, predominantly, the accuracy metric is used for machine learning and deep learning models.

RQ5:Which feature extraction approaches proved to be most influential for ADHD diagnosis?

The choice of a particular feature comes from the selection of a machine-learning model and the type of data used for training. In the machine learning paradigm, the selection of an appropriate feature extraction approach is critically important. In addition, the often multi-feature approach yields better results, as reported in the existing literature [130, 134, 135].

Table 10 shows the studies that utilized feature fusion approaches for obtaining superior performance compared to using a single feature. It is observed that FC features are the basic features that are utilized for MRI-based ADHD diagnosis. However, using FC features alone does not provide the desired level of accuracy which led the researchers to look for other features that complement FC features. Consequently, a rich variety of features has been explored including phenotypic, spatial, temporal, long-term dependency (LTD) features, etc. In addition, Gaussian mixture, Shannon entropy, mixup, and sliding window approaches have also been utilized. Figure 9 shows the provided accuracy by various feature fusion approaches. It is observed that using imaging features like FC and non-imaging features such as phenotypic features produce better results, as reported by [63, 66].

Table 10 Commonly used fused features for ADHD classification
Fig. 9
figure 9

Reported accuracy of various feature fusion approaches

Another important avenue in the paradigm of feature engineering is reducing feature dimensionality. One challenge in MRI-based ADHD diagnosis is the large feature set with a smaller number of labeled instances which often leads to poor training of machine learning models. As a result, the dimensionality reduction approach becomes an attractive solution. Table 11 shows the performance of various approaches when used with machine learning models.

Table 11 Dimensionality reduction approaches for ADHD diagnosis

How ADHD diagnosis using imaging-based approaches can be improved?

Currently, machine learning and deep learning approaches act as assistive tools for medical experts to speed up the diagnosis process of ADHD. In Europe, the average ADHD diagnosis time is approximately 6 months as the subjects, parents, and teachers are interviewed multiple times using different questionnaires. Machine and deep learning approaches utilize various types of data, like neuroimaging, head movement, body movement, eye saccade speed and frequency, etc. to suggest the probability of ADHD among subjects thereby speeding up the diagnosis process. Suggesting insights into improving the performance of existing approaches, particularly those utilizing fMRI, MRI, and fNIRS, requires delving into the challenges and limitations of the machine and deep learning approaches that utilize these modalities.

fMRI is the leading neuroimaging modality extensively studied in the context of analyzing brain functions and related disorders. It has been widely adopted as it is a non-invasive approach that provides high spatial resolution to investigate brain functions. In addition, it does not carry any radiation-related risks for the subjects. Despite these advantages, obtaining fMRI data is expensive. Therefore data labeling is also expensive leading to smaller datasets available for experiments.

Choice between rs-fMRI and task-based fMRI

Existing studies make a choice between rs-fMRI and task-related fMRI. The initial research on fMRI predominantly focuses on task-based fMRI data collection where the subjects are asked to perform an active task involving memory, attention, etc. On the contrary, rs-fMRI involves scanning the brain of a subject while he is presumably at rest. Research indicates that even at rest state, the BOLD signals show spatial and temporal fluctuations [136, 137]. Using the rs-fMRI data, it is possible to find differences in interindividual and intraindividual brain working making it a suitable tool to find impact on the brain due to trauma, etc.

Heterogeneity of data collection devices for rs-fMRI

Despite the potential and recent use of rs-fMRI for ADHD diagnosis, different acquisition methods exist where 1.5T, 3T, or, 7T sensors have been used making it difficult to compare the performance of various models. For rs-fMRI, a repetition time approach is followed where the scan is repeated between 1 to 3 s. Even for this approach, scans are carried out using open eyes, closed eyes, and eyes fixed on a fixation cross. Despite the difference between data gathered using these approaches being moderate, measurable differences have been recorded [138]. The best results have been reported using the eye-fixation approach, yet it is difficult to follow and varies from one subject to another. In addition, the MRI apparatus does not contain any eye tracking device which makes it difficult to estimate how accurate the eye fixation from a subject is. So, predominately in existing fMRI-based ADHD diagnosis studies, the closed-eyes approach is followed.

Limitations of task-based fMRI

Task-based fMRI has been widely used in cognitive neuroscience to analyze brain activation in response to different tasks. It has been used to study group-level effects like brain activity in response to different stimuli. In addition, its recent application is analyzing across-subject correlation concerning different variables like cognitive performance, behavior, etc. Nonetheless, the assumption of regional activation being a stable trait-like measure is challenged by several studies recently [139,140,141,142]. Although much research has been done with respect to trait stability among adults, very little work is available indicating the trait stability of task-based fMRI in children making the results based on task-based fMRI challengable.

Availability of labeled data

One of the challenges of using machine learning models in general and deep learning, in particular, is the availability of labeled data to train models. In the case of ADHD diagnosis, finding labeled fMRI data is a difficult task. Predominantly, researchers use their collected dataset for experiments. These datasets are smaller in size and not publicly available which makes research reproducibility almost impossible. Similar to the ADHD-200 initiative More fMRI datasets must be publicly available to investigate the performance of various approaches on such benchmarks.

Another course of action is to utilize data augmentation approaches such as the synthetic minority oversampling (SMOTE) approach where the data samples from the minority class can be used to generate synthetic samples for class balancing. Existing works report elevated performance of machine learning and deep learning models using data augmentation approaches [143,144,145].

Exploring novel technologies

Although fMRI is currently a largely used technology for ADHD diagnosis, novel technologies need to be explored for improved performance. For example, transcranial magnetic stimulation (TMS) has emerged as a promising technology for studying neurological and psychiatric disorders. Due to recent technology, TMS is not well investigated and has been applied with fMRI [146]. However, utilizing it with fMRI can provide valuable insights to diagnose mental disorders. Similar to TMS, other approaches can be combined with fMRI to improve ADHD diagnosis accuracy. For example, physical movement data such as head movement, eye movement patterns, etc. can be combined with fMRI data to investigate their efficacy.

It is reported that the limited diagnosis power of fMRI poses bottlenecks for analyzing brain functional connectivity networks. At the same time, using ultra-high magnetic fields may provide better resolution to study brain functions [147]. Higher image resolution is associated with higher specificity and sensitivity but may have the challenges of high variability when using a magnetic field of higher than 7 Tesla. Further investigations into the use of ultra-high magnetic field-based fRMI may prove influential to better understand function connections for ADHD diagnosis.

Challenges of functional brain organization

The studies that utilize fMRI data perform analysis of the functional connectivity of the brain which is connected to various brain disorders. However, to reduce the complexity of such data, the analysis is based on features of lower dimensional space. The fMRI data is transformed into a lower dimensional space for analysis of brain representation. The diversity of brain representations used in existing works makes it very difficult to reproduce the research thereby limiting the scope of research findings [148]. Standardization on the use of particular brain representations can potentially overcome this issue. Furthermore, open-source projects sharing details of model implementations and brain representation details can also broaden the scope of research findings.

Limitations of machine and deep learning approaches

A review of existing works on ADHD diagnosis indicates that machine and deep learning models can be utilized to obtain automated accurate diagnosis of ADHD disease, yet, the results provided from such models are not universal. What it infers is that the performance of models varies from one dataset to another. Similarly, the choice of an appropriate feature selection method greatly influences the outcome of such models. Often, extensive fine-tuning for various parameters is needed to obtain the best results.

Changing the underlying data is expected to substantially affect the performance of these models indicating their lack of generalizability. Although such models work as assistance to medical experts in making the final decision, a higher classification accuracy would have a higher impact on medical experts’ reliance on such systems. For example, in the case of the ADHD-200 dataset, the data are collected from various sites. Among the existing approaches, there is no such approach that can provide a similar ADHD diagnosis accuracy for all sites. Studies [70] utilize data augmentation as one solution to overcome this issue and report the sliding window method can be used to improve the ADHD classification from cross-site data.

Existing models for ADHD diagnosis can not handle complex data representations from fMRI data. To avoid computational complexity and reduce processing time, brain function connectivity network data is transformed into a lower feature space which may lead to information loss. Dedicated efforts are needed to build more efficient machine learning and deep learning models to handle complex brain representations without the need to transform them into a lower space.

Another challenge for machine learning and deep learning models is to increase diagnostic accuracy. Currently, such models serve as assisting medical experts to accelerate their decisions and are not trusted for autonomous decisions. One hurdle in this case is the lack of trust in the diagnosis results of such models. Machine learning and deep learning models lack generalizability, and their outcome is severely affected in the case of cross-site data. Building a more trustworthy and generalized model is needed to overcome this issue.

Utility of neuroimaging-based ADHD diagnosis

Neuroimaging has greatly helped understand the differences in functional connectivity of various regions of the brain concerning ADHD and control subjects. The exact cause that leads to ADHD is not determined yet, neurologic, biological, and environmental factors are reported to interplay in this disorder [149, 150]. From many studies, [151, 152], a higher risk of ADHD is reported for children where first-degree relatives have ADHD. There is no single diagnosis technique for ADHD and often involves more than one type of test. For clinical practice, the United States (US) follows the DSM-V criteria which involves field trials for assessing various symptoms among children of various sub-types of ADHD. Clinical diagnosis is more conclusive and deterministic compared to non-clinical approaches like the use of neuroimaging approaches.

Contrary to clinical diagnosis methods which provide more conclusive results, neuroimaging and other modes of ADHD diagnosis, function as assistive tools to help physicians speed up the diagnosis process. As reported in [153], the average time of ADHD diagnosis (from the first visit to the physician to the diagnosis) is 10.8 months in Europe. On the other hand, the shortest duration was reported for Italy which is 3.0 months while the United Kingdom (UK) has the longest duration of 18.3 months. As such, the use of machine learning and deep learning integrated with various imaging tools aims at working as a “decision-aid” and is reported in [153] to reduce the diagnosis duration substantially.

Limitations of current study

Despite the systematic process like PRISMA followed in this study, it may have the following limitations:

  • Great care has been taken while formulating the search string for the articles’ search. Since the survey is about the ADHD diagnosis approaches, “ADHD” was the central point of the query. To ensure a search of all related articles, we used different variations of AHDH like “attention deficit-hyperactivity disorder”, “attention deficit/hyperactivity disorder”, “ADHD”, “AD/HD”, “AD-HD”, etc. Even so, there is a probability that some relevant articles were missed due to the absence of such terms in their title.

  • There is always the probability of bias to some extent in the survey papers. Even though the article selection is handled with care involving multiple authors for article selection using the inclusion and exclusion criteria, the process may have some bias.

  • For article classification, the core technology/approach is considered, for example, MRI, or, fMRI, etc. However, some articles may been placed in the wrong group due to this technique particularly those that involve more than one technology or approach.

Conclusions and future directions

ADHD is a neurological condition, found in children and adults, with a higher prevalence in children. Although it is not reversible, proper diagnosis and appropriate treatment and care ensure normal life for ADHD victims. In this regard, a large number of technologies and approaches have been researched. This survey aims to provide a comprehensive evaluation of neuroimaging approaches like fMRI, one of the leading modalities studied for ADHD diagnosis, and approaches based on fMRI data. The fMRI-based approaches are discussed, particularly focusing on those that utilize machine learning, or deep learning models for automated diagnosis of ADHD using fMRI data. The objective of this study includes exploring leading machine learning and deep learning approaches incorporated with fMRI data, feature engineering pipelines followed for ADHD detection, discussion of datasets used for experiments, and finally pros and cons of each approach and how they can be further improved.

While exploring these aspects of existing research challenges and limitations of existing works have been identified and a discussion is provided on how to improve existing approaches for better outcomes concerning ADHD diagnosis. This survey serves as a foundation for those who embark on ADHD diagnosis using machine learning approaches. Using the insights provided in this survey, one can find suitable models for ADHD diagnosis, as well as, available datasets recorded from different sensors thereby leading to novel and more accurate diagnostic frameworks.

This study followed a systematic approach for article search from various repositories and outlined inclusion and exclusion criteria to include quality articles from existing works. Multiple authors selected articles to reduce bias and ensure quality, even such, the probability of missing good quality articles can not be neglected. Similarly, article classification may be prone to error and the same article may be appearing in more than one category. In the light of the discussions, we focus on the following points for future work

  • Standardization for rs-fMRI and task-related fMRI data for ADHD diagnosis. Existing research predominantly used task-related fMRI; further research is needed for rs-fMRI.

  • The heterogeneity of data-collecting devices is a challenge that reduces the generalizability of diagnosis approaches. For reproducibility and to broaden the scope of research findings, standards on the use of particular data collection modalities are needed.

  • Except for the ADHD-200 initiative, large-size public data are not available which limits the research efforts in this domain. Large fMRI datasets should be made publicly available to accelerate research efforts.

  • Currently, fMRI is the leading neuroimaging technology used for ADHD diagnosis and further novel technologies need to be explored. In addition, physiological traits like eye and head-related data can be incorporated with fMRI data for further investigation.

  • Various brain representations are used in existing research, thereby reducing the reproducibility of the research and the scope of their research findings.

  • Novel and sophisticated models should be developed that can deal with complex brain functional connectivity networks from fMRI data.

Although fMRI is a largely, rather leading neuroimaging technology, used for ADHD diagnosis combined with machine learning models, better brain representation and sophisticated models can increase the reliability of both fMRI and machine learning approaches.

Availability of data and materials

The data can be requested from the corresponding authors.

Code availability

Not applicable.

References

  1. Wolraich ML, Chan E, Froehlich T, Lynch RL, Bax A, Redwine ST, Ihyembe D, Hagan JF. ADHD diagnosis and treatment guidelines: a historical perspective. Pediatrics. 2019;144(4):e20191682.

    Article  Google Scholar 

  2. Fayyad J, Sampson NA, Hwang I, Adamowski T, Aguilar-Gaxiola S, Al-Hamzawi A, Andrade LH, Borges G, Girolamo G, Florescu S. The descriptive epidemiology of DSM-IV adult ADHD in the world health organization world mental health surveys. ADHD Attent Deficit Hyperactivity Disord. 2017;9:47–65.

    Article  Google Scholar 

  3. Magnin E, Maurs C. Attention-deficit/hyperactivity disorder during adulthood. Revue neurologique. 2017;173(7–8):506–15.

    Article  Google Scholar 

  4. Danielson ML, Bitsko RH, Ghandour RM, Holbrook JR, Kogan MD, Blumberg SJ. Prevalence of parent-reported ADHD diagnosis and associated treatment among us children and adolescents, 2016. J Clin Child Adolesc Psychol. 2018;47(2):199–212.

    Article  Google Scholar 

  5. Wolraich ML, McKeown RE, Visser SN, Bard D, Cuffe S, Neas B, Geryk LL, Doffing M, Bottai M, Abramowitz AJ. The prevalence of ADHD: its diagnosis and treatment in four school districts across two states. J Attent Disord. 2014;18(7):563–75.

    Article  Google Scholar 

  6. Blum K, Chen AL-C, Braverman ER, Comings DE, Chen TJ, Arcuri V, Blum SH, Downs BW, Waite RL, Notaro A. Attention-deficit-hyperactivity disorder and reward deficiency syndrome. Neuropsychiatr Dis Treat. 2008;4(5):893–918.

    Google Scholar 

  7. Gadow KD, Drabick DA, Loney J, Sprafkin J, Salisbury H, Azizian A, Schwartz J. Comparison of ADHD symptom subtypes as source-specific syndromes. J Child Psychol Psychiatry. 2004;45(6):1135–49.

    Article  Google Scholar 

  8. Matthews PM, Jezzard P. Functional magnetic resonance imaging. J Neurol Neurosurg Psychiatry. 2004;75(1):6–12.

    Google Scholar 

  9. Di Prospero ND, Kim S, Yassa MA. Magnetic resonance imaging biomarkers for cognitive decline in down syndrome. In: The neurobiology of aging and Alzheimer disease in down syndrome. Elsevier; 2022. p. 149–72.

    Chapter  Google Scholar 

  10. Le Bihan D. Looking into the functional architecture of the brain with diffusion MRI. Nat Rev Neurosci. 2003;4(6):469–80.

    Article  Google Scholar 

  11. Birn RM, Smith MA, Jones TB, Bandettini PA. The respiration response function: the temporal dynamics of FMRI signal fluctuations related to changes in respiration. Neuroimage. 2008;40(2):644–54.

    Article  Google Scholar 

  12. Chang C, Cunningham JP, Glover GH. Influence of heart rate on the bold signal: the cardiac response function. Neuroimage. 2009;44(3):857–69.

    Article  Google Scholar 

  13. Glover GH. Overview of functional magnetic resonance imaging. Neurosurg Clin. 2011;22(2):133–9.

    Article  Google Scholar 

  14. Loh HW, Ooi CP, Barua PD, Palmer EE, Molinari F, Acharya UR. Automated detection of ADHD: current trends and future perspective. Comput Biol Med. 2022;146: 105525.

    Article  Google Scholar 

  15. Zhang-James Y, Razavi AS, Hoogman M, Franke B, Faraone SV. Machine learning and mri-based diagnostic models for ADHD: are we there yet? J Attent Disord. 2023;27(4):335–53.

    Article  Google Scholar 

  16. Kitchenham B. Procedures for performing systematic reviews. Keele, UK, Keele University. 2004;33(2004):1–26.

    Google Scholar 

  17. Sidhu GS, Asgarian N, Greiner R, Brown MR. Kernel principal component analysis for dimensionality reduction in FMRI-based diagnosis of ADHD. Front Syst Neurosci. 2012;6:74.

    Article  Google Scholar 

  18. Bohland JW, Saperstein S, Pereira F, Rapin J, Grady L. Network, anatomical, and non-imaging measures for the prediction of ADHD diagnosis in individual subjects. Front Syst Neurosci. 2012;6:78.

    Article  Google Scholar 

  19. Colby JB, Rudie JD, Brown JA, Douglas PK, Cohen MS, Shehzad Z. Insights into multimodal imaging classification of ADHD. Front Syst Neurosci. 2012;6:59.

    Article  Google Scholar 

  20. Kuang D, He L. Classification on adhd with deep learning. In: 2014 international conference on cloud computing and Big Data. IEEE; 2014. p. 27–32.

  21. Farzi S, Kianian S, Rastkhadive I. Diagnosis of attention deficit hyperactivity disorder using deep belief network based on greedy approach. In: 2017 5th international symposium on computational and business intelligence (ISCBI). IEEE; 2017. p. 96–9.

  22. Rezaei M, Zare H, Hakimdavoodi H, Nasseri S, Hebrani P. Classification of drug-naive children with attention-deficit/hyperactivity disorder from typical development controls using resting-state fmri and graph theoretical approach. Front Hum Neurosci. 2022;16: 948706.

    Article  Google Scholar 

  23. Wang J, Wang X, Xia M, Liao X, Evans A, He Y. Gretna: a graph theoretical network analysis toolbox for imaging connectomics. Front Hum Neurosci. 2015;9:386.

    Google Scholar 

  24. Singh J, Kaur G, Kapoor N. Classification of attention deficit hyperactivity disorder using machine learning. In: 2022 IEEE 3rd global conference for advancement in technology (GCAT). IEEE; 2022. p. 1–8.

  25. Lakhan A, Hamouda H, Abdulkareem KH, Alyahya S, Mohammed MA. Digital healthcare framework for patients with disabilities based on deep federated learning schemes. Comput Biol Med. 2024;169: 107845.

    Article  Google Scholar 

  26. Lakhan A, Mohammed MA, Abdulkareem KH, Hamouda H, Alyahya S. Autism spectrum disorder detection framework for children based on federated learning integrated CNN-LSTM. Comput Biol Med. 2023;166: 107539.

    Article  Google Scholar 

  27. Ibrahim AM, Mohammed MA. A comprehensive review on advancements in artificial intelligence approaches and future perspectives for early diagnosis of parkinson’s disease. Int J Math Stat Comput Sci. 2024;2:173–82.

    Article  Google Scholar 

  28. Preetha P, Mallika R. Normalization and deep learning based attention deficit hyperactivity disorder classification. J Intell Fuzzy Syst. 2021;40(4):7613–21.

    Article  Google Scholar 

  29. Hart H, Chantiluke K, Cubillo AI, Smith AB, Simmons A, Brammer MJ, Marquand AF, Rubia K. Pattern classification of response inhibition in ADHD: toward the development of neurobiological markers for ADHD. Human Brain Mapping. 2014;35(7):3083–94.

    Article  Google Scholar 

  30. Qiang N, Dong Q, Sun Y, Ge B, Liu T. deep variational autoencoder for modeling functional brain networks and adhd identification. In: 2020 IEEE 17th international symposium on biomedical imaging (ISBI). IEEE; 2020. p. 554–7.

  31. Shoeibi A, Ghassemi N, Khodatars M, Moridian P, Khosravi A, Zare A, Gorriz JM, Chale-Chale AH, Khadem A, Rajendra Acharya U. Automatic diagnosis of schizophrenia and attention deficit hyperactivity disorder in RS-FMRI modality using convolutional autoencoder model and interval type-2 fuzzy regression. Cogn Neurodyn. 2023;17(6):1501–23.

    Article  Google Scholar 

  32. Ke H, Wang F, Ma H, He Z. Adhd identification and its interpretation of functional connectivity using deep self-attention factorization. Knowl Based Syst. 2022;250: 109082.

    Article  Google Scholar 

  33. K UR, PAP. Hybrid deep learning classification model for attention-deficit-hyperactivity disorder using functional magnetic resonance imaging. In: 2023 international conference on intelligent systems for communication, IoT and security (ICISCoIS). 2023. p. 688–93. https://doi.org/10.1109/ICISCoIS56541.2023.10100467

  34. Uyulan C, Erguzel TT, Turk O, Farhad S, Metin B, Tarhan N. A class activation map-based interpretable transfer learning model for automated detection of ADHD from FMRI data. Clin EEG Neurosci. 2023;54(2):151–9.

    Article  Google Scholar 

  35. Li Y, Lian Z, Li M, Liu Z, Xiao L, Wei Z. Elm-based classification of adhd patients using a novel local feature extraction method. In: 2016 ieee international conference on bioinformatics and biomedicine (BIBM). IEEE; 2016. p. 489–92.

  36. Zhang X, Guo L, Li X, Zhang T, Zhu D, Li K, Chen H, Lv J, Jin C, Zhao Q. Characterization of task-free and task-performance brain states via functional connectome patterns. Med Image Anal. 2013;17(8):1106–22.

    Article  Google Scholar 

  37. Miao B, Zhang Y. A feature selection method for classification of adhd. In: 2017 4th international conference on information, cybernetics and computational social systems (ICCSS). IEEE; 2017. p. 21–5.

  38. Aradhya AM, Subbaraju V, Sundaram S, Sundararajan N. Regularized spatial filtering method (r-sfm) for detection of attention deficit hyperactivity disorder (ADHD) from resting-state functional magnetic resonance imaging (rs-fmri). In: 2018 40th annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE; 2018. p. 5541–4.

  39. Huang Z-A, Liu R, Tan KC. Multi-task learning for efficient diagnosis of ASD and ADHD using resting-state FMRI data. In: 2020 international joint conference on neural networks (IJCNN). IEEE; 2020. p. 1–7.

  40. Aradhya AM, Sundaram S, Pratama M. Metaheuristic spatial transformation (mst) for accurate detection of attention deficit hyperactivity disorder (ADHD) using RS-FMRI. In: 2020 42nd annual international conference of the IEEE engineering in medicine & biology society (EMBC). IEEE; 2020. p. 2829–32.

  41. Riaz A, Asad M, Alonso E, Slabaugh G. DeepFMRI: End-to-end deep learning for functional connectivity and classification of ADHD using FMRI. J Neurosci Methods. 2020;335: 108506.

    Article  Google Scholar 

  42. Jha RR, Nigam A, Bhavsar A, Jaswal G, Pathak SK, Kumar R. Hlgsnet: Hierarchical and lightweight graph siamese network with triplet loss for FMRI-based classification of ADHD. In: 2020 international joint conference on neural networks (IJCNN). IEEE; 2020. p. 1–7.

  43. Salman SA, Lian Z, Saleem M, Zhang Y. Functional connectivity based classification of adhd using different atlases. In: 2020 IEEE international conference on progress in informatics and computing (PIC). IEEE; 2020. p. 62–6.

  44. Liu S, Zhao L, Wang X, Xin Q, Zhao J, Guttery DS, Zhang Y-D. Deep spatio-temporal representation and ensemble classification for attention deficit/hyperactivity disorder. IEEE Trans Neural Syst Rehabil Eng. 2020;29:1–10.

    Article  Google Scholar 

  45. Chen Y, Tang Y, Wang C, Liu X, Zhao L, Wang Z. ADHD classification by dual subspace learning using resting-state functional connectivity. Artif Intell Med. 2020;103: 101786.

    Article  Google Scholar 

  46. Tang Y, Li X, Chen Y, Zhong Y, Jiang A, Wang C. High-accuracy classification of attention deficit hyperactivity disorder with l 2, 1-norm linear discriminant analysis and binary hypothesis testing. IEEE Access. 2020;8:56228–37.

    Article  Google Scholar 

  47. Cicek G, Akan A. Deep learning approach versus traditional machine learning for ADHD classification. In: 2021 Medical Technologies Congress (TIPTEKNO). IEEE; 2021. p. 1–4.

  48. Tang Y, Sun J, Wang C, Zhong Y, Jiang A, Liu G, Liu X. ADHD classification using auto-encoding neural network and binary hypothesis testing. Artif Intell Med. 2022;123: 102209.

    Article  Google Scholar 

  49. Wang D, Hong D, Wu Q. Attention deficit hyperactivity disorder classification based on deep learning. IEEE/ACM Trans Comput Biol Bioinf. 2022;20(2):1581–6.

    Article  Google Scholar 

  50. Saurabh S, Gupta P. Deep learning-based modified bidirectional LSTM network for classification of ADHD disorder. Arab J Sci Eng. 2024;49(3):3009–26.

    Article  Google Scholar 

  51. Rahadian BA, Dewi C, Rahayudi B. The performance of genetic algorithm learning vector quantization 2 neural network on identification of the types of attention deficit hyperactivity disorder. In: 2017 international conference on sustainable information engineering and technology (SIET). IEEE; 2017. p. 337–41.

  52. Deshpande G, Wang P, Rangaprakash D, Wilamowski B. Fully connected cascade artificial neural network architecture for attention deficit hyperactivity disorder classification from functional magnetic resonance imaging data. IEEE Trans Cybern. 2015;45(12):2668–79.

    Article  Google Scholar 

  53. Saha P, Sarkar D. Characterization and classification of adhd subtypes: an approach based on the nodal distribution of eigenvector centrality and classification tree model. Child Psychiatry Human Dev. 2022;55(3):622–34.

    Article  Google Scholar 

  54. Gao Y, Ni H, Chen Y, Tang Y, Liu X. Subtype classification of attention deficit hyperactivity disorder with hierarchical binary hypothesis testing framework. J Neural Eng. 2023;20(5): 056015.

    Article  Google Scholar 

  55. Park B-Y, Kim M, Seo J, Lee J-M, Park H. Connectivity analysis and feature classification in attention deficit hyperactivity disorder sub-types: a task functional magnetic resonance imaging study. Brain Topogr. 2016;29:429–39.

    Article  Google Scholar 

  56. Riaz A, Asad M, Alonso E, Slabaugh G. Fusion of FMRI and non-imaging data for ADHD classification. Comput Med Imaging Graphics. 2018;65:115–28.

    Article  Google Scholar 

  57. Frey BJ, Dueck D. Clustering by passing messages between data points. Science. 2007;315(5814):972–6.

    Article  MathSciNet  Google Scholar 

  58. Rodriguez A, Laio A. Clustering by fast search and find of density peaks. Science. 2014;344(6191):1492–6.

    Article  Google Scholar 

  59. Mao Z, Su Y, Xu G, Wang X, Huang Y, Yue W, Sun L, Xiong N. Spatio-temporal deep learning method for ADHD FMRI classification. Inf Sci. 2019;499:1–11.

    Article  Google Scholar 

  60. Miao B, Zhang L, Guan J, Meng Q, Zhang Y. Classification of ADHD individuals and neurotypicals using reliable relief: a resting-state study. IEEE Access. 2019;7:62163–71.

    Article  Google Scholar 

  61. Yao D, Sun H, Guo X, Calhoun VD, Sun L, Sui J. Adhd classification within and cross cohort using an ensembled feature selection framework. In: 2019 IEEE 16th international symposium on biomedical imaging (ISBI 2019). IEEE; 2019. p. 1265–9.

  62. Shao L, Zhang D, Du H, Fu D. Deep forest in ADHD data classification. IEEE Access. 2019;7:137913–9.

    Article  Google Scholar 

  63. Gao M-S, Tsai F-S, Lee C-C. Learning a phenotypic-attribute attentional brain connectivity embedding for adhd classification using rs-fmri. In: 2020 42nd annual international conference of the IEEE engineering in medicine & biology society (EMBC). IEEE; 2020. p. 5472–5.

  64. Liu R, Huang Z-a, Jiang M, Tan KC. Multi-LSTM networks for accurate classification of attention deficit hyperactivity disorder from resting-state FMRI data. In: 2020 2nd international conference on industrial artificial intelligence (IAI). IEEE; 2020. p. 1–6.

  65. Zhang T, Li C, Li P, Peng Y, Kang X, Jiang C, Li F, Zhu X, Yao D, Biswal B. Separated channel attention convolutional neural network (SC-CNN-attention) to identify ADHD in multi-site RS-FMRI dataset. Entropy. 2020;22(8):893.

    Article  Google Scholar 

  66. Tang Y, Jiang J, Li M, Chen Y, Meng X. Adhd classification via auto-encoding network with non-imaging data fusion. In: 2021 Asia-pacific signal and information processing association annual summit and conference (APSIPA ASC). IEEE; 2021. p. 1328–32.

  67. Qin Y, Lou Y, Huang Y, Chen R, Yue W. An ensemble deep learning approach combining phenotypic data and FMRI for ADHD diagnosis. J Signal Process Syst. 2022;94(11):1269–81.

    Article  Google Scholar 

  68. Niu Y, Huang F, Zhou H, Peng J. Deep spatio-temporal method for ADHD classification using resting-state fmri. In: 2022 IEEE 34th international conference on tools with artificial intelligence (ICTAI). IEEE; 2022. p. 1082–7.

  69. Qiang N, Dong Q, Liang H, Ge B, Zhang S, Zhang C, Gao J, Sun Y. A novel ADHD classification method based on resting state temporal templates (RSTT) using spatiotemporal attention auto-encoder. Neural Comput Appl. 2022;34(10):7815–33.

    Article  Google Scholar 

  70. Pei S, Wang C, Cao S, Lv Z. Data augmentation for FMRI-based functional connectivity and its application to cross-site ADHD classification. IEEE Trans Instrum Meas. 2022;72:1–15.

    Google Scholar 

  71. Igual L, Soliva JC, Escalera S, Gimeno R, Vilarroya O, Radeva P. Automatic brain caudate nuclei segmentation and classification in diagnostic of attention-deficit/hyperactivity disorder. Comput Med Imaging Graphics. 2012;36(8):591–600.

    Article  Google Scholar 

  72. Peng X, Lin P, Zhang T, Wang J. Extreme learning machine-based classification of ADHD using brain structural MRI data. PLoS ONE. 2013;8(11):79476.

    Article  Google Scholar 

  73. Wang P, Zhu D, Li X, Chen H, Jiang X, Sun L, Cao Q, An L, Liu T, Wang Y. Identifying functional connectomics abnormality in attention deficit hyperactivity disorder. In: 2013 IEEE 10th international symposium on biomedical imaging. IEEE; 2013. p. 544–7.

  74. Johnston BA, Mwangi B, Matthews K, Coghill D, Konrad K, Steele JD. Brainstem abnormalities in attention deficit hyperactivity disorder support high accuracy individual diagnostic classification. Human Brain Map. 2014;35(10):5179–89.

    Article  Google Scholar 

  75. Sachnev V. An efficient classification scheme for adhd problem based on binary coded genetic algorithm and mcfis. In: 2015 international conference on cognitive computing and information processing (CCIP). IEEE; 2015. p. 1–6.

  76. Qureshi MNI, Min B, Jo HJ, Lee B. Multiclass classification for the differential diagnosis on the ADHD subtypes using recursive feature elimination and hierarchical extreme learning machine: structural mri study. PLoS ONE. 2016;11(8):0160697.

    Article  Google Scholar 

  77. Qureshi MNI, Lee B. Classification of adhd subgroup with recursive feature elimination for structural brain mri. In: 2016 38th annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE; 2016. p. 5929–32.

  78. Zou L, Zheng J, Miao C, Mckeown MJ, Wang ZJ. 3d CNN based automatic diagnosis of attention deficit hyperactivity disorder using functional and structural MRI. IEEE Access. 2017;5:23626–36.

    Article  Google Scholar 

  79. Chaim-Avancini T, Doshi J, Zanetti M, Erus G, Silva M, Duran F, Cavallet M, Serpa M, Caetano S, Louza M. Neurobiological support to the diagnosis of ADHD in stimulant-naïve adults: pattern recognition analyses of MRI data. Acta Psychiatrica Scand. 2017;136(6):623–36.

    Article  Google Scholar 

  80. Zhu L, Zhang L, Han Y, Zeng Q, Chang W. Study of attention deficit/hyperactivity disorder classification based on convolutional neural networks. Sheng wu yi xue gong cheng xue za zhi J Biomed Eng Shengwu yixue gongchengxue zazhi. 2017;34(1):99–105.

    Google Scholar 

  81. Cicek G, Akan A, Metin B. Detection of attention deficit hyperactivity disorder using local and global features. In: 2018 medical technologies national congress (TIPTEKNO). IEEE; 2018. p. 1–4.

  82. Wang X-H, Jiao Y, Li L. Diagnostic model for attention-deficit hyperactivity disorder based on interregional morphological connectivity. Neurosci Lett. 2018;685:30–4.

    Article  Google Scholar 

  83. Zhang Y, Tang Y, Chen Y, Zhou L, Wang C. Adhd classification by feature space separation with sparse representation. In: 2018 IEEE 23rd international conference on digital signal processing (DSP). IEEE; 2018. p. 1–5.

  84. Kushki A, Anagnostou E, Hammill C, Duez P, Brian J, Iaboni A, Schachar R, Crosbie J, Arnold P, Lerch JP. Examining overlap and homogeneity in ASD, ADHD, and OCD: a data-driven, diagnosis-agnostic approach. Transl Psychiatry. 2019;9(1):318.

    Article  Google Scholar 

  85. Sachnev V, Suresh S, Sundararajan N, Mahanand BS, Azeem MW, Saraswathi S. Multi-region risk-sensitive cognitive ensembler for accurate detection of attention-deficit/hyperactivity disorder. Cogn Comput. 2019;11:545–59.

    Article  Google Scholar 

  86. Zhu L, Chang W. Application of deep convolutional neural networks in attention-deficit/hyperactivity disorder classification: data augmentation and convolutional neural network transfer learning. J Med Imaging Health Inf. 2019;9(8):1717–24.

    Article  Google Scholar 

  87. Wang T, Kamata S-i. Classification of structural mri images in adhd using 3d fractal dimension complexity map. In: 2019 IEEE international conference on image processing (ICIP). IEEE; 2019. p. 215–9.

  88. Abdolmaleki S, Abadeh MS. Brain mr image classification for adhd diagnosis using deep neural networks. In: 2020 international conference on machine vision and image processing (MVIP). IEEE; 2020. p. 1–5.

  89. Huang Y-L, Hsieh W-T, Yang H-C, Lee C-C. Conditional domain adversarial transfer for robust cross-site adhd classification using functional mri. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE; 2020. p. 1190–4.

  90. Lin C, Qiu J, Hou K, Lu W, Lu W, Liu X, Qiu J, Shi L. Structural MRI-based radiomics and machine learning for the classification of attention-deficit/hyperactivity disorder subtypes. Med Phys. 2020;47:554–554.

    Google Scholar 

  91. McNorgan C, Judson C, Handzlik D, Holden JG. Linking ADHD and behavioral assessment through identification of shared diagnostic task-based functional connections. Front Physiol. 2020;11: 583005.

    Article  Google Scholar 

  92. Kautzky A, Vanicek T, Philippe C, Kranz G, Wadsak W, Mitterhauser M, Hartmann A, Hahn A, Hacker M, Rujescu D. Machine learning classification of ADHD and HC by multimodal serotonergic data. Transl Psychiatry. 2020;10(1):104.

    Article  Google Scholar 

  93. Zhou X, Lin Q, Gui Y, Wang Z, Liu M, Lu H. Multimodal MR images-based diagnosis of early adolescent attention-deficit/hyperactivity disorder using multiple kernel learning. Front Neurosci. 2021;15: 710133.

    Article  Google Scholar 

  94. Khullar V, Salgotra K, Singh HP, Sharma DP. Deep learning-based binary classification of ADHD using resting state MR images. Augment Human Res. 2021;6(1):5.

    Article  Google Scholar 

  95. Jacobs GR, Voineskos AN, Hawco C, Stefanik L, Forde NJ, Dickie EW, Lai M-C, Szatmari P, Schachar R, Crosbie J. Integration of brain and behavior measures for identification of data-driven groups cutting across children with ASD, ADHD, or OCD. Neuropsychopharmacology. 2021;46(3):643–53.

    Article  Google Scholar 

  96. Doi H, Kanai C, Ohta H. Transdiagnostic and sex differences in cognitive profiles of autism spectrum disorder and attention-deficit/hyperactivity disorder. Autism Res. 2022;15(6):1130–41.

    Article  Google Scholar 

  97. Zhang M, Huang Y, Jiao J, Yuan D, Hu X, Yang P, Zhang R, Wen L, Situ M, Cai J. Transdiagnostic symptom subtypes across autism spectrum disorders and attention deficit hyperactivity disorder: validated by measures of neurocognition and structural connectivity. BMC Psychiatry. 2022;22(1):102.

    Article  Google Scholar 

  98. Lohani DC, Rana B. ADHD diagnosis using structural brain MRI and personal characteristic data with machine learning framework. Psychiatry Res Neuroimaging. 2023;334: 111689.

    Article  Google Scholar 

  99. Priyanka R, Komarina R, Priya PA. Mri segmentation of human brain for diagnosis of ADHD. In: 2023 international conference on recent advances in electrical, electronics, ubiquitous communication, and computational intelligence (RAEEUCCI). IEEE; 2023. p. 1–7.

  100. Abedinzadeh Torghabeh F, Hosseini SA, Modaresnia Y. Potential biomarker for early detection of ADHD using phase-based brain connectivity and graph theory. Phys Eng Sci Med. 2023;46(4):1447–65.

    Article  Google Scholar 

  101. Wang J, Liao W, Jin X. Classification of ADHD using FNIRS signals based on functional connectivity and interval features. In: 2021 6th international conference on computational intelligence and applications (ICCIA). IEEE; 2021. p. 113–7.

  102. Gu Y, Miao S, Yang J, Li X. ADHD children identification with multiview feature fusion of FNIRS signals. IEEE Sens J. 2022;22(13):13536–43.

    Article  Google Scholar 

  103. Shin J, Konnai S, Maniruzzaman M, Hasan MAM, Hirooka K, Megumi A, Yasumura A. Identifying ADHD for children with coexisting ASD from FNIRS signals using deep learning approach. IEEE Access. 2023. https://doi.org/10.1109/ACCESS.2023.3299960.

    Article  Google Scholar 

  104. Crippa A, Salvatore C, Molteni E, Mauri M, Molteni M, Nobile M, Castiglioni I. The utility of a computerized algorithm based on a multi-domain profile of measures for the diagnosis of attention deficit/hyperactivity disorder. Front Psychiatry. 2017;8: 296167.

    Article  Google Scholar 

  105. Weyandt L, Weyandt LL. The physiological bases of cognitive and behavioral disorders: foundations of psychological and neurodegenerative disorders. Routledge; 2006.

    Book  Google Scholar 

  106. Rubia K. The dynamic approach to neurodevelopmental psychiatric disorders: use of FMRI combined with neuropsychology to elucidate the dynamics of psychiatric disorders, exemplified in adhd and schizophrenia. Behav Brain Res. 2002;130(1–2):47–56.

    Article  Google Scholar 

  107. Rubia K, Halari R, Cubillo A, Mohammad A-M, Brammer M, Taylor E. Methylphenidate normalises activation and functional connectivity deficits in attention and motivation networks in medication-naive children with adhd during a rewarded continuous performance task. Neuropharmacology. 2009;57(7–8):640–52.

    Article  Google Scholar 

  108. Rubia K, Halari R, Smith AB, Mohammad M, Scott S, Brammer MJ. Shared and disorder-specific prefrontal abnormalities in boys with pure attention-deficit/hyperactivity disorder compared to boys with pure cd during interference inhibition and attention allocation. J Child Psychol Psychiatry. 2009;50(6):669–78.

    Article  Google Scholar 

  109. Cubillo A, Halari R, Ecker C, Giampietro V, Taylor E, Rubia K. Reduced activation and inter-regional functional connectivity of fronto-striatal networks in adults with childhood attention-deficit hyperactivity disorder (adhd) and persisting symptoms during tasks of motor inhibition and cognitive switching. J Psychiatr Res. 2010;44(10):629–39.

    Article  Google Scholar 

  110. Depue BE, Burgess GC, Willcutt EG, Ruzic L, Banich M. Inhibitory control of memory retrieval and motor processing associated with the right lateral prefrontal cortex: evidence from deficits in individuals with adhd. Neuropsychologia. 2010;48(13):3909–17.

    Article  Google Scholar 

  111. Konrad A, Dielentheis TF, El Masri D, Bayerl M, Fehr C, Gesierich T, Vucurevic G, Stoeter P, Winterer G. Disturbed structural connectivity is related to inattention and impulsivity in adult attention deficit hyperactivity disorder. Eur J Neurosci. 2010;31(5):912–9.

    Article  Google Scholar 

  112. Zimmer L. Positron emission tomography neuroimaging for a better understanding of the biology of ADHD. Neuropharmacology. 2009;57(7–8):601–7.

    Article  Google Scholar 

  113. Mori S, Zhang J. Principles of diffusion tensor imaging and its applications to basic neuroscience research. Neuron. 2006;51(5):527–39.

    Article  Google Scholar 

  114. Gusnard DA, Raichle ME. Searching for a baseline: functional imaging and the resting human brain. Nat Rev Neurosci. 2001;2(10):685–94.

    Article  Google Scholar 

  115. Weyandt L, Swentosky A, Gudmundsdottir BG. Neuroimaging and ADHD: FMRI, pet, DTI findings, and methodological limitations. Dev Neuropsychol. 2013;38(4):211–25.

    Article  Google Scholar 

  116. NeuroImaging Tools & Resources Collaboratory; 2017. https://www.nitrc.org/plugins/mwiki/index.php/neurobureau:AthenaPipeline.

  117. Smith SM, Jenkinson M, Woolrich MW, Beckmann CF, Behrens TE, Johansen-Berg H, Bannister PR, De Luca M, Drobnjak I, Flitney DE. Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage. 2004;23:208–19.

    Article  Google Scholar 

  118. Cox RW. Afni: software for analysis and visualization of functional magnetic resonance neuroimages. Comput Biomed Res. 1996;29(3):162–73.

    Article  Google Scholar 

  119. Bellec P, Chu C, Chouinard-Decorte F, Benhajali Y, Margulies DS, Craddock RC. The neuro bureau ADHD-200 preprocessed repository. Neuroimage. 2017;144:275–86.

    Article  Google Scholar 

  120. Penny WD, Friston KJ, Ashburner JT, Kiebel SJ, Nichols TE. Statistical parametric mapping: the analysis of functional brain images. London: Elsevier; 2011.

    Google Scholar 

  121. Yan C-G, Wang X-D, Zuo X-N, Zang Y-F. DPABI: data processing & analysis for (resting-state) brain imaging. Neuroinformatics. 2016;14:339–51.

    Article  Google Scholar 

  122. Yan C, Zang Y. DPARSF: a matlab toolbox for’’ pipeline’’ data analysis of resting-state FMRI. Frontiers in systems neuroscience. 2010;4:1377.

    Google Scholar 

  123. Chao-Gan Y. Data processing assistant for resting-state FMRI (DPARSF). The R-fMRI Network; 2014.

  124. Shehzad Z, Giavasis S, Li Q, Benhajali Y, Yan C, Yang Z, Milham M, Bellec P, Craddock C. The preprocessed connectomes project quality assessment protocol-a resource for measuring the quality of MRI data. Front Neurosci. 2015;47:10–3389.

    Google Scholar 

  125. ADHD-200 Sample. http://fcon_1000.projects.nitrc.org/indi/adhd200/.

  126. Demaray MK, Elting J, Schaefer K. Assessment of attention-deficit/hyperactivity disorder (ADHD): a comparative evaluation of five, commonly used, published rating scales. Psychol Schools. 2003;40(4):341–61.

    Article  Google Scholar 

  127. Quality Improvement SoA.DD. Clinical practice guideline: diagnosis and evaluation of the child with attention-deficit/hyperactivity disorder. Pediatrics 2000;105(5):1158–1170.

  128. Dulcan M. Practice parameters for the assessment and treatment of children, adolescents, and adults with attention-deficit/hyperactivity disorder. J Am Acad Child Adolesc Psychiatry. 1997;36(10):85–121.

    Article  Google Scholar 

  129. Janiesch C, Zschech P, Heinrich K. Machine learning and deep learning. Electr Markets. 2021;31(3):685–95.

    Article  Google Scholar 

  130. Raza A, Siddiqui HUR, Munir K, Almutairi M, Rustam F, Ashraf I. Ensemble learning-based feature engineering to analyze maternal health during pregnancy and health risk prediction. PLoS ONE. 2022;17(11):0276525.

    Article  Google Scholar 

  131. Rustam F, Siddique MA, Siddiqui HUR, Ullah S, Mehmood A, Ashraf I, Choi GS. Wireless capsule endoscopy bleeding images classification using CNN based model. IEEE Access. 2021;9:33675–88.

    Article  Google Scholar 

  132. Rustam F, Ishaq A, Munir K, Almutairi M, Aslam N, Ashraf I. Incorporating CNN features for optimizing performance of ensemble classifier for cardiovascular disease prediction. Diagnostics. 2022;12(6):1474.

    Article  Google Scholar 

  133. Alturki N, Umer M, Ishaq A, Abuzinadah N, Alnowaiser K, Mohamed A, Saidani O, Ashraf I. Combining CNN features with voting classifiers for optimizing performance of brain tumor classification. Cancers. 2023;15(6):1767.

    Article  Google Scholar 

  134. Rustam F, Mushtaq MF, Hamza A, Farooq MS, Jurcut AD, Ashraf I. Denial of service attack classification using machine learning with multi-features. Electronics. 2022;11(22):3817.

    Article  Google Scholar 

  135. Karim M, Missen MMS, Umer M, Sadiq S, Mohamed A, Ashraf I. Citation context analysis using combined feature embedding and deep convolutional neural network model. Appl Sci. 2022;12(6):3203.

    Article  Google Scholar 

  136. Biswal BB. Resting state FMRI: a personal history. Neuroimage. 2012;62(2):938–44.

    Article  Google Scholar 

  137. Biswal BB, Mennes M, Zuo X-N, Gohel S, Kelly C, Smith SM, Beckmann CF, Adelstein JS, Buckner RL, Colcombe S. Toward discovery science of human brain function. Proc Natl Acad Sci. 2010;107(10):4734–9.

    Article  Google Scholar 

  138. Patriat R, Molloy EK, Meier TB, Kirk GR, Nair VA, Meyerand ME, Prabhakaran V, Birn RM. The effect of resting condition on resting-state FMRI reliability and consistency: a comparison between resting with eyes open, closed, and fixated. Neuroimage. 2013;78:463–73.

    Article  Google Scholar 

  139. Herting MM, Gautam P, Chen Z, Mezher A, Vetter NC. Test-retest reliability of longitudinal task-based FMRI: implications for developmental studies. Dev Cogn Neurosci. 2018;33:17–26.

    Article  Google Scholar 

  140. Elliott ML, Knodt AR, Ireland D, Morris ML, Poulton R, Ramrakha S, Sison ML, Moffitt TE, Caspi A, Hariri AR. What is the test-retest reliability of common task-functional MRI measures? new empirical evidence and a meta-analysis. Psychol Sci. 2020;31(7):792–806.

    Article  Google Scholar 

  141. Noble S, Scheinost D, Constable RT. A guide to the measurement and interpretation of FMRI test–retest reliability. Curr Opin Behav Sci. 2021;40:27–32.

    Article  Google Scholar 

  142. Chaarani B, Hahn S, Allgaier N, Adise S, Owens M, Juliano A, Yuan D, Loso H, Ivanciu A, Albaugh M. Baseline brain function in the preadolescents of the ABCD study. Nat Neurosci. 2021;24(8):1176–86.

    Article  Google Scholar 

  143. Rustam F, Aslam N, De La Torre Díez I, Khan YD, Mazón JLV, Rodríguez CL, Ashraf I. White blood cell classification using texture and RGB features of oversampled microscopic images. Healthcare. 2022;10:2230.

    Article  Google Scholar 

  144. Lee E, Rustam F, Aljedaani W, Ishaq A, Rupapara V, Ashraf I. Predicting pulsars from imbalanced dataset with hybrid resampling approach. Adv Astron. 2021;2021:1–13.

    Article  Google Scholar 

  145. Shafique R, Rustam F, Choi GS, Díez IDLT, Mahmood A, Lipari V, Velasco CLR, Ashraf I. Breast cancer prediction using fine needle aspiration features and upsampling with supervised machine learning. Cancers. 2023;15(3):681.

    Article  Google Scholar 

  146. Mizutani-Tiebel Y, Tik M, Chang K-Y, Padberg F, Soldini A, Wilkinson Z, Voon CC, Bulubas L, Windischberger C, Keeser D. Concurrent TMS-FMRI: technical challenges, developments, and overview of previous studies. Front Psychiatry. 2022;13: 825205.

    Article  Google Scholar 

  147. Viessmann O, Polimeni JR. High-resolution FMRI at 7 tesla: challenges, promises and recent developments for individual-focused FMRI studies. Curr Opin Behav Sci. 2021;40:96–104.

    Article  Google Scholar 

  148. Bijsterbosch J, Harrison SJ, Jbabdi S, Woolrich M, Beckmann C, Smith S, Duff EP. Challenges and future directions for representations of functional brain organization. Nat Neurosci. 2020;23(12):1484–95.

    Article  Google Scholar 

  149. Castellanos FX, Giedd JN, Marsh WL, Hamburger SD, Vaituzis AC, Dickstein DP, Sarfatti SE, Vauss YC, Snell JW, Lange N. Quantitative brain magnetic resonance imaging in attention-deficit hyperactivity disorder. Arch Gen Psychiatry. 1996;53(7):607–16.

    Article  Google Scholar 

  150. Aylward EH, Reiss AL, Reader MJ, Singer HS, Brown JE, Denckla MB. Basal ganglia volumes in children with attention-deficit hyperactivity disorder. J Child Neurol. 1996;11(2):112–5.

    Article  Google Scholar 

  151. Biederman J, Faraone SV, Keenan K, Knee D, Tsuang MT. Family-genetic and psychosocial risk factors in DSM-III attention deficit disorder. J Am Acad Child Adolesc Psychiatry. 1990;29(4):526–33.

    Article  Google Scholar 

  152. Adesman AR. The diagnosis and management of attention-deficit/hyperactivity disorder in pediatric patients. Primary Care Companion J Clin Psychiatry. 2001;3(2):66.

    Google Scholar 

  153. Hollis C, Hall CL, Guo B, James M, Boadu J, Groom MJ, Brown N, Kaylor-Hughes C, Moldavsky M, Valentine AZ. The impact of a computerised test of attention and activity (qbtest) on diagnostic decision-making in children and young people with suspected attention deficit hyperactivity disorder: single-blind randomised controlled trial. J Child Psychol Psychiatry. 2018;59(12):1298–308.

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2021R1A6A1A03039493, NRF-2022R1I1A1A01070998).

Author information

Authors and Affiliations

Authors

Contributions

IA conceptualization, methodology, and writing—original draft. SJ formal analysis and data curation and visualization. SR software, validation, and project administration. YP investigation, supervision, and writing—review & edit. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Yongwan Park.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ashraf, I., Jung, S., Hur, S. et al. A systematic literature review of neuroimaging coupled with machine learning approaches for diagnosis of attention deficit hyperactivity disorder. J Big Data 11, 140 (2024). https://doi.org/10.1186/s40537-024-00998-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40537-024-00998-3

Keywords