DL-Brainage_review

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/366136217
Deep Learning for Brain Age Estimation: A Systematic Review
Preprint · December 2022

DOI: 10.48550/arXiv.2212.03868
CITATIONS READS
0 645
10 authors, including:
M. Tanveer Mudasir Ganaie

Indian Institute of Technology Indore Indian Institute of Technology Indore
204 PUBLICATIONS 5,328 CITATIONS 54 PUBLICATIONS 1,494 CITATIONS
SEE PROFILE SEE PROFILE
Iman Beheshti Tripti Goel

University of Manitoba National Institute of Technology, Silchar
76 PUBLICATIONS 1,635 CITATIONS 76 PUBLICATIONS 762 CITATIONS
SEE PROFILE SEE PROFILE
All content following this page was uploaded by M. Tanveer on 13 January 2023.
The user has requested enhancement of the downloaded file.

Deep Learning for Brain Age Estimation: A Systematic Review
M. Tanveera , M. A. Ganaieb , Iman Beheshtic , Tripti Goeld , Nehal Ahmada,e , Kuan-Ting
Laif , Kaizhu Huangg , Yu-Dong Zhangh , Javier Del Seri,j , Chin-Teng Link
a
Department of Mathematics, Indian Institute of Technology Indore, Simrol, Indore, 453552, India
b
Department of Robotics, University of Michigan, Ann Arbor, MI 48109, USA
c
Department of Human Anatomy and Cell Science, Rady Faculty of Health Sciences, Max Rady College of
Medicine, University of Manitoba, Winnipeg, MB, Canada
d
Biomedical Imaging Lab, National Institute of Technology Silchar, Assam 788010, India
e
Department of Electrical Engineering and Computer Science, National Taipei University of Technology,
Taipei, Taiwan
arXiv:2212.03868v1 [eess.IV] 7 Dec 2022
f
Department of Electronic Engineering, National Taipei University of Technology, Taipei, Taiwan
g
Data Science Research Center, Duke Kunshan University, China
h
School of Computing and Mathematical Sciences, University of Leicester, UK
i
TECNALIA, Basque Research and Technology Alliance (BRTA), 48160, Derio, Spain
j
University of the Basque Country (UPV/EHU), 48013, Bilbao, Spain
k
School of Computer Science, Faculty of Engineering and Information Technology, University of
Technology Sydney, Sydney, Australia
Abstract
Over the years, Machine Learning models have been successfully employed on neuroimaging
data for accurately predicting brain age. Deviations from the healthy brain aging pattern
are associated to the accelerated brain aging and brain abnormalities. Hence, efficient and
accurate diagnosis techniques are required for eliciting accurate brain age estimations. Sev-
eral contributions have been reported in the past for this purpose, resorting to different
data-driven modeling methods. Recently, deep neural networks (also referred to as deep
learning) have become prevalent in manifold neuroimaging studies, including brain age es-
timation. In this review we offer a comprehensive analysis of the literature related to the
adoption of deep learning for brain age estimation with neuroimaging data. We detail and
analyze different deep learning architectures used for this application, pausing at research
works published to date quantitatively exploring their application. We also examine different
brain age estimation frameworks, comparatively exposing their advantages and weaknesses.
Finally, the review concludes with an outlook towards future directions that should be fol-
lowed by prospective studies. The ultimate goal of this paper is to establish a common and
informed reference for newcomers and experienced researchers willing to approach brain age
estimation by using deep learning models.
Keywords: Brain age estimation, neuroimaging, machine learning, deep learning, deep
neural networks.
Email addresses: mtanveer@iiti.ac.in (M. Tanveer), mudasirg@umich.edu (M. A. Ganaie),

Preprint submitted to Elsevier December 9, 2022
1. Introduction
With the global rise of population, a spike of cases has been noted worldwide in a range of
non-fatal albeit disabling disorders, including neurodegenerative disorders such as cognitive
decline and dementia [1]. To deal with this challenge, there is a growing need to identify
the link between brain aging processes and neurodegenerative disease mechanisms. Methods
need to be devised for identifying the subjects who are at a higher risk of age-associated
deterioration, monitoring the progress of degradation, and devising appropriate treatments
to be prescribed for involved subjects. Age-related brain alterations play an important role in
the etiology of brain diseases. The discrepancy between the brain age and the chronological
age may reveal several risks of experiencing health related issues during the different stages
of life.
In this context, an unquestioned upsurge of data-driven models has been developed for
human health over the last couple of decades. Indeed, machine learning has been successful
in different modeling tasks related to brain age, mostly using magnetic resonance imaging
(MRI) scans of the brain. All such tasks fall below the overarching brain age estimation con-
cept which, through brain imaging data coupled with machine learning models, encompasses
techniques capable of predicting homogeneous trajectories in normal brains at an individual
level [2]. A brain age estimation framework typically employs a training set of cognitively
healthy participants together with supervised learning (i.e., a regression algorithm) to model
the correlation between extracted brain features (i.e., independent variables) and the real
age of the patient (i.e., dependent variable). This prediction model is then applied to in-
dependent test data to infer the brain age of patients unseen by the trained model. The
deviation (i.e., positively or negatively) from these typical trajectories, often called the brain
age delta (estimated brain age minus real age), have been linked to the status of the brain
[2]. A large number of studies have documented that a positive brain age delta is associated
to brain abnormalities and mortality [3], whereas a negative brain age delta is linked to a
younger brain [4]. The brain age metric has been successfully used to detect different neu-
rological and mental diseases [5]. Beyond the detection of neurological and mental illnesses,
the brain age metric has also been utilized for discovering the effects of lifestyles (such as
resilience, depressive symptoms, life satisfaction) on cognitively healthy brains [6].
It is worthwhile noting that the accuracy of interpreting brain age results relies on the
robustness of the brain age estimation framework in use. Indeed, a more accurate brain
age estimation framework can deliver more robust outcomes for clinical applicants. As a
result, developing a more precise brain age estimation framework for clinical applications
is required, so many research groups have attempted at improving brain age estimation
frameworks by embracing diverse machine learning techniques. In addition to the traditional
machine learning methods, deep learning has more recently become a popular methodology
in the field of neuroimaging, where it has been widely employed for different tasks such as
Iman.beheshti@umanitoba.ca (Iman Beheshti), triptigoel@ece.nits.ac.in (Tripti Goel),

nehalahmad@iiti.ac.in (Nehal Ahmad), ktlai@ntut.edu.tw (Kuan-Ting Lai),
kaizhu.huang@dukekunshan.edu.cn (Kaizhu Huang), yudongzhang@ieee.org (Yu-Dong Zhang),
javier.delser@tecnalia.com (Javier Del Ser), Chin-Teng.Lin@uts.edu.au (Chin-Teng Lin)
2
segmentation, lesion detection, and classification [7]. One of the most significant advantages
of deep learning models is their ability to combine feature extraction, feature reduction, and
prediction stages into uniform computational system capable of outperforming traditional
machine learning methods, especially when modeling highly complex data. Therefore, deep
learning has risen to prominence as a preferred technique for brain imaging studies, and
deep learning-based neuroimaging studies have increased with a steadily growing propensity
over the last decade [7].
Table 1: Published surveys reviewing brain age estimation studies.
Reviewed Papers Dataset Deep

Reference Year Models Comments
period reviewed Coverage Learning?
[8] 2019 2010-2018 40 Yes Traditional, Yes Focused on T1w-MRI preprocessing methods,
Deep Learning feature extraction techniques, and regression
algorithms. A brief discussion of deep learning
models (i.e., the CNN model) and associated
toolboxes (i.e., TensorFlow and PyTorch)
[9] 2021 2010-2021 n.a. No Traditional No Focused on T1w-MRI preprocessing methods, the
construction of a standard brain age study, and
clinical applications of brain age prediction
[5] 2021 2010-2021 84 No Traditional No Focused on different brain imaging modalities
(T1w-MRI, fMRI, DTI, PET, and SPECT),
feature extraction and reduction methods,
regression algorithms, bias adjustment methods,
and clinical applications of brain age prediction
Ours 2022 2017-2022 35 Yes Deep Learning Yes Review of deep learning architectures used in the
area, prospective lines of research
Legend: T1w-MRI: T1-weighted magnetic resonance imaging; fMRI: functional MRI; DTI: diffusion tensor imaging; PET: Positron
emission tomography; SPECT: single photon emission computed tomography
To date, only a few surveys have summarized existing brain age estimation studies
[8, 9, 5]. Of note, these surveys have mostly focused on traditional machine learning algo-
rithms (such as support vector regression, relevance vector regression, and Gaussian process
regression), feature extraction and data reduction techniques, as well as the clinical appli-
cation of the developed brain age techniques. Table 1 shows the list of published surveys
that have summarized brain age studies so far. In the light of the information provided in
the table and as per our examination of the literature contributed in the are, no survey has
previously offered a systematic review of the state-of-the-art related to deep learning for
brain age estimation. To the best of our knowledge, the present study is the first that ad-
dresses this niche by carefully inspecting different deep learning architectures that have been
adopted to estimate the brain age from neuroimaging data. Our detailed literature analysis
is further complemented with a proposal of future directions in this area, stimulated by our
informed conclusions drawn from the examined literature corpus. Our main contributions
can be summarized as follows:
• We categorise existing brain age estimation models based on the deep learning archi-
tectures used for the estimation. Moreover, we further classify such models depending
on whether they adopt a slice-based or a voxel-based approach when modeling their
input data.
3
• We examine in detail such estimation models by discussing about their modalities,
size of data in use, reported performance of the models, and the datasets used for
evaluation purposes.
• We examine every deep learning architecture used in the area, describing design pat-
terns in their different hyperparameters, including the number of hidden layers, acti-
vation functions, dense layers, loss functions and other choices alike.
• We enumerate several challenges that remain unsolved and trace directions that can
be followed for the advancement of data-based solutions for predicting the brain age
from image data.
The rest of this review is structured as follows. Section 2 describes briefly the deep
learning architectures referred later in the literature review. Section 3 establishes the taxon-
omy around which the literature examination is done, and reviews the publications that use
deep learning architectures for estimating the brain age from neuroimaging data. Finally,
limitations of existing studies and suggestions for future research paths are given in Section
4. Finally, Section 5 concludes this review with a summary of the main insights drawn from
the survey.
2. Deep Learning Architectures

Before proceeding with the literature review, it is necessary to establish concepts and
architectures related to deep neural networks that will be later referred to as such extensively
in the bibliographic study. This section provides a brief explanation of some popular deep
learning models for the sake of completeness and to ease the understanding of the results of
our examination.
As the technology develops day by day, the amount of global digital information increases
exponentially. The medical domain has arguably been among the sectors whose digital sub-
strate has grown sharply over the years. As a result, high-performance models have been
in large demand to ingest, process and extract knowledge from these data. For this pur-
pose, the advent and progressive maturity of deep learning models has gained a significant
momentum in the majority of the research areas, particularly in biomedical data process-
ing. Deep learning models have comprehensively been proven to advance remarkably in the
analysis of biomedical data for different diseases, such as Alzheimer [10], Parkinson [11] and
brain tumor classification [12]. Here, we give brief overview of some of these deep learn-
ing architectures used in brain age estimation, including Feed-Forward Neural Networks,
Convolutional Neural Network (CNN), VGGNet [13], ResNet [14], and ensembles of Deep
Learning models [15]. Such architectures, the application under focus in this review and the
different goals for which brain age can be estimated using Deep Learning architectures are
conceptually illustrated in Figure 1.
2.1. Feed-forward neural network

It is the simplest form of a neural network. The primary objective of a feed-forward
neural network is to compute the approximation of a function [16]. Feed-forward networks
4
are sequential functions or perceptrons assembled together in a chain structure and it is syn-
dicated with directed acyclic graph representing the way functions are associated together.
In this network, data flow in only one direction. The total number of layers in a network
represents the depth of the network, so that the higher the number of layers in the network
is, the deeper the network can be declared to be. The parameters defining such composed
functions in the network are adjusted to minimize a loss function defined over a training
dataset, by using a gradient backpropagation algorithm.
2.2. Convolutional Neural Networks, ResNet and VGGNet

The progressive advance in feed-forward neural networks eventually yielded CNNs as
the next generation of the wide family of neural computation models. CNN is one of the
most successful data-based models arising from the computer vision domain [17] which,
since its inception, has been extrapolated to other complex data domains, such as video
or time series modeling. It is widely being used in applications such as natural language
processing [18], voice processing [19] or text mining [20], to mention a few. In general, the
superior performance of CNNs when compared to other artificial neural networks hinges on
their capability to automate the feature extraction process from complex multidimensional
data, which is realized by extending the aforementioned gradient backpropagation algorithm
through layers that model spatial correlation within the data input to the network. Such
layers, referred to as convolutional, are the first in a stacked hierarchy that, together with
pooling layers and fully connected (dense) layers, compose the overall structure of a CNN
used for classification or regression tasks.
The fact that vanilla CNN models struggled to achieve good levels of classification accu-
racy when dealing with comprehensive datasets paved the way for the proposal of new CNN
architectures over the years. AlexNet [17] was the first CNN-based modern architecture that
came into picture, achieving unrivaled performance at the time in the well-known ImageNet
classification competition. Theoretically, neural networks with a large depth have a bet-
ter capability to extract features, potentially improving the model’s predictive performance
[21, 22]. However, when dealing with deep convolutional architectures, the chances of having
exploding and vanishing gradients become higher, being catastrophic for the model’s train-
ing process. To address these issues, Microsoft researchers introduced an enhanced CNN
architecture coined as ResNet [13, 23]. The ResNet architecture introduced residual blocks
and the concept of skip-connection networks. By virtue of skip connections, the activation
of a layer is connected to subsequent layers in the hierarchy by skipping a few layers in
between, hence forming a residual block. In other words, a ResNet architecture is composed
of stacks of residual blocks. The overall accuracy in ResNet is enhanced without loss of
accuracy or increase in the training and testing error.
Another CNN model called VGGNet was developed by [13], forging thereafter a series of
different models characterized by different depths and hyperparametric configurations. VG-
GNet reduced the memory consumption and model time complexity as it does not use local
response normalization algorithm. VGGNet flavors have different channel specifications,
supporting from 11 weight layers (each including 8 convolution with three fully connected
layers) to 19 weight layers (including 16 convolution layers with 3 fully connected layer
5
each). Even after having multiple variants of CNN models, researchers were enthusiastic to
enhance the model performance and generalization capability.
2.3. Ensemble Deep Learning

In order to further enhance the performance of the model over unseen query data, en-
semble learning soon entered the landscape of deep learning models. In general, ensemble
learning involves combining and training multiple models with the aim of improving gen-
eralization performance [24, 25, 26, 27, 28, 29]. This performance boost is due to the fact
that predictions from different networks modeling diversified knowledge about the training
dataset are combined together to give better performance [30]. Ensemble learning approaches
using deep neural networks can be found within the broad categories of bagging [31], boost-
ing [32] and stacking [33]. These networks are considered as the best in terms of performance
accuracy and robustness as compared to individual networks [34, 35]. Ensemble CNNs are
desirable because of the basic fact that the optimization problem of assigning weight to
each node may have different local minima [24, 36]. CNN ensembles have so far applied to
many modeling tasks, including character recognition, image analysis, face recognition and
medical image diagnosis.
Mental illness
Artificial Intelligence detection
Brain age estimation
Machine Learning
Neurological
disorders
Deep
Learning Model Effect of
lifestyle on
brain
Prescription
of cognitive
therapy
Figure 1: Conceptual diagram relating different Deep Learning architectures with the targeted application
(brain age estimation) and the ultimate purposes for which the application is sought.
3. Taxonomy and Literature Review

Departing from the deep learning architectures revisited above, we now delve into the
review of the most significant works for brain age estimation using such architectures. In
doing so, the section first settles the methodology followed to collect and organize all the
considered corpus of literature (Subsection 3.1), followed by a detailed study of the con-
tributions around a proposed taxonomy that sorts them depending on the deep learning
architecture in use (namely, CNN, VGGNet, ResNet, ensemble deep learning, and miscella-
neous), and then in terms of the input data given to the model (slice-based or voxel-based.
This literature review is given in Subsection 3.2.
6
3.1. Search Strategy
Our literature compilation was performed in March 2022, querying relevant bibliographic
databases (Google Scholar, PubMed) with search keywords tightly connected to the area un-
der study. Specifically, we considered “deep learning” combined with the following items:
“brain age estimation”, “brain age prediction”, “MRI”, “brain imaging”, and “neuroimag-
ing”. Studies not focusing on deep learning and neuroimaging were eliminated in the initial
screening. A total of 35 studies were retained and examined in detail.
3.2. Taxonomy and Literature Review

As a result of our literature study, we designed and arranged all works around the tax-
onomy shown in Figure 2. Two classification criteria can be distinguished: the first is the
deep learning architecture, which echoes the architectures revised in the previous section and
complements it with an additional category (miscellaneous) to account for alternative archi-
tectural choices not covered in such wide categories. The second classification criterion is the
format of the data input to the deep learning model, namely, slice- or voxel-based approaches.
In slice-based approaches, deep learning models are trained with the two-dimensional MRI
slices extracted from 3D MRI scans. Consequently, slice-based approaches contains less
number of trainable parameters, and hence imply a lower computational complexity of the
overall modeling solution. However, two-dimensional slices do not contain the whole brain’s
information (e.g. adjacent correlations between slices are not considered). By contrast, in
voxel-based approaches deep learning models are trained using three-dimensional MRI scans
or 3D MRI patches. 3D scans are first registered on a standard brain template to make the
dimensions of all the 3D MRI scans uniform. After that, the registered 3D scans are fed
to a deep learning models for training, wherein the deep learning model is configured with
neural processing elements capable of operating over 3D input data tensors. Voxel-based
approaches can model brain image data considering inter-slice information. However, the
number of trainable parameters is much higher than that of their slice-based counterparts.
Bearing this taxonomy in mind, we proceed with our systematic review and analysis of the
literature in the subsequent sections, following the two criteria of the taxonomy itself. A
summary of the modalities, learning models, datasets in use and reported results of the
analyzed studies is given in Table 2.
3.2.1. Convolutional Neural Networks

CNN has attracted researchers since 2017 for brain age estimation and become very
popular because of its automatic feature extraction capability and high-performance results.
• Slice-based CNN models:
When it comes to slice-based CNN approaches, 2D MRI scans are often used to train
two-dimensional CNN networks. The work in [37] highlighted the limitations of using 3D
CNN for brain age prediction, including the need for a large number of parameters and the
computational complexity of the training phase. As a result, it proposed a 2D recurrent
neural network (RNN) for brain age prediction. In the proposed model, a 2D CNN encodes
7
Deep Learning for
brain age estimation
CNN VGGNet ResNet Ensemble Deep Miscellaneous

Learning
Slice-based Slice-based Slice-based Slice-based Slice-based

[37,38,39] [46] [51] [61, 62] [73, 75]
Voxel-based Voxel-based Voxel-based Voxel-based Voxel-based

[40-43, 45] [47-50] [52-55] [64-68] [63, 78-83]
Figure 2: Literature taxonomy of Deep Learning models applied to the estimation of the brain age from
neuroimaging data.
important intra-slice features and RNN processes the ordered sequence of sagittal plane MRI
scans to learn inter-slice correlations. The weights of the model are initialized using Kaiming
weight initialization scheme, whereas training is done by using an Adam solver. The main
advantage of the proposed 2D CNN is the reduced number of parameters i.e. 1, 068, 065 as
compared to 2, 018, 000 for 3D CNN with better accuracy. The main limitation of the study
is the lack of consideration of the association of brain age with the neurological and other
health conditions.
Similarly, a slice-based CNN approach was developed in [38] by segmenting T1-weighted
MRI scans to patches. Pearson’s correlation is used to measure pairwise similarities between
the patches. The extracted metrics are fed to the complex network using the patch as the
node and correlation metric as weights. The model shows the most affected brain regions for
age prediction using the so-called Gedeon method. They found 10 most significant features
related to aging located in the left hemisphere. The main drawback of the study is the use
of a small cohort of only 488 subjects.
Further along this line, Dular et al. [39] recently compared four slice-based CNN models
for brain age estimation using transfer learning with domain adaptation and bias correction.
The authors generalized the CNN model performance for unseen new MRI data with a simple
transfer learning approach. However, the first and fourth CNN models were trained on full
resolution 3D T1-weighted MRI images, whereas the second model was trained on 2D MRI
by extracting 15 axial slices from each 3D MRI image. In the third model, the resolution of
the MRI image is downsampled to 95 × 79 × 78 for training the 3D CNN model. The input
to the different CNN models is of different types, which may affect the performance of the
proposed architecture.
• Voxel-based CNN models:
Voxel-based CNN networks are trained over 3D volume MRI brain scans. This is the
approach used in [40], which addressed brain age prediction by introducing complex net-
works using a structural connectivity model. 3D T1-weighted images are segmented into
8
patches and then Pearson correlation is used to calculate the pairwise similarities between
the patches. The pair of patches and similarity metric from the graph model is fed to the
deep learning model for classification. The developed model demonstrates the effect of aging
on different brain regions by exploiting the structural connectivity using a graph model. The
proposed model achieved robust results with only brain extraction and linear registration to
reduce the computational complexity. Furthermore, to assess the feature importance for age
prediction, the Gedeon method is used and represents 12 most informative patches related
to aging. The most informative patches of medial frontal gyrus, caudate, paracentral lobule,
putamen, cingulate, brainstem, sub-gyral regions, etc. are observed in Gedeon test. The
limitation of the study is the experiments have been performed only on the Autism spectrum
disorder dataset, which may not suffice for validating the association of age difference with
other disorders.
Another voxel-based strategy was considered in Cole et al. [41], where T1-weighted im-
ages were segmented into gray matter (GM) and white matter (WM) using SPM12 toolbox.
Results of this study evinced best performance using GM volume compared to WM, raw
and ensemble of WM and GM. They also investigated the heritability of the brain age es-
timation using GM, WM and combination of both. The main limitation of the study is
the use of only two scanner data to check the reliability of between-scanner performance of
the model. Moreover, the authors did not correlate any neuro-anatomical feature with the
predicted age difference, nor did the analysis consider in-scanner motion artefacts that can
affect the overall performance of the model in practical settings. Other works along this
line include [42], which segmented 1.5T T1-weighted MRI scans into GM, WM and cere-
brospinal fluid (CSF). They used GM as the input for training a 3D CNN network. Logistic
regression models and Cox proportional hazard models unveiled the association of age gap
with dementia. Attention maps are extracted from the trained networks using gradient-
weighted activation mapping, which depicted the changes in GM hyperintensity around the
hippocampus and amygdala with the age. On a negative note, the modeling proposal of this
work is less generalizable as the proposed CNN network is not able to handle images from
different datasets. In addition, the authors excluded dementia and stroke subjects while
training the CNN model.
The hybridization of models has been also explored for the estimation of brain age. An
example is the study in Pardakhti and Sajedi [43], proposing to hybridize a 3D CNN with
support vector regression [44] and Gaussian process regression [44]. They trained the model
using healthy subjects from the IXI dataset, and evaluating the model using 47 healthy
and 22 Alzheimer’s disease patients from the ADNI dataset to gauge the generalization
capabilities of the proposed hybrid model. However, the authors did not perform any regional
analysis to find specific brain areas related to the Alzheimer’s disease.
Finally, a recent step in the direction towards voxel-based CNN is Hong et al. [45], which
proposes a 3D-CNN architecture for the prediction of brain age in children from routine
brain MRI scans. The authors used data augmentation strategies to extend the data size
used to train the deep neural network. Moreover, to evaluate the correlation among the
adjacent slices of the brain scans, the authors sliced the 3D scans into multiple 2D slices
followed by training via a 2D CNN model. However, the performance of the 2D-CNN model
9
was found to be lower when compared to the 3D-CNN model. This signifies the fact that
the correlation among the adjacent slices is of utmost relevance for the brain age prediction.
Interestingly, Hong et al. [45] unveiled that predictions for children age under 2 years are
more accurate than the ones elicited for children aged over 2 years.
3.2.2. VGGNet
Because of the popularity of DL networks, researchers are attracted to pretrained DL
networks to get better performance results for brain age estimation. VGGNet [13] is the pre-
trained DL network having the fixed size convolutional layer (3 × 3), which allows modeling
small details of the input image. Moreover, the number of convolutional filters gets doubled
at each stack of the convolutional layer. We now discuss on slice-based and voxel-based
VGGNet models used for brain age estimation:
• Slice-based VGGNet models:
To the best of our knowledge, the work in [46] is the only one investigating the perfor-
mance of 2D VGGNet for brain age estimation. In doing so, the work relies on VGGNet for
extracting 2D slices from 3D volumetric images having the highest mutual information to
increase the speed of the network. A downside of this work is the use of only channel normal-
ization for the preprocessing of MRI images, ignoring motion artefacts, linear registration
and bias correction in the preprocessing that may lead to inaccurate results. Furthermore,
authors did not compare the performance of VGGNet with other pretrained networks.
• Voxel-based VGGNet models:
As opposed to slice-based strategies, voxel-based VGGNet models have been more fre-
quently used in the literature. To begin with, Feng et al. [47] investigated a large-scale
heterogeneous MRI dataset for brain age estimation using a 5-layer 3D VGGNet model. In
addition, the authors also conducted a region-wise age analysis, and demonstrated a nega-
tive association between frontal lobe measures like cortical thickness and the age difference.
Further, the association of predictive age difference and Benton Face Recognition scores
(BFRT) is used to measure the baseline visual memory in evaluating cognitive functions. In
conclusion, the study provides quantitative and region-based analysis for predicting brain
age and other neuroimaging applications. However, it only evaluated the BFRT score to
correlate the age deviation with the cognitive decline, which is not specific to a particular
disease. Another limitation is the inclusion of less data from very old individuals, excluding
data for individuals less than 18 years old.
Another more recent contribution exploring voxel-based VGGNets is [48], which explored
3D VGGNet models using the T1-weighted UK Biobank dataset. They examined the cor-
relation between the predicted age difference and 8, 787 non-image variables reporting on
the lifestyle, medical and physiological parameters, mental health self-report, and medical
history of the patient. The study uncovered a mild correlation between the age difference
and the image-derived phenotype (IDP) of multi-modalities. Attention gates are also inves-
tigated with linearly and non-linearly registered images to focus on the age-affected region in
10
the brain and suppress non-affected regions. Linearly registered MRI scans show the subtle
cortical changes which are not visible in nonlinear registered scans, proving the efficiency of
the utility of linear registered images for detecting cognitive decline. As a drawback of this
study, we highlight the use of IDPs to investigate the relation between the other modalities
and age difference, as IDPs consist of reduced voxel-wise information when compared to
T1-weighted 3D scans.
An interesting alternative was proposed by Peng et al. [49]: a lightweight, simple fully-
connected convolutional network (SFCN) inspired by the VGGNet model. SFCN uses
dropout, voxel shifting and mirroring to improve the performance results. In exchange,
the model requires more training time than other modeling alternatives at the time, which
is more than 50 hours using 2 specialized P100 GPU cards. In addition, the study did not
investigate potential associations between age difference and any health condition of the
subject.
Finally, we stop at the work by Jiang et al. [50], which segmented T1-weighted images
using the CorticalParcellation-Yeo2011 reference map, yielding 7 regions (cortex networks)
modeled each by a 3D VGGNet model. The results showed less mean absolute error using
the frontoparietal network, dorsal attention network, and default mode network. They also
investigated the association of GM to brain age-related changes using Pearson’s correlation.
However, the use of only 7 cerebral cortex networks ignores age effects emerging from fine-
grained subnetworks and subcortical networks.
3.2.3. ResNet
As explained in Subsection 2.2, ResNet overcomes training issues derived from vanishing
gradients by using hierarchical stacks of residual blocks. We now review works in which
brain age estimation has been approached by ResNet-based deep learning methods:
• Slice-based ResNet models:
Our literature collection only discovered a single work prospecting the adoption of slice-
based ResNets for brain age estimation. Shi et al. [51] explored the prediction of the fetal
brain age using ResNet as the basic building block of a CNN trained over T2-weighted
images. Furthermore, an attention map was extracted from the last stage of the CNN
network. As a result, the study found out an association between the predicted age difference
and fetal anomalies like small head circumference and malformations. The main limitation
of this work is the use of transverse MRI scans for acquiring the T2-weighted scans of fetal,
but some of the brain anomalies like corpus callosum and other midline brain structures are
poorly visible in the transverse plane. Another weakness is the use of the last menstrual
period as the ground truth of gestational age, which may be inaccurate.
• Voxel-based ResNet models:
The seminal work in Jónsson et al. [52] proposed the use of a 3D ResNet based multimodal
model for brain age estimation, which takes T1-weighted, WM, GM and Jacobian map as
the input to each CNN network. Also, sex and scanner information are added at the last
11
layer to improve the performance of the model. To avoid randomly initializing the CNN
network, the ensemble model is first trained on the Icelandic dataset, then fine-tuned using
transfer learning over the IXI dataset. Then the model is assessed on UK Biobank dataset,
and final decision is made by using majority voting among the trained networks. A genome-
wide association study is also done to link the predicted age difference to two main genome
sequence: rs1452628-T and rs2435204. The study also show the negative association of the
predicted age difference and neuropsychological time tests. A clear downside of the proposal
made in this study is its computational complexity derived from the use of an ensemble of
3D CNN networks.
Shortly thereafter, Kolbeinsson et al. [53] resorted to a voxel-based approach to inves-
tigate the extent to which six brain regions contribute to brain ageing, including the left
amygdala, right hippocampus, left cerebellar, left insular cortex and the left crus and ver-
mis. To this end, a permutation importance method was used. The authors explored the
age differences associated with hypertension, multiple sclerosis, systolic and diastolic blood
pressure measurement, and Type-I and II diabetes. Unfortunately, the use of permutation
based importance quantification is limiting for regional analysis, as it shows that boundary
affects if images are not correctly aligned.
More recently, two contributions have made further use of residual network architectures.
The first is [54], which proposed a ResNet based 2-layer 3D CNN architecture with minimal
preprocessing of T1-weighted images and used transfer learning to extract features. They
used only brain extraction and image cropping as the preprocessing steps to reduce the
computational complexity of the proposed model. Training is performed over the GNC
dataset and evaluated using BiDirect study, FOR2107/MACS, and the IXI datasets using
transfer learning to inspect the generalization potential of the model. Since 3D CNN was
only in use for age prediction, no comparison of the results to other pretrained deep learning
networks was done. Furthermore, the correlation of age difference with brain regions was
not investigated.
The other recent contribution related to voxel-based ResNet models is [55], which used
this architecture for identifying genetic factors linked to the brain aging. The study, based on
the genome-wide association study with predicted brain age, revealed that single nucleotide
polymorphisms (SNPs) from four independent locations are significantly correlated with
brain aging. However, the work only evaluated the genetic factor while ignoring other
aspects like lifestyle habits or diseases. For example, heavy smoking and the consumption
of alcohol are known to accelerate brain aging [56]. Likewise, diseases like diabetes and
schizophrenia play a catalytic role in brain aging as well [57, 58]. By contrast, physical
exercise improves the general health and slows the brain aging [59, 60].
3.2.4. Ensemble Deep Learning
Ensemble deep learning combine multiple networks to achieve improved robustness and
predictive performance, at the cost of a generally higher training complexity. Researchers
started to embrace ensemble deep learning to achieve better results for brain age estimation.
A deeper analysis of the works in this direction is done in what follows:
• Slice-based Ensemble Deep Learning models:
12
Two contributions from the reviewed literature fall within this first category. To begin
with, Ballester et al. [61] proposed slice-level brain age prediction via CNNs and linear
regression. Instead of using a whole brain image, the authors used slice-level predictions
to identify the brain regions that most contribute to the brain age gap. Moreover, authors
evaluated how the predictions are affected by factors like the slice index and plane, age, sex
and MRI data collection site. The study revealed that specific brain slices have more impact
on the prediction error. A stratified partitioning of the train and test datasets was shown
to remove the site effects and minimize the sex effects. Also, the overall error of the models
is affected by the choice of the MRI slice.
The second work is [62], which used an ensemble of CNNs to estimate brain age from
routine T2-weighted spin-echo brain MRI scans. The study suffers from the following limita-
tions [63]: 1) it averaged predictions across all slices, however, not all slices may contribute
equally to the final prediction, potentially resulting in subpar performance; 2) each slice is
treated independently from the rest of slices, which precludes the modelling of non-linear in-
teractions among the axially separated features. There are several shortcomings of brain age
models that are not developed for routine clinical MRIs. To begin with, models are mostly
trained over high-resolution volumetric T1-weighted scans with isotropic or near-isotropic
voxels. However, these type of scans are rarely the part of routine examinations. Secondly,
most models are trained and evaluated on the datasets which follow precise imaging pro-
tocols and participant inclusion criteria. Contrarily, routine imaging scans in hospitals are
heterogeneous in practice due to varying scanner vendors, different imaging protocols and
heterogeneous patients from multiple sites. Thirdly, state-of-the-art brain age models rely
on computationally intensive preprocessing techniques such as spatial normalization, skull
stripping and bias field corrections.
• Voxel-based Ensemble Deep Learning models:
Four contributions in the reviewed literature can be classified as voxel-based ensembles

of deep learning models. The first to highlight is [64], which empirically showed the relation
between input feature sets and the predictive performance of brain age estimation methods
under the framework of conventional machine learning models. The authors showed that the
combination of multiple input feature sets and objective-specific functions via an ensemble
deep learning framework results in an improved performance. Similarly, Couvy-Duchesne
et al. [65] utilized an ensemble of 7 classifiers, which were trained over 3D GM and WM maps
and vertex-wise measurements of surface area. They concluded that the best performance
was achieved by using ensemble deep learning with tuneable weights estimated by using
linear regression. Further, a combination of the ensemble predictions using random forest
improves the performance of the proposed model even further. Unfortunately, as in any
other ensemble deep learning method, a penalty in computational complexity yields from
the need for training 7 different networks.
Another example of voxel-based ensemble deep learning is [66], which proposed an en-
semble of multiple 3D CNN networks on 10, 176 subjects for chronological age prediction.
The model achieved a MAE of 3.07 years over unseen data. They located cerebrospinal
13
fluid cavities as an atrophy marker for age prediction. Furthermore, the authors aggregated
several explanation maps by averaging to form a population-based map and highlighted the
ventricles and cisterns as potential biomarkers related to early aging. This work considered
only cross-sectional data instead of longitudinal data; as a result, only between-subject age
variations were accounted for, leaving aside possible trajectories within the same subject.
Hofmann et al. [67] proposed multilevel ensemble models towards the interpretability of the
models in neuroimaging, showing better performance than their individual base learners.
The authors concluded that the voxels in and around the ventricles and at the border of
the brain to meningeal areas contributed strongly to the brain age. Also, voxels covering
cortical sulcal structures appeared to be more relevant in older subjects than in the younger
ones. Finally, Poloni et al. [68] developed two brain age estimation CNN models focused
on hippocampal age by using T1 brain MRI scans. Authors used data augmentation tech-
niques that provided ample amount of data to train the model.This approach explores an
association of prediction errors with different subjects such as cognitively normal, AD and
MCI subjects. They performed statistical analysis of delta estimation error among baseline
groups. In addition, authors correlated the results with clinically measured values and found
a negative correlation. This research provided the inference time of 0.12 seconds and the
overall processing time mentioned is below seven minutes.
3.2.5. Miscellaneous Deep Learning Models

Part of the literature has glimpsed at other pretrained deep learning networks for brain
age estimation, including Alexnet [17], U-Net [69], DenseNet [70], Autoencoder [71], Inception-
V1 [72], and EfficientNet. For completeness these works are briefly discussed next:
• Slice-based miscellaneous Deep Learning models:
Lin et al. [73] proposed 2D AlexNet [17, 74] for feature extraction, principal component
analysis (PCA) for feature selection and relevance vector machine with polynomial kernel
for classification using GM images extracted from T1-Weighted images. They tested the
Alzheimer’s patient data and found the predicted age difference of 3.13 years of Mild Cog-
nitive Impairment (MCI) patients and 6.08 years for Alzheimer’s Disease (AD) patients.
However, authors demonstrated the results only using Alexnet [17, 74] pretrained model,
which is a black-box and lacks reliable explanation. Furthermore, the authors did not inves-
tigate the performance on other pretrained networks to comparatively assess the efficiency
of their AlexNet-based proposal.
An interesting advance is the work by Lombardi et al. [75], which presented an explainable
deep learning networks with morphological features extraction. The authors investigated two
morphological descriptors – SHAP [76] and LIME [77] – for finding reliable biomarkers for
aging. The experimental analysis showed the significance of SHAP as the most reliable
descriptor for explaining the morphological aging process. However, to exploit 3D MRI in-
formation at the voxel level, CNN-based salience maps and layer-wise relevance propagation
mapping can be used instead of SHAP and LIME to highlight the region of interest in MRI.
In addition, a correlation analysis was performed to investigate the morphological features’
14
importance with the age. However, a quantitative analysis is still required to correlate the
most important regions for age prediction.
• Voxel-based miscellaneous Deep Learning models:
When using voxel-based information at their input, the diversity of miscellaneous models
reported to date is higher. We begin with Popescu et al. [78], which used a U-Net based
model for the local brain age prediction via incorporation of voxel information. The study
analysed the structural differences between the MCI and AD patients. Moreover, it also
evaluated the reliability of the local brain age predictions both within and between scanners.
The study revealed that for local brain predicted age difference (with Cohen’s d values in
consideration), accumbens, putamen, pallidum, hippocampus and amygdala proved to be
discriminative of the subcortical regions. Moreover, the between-scanner reliability is a
moderate factor. However, the drawback of this study is the requirement of a healthy
population from a given site, as a site change or side scanner effects may lead to the local
brain predicted age difference distribution not centred around zero.
Another very recent work in this assorted set of contributions is [63], which developed a
DenseNet-based model for brain age estimation based on routine clinical head examinations.
The use of these data ensures that the models are trained on a range of scanners and ac-
quisition protocols, and hence, the natural representation of the routine clinical population.
This study suffers from the following limitations: 1) as the model is trained on radiologically
normal for age brains, it remains unknown how the model will behave in individuals with
gross abnormalities; 2) the authors attributed the interpretability of the model to brain
parenchyma; however, no systematic analysis of the brain features responsible for the brain
age prediction is provided.
Lee et al. [79] used a modified 3D-DenseNet [70] with Adam optimizer (with β1 = 0.9 and
β2 = 0.99) and MAE loss function for the brain age prediction in normal aging and dementia.
The study focused on the age and modality specific maps of the CNN model which could
lead to the interpretation of the brain regions contributing most to the prediction of age
subgroup and modality type using occlusion sensitivity analysis. The occlusion sensitivity
analysis method masks the portion of a brain and evaluates its effect on the decisions of
the MAE change. The study concluded that for a younger group (30 − 40 and 40 − 50
years) posterior region with a peak at posterior cingulate cortex contributed higher, while
as in 50 − 60 and 70 − 80 years of age groups, the inferior frontal regions along with the
orbitofrontal and olfactory cortex contributed predominantly. Moreover, the metabolic data
proves to be more sensitive compared to MRI based data for the brain age prediction. The
limitation of this study is that it only evaluated on the neurodegenerative pathology, without
its evaluation on chronic systemic medical diseases and vascular diseases which could have
different patterns of brain aging.
Mouches et al. [80] unified the brain age prediction and age conditional template gener-
ation with a deterministic autoencoder. In age-specific templates, the authors showed the
natural age variations with an increased ventricular volume and wider sulci associated with
increasing age. The same authors proposed in [81] an autoencoder-based multimodal brain
15
age prediction model using MRI and angiography data. The authors exploited saliency
maps for investigating the contribution of cortical, subcortical and arterial structures for
the prediction. The study concluded that combining brain tissue and artery information
improves the performance of the models. Moreover, the lateral sulcus, fourth ventricle, and
medial temporal brain regions proved to be important morphological features. Varatharajah
et al. [82] employed a transfer learning approach via pretrained Inception-V1 [72] based 3D
feature extractor, followed by prediction via regressor and bucketed classification.
We end this section by examining [83], which proposed a regression model based on deep
relation learning. The basic idea behind this model is to learn the relationship between pairs
of different input images. Authors considered four relations among the non-linear pairs.
These relations were learned by the model simultaneously from a single neural network,
and it performed two separate tasks such as relation regression and feature extraction.
For the feature extraction process, EfficientNet was employed, while a transformer was
applied to model the relationship between pairs of images. The performance was evaluated
over different datasets with subjects aged between 20 and 72 years. Different performance
evaluation strategies such as estimation of brain age with different images, estimation of
brain age with known age references and estimation of brain age with the same pair of images
were also adopted. In their experiments, 2.38 was the lowest MAE score achieved. According
to [83], the deep relation learning model can provide better generalization performance for
brain age estimation with lower MAE than other state-of-the-art approaches.
16
Table 2: Summary of the main characteristics and results of the literature reviewed in this manuscript. ρ denotes Pearson’s correlation
coefficient, RMSE is root mean square error, and MAE stands form mean absolute error.
Performance
Dataset
Year Ref. Modality Model MAE(years) Other Database
size
measures
2017 [41] T1 3D-CNN 2,001 4.16 r= 0.96, Brain-Age Normative Control (BANC)
RMSE= 5.31
2017 [46] T1 VGGNet 1,099 4 – Aoba brain image research center and Sendai
Tsurugaya project
2018 [82] - 3D CNN with 12,988 – RMSE= 5.54 and ADNI

regression and 6.44
bucket
classification
2019 [42] T1 3D CNN 5,496 4.45 ± 3.59 – Rotterdam Study
2019 [52] T1 3D CNN 1,264 (Icelandic), 3.39 – Icelandic, IXI, UK Biobank

(ResNet) 15,040 (UK
Biobank), 544
(IXI)
2019 [38] T1 CNN-4L 484 4.7 ± 0.1 RMSE=6.2 ± 1.1, ABIDE, ADNI, BNU, ICBM, IXI
(20, 100, 50, 20) ρ = 0.95 ± 0.02
2020 [47] T1 3D CNN 10,158 4.06 r = 0.970 Cam-CAN, IXI, SALD, DLBS, OASIS-1, CoRR,
SchizConnect, ADNI, AIBL, OASIS-2, PPMI, NIFD,
17
BGSP and SLIM
2020 [48] T1 3D CNN 19,687 2.71 ± 2.10 – UK Biobank

(female),
2.91 ± 2.18
(male)
2020 [53] T1 3D CNN 21,382 2.87, 3.42 – UK Biobank

(ResNet)
2020 [37] – 2D-CNN + RNN 10,446 2.86 RMSE= 3.61 UK Biobank
2020 [43] T1 3D 562 5 – IXI

CNN/SVM/GPR
2020 [51] T2 Attention-based 659 0.767 weeks – Custom data (no benchmark)
ResNet
2020 [50] T1 3D CNN 1,454 5.55 – ABIDE, BNU, ICBM, IXI, OASIS
(VGG13)
2020 [66] T1 Ensemble of 3D 10,176 3.07 r = 0.98 ADNI, PPMI, ICBM,AIBL, SLIM, OASIS, CANDI,
CNN IDA, COBRE, CNP, CORR, FCP
2020 [65] T1 Ensemble of 2,640 3.33 – PAC2019 dataset

CNN, SVM,
linear unbiased
predictor
2020 [45] T1 3D CNN 220 67.6 days – Children’s Hospital of Nanjing Medical University
2021 [40] T1 3D CNN 1,016 2.19 ± 0.03 RMSE= ABIDE

2.91 ± 0.03, ρ =
0.890 ± 0.003
2021 [49] T1 Simple Fully 14,503 2.14 – UK Biobank
Convolutional
Network (SFCN)
2021 [54] T1 3D CNN 10,691 (training) 2.84 ± 0.5 – German National Cohort
(ResNet) + 2,173
(validation)
2021 [73] T1 AlexNet 594 4.51 r = 0.979 ADNI GO, ADNI 2, IXI, OASIS
2021 [75] T1 FCDNN 2,638 2.7 r = 0.86 PAC2019
2021 [84] T1 FFDNN 2,638 4.6 – PAC2019
2021 [39] T1 CNN+Transfer 2,543 3.3 r = 0.91 Multi-site dataset

Learning
2021 [85] T1 3D modified 4,127 3.43 ± 0.0545 – MAYO dataset

DenseNet (FDG PET),
4.2055 ± 0.2241
(MRI)
2021 [78] T1 U-Net inspired 3,500 9.751 – BANC, Dallas Lifespan Brain Study, Cambridge
FCNN Centre for Ageing and Neuroscience, Southwest
University Adult Lifespan Dataset, OASIS3, AIBL,
Wayne State, Local data from Imperial College
London
2021 [55] T1 ResNet 16,998 2.7 – UK Biobank
2021 [62] T2 3D CNN 1,530 4.22 (local), 9.96 r = 0.862 (local), Seoul National University Hospital data, IXI
(IXI) 0.861 (IXI)
18
2021 [61] – Modified – 4.62 – PHOTON-AI

ResNet18
2021 [86] T1 CNN+Autoencoder 2,118 4.95 – Private data (Pomerania, Germany), IXI
2021 [64] T1 Ensemble Deep 2,118 3.33 – PAC2019

Learning
2021 [67] T1 CNN 2637 3.38 − 5.07 – Private data
2022 [63] T2 DenseNet121 23,302 2.97 – Data from King’s College Hospital NHS Foundation
Trust (KCH), Guy’s and St Thomas NHS
Foundation Trust (GSTT)
2022 [81] T1 SFCN 2,074 3.85 ± 2.9 – T1-weighted MRI (CNNT1) and TOF MRA
(CNNTOF) datasets
2022 [83] T1 SFCN 6,049 2.38 ρ = 0.988 TMGHBCH, NIH-PD, ABIDE-I, BGSP, BeijingEN,
IXI, DLBS, OASIS
2022 [68] T1 EfficientNet 1,554 3.31 RMSE= 4.65 NAC, IXI and ADNI
4. Challenges and future directions
The literature review performed in this survey has exposed the growing interest arisen by
Deep Learning models for brain age estimation over the years. Despite its relative maturity,
our critical analysis of contributions to date have unveiled several research niches that still
deserve further attention by the research community. We herein list such challenges, together
with potential research avenues that can be traversed towards addressing them effectively
in the future (we refer to Figure 3 for a graphical summary):
Integration of
Explainability in Deep Learning
Multimodal Data
for Brain Age Estimation
▪ Brain age difference also depends on other factors like iron deposition,
▪ No information about input-output relationship brain connectivity, and metabolic variations
▪ Explainable Artificial Intelligence can explain the ▪ Combination with other modalities can provide new biomarkers
relationship, which will be beneficial for clinical 4.1
purposes and practical adoption of the model
Brain Age Difference

4.7 4.2 with cognitive decline
▪ Association of brain age difference to
mental health
Adversarial Perturbations Deep ▪ Brain region affected due to Alzheimer,
Parkinson and other neural disorders
& Uncertainty Estimation Learning for
brain age
▪ Develop robust and safe deep learning
estimation Computational
networks from adversarial perturbations
▪ Focus on uncertainty estimation for
4.6 4.3 Complexity
deep learning
▪ Use of 2D CNN instead of 3D CNN to reduce
computational complexity
▪ Use less preprocessing for routine MRI
Deep Learning scans to reduce the complexity of the model
Architecture 4.5 4.4 Significant Brain

▪ Specialized ensemble deep learning architectures
can be used for improved predictions Region Extraction
▪ A benchmark among deep learning frameworks
▪ Identify specific brain regions related to age gap
for brain age estimation is in need
▪ Study the effect of age groups and gender on the
regions affecting the brain age
Figure 3: Graphical representation of challenges and future directions in Deep Learning for brain age
estimation. Numbers refer to the subsection in Section 4 where each challenge is elaborated.
4.1. Integration of Multimodal Data

Most brain age estimation approaches published to date used exclusively T1-weighted
images, which can only expose structural atrophies of the brain organ. However, cognitive
decline is known to stringently depend on several other factors, such as the iron deposition,
brain connectivity, and metabolic variations. By virtue of their modular and highly flexible
design, deep learning models are naturally prone to the integration of multiple modalities
of information defined over heterogeneous domains, including space and time. Therefore,
the fusion of other modalities towards a more accurate estimation of the brain will surely
provide new insights and chances to explore other biomarkers that affect brain aging.
19
4.2. Association of Brain Age Difference with Cognitive Decline
In general, brain age estimation frameworks proceed by building a prediction model by
modeling brain scans of cognitively healthy participants. Recent studies have documented
that brain age is associated to mental health, life satisfaction and metabolic factors (e.g.,
diabetes) in cognitively healthy elderly [6]. On the other hand, these factors could cause
a bias in estimated brain age, which should be taken into account when building a brain
age estimation model from healthy data. The majority of deep learning-based brain age
studies have tested their techniques on AD patients who have extensive brain atrophies.
Albeit the deep learning models have shown outstanding performance on the training set
and cognitively healthy subjects, their dependability should be verified on other neurolog-
ical disorders with small volumetric alterations such as Parkinson’s disease, Huntington’s
Disease, Epilepsy and Down Syndrome, to mention an exemplifying few.
4.3. Computational Complexity

A small subset of the reviewed contributions has focused on 2D CNN models, mainly
aimed to reduce the computational complexity of the designed network when compared to 3D
CNN modelling counterparts. 2D slices can be extracted from volumetric MRI scans based
on mutual information and entropy of the slices. Furthermore, a performance comparison
between the axial, coronal, and sagittal planes of MRI scans can be explored, providing a
good visibility of the regions of interest.
On the other hand, most brain age estimators rely on computationally intensive pre-
processing techniques. However, only a few models have been developed for the estimation
of brain age from routine MRI scans. Hence, the community should focus on developing
efficient and accurate models for brain age estimation over routine MRI scans, potentially
encompassing artifacts that unless properly tackled, hinder the predictability of the brain
age from such data sources.
4.4. Significant Brain Region Extraction

The learning task underneath most brain age estimation models seeks the accurate pre-
diction of the chronological age of the brain. These models are generally used to predict
the age gap in non-healthy subjects for clinical purposes. The wider the age gap, the more
abnormalities exist in the brain. However, despite the accurate prediction eventually elicited
by the model, only a few studies analyze which brain regions contribute most to the estima-
tion of the brain age. The scarce reports where such an analysis was provided have revealed
that different regions of the brain are affected depending on the age group to which the
subject belong. Hence, future research can be devoted towards case studies reflecting the
effect of age groups and gender on brain age estimation.
4.5. New Deep Learning Architectures

As evidenced in our literature review, it has not been until recently when ensemble deep
learning architectures have been used to improve the brain age prediction accuracy. The most
straightforward approach to induce diversity among base learners working with a given data
modality is bagging. However, there are other approaches like negative correlation learning,
20
which can improve the performance of ensemble architectures, that remain unexplored for
this particular application.
Further along this line, the community lacks a comprehensive study that benchmarks the
performance of deep learning architectures on publicly available datasets, using unified met-
rics, evaluation protocols and several datasets. Hence, the area is in demand for a benchmark
between the different deep learning approaches published for brain age estimation. This will
not only contribute, in an informed fashion, to a consensus around the state-of-the-art deep
learning model for accurate brain age prediction, but also stimulate further efforts towards
improving such a model from different perspectives, including precision, trustworthiness and
training/inference efficiency.
4.6. Adversarial Perturbations and Uncertainty Estimation

Deep learning has been recently found to be vulnerable to certain imperceptible pertur-
bations called adversarial perturbations [87, 88]. Those examples corrupted with adversarial
perturbations are called adversarial examples. Attacks comprising such examples unleash
serious robustness challenge for deep learning methods, and have recently attracted great
interest in the community. Since brain age estimation methods mainly resort to existing
deep learning methods, they seldom consider their robustness against adversarial attacks.
Many proposals have been proposed to enhance and/or deliver a robust and safe deep learn-
ing model [89, 90]. We advocate for developments tackling such robustness issues when
applying deep learning models to brain age estimation.
Another important aspect of deep learning models relates to the propagation of the inher-
ent data (aleatoric) and modeling (epistemic) uncertainty to their predicted output. Failing
to provide understandable information about the confidence of a model in its produced esti-
mation jeopardizes the adoption of the model in practical applications where decisions made
from the estimations may have catastrophic consequences, as in clinical practice. Uncer-
tainty estimation techniques aim to improve the trustworthiness of the users in the output
of data-based models by informing about the confidence associated to each model [91]. For
regression tasks, this information takes the form of confidence intervals, which are usually
estimated for a given confidence level. Hence, prospective studies can focus on uncertainty
estimation techniques and metrics that are either model-agnostic (e.g. conformal predic-
tion or quantile regression) or specific for deep learning models (correspondingly, evidential
losses, Monte Carlo dropout or Bayesian networks), so as to increase the trustworthiness of
clinical practitioners on the brain age estimated by this family of models.
4.7. Explainability in Deep Learning for Brain Age Estimation

In general, deep learning models are considered “black boxes” due to the lack of an infer-
ence process that maps which components of a given input are relevant for the output value
issued by the model. When modeling complex data, the performance of algorithmically
transparent models (e.g. decision trees) usually lags behind that achievable by highly para-
metric deep neural networks. The need for techniques capable of explaining what already
trained models observe in their input to predict their outputs gave birth to the landscape
of post-hoc explainability methods [92]. Nowadays a plethora of approaches to furnish a
21
posteriori explanations for deep neural networks can be found in the literature, ranging
from concept relevance methods to model simplification approaches and counterfactual gen-
eration. Most of these techniques have been integrated in mainstream software libraries
used for the purpose. We firmly make the case for the inclusion of post-hoc explanations in
future studies proposing new neural architectures and/or datasets for brain age estimation:
this is necessary not only to inform decisions that may put at risk the life of patients [93],
but also to provide medical experts with information about new biomarkers that highly cor-
relate with brain ageing. Although exemplary attempts in this regard have been reported
recently, such as the use of SHAP and LIME [75], the community should look beyond purely
performance-driven research by ensuring that the model not only performs well, but also
performs reliably and that can be trusted.
5. Concluding Remarks and Outlook

This survey has reviewed brain age estimation frameworks encompassing deep learning
architectures at their core. Essentially, the estimation of the brain age from neuroimaging
data can be regarded as a predictive modeling task. The high dimensionality of neuroimag-
ing MRI scans and other modalities called for the adoption of different deep learning models
for the early diagnostic of several disorders directly or indirectly related to brain age esti-
mation. This review has attempted at collecting them in their entirety, examining in depth
how the brain age estimation problem has been approached in each published study, which
modeling patterns have been followed to solve them effectively, reported levels of predictive
performance and relevant architectural choices. As a result, a taxonomy has been proposed
to arrange and organize all the literature in a systematic manner, serving as a guide for
the critical assessment of the strong points and weaknesses of the reviewed works. As a
consequence of our in-depth review of the literature, we have also enumerated several re-
search directions in brain age estimation to drive future efforts in this topic. We hope this
manuscript enthrones as an useful reference for researchers and practitioners working on
brain age estimation via deep learning architectures, as well as for newcomers willing to join
forces around this exciting research area.
Acknowledgment
This work is supported by National Supercomputing Mission under DST and Miety,
Govt. of India under Grant No. DST/NSM/R&D HPC Appl/2021/03.29. We are grateful
to the Indian Institute of Technology Indore for the support and facility offered for the re-
search. J. Del Ser would like to thank the Spanish Centro para el Desarrollo Tecnologico
Industrial (CDTI, Ministry of Science and Innovation) through the “Red Cervera” Pro-
gramme (AI4ES project), as well as by the Basque Government through the ELKARTEK
program and the Consolidated Research Group MATHMODE (ref. IT1456-22).
References
[1] T. Vos, A. D. Flaxman, M. Naghavi, R. Lozano, C. Michaud, M. Ezzati, K. Shibuya, J. A. Salomon,
S. Abdalla, V. Aboyans et al., “Years lived with disability (ylds) for 1160 sequelae of 289 diseases and
22
injuries 1990–2010: a systematic analysis for the global burden of disease study 2010,” The Lancet, vol.
380, no. 9859, pp. 2163–2196, 2012.
[2] K. Franke and C. Gaser, “Ten years of BrainAGE as a neuroimaging biomarker of brain aging: what
insights have we gained?” Frontiers in Neurology, p. 789, 2019.
[3] M. Cearns, T. Hahn, and B. T. Baune, “Recommendations and future directions for supervised machine
learning in psychiatry,” Translational Psychiatry, vol. 9, no. 1, pp. 1–12, 2019.
[4] E. Luders, N. Cherbuin, and C. Gaser, “Estimating brain age using high-resolution pattern recognition:
Younger brains in long-term meditation practitioners,” NeuroImage, vol. 134, pp. 508–513, 2016.
[5] S. Mishra, I. Beheshti, and P. Khanna, “A review of neuroimaging-driven brain age estimation for
identification of brain disorders and health conditions,” IEEE Reviews in Biomedical Engineering,
2021.
[6] D. Sone, I. Beheshti, S. Shinagawa, H. Niimura, N. Kobayashi, H. Kida, R. Shikimoto, Y. Noda,
S. Nakajima, and S. Bun, “Neuroimaging-derived brain age is associated with life satisfaction in cog-
nitively unimpaired elderly: A community-based study,” Translational Psychiatry, vol. 12, no. 1, pp.
1–6, 2022.
[7] L. Zhang, M. Wang, M. Liu, and D. Zhang, “A survey on deep learning for neuroimaging-based brain
disorder analysis,” Frontiers in Neuroscience, p. 779, 2020.
[8] H. Sajedi and N. Pardakhti, “Age prediction based on brain MRI image: a survey,” Journal of Medical
Systems, vol. 43, no. 8, pp. 1–30, 2019.
[9] L. Baecker, R. Garcia-Dias, S. Vieira, C. Scarpazza, and A. Mechelli, “Machine learning for brain age
prediction: Introduction to methods and clinical applications,” EBioMedicine, vol. 72, p. 103600, 2021.
[10] M. Tanveer, B. Richhariya, R. Khan, A. Rashid, P. Khanna, M. Prasad, and C. Lin, “Machine learning
techniques for the diagnosis of Alzheimer’s disease: A review,” ACM Transactions on Multimedia
Computing, Communications, and Applications (TOMM), vol. 16, no. 1s, pp. 1–35, 2020.
[11] M. Tanveer, A. H. Rashid, R. Kumar, and R. Balasubramanian, “Parkinson’s disease diagnosis us-
ing neural networks: Survey and comprehensive evaluation,” Information Processing & Management,
vol. 59, no. 3, p. 102909, 2022.
[12] K. Muhammad, S. Khan, J. Del Ser, and V. H. C. De Albuquerque, “Deep learning for multigrade
brain tumor classification in smart healthcare systems: A prospective survey,” IEEE Transactions on
Neural Networks and Learning Systems, vol. 32, no. 2, pp. 507–522, 2020.
[13] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,”
ArXiv Preprint ArXiv:1409.1556, 2014.
[14] F. Emmert-Streib, Z. Yang, H. Feng, S. Tripathi, and M. Dehmer, “An introductory review of deep
learning for prediction models with big data,” Frontiers in Artificial Intelligence, vol. 3, p. 4, 2020.
[15] M. d. Condorcet, “Essay on the application of analysis to the probability of majority decisions,” Paris:
Imprimerie Royale, 1785.
[16] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press, 2016, http://www.
deeplearningbook.org.
[17] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural
networks,” Advances in Neural Information Processing Systems, vol. 25, 2012.
[18] Z. Zhao and Y. Wu, “Attention-based convolutional neural networks for sentence classification.” in
Interspeech, vol. 8, 2016, pp. 705–709.
[19] G. E. Dahl, D. Yu, L. Deng, and A. Acero, “Context-dependent pre-trained deep neural networks for
large-vocabulary speech recognition,” IEEE Transactions on Audio, Speech, and Language Processing,
vol. 20, no. 1, pp. 30–42, 2011.
[20] H. Peng, J. Li, Y. He, Y. Liu, M. Bao, L. Wang, Y. Song, and Q. Yang, “Large-scale hierarchical text
classification with recursively regularized deep graph-CNN,” in Proceedings of the 2018 World Wide
Web Conference, 2018, pp. 1063–1072.
[21] K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance
on imagenet classification,” in Proceedings of the IEEE International Conference on Computer Vision,
2015, pp. 1026–1034.
23
[22] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell, “Decaf: A deep
convolutional activation feature for generic visual recognition,” in International Conference on Machine
Learning. PMLR, 2014, pp. 647–655.
[23] K. He, X. Zhang, and S. Ren, “Deep residual learning for image recognition,” in Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
[24] L. K. Hansen and P. Salamon, “Neural network ensembles,” IEEE Transactions on Pattern Analysis
and Machine Intelligence, vol. 12, no. 10, pp. 993–1001, 1990.
[25] T. G. Dietterich, “Ensemble methods in machine learning,” in International Workshop on Multiple
Classifier Systems. Springer, 2000, pp. 1–15.
[26] M. Ganaie, M. Hu, A. Malik, M. Tanveer, and P. Suganthan, “Ensemble deep learning: A review,”
Engineering Applications of Artificial Intelligence, vol. 115, p. 105151, 2022.
[27] Y. Cao, T. A. Geddes, J. Y. H. Yang, and P. Yang, “Ensemble deep learning in bioinformatics,” Nature
Machine Intelligence, vol. 2, no. 9, pp. 500–508, 2020.
[28] C. Ju, A. Bibaut, and M. van der Laan, “The relative performance of ensemble methods with deep
convolutional neural networks for image classification,” Journal of Applied Statistics, vol. 45, no. 15,
pp. 2800–2818, 2018.
[29] S. González, S. Garcı́a, J. Del Ser, L. Rokach, and F. Herrera, “A practical tutorial on bagging and
boosting based ensembles for machine learning: Algorithms, software tools, performance study, practical
perspectives and opportunities,” Information Fusion, vol. 64, pp. 205–237, 2020.
[30] S. Lee, S. Purushwalkam, M. Cogswell, D. Crandall, and D. Batra, “Why M heads are better than one:
Training a diverse ensemble of deep networks,” ArXiv Preprint ArXiv:1511.06314, 2015.
[31] L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, no. 2, pp. 123–140, 1996.
[32] P. Bartlett, Y. Freund, W. S. Lee, and R. E. Schapire, “Boosting the margin: A new explanation for
the effectiveness of voting methods,” The Annals of Statistics, vol. 26, no. 5, pp. 1651–1686, 1998.
[33] D. H. Wolpert, “Stacked generalization,” Neural networks, vol. 5, no. 2, pp. 241–259, 1992.
[34] G. Huang, Y. Li, G. Pleiss, Z. Liu, J. E. Hopcroft, and K. Q. Weinberger, “Snapshot ensembles: Train
1, get m for free,” ArXiv Preprint ArXiv:1704.00109, 2017.
[35] Y. Ren, L. Zhang, and P. N. Suganthan, “Ensemble classification and regression-recent developments,
applications and future directions,” IEEE Computational Intelligence Magazine, vol. 11, no. 1, pp.
41–53, 2016.
[36] A. J. Sharkey, Combining artificial neural nets: ensemble and modular multi-net systems. Springer
Science & Business Media, 2012.
[37] P. K. Lam, V. Santhalingam, P. Suresh, R. Baboota, A. H. Zhu, S. I. Thomopoulos, N. Jahanshad,
and P. M. Thompson, “Accurate brain age prediction using recurrent slice-based networks,” in 16th
International Symposium on Medical Information Processing and Analysis, vol. 11583. International
Society for Optics and Photonics, 2020, p. 1158303.
[38] N. Amoroso, M. La Rocca, L. Bellantuono, D. Diacono, A. Fanizzi, E. Lella, A. Lombardi, T. Mag-
gipinto, A. Monaco, and S. Tangaro, “Deep learning and multiplex networks for accurate modeling of
brain age,” Frontiers in Aging Neuroscience, vol. 11, p. 115, 2019.
[39] L. Dular, Ž. Špiclin, and A. D. N. Initiative, “Improving across dataset brain age predictions using
transfer learning,” in International Workshop on PRedictive Intelligence In MEdicine. Springer, 2021,
pp. 243–254.
[40] L. Bellantuono, L. Marzano, M. La Rocca, D. Duncan, A. Lombardi, T. Maggipinto, A. Monaco, S. Tan-
garo, N. Amoroso, and R. Bellotti, “Predicting brain age with complex networks: From adolescence to
adulthood,” NeuroImage, vol. 225, p. 117458, 2021.
[41] J. H. Cole, R. P. Poudel, D. Tsagkrasoulis, M. W. Caan, C. Steves, T. D. Spector, and G. Montana,
“Predicting brain age with deep learning from raw imaging data results in a reliable and heritable
biomarker,” NeuroImage, vol. 163, pp. 115–124, 2017.
[42] J. Wang, M. J. Knol, A. Tiulpin, F. Dubost, M. de Bruijne, M. W. Vernooij, H. H. Adams, M. A.
Ikram, W. J. Niessen, and G. V. Roshchupkin, “Gray matter age prediction as a biomarker for risk of
dementia,” Proceedings of the National Academy of Sciences, vol. 116, no. 42, pp. 21 213–21 218, 2019.
24
[43] N. Pardakhti and H. Sajedi, “Brain age estimation based on 3D MRI images using 3D convolutional
neural network,” Multimedia Tools and Applications, vol. 79, no. 33, pp. 25 051–25 065, 2020.
[44] S. Kanwal, B. Khan, S. M. Ali, C. A. Mehmood, and M. Q. Rauf, “Support vector machine and gaussian
process regression based modeling for photovoltaic power prediction,” in 2018 International Conference
on Frontiers of Information Technology (FIT). IEEE, 2018, pp. 117–122.
[45] J. Hong, Z. Feng, S. H. Wang, A. Peet, Y. D. Zhang, Y. Sun, and M. Yang, “Brain Age Prediction
of Children Using Routine Brain MR Images via Deep Learning,” Frontiers in Neurology, vol. 11, no.
October, pp. 1–13, 2020.
[46] T.-W. Huang, H.-T. Chen, R. Fujimoto, K. Ito, K. Wu, K. Sato, Y. Taki, H. Fukuda, and T. Aoki, “Age
estimation from brain MRI images using deep learning,” in 2017 IEEE 14th International Symposium
on Biomedical Imaging (ISBI 2017). IEEE, 2017, pp. 849–852.
[47] X. Feng, Z. C. Lipton, J. Yang, S. A. Small, F. A. Provenzano, A. D. N. Initiative, and F. L. D. N. Ini-
tiative, “Estimating brain age based on a uniform healthy population with deep learning and structural
magnetic resonance imaging,” Neurobiology of Aging, vol. 91, pp. 15–25, 2020.
[48] N. K. Dinsdale, E. Bluemke, S. M. Smith, Z. Arya, D. Vidaurre, M. Jenkinson, and A. I. Namburete,
“Learning patterns of the ageing brain in MRI using deep convolutional networks,” NeuroImage, vol.
224, p. 117401, 2021.
[49] H. Peng, W. Gong, C. F. Beckmann, A. Vedaldi, and S. M. Smith, “Accurate brain age prediction with
lightweight deep neural networks,” Medical Image Analysis, vol. 68, p. 101871, 2021.
[50] H. Jiang, N. Lu, K. Chen, L. Yao, K. Li, J. Zhang, and X. Guo, “Predicting brain age of healthy adults
based on structural MRI parcellation using convolutional neural networks,” Frontiers in Neurology,
vol. 10, p. 1346, 2020.
[51] W. Shi, G. Yan, Y. Li, H. Li, T. Liu, C. Sun, G. Wang, Y. Zhang, Y. Zou, and D. Wu, “Fetal
brain age estimation and anomaly detection using attention-based deep ensembles with uncertainty,”
NeuroImage, vol. 223, p. 117316, 2020.
[52] B. A. Jónsson, G. Bjornsdottir, T. Thorgeirsson, L. M. Ellingsen, G. B. Walters, D. Gudbjartsson,
H. Stefansson, K. Stefansson, and M. Ulfarsson, “Brain age prediction using deep learning uncovers
associated sequence variants,” Nature Communications, vol. 10, no. 1, pp. 1–10, 2019.
[53] A. Kolbeinsson, S. Filippi, Y. Panagakis, P. M. Matthews, P. Elliott, A. Dehghan, and I. Tzoulaki, “Ac-
celerated MRI-predicted brain ageing and its associations with cardiometabolic and brain disorders,”
Scientific Reports, vol. 10, no. 1, pp. 1–9, 2020.
[54] L. Fisch, J. Ernsting, N. R. Winter, V. Holstein, R. Leenings, M. Beisemann, K. Sarink, D. Emden,
N. Opel, and R. Redlich, “Predicting brain-age from raw T1-weighted magnetic resonance imaging data
using 3D convolutional neural networks,” ArXiv Preprint ArXiv:2103.11695, 2021.
[55] K. Ning, B. A. Duffy, M. Franklin, W. Matloff, L. Zhao, N. Arzouni, F. Sun, and A. W. Toga,
“Improving brain age estimates with deep learning leads to identification of novel genetic factors
associated with brain aging,” Neurobiology of Aging, vol. 105, pp. 199–204, 2021. [Online]. Available:
https://doi.org/10.1016/j.neurobiolaging.2021.03.014
[56] K. Ning, L. Zhao, W. Matloff, F. Sun, and A. W. Toga, “Association of relative brain age with tobacco
smoking, alcohol consumption, and genetic variants,” Scientific Reports, vol. 10, no. 1, pp. 1–10, 2020.
[57] K. Franke, C. Gaser, B. Manor, and V. Novak, “Advanced brain age in older adults with type 2 diabetes
mellitus,” Frontiers in Aging Neuroscience, vol. 5, p. 90, 2013.
[58] H. G. Schnack, N. E. Van Haren, M. Nieuwenhuis, H. E. Hulshoff Pol, W. Cahn, and R. S. Kahn,
“Accelerated brain aging in Schizophrenia: a longitudinal pattern recognition study,” American Journal
of Psychiatry, vol. 173, no. 6, pp. 607–616, 2016.
[59] A. F. Kramer, K. I. Erickson, and S. J. Colcombe, “Exercise, cognition, and the aging brain,” Journal
of Applied Physiology, vol. 101, no. 4, pp. 1237–1242, 2006.
[60] E. B. Larson, L. Wang, J. D. Bowen, W. C. McCormick, L. Teri, P. Crane, and W. Kukull, “Exercise
is associated with reduced risk for incident dementia among persons 65 years of age and older,” Annals
of Internal Medicine, vol. 144, no. 2, pp. 73–81, 2006.
[61] P. L. Ballester, L. T. da Silva, M. Marcon, N. B. Esper, B. N. Frey, A. Buchweitz, and F. Meneguzzi,
25
“Predicting Brain Age at Slice Level: Convolutional Neural Networks and Consequences for Inter-
pretability,” Frontiers in Psychiatry, vol. 12, no. March, 2021.
[62] I. Hwang, E. K. Yeon, J. Y. Lee, R. E. Yoo, K. M. Kang, T. J. Yun, S. H. Choi, C. H. Sohn, H. Kim,
and J. h. Kim, “Prediction of brain age from routine T2-weighted spin-echo brain magnetic resonance
images with a deep convolutional neural network,” Neurobiology of Aging, vol. 105, pp. 78–85, 2021.
[Online]. Available: https://doi.org/10.1016/j.neurobiolaging.2021.04.015
[63] D. A. Wood, S. Kafiabadi, A. A. Busaidi, E. Guilhem, A. Montvila, J. Lynch, M. Townend, S. Agarwal,
A. Mazumder, G. J. Barker, S. Ourselin, J. H. Cole, and T. C. Booth, “Accurate brain-age models for
routine clinical MRI examinations,” NeuroImage, vol. 249, no. September 2021, 2022.
[64] C. Y. Kuo, T. M. Tai, P. L. Lee, C. W. Tseng, C. Y. Chen, L. K. Chen, C. K. Lee, K. H. Chou,
S. See, and C. P. Lin, “Improving Individual Brain Age Prediction Using an Ensemble Deep Learning
Framework,” Frontiers in Psychiatry, vol. 12, no. March, pp. 1–11, 2021.
[65] B. Couvy-Duchesne, J. Faouzi, B. Martin, E. Thibeau-Sutre, A. Wild, M. Ansart, S. Durrleman,
D. Dormont, N. Burgos, and O. Colliot, “Ensemble learning of convolutional neural network, support
vector machine, and best linear unbiased predictor for brain age prediction: ARAMIS contribution to
the predictive analytics competition 2019 challenge,” Frontiers in Psychiatry, vol. 11, 2020.
[66] G. Levakov, G. Rosenthal, I. Shelef, T. R. Raviv, and G. Avidan, “From a deep learning model back to
the brain—identifying regional predictors and their relation to aging,” Human Brain Mapping, vol. 41,
no. 12, pp. 3235–3252, 2020.
[67] S. M. Hofmann, F. Beyer, S. Lapuschkin, M. Loeffler, K.-R. Müller, A. Villringer, W. Samek, and A. V.
Witte, “Towards the Interpretability of Deep Learning Models for Human Neuroimaging,” bioRxiv,
no. June, p. 2021.06.25.449906, 2021. [Online]. Available: https://www.biorxiv.org/content/10.1101/
2021.06.25.449906v1%0Ahttps://www.biorxiv.org/content/10.1101/2021.06.25.449906v1.abstract
[68] K. M. Poloni, R. J. Ferrari, and A. D. N. Initiative, “A deep ensemble hippocampal cnn model for
brain age estimation applied to Alzheimer’s diagnosis,” Expert Systems with Applications, vol. 195, p.
116622, 2022.
[69] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmen-
tation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.
Springer, 2015, pp. 234–241.
[70] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional
networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017,
pp. 4700–4708.
[71] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, P.-A. Manzagol, and L. Bottou, “Stacked denoising
autoencoders: Learning useful representations in a deep network with a local denoising criterion.”
Journal of Machine Learning Research, vol. 11, no. 12, 2010.
[72] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabi-
novich, “Going deeper with convolutions,” in Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition, 2015, pp. 1–9.
[73] L. Lin, G. Zhang, J. Wang, M. Tian, and S. Wu, “Utilizing transfer learning of pre-trained AlexNet and
relevance vector machine for regression for predicting healthy older adult’s brain age from structural
MRI,” Multimedia Tools and Applications, vol. 80, no. 16, pp. 24 719–24 735, 2021.
[74] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “SqueezeNet:
AlexNet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size,” ArXiv Preprint
ArXiv:1602.07360, 2016.
[75] A. Lombardi, D. Diacono, N. Amoroso, A. Monaco, J. M. R. Tavares, R. Bellotti, and S. Tangaro,
“Explainable deep learning for personalized age prediction with brain morphology,” Frontiers in Neu-
roscience, vol. 15, p. 578, 2021.
[76] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” Advances in
Neural Information Processing Systems, vol. 30, 2017.
[77] M. T. Ribeiro, S. Singh, and C. Guestrin, “Anchors: High-precision model-agnostic explanations,” in
Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, 2018.
26
[78] S. G. Popescu, B. Glocker, D. J. Sharp, and J. H. Cole, “A U-Net model of local brain-age,” bioRxiv,
2021.
[79] J. Lee, B. J. Burkett, H.-K. Min, M. L. Senjem, E. S. Lundt, H. Botha, J. Graff-Radford, L. R.
Barnard, J. L. Gunter, and C. G. Schwarz, “Deep learning-based brain age prediction in normal aging
and dementia,” Nature Aging, vol. 2, no. 5, pp. 412–424, 2022.
[80] P. Mouches, M. Wilms, D. Rajashekar, S. Langner, and N. Forkert, “Unifying brain age prediction and
age-conditioned template generation with a deterministic autoencoder,” in Medical Imaging with Deep
Learning. PMLR, 2021, pp. 497–506.
[81] P. Mouches, M. Wilms, D. Rajashekar, S. Langner, and N. D. Forkert, “Multimodal biological brain
age prediction using magnetic resonance imaging and angiography with the identification of predictive
regions,” Human Brain Mapping, no. September 2021, pp. 1–13, 2022.
[82] Y. Varatharajah, S. Baradwaj, A. Kiraly, D. Ardila, R. Iyer, S. Shetty, and K. Kohlhoff, “Predicting
Brain Age Using Structural Neuroimaging and Deep Learning,” bioRxiv, p. 497925, 2018. [Online].
Available: https://www.biorxiv.org/content/10.1101/546671v5
[83] S. He, Y. Feng, P. E. Grant, and Y. Ou, “Deep relation learning for regression and its application to
brain age estimation,” IEEE Transactions on Medical Imaging, 2022.
[84] A. Lombardi, A. Monaco, G. Donvito, N. Amoroso, R. Bellotti, and S. Tangaro, “Brain Age Predic-
tion With Morphological Features Using Deep Neural Networks: Results From Predictive Analytic
Competition 2019,” Frontiers in Psychiatry, vol. 11, no. January, pp. 1–15, 2021.
[85] J. Lee, B. Burkett, H.-k. Min, J. Graff-radford, C. Schwarz, and R. Petersen, “Deep learning-based
brain age prediction in normal aging and dementia.”
[86] P. Mouches, M. Wilms, D. Rajashekar, S. Langner, and N. D. Forkert, “Unifying Brain Age Prediction
and Age-Conditioned Template Generation with a Deterministic Autoencoder,” Proceedings of the
Fourth Conference on Medical Imaging with Deep Learning (MIDL 2021), vol. 143, pp. 497–506, 2021.
[Online]. Available: https://proceedings.mlr.press/v143/mouches21a.html
[87] X. Yuan, P. He, Q. Zhu, and X. Li, “Adversarial examples: Attacks and defenses for deep learning,”
IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 9, pp. 2805–2824, 2019.
[88] C. Lyu, K. Huang, and H.-N. Liang, “A unified gradient regularization family for adversarial examples,”
in 2015 IEEE International Conference on Data Mining. IEEE, 2015, pp. 301–309.
[89] X. Zhang, D. Wu, L. Ding, H. Luo, C.-T. Lin, T.-P. Jung, and R. Chavarriaga, “Tiny noise, big mistakes:
adversarial perturbations induce errors in brain–computer interface spellers,” National Science Review,
vol. 8, no. 4, p. nwaa233, 2021.
[90] Z. Qian, K. Huang, Q.-F. Wang, and X.-Y. Zhang, “A survey of robust adversarial training in pattern
recognition: Fundamental, theory, and methodologies,” arXiv preprint arXiv:2203.14046, 2022.
[91] B. Kim, R. Khanna, and O. O. Koyejo, “Examples are not enough, learn to criticize! criticism for
interpretability,” Advances in Neural Information Processing Systems, vol. 29, 2016.
[92] A. Barredo Arrieta, N. Dı́az-Rodrı́guez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. Garcia,
S. Gil-Lopez, D. Molina, R. Benjamins, R. Chatila, and F. Herrera, “Explainable Artificial Intelligence
(XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI,” Information Fusion,
vol. 58, pp. 82–115, 2020.
[93] A. Holzinger, M. Dehmer, F. Emmert-Streib, R. Cucchiara, I. Augenstein, J. Del Ser, W. Samek,
I. Jurisica, and N. Dı́az-Rodrı́guez, “Information fusion as an integrative cross-cutting enabler to achieve
robust, explainable, and trustworthy medical artificial intelligence,” Information Fusion, vol. 79, pp.
263–278, 2022.
27
View publication stats

DL-Brainage_review

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DL-Brainage_review

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Deep Learning for Brain Age Estimation: A Systematic Review

Preprint · December 2022

M. Tanveer Mudasir Ganaie

SEE PROFILE SEE PROFILE

Iman Beheshti Tripti Goel

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

Email addresses: mtanveer@iiti.ac.in (M. Tanveer), mudasirg@umich.edu (M. A. Ganaie),

Iman.beheshti@umanitoba.ca (Iman Beheshti), triptigoel@ece.nits.ac.in (Tripti Goel),

Reviewed Papers Dataset Deep

2. Deep Learning Architectures

2.1. Feed-forward neural network

2.2. Convolutional Neural Networks, ResNet and VGGNet

2.3. Ensemble Deep Learning

3. Taxonomy and Literature Review

3.2. Taxonomy and Literature Review

3.2.1. Convolutional Neural Networks

• Slice-based CNN models:

CNN VGGNet ResNet Ensemble Deep Miscellaneous

Slice-based Slice-based Slice-based Slice-based Slice-based

Voxel-based Voxel-based Voxel-based Voxel-based Voxel-based

• Slice-based VGGNet models:

• Voxel-based VGGNet models:

• Slice-based ResNet models:

• Voxel-based ResNet models:

• Voxel-based Ensemble Deep Learning models:

Four contributions in the reviewed literature can be classified as voxel-based ensembles

3.2.5. Miscellaneous Deep Learning Models

• Slice-based miscellaneous Deep Learning models:

• Voxel-based miscellaneous Deep Learning models:

2018 [82] - 3D CNN with 12,988 – RMSE= 5.54 and ADNI

2019 [42] T1 3D CNN 5,496 4.45 ± 3.59 – Rotterdam Study

2019 [52] T1 3D CNN 1,264 (Icelandic), 3.39 – Icelandic, IXI, UK Biobank

BGSP and SLIM

2020 [48] T1 3D CNN 19,687 2.71 ± 2.10 – UK Biobank

2020 [53] T1 3D CNN 21,382 2.87, 3.42 – UK Biobank

2020 [37] – 2D-CNN + RNN 10,446 2.86 RMSE= 3.61 UK Biobank

2020 [43] T1 3D 562 5 – IXI

2020 [65] T1 Ensemble of 2,640 3.33 – PAC2019 dataset

2021 [40] T1 3D CNN 1,016 2.19 ± 0.03 RMSE= ABIDE

2021 [75] T1 FCDNN 2,638 2.7 r = 0.86 PAC2019

2021 [84] T1 FFDNN 2,638 4.6 – PAC2019

2021 [39] T1 CNN+Transfer 2,543 3.3 r = 0.91 Multi-site dataset

2021 [85] T1 3D modified 4,127 3.43 ± 0.0545 – MAYO dataset

2021 [55] T1 ResNet 16,998 2.7 – UK Biobank

2021 [61] – Modified – 4.62 – PHOTON-AI

2021 [64] T1 Ensemble Deep 2,118 3.33 – PAC2019

2021 [67] T1 CNN 2637 3.38 − 5.07 – Private data

Brain Age Difference

Architecture 4.5 4.4 Significant Brain

4.1. Integration of Multimodal Data

4.3. Computational Complexity

4.4. Significant Brain Region Extraction

4.5. New Deep Learning Architectures

4.6. Adversarial Perturbations and Uncertainty Estimation

4.7. Explainability in Deep Learning for Brain Age Estimation

5. Concluding Remarks and Outlook

View publication stats

You might also like