Deep Learning Techniques in Extreme Weather

1
Deep Learning Techniques in Extreme Weather

Events: A Review
Shikha Verma, India Meteorological Department, Ministry of Earth Sciences, New Delhi
Kuldeep Srivastava, India Meteorological Department, Ministry of Earth Sciences, New Delhi
Akhilesh Tiwari, Indian Institute of Information Technology, Allahabad
Shekhar Verma, Indian Institute of Information Technology, Allahabad
arXiv:2308.10995v1 [physics.ao-ph] 18 Aug 2023
Abstract—Extreme weather events pose significant challenges, Deep learning models are composed of multiple layers
thereby demanding techniques for accurate analysis and precise of interconnected artificial neurons [8]. The distinguishing
forecasting to mitigate its impact. In recent years, deep learning feature of deep learning models is their ability to automatically
techniques have emerged as a promising approach for weather
forecasting and understanding the dynamics of extreme weather learn and discover intricate patterns and features directly from
events. This review aims to provide a comprehensive overview the data, without the need for explicit feature engineering.
of the state-of-the-art deep learning in the field. We explore This is achieved by passing the data through multiple layers
the utilization of deep learning architectures, across various of interconnected neurons, where each layer learns to extract
aspects of weather prediction such as thunderstorm, lightning, increasingly abstract representations of the input data. The
precipitation, drought, heatwave, cold waves and tropical cy-
clones. We highlight the potential of deep learning, such as its precision of weather forecasts is heavily dependent on historic
ability to capture complex patterns and non-linear relationships. data. However, non-linear and complex nature of weather
Additionally, we discuss the limitations of current approaches phenomena poses inherent challenges to achieving absolute
and highlight future directions for advancements in the field of precision. The traditional methods, including statistical,
meteorology. The insights gained from this systematic review are dynamical and numerical models, have proven effective in
crucial for the scientific community to make informed decisions
and mitigate the impacts of extreme weather events. forecasting weather events with considerable lead time [9],
they encounter limitations in capturing intricate patterns and
Index Terms—Extreme Weather Events, Weather Prediction, dynamics. As a result, achieving accurate predictions becomes
Deep Learning
unattainable due to the intricate nature of weather systems.
Continuous advancements in research and the utilization of
I. I NTRODUCTION emerging technologies, such as deep learning, offer promising
avenues for further enhancing weather prediction capabilities.
W EATHER refers to short-term natural events that occur

in a certain location and time which include character-
istics such as temperature, pressure, humidity, cloud cover,
This transformative approach enables highly accurate weather
predictions including severe weather events, empowering
proactive measures to mitigate their impacts effectively. Deep
precipitation, wind speed and wind direction [1]. Extreme learning facilitates the integration of diverse data sources,
weather, on the other hand, refers to weather events that devi- including satellites, radars, and weather stations, to provide
ate significantly from the expected conditions. Some instances comprehensive and real-time meteorological insights for
of extreme weather encompass include tropical cyclones, improved public safety and resilience.
heatwaves, intense blizzards with heavy snowfall, excessive
rainfall leading to flooding, and droughts [2] [3] [4] [5]. This review is organized as follows: section I introduces the
These occurrences pose serious challenges to society and paper by providing an overview of the challenges, advance-
the environment, requiring careful planning and necessary ments, and applications of weather forecasting using deep
measures to mitigate their detrimental effects. Consequently, learning, as well as outlining the organization of the paper.
predicting weather holds significant importance. section II explores the realm of extreme weather events, high-
Weather prediction relies on a gathering information from lighting the need of accurate weather prediction. In section III,
weather stations, satellites, radar systems, weather balloons, an extensive literature review is presented, focusing on the
and buoys to access the current atmospheric conditions [1] utilization of deep learning for extreme weather events. This
[6]. NWP utilizes mathematical models to simulate the pattern section explores the existing research, highlighting the differ-
of atmosphere which are based on initial conditions collected ent approaches, models, and findings in the field. section IV
from observational data [1] [6] [7]. Ensemble forecasting focuses on the challenges encountered in the field of deep
generates several forecasts with slight modifications in initial learning for meteorology and weather forecasting. This section
variables and model parameters to assess uncertainty and discusses the limitations, data issues, interpretability concerns,
likelihood of possible outcomes. Climate models utilize data and other obstacles faced when applying deep learning tech-
assimilation, which integrates observational data with model niques in this domain. section V highlights how the integra-
output, to generate long-term weather trends with increased tion of deep learning in extreme weather research advances
forecast precision. predictive capabilities and emphasizes the potential of hybrid
2
models to enhance forecast accuracy and proactive mitigation III. D EEP L EARNING IN E XTREME W EATHER E VENTS
strategies. Additionally, it outlines potential avenues for future
research in weather forecasting using deep learning, discussing Deep learning models are revolutionizing extreme weather
promising areas of exploration, methodologies, and potential prediction by leveraging diverse data sources to accurately
advancements to enhance the accuracy and efficiency of deep forecast events such as cyclones, heatwaves, heavy rainfall,
learning models for weather prediction. Finally, section VI and severe storms. Their ability to analyze complex patterns
offers a concise conclusion that summarizes the key findings and relationships enables early warning systems and proac-
and contributions of the study. tive mitigation strategies, with the potential to minimize the
impacts of extreme weather on society and the environment.
II. E XTREME W EATHER E VENTS A Deep Neural Network (DNN) is being developed for
Exploring the fundamental elements that govern weather an early warning system to predict extreme weather events
patterns and shape the dynamics of atmosphere helps to obtain such as floods, droughts, and heatwaves. The DNN approach
a broader perspective [1]. Temperature plays a pivotal role in effectively downscales and bias corrects coarse resolution sea-
determining the thermal state of the atmosphere, thereby exert- sonal forecast ensembles, generating realistic, high-resolution
ing a substantial influence over the behaviors of gases, liquids, climate information. The study demonstrates that the DNN
and the overall human comfort. Air pressure, commonly re- model accurately predicts extreme values while preserving
ferred to as atmospheric pressure, significantly shapes weather the physical relationships and trends in the variables [13].
patterns by creating low and high-pressure systems. These sys- Researchers aim to improve the forecast of SCW, including
tems contribute to a wide array of meteorological conditions, thunderstorms, short-duration heavy rain, hail, and convective
including notable adverse conditions in case of low-pressure gusts by employing deep-CNN algorithm to effectively extract
systems. The pressure difference give rise to atmospheric the characteristics of SCW and achieve better forecast perfor-
winds that traverse from low pressure area to high pressure mance as compared to traditional machine learning algorithms
area, enabling the essential mechanism of air circulation. [14]. Deep learning methods are being proposed for detecting
Humidity, the amount of water vapour present in atmosphere, and forecasting anomalies in spatiotemporal data. The choice
demonstrates a direct relationship with temperature; higher of learning task depends on whether the anomalies are known
temperatures allow the air to hold more water vapor, leading or unknown. Anomalies are often imbalanced and require
to the formation of clouds. The extent of cloud cover plays a specific data pre-processing. Leveraging diverse data sources
decisive role in modulating solar radiation reaching the Earth’s and modelling techniques, deep learning models exhibit strong
surface, thereby affecting temperature profile and atmospheric capabilities in accurately forecasting flood occurrence, sever-
dynamics. Precipitation, which includes rain, snow, and hail, ity, and spatial distribution. These models play a crucial role
occurs when moisture in the atmosphere condenses and is in real-time monitoring, enabling timely response and effec-
then released from clouds. This process constitutes a primary tive mitigation strategies. Wasserstein Generative Adversarial
mechanism through which atmospheric water is returned to the Network (WGAN) is being utlized for downscaling tropical
Earth’s surface [10] When these atmospheric elements surpass cyclone rainfall to hazard-relevant spatial scales [15]. Addi-
anticipated norms, they can trigger a diverse range of extreme tionally, a hybrid approach combining WGAN and Variational
weather events that carry significant consequences. These in- Autoencoder GAN (VAEGAN) is being introduced to enhance
clude heatwaves posing health risks and intensifying wildfires the resolution of rainfall measurements from 100 km to 10 km
[11], intense precipitation leading to flooding and landslides, resolution, showing realistic power spectra for various wave
warm ocean conditions fueling cyclones with their devastating numbers [16]. A deep learning-based technique employing a
winds and storm surges, severe thunderstorms generating fully connected neural network is being proposed to accurately
tornadoes and hailstorms, prolonged droughts affecting agri- predict rainfall-induced shallow landslides across Italy [17].
culture and water supply, snowstorms and blizzards disrupting
transportation and infrastructure, and freezing rain causing
A. Deep Learning in Thunderstorm and Lightning
damaging ice storms. Monitoring and understanding these
factors are essential for early detection, preparedness, and Thunderstorms are complex atmospheric phenomena char-
effective management to mitigate the potential impacts of these acterized by a combination of thunder, lightning, heavy rain-
diverse extreme weather phenomena [12]. To achieve this, fall, and strong winds, with lightning resulting from electrical
ensemble forecasting emerges as a powerful tool, generating discharges within clouds or between clouds and the ground.
several forecasts with slight modifications in initial variables In a study two hybrid models, EEMD-ANN and EEMD-
and model parameters to assess uncertainty and the likelihood SVM, are being developed for predicting thunderstorm fre-
of possible outcomes. This strategic approach is complemented quency in Bangladesh. These models utilize ensemble empir-
by the utilization of climate models, which employ data ical mode decomposition (EEMD) to extract relevant features
assimilation techniques. By integrating observational data with for accurate prediction. The EEMD-ANN model consists of
model outputs, these models generate long-term weather trends an input layer with 11 variables, two hidden layers (4 and
with precise forecast. These forecasts enable decision-makers 2 neurons), sigmoid activation function, and a 0.1 learning
and emergency response teams to better understand the level rate. On the other hand, EEMD-SVM employs various kernel
of risk associated with different weather events and make functions for effective handling of non-stationary TSF data.
informed choices to protect communities and infrastructure. The inpur variables include CAPE, CPRCP, CRR, DP, KI,
3
Task Approach Ref

Thunderstorm Prediction EEMD-ANN, EEMD-SVM, ARIMA [18]
LRCN-CNN, LSTM [19]
Thunderstorm Severity Prediction LSTM-FC, CNN-LSTM, ConvLSTM [20]
Lightning Prediction RNN [21]
ResNet [22]
Lightning Identification CNN [23]
TABLE I: Deep Learning in Thunderstorm and Lightning
PRCP, RH, ST, TSD, TT, and WS50. EEMD, based on Hilbert- are set to Leaky-Rectified Linear Unit (with an alpha value
Huang transform, mitigates challenges of EMD by introducing of 0.05) and Rectified Linear Unit. The models are trained
Gaussian white noise. This enables precise decomposition of and evaluated using hourly lightning flash data and weather
time series data into intrinsic mode functions (IMFs), revealing variables based on the MAE and MSE. Among the various
underlying patterns. ARIMA models effectively handle non- LSTM model variants, the CNN-LSTM model outperforms
stationary time series data. The hybrid models, EEMD-ANN the other models with a MAE of 51 flashes per hour because
and EEMD-SVM, capitalize on EEMD’s capabilities to han- of its ability to capture spatio-temporal features, leading to
dle non-stationary data and capture nonlinear relationships. more accurate predictions of thunderstorm severity [20].
These models outperform like ANN, SVM, and ARIMA in To predict the occurrence of lightning, an innovative data-
terms of prediction accuracy, with improvements ranging from driven neural network model called Attention-Based Dual-
8.02% to 22.48% across TSF categories [18]. In another Source Spatiotemporal Neural Network (ADSNet) is being
study, the aim is to predict severe thunderstorm occurrences introduced. ADSNet is designed for accurate hourly lightning
through an innovative approach using lightning and radar forecasting and utilizes both numerical simulations and histor-
video data in the Liguria region of Italy. The ensemble ical lightning observations, resulting in a comprehensive and
technique outperforms traditional methods that optimized stan- effective approach. A diverse dataset, combining WRF simu-
dard quality-based scores. The architecture involves a LRCN, lation data with Cloud-to-Ground Lightning Location System
which combines a CNN for extracting spatial features and an (CGLLS) observations from North China, is being employed.
LSTM network for analyzing sequential aspects. The training The model consists of dual RNN encoder-decoder units,
process spans 100 epochs, employing the Adam Optimizer several CNN modules, DCNN modules, and attention mecha-
with a learning rate of 0.001 and a mini-batch size of 72. nisms. ConvLSTM is chosen for its adeptness in capturing in-
The training process incorporates a class balanced cross- tricate spatiotemporal dependencies. This intricate framework
entropy loss function to fine-tune the model’s performance. is tailored for conducting 12-hour lightning forecasts in the
The model’s reliability is validated using a historical radar North China region. The model adopts the Adam optimizer
video dataset comprising CAPPI images at 2 km, 3 km, and with an initial learning rate of 0.0001 and Weighted Binary
5 km above above sea level, demonstrating its effectiveness Cross-Entropy as a loss function. Experimental results are
in probabilistic forecasting of severe thunderstorms [19]. In validating the superiority of ADSNet over baseline methods in
a study focused at assessing the accuracy of various LSTM terms of lightning forecast accuracy [21]. In another study, an
neural network variants in predicting thunderstorm severity innovative approach known as Lightning Monitoring Residual
through the utilization of remote sensing weather data, the Network (LM-ResNet) is being introduced, leveraging deep
primary objective is to quantitatively forecast the intensity of learning for effective lightning location monitoring in Ningbo,
thunderstorms by analyzing the frequency of lightning flashes China. By transforming the task into binary classification,
using deep learning models. The study employs two main radar data (PPI, CR, ET, V) and essential land attributes
datasets: SALDN lightning detection network data and SAWS (DEM, aspect, slope, land use, NDVI) are being harnessed to
weather station data. These datasets are used to train and create a comprehensive lightning feature dataset. LM-ResNet
evaluate different LSTM neural network variants, including employs Rectified Linear Unit (ReLU) activation for effective
LSTM-FC, CNN-LSTM, and ConvLSTM models. The LSTM- learning and addresses data imbalances through Focal Loss,
FC model consists of three LSTM layers and one dense layer, a specialized cross-entropy-based loss function. The model’s
with an optimizer based on the Adam algorithm. The activation training configuration includes an initial learning rate of 0.1
function used is the Leaky-Rectified Linear Unit with an alpha and utilizes the SGD optimizer with a batch size of 64, incor-
value of 0.15. Similarly, the CNN-LSTM model comprises porating momentum of 0.9 and weight attenuation of 0.0004
two Conv2D layers, one LSTM layer, and one dense layer. to enhance learning while mitigating overfitting. The study
This model also employs the Adam optimizer and utilizes the is demonstrating LM-ResNet’s superiority over competing
Leaky-Rectified Linear Unit as the activation function, with an architectures like GoogLeNet and DenseNet, highlighting its
alpha value of 0.05. The ConvLSTM model is structured with potential for accurate and reliable lightning incident tracking.
two ConvLSTM2D layers and two dense layers. The same [22].
Adam optimizer is employed, and the activation functions An approach called Lightning-SN is introduced, designed
4
for precise cloud-to-ground (CG) lightning identification using up to a 3-hour lead time. Model parameter optimization is
deep learning techniques. This model effectively utilizes S- achieved using the Adam optimizer, fine-tuned with a learning
band Doppler radar data and CG lightning records of the rate of 104 and a momentum of 0.5. To further enhance
Ningbo area in Zhejiang Province, China, collected from predictive performance, the research incorporates Balanced
August 2009 to December 2021 via the ADTD lightning po- Mean Squared Error (B-MSE) and Balanced Mean Absolute
sitioning system. Lightning-SN leverages an encoder-decoder Error (B-MAE) as loss functions. The model’s forecasting
structure with 25 convolutional layers, five pooling layers, five capabilities are evaluated using established metrics—HSS,
upsampling layers, and a sigmoid activation function layer. CSI, POD, and FAR—all of which collectively highlight its
The architecture capitalizes on symmetry, boundary preserva- proficiency in precipitation nowcasting. [24]. MetNet-2, a deep
tion techniques, and a 1x1 convolution kernel in the final layer. neural network-based weather model, outperforms existing
The model’s optimization is driven by the Adam optimizer physics-based models in predicting high-resolution precipita-
and guided by the GHM loss function. Training involves the tion up to 12 hours ahead. The study utilizes data sources such
BP algorithm, employing iterative refinement and validation as MRMS, GOES-2, and HRRR datasets. Input observations
testing. Additionally, the study includes a comprehensive from various sources, including radar, satellite, and assimila-
comparative analysis with other semantic segmentation algo- tion features, are processed through a CNN to capture temporal
rithms—FCNN, DeepLab-V3, and BiSeNet—evaluated under dynamics. Efficient computation is achieved through model
identical conditions. Lightning-SN demonstrates substantial parallelism across 16 interconnected TPU cores, allowing
performance improvements over traditional threshold-based accurate forecasts over a 512 km x 512 km target patch.
methods, particularly in scenarios involving high-resolution The model’s architecture consists of three stacks of 8 residual
radar data. [23]. blocks with exponentially increasing dilation factors. Operat-
ing within the Continental United States, MetNet-2 generates
forecasts at a 2-minute frequency with a spatial resolution of
B. Deep Learning in Precipitation
1 km. It operates within a probabilistic framework, producing
Precipitation is the process by which water, in either categorical predictions across 512 precipitation levels for each
liquid or solid form, falls from the atmosphere to the Earth’s target position. The model’s performance exceeds that of the
surface. Hail, snow, and rainfall are the three most common High-Resolution Ensemble Forecast (HREF) when assessed
types of precipitation. The most frequent type of precipitation using the Cumulative Ranked Probability Score (CRPS). [25].
is rain, which occurs when water droplets congregate and The utilization of deep learning techniques to merge precip-
become heavy enough to fall to the ground. When raindrops itation data from diverse sources across the Tibetan Plateau,
are pushed higher into the freezing parts of the sky during with the aim of enhancing data precision. The study explores
violent thunderstorms, they freeze and pile in layers, resulting three methodologies: ANN, CNN and a statistical Extended
in hailstones of varied sizes that can cause property and Triple Collocation (ETC) method. The neural network ar-
crop damage. Snow is formed when water vapour condenses chitecture employed consists of an ANN with four fully
straight into ice crystals in cold atmospheric conditions. connected layers and a CNN enhanced by two additional
Accurate forecasting and understanding of precipitation convolutional layers to capture spatial features. To mitigate
patterns are critical for many industries, including agriculture, overfitting, dropout layers with a 0.1 dropout rate followed
water resource management, and transportation. each fully connected or convolutional layer. The optimiza-
tion employs the Adam algorithm with a learning rate of
1) Rainfall: Deep learning uses meteorological data such 0.0001, the RMSE serves as the loss function, and the ReLU
as historical rainfall records, satellite images, and atmospheric function acts as the activation function. The hyperparameters
conditions. Rainfall forecasts give essential information for consist of 500 epochs and a batch size of 2500 for effective
disaster preparedness, agricultural planning, water resource training. Meteorological and hydrological evaluations reveal
management, and climate modelling. that the CNN approach consistently demonstrates superior
A nowcasting model is designed to address extreme weather performance compared to the others, showcasing enhanced
phenomena encompassing both precipitation and landfalling spatial distribution and heightened accuracy. The meteorologi-
hurricanes. The research employs a comprehensive dataset cal evaluation employs eight metrics: CC, BIAS, STDRATIO,
spanning five years (2015–2020) of radar observations over MAE, RMSE, POD, FAR, and CSI. The hydrological assess-
South Texas, including 22 hurricane events that occurred ment utilizes NSE and PBIAS for model parameter validation,
in the United States. The architectural design of the model with KGE employed to counter NSE’s flow peak bias and
comprises four core components: RNN, up-sample, down- emphasize runoff variability [26].
sample, and convolution. The architecture is built upon a Using the U.S. Weather Surveillance Radar-1988 Doppler
three-layer encoder-decoder structure, incorporating distinct (WSR-88D) observations dataset, researchers developed four
filter arrangements for the RNN, while seamlessly integrating DL models for QPE (Quantitative Precipitation Estima-
convolution and deconvolution operations. GRU is selected tion) with a CNN-VGG architecture. These models, named
as the foundational RNN unit, organised in multiple layers RQPENetD1, RQPENetD2, RQPENetV, and RQPENetR, in-
to effectively capture intricate spatiotemporal patterns. The corporate dense blocks, RepVGG blocks, and residual blocks.
model effectively predicted future radar reflectivity echo maps The architecture of RQPENetD1 features an initial convolution
based on five preceding observations, enabling forecasts for layer, four dense blocks with varying bottleneck layers, and
5
Task Approach Ref

Precipitation Forecast RNN [24]
CNN [25]
Precipitation Data Merging CNN, ANN [26]
Quantitative Precipitation Estimation CNN-based [27]
TPW and CAPE Estimation MLP [28]
Hailstorm Detection CNN, DNN [29]
Hailstorm Forecast Atuencoder, CNN [30]
CNN [31]
Hail Size Estimation PCA, BPNN [32]
Cloud or Snow Identification DeepLab-CRF [33]
CNN [34]
U-Net [35]
U-Net [36]
U-Net [37]
CNN [38]
Snow Depth Estimation BPNN [39]
CNN, ResNet [40]
deep CNN [41]
Snow Water Equivalent Estimation ANN, ANFIS [42]
MNLR, NNGA [43]
TABLE II: Deep Learning in Precipitation
transition layers for spatial reduction. It processes 3-D radar statistical metrics, including correlation coefficient, bias, and
data from two elevation angles to estimate rainfall rate using RMSE. The incremental ANN model demonstrates improved
a fully connected layer with adaptive average pooling and accuracy and stability compared to static learning methods,
utilizes MSE as the loss function. RQPENetD2 shares a similar indicating its potential to accurately estimate TPW and CAPE
structure with dense blocks featuring (24, 16) and (36, 24) [28].
bottleneck layers, along with transition layers involving 1x1
Convolution and average pooling. RQPENetV incorporates
2) Hail: Hail prediction helps to improve our understanding
RepVGG blocks in a multi-branch structure across five stages,
and readiness for this dangerous weather occurrence. We can
while RQPENetR utilizes residual modules in four sequential
increase hail forecasting accuracy, giving advanced warnings
blocks with varying bottleneck layers for feature enhancement
to prevent possible infrastructure, agriculture, and community
from 3-D radar data. The evaluation of RQPENet’s radar pre-
damage. Because hail storms may have significant socioeco-
cipitation estimation includes metrics such as RMSE, MAE,
nomic consequences, applying deep learning techniques into
CC, and NSE, along with additional atmospheric science
hail prediction models is critical for timely and effective risk
metrics: POD, FAR, CSI, HSS, and GSS. The findings indi-
management and disaster response measures.
cate the superior performance of dense blocks-based models,
particularly RQPENet D1 and RQPENet D2, compared to In a test case study, researchers applied deep learning net-
residual blocks and RepVGG blocks-based models, as well works for hailstorm detection using CNN and DNN architec-
as five traditional Z-R relations [27]. tures. The approach involves training these networks on GOES
satellite imagery and MERRA-2 atmospheric parameters to
In a recent study, researchers propose an ANN model with identify hail storms. Different architectures are utilized, in-
incremental learning to derive total precipitable water (TPW) cluding a CNN for processing satellite imagery and a DNN for
and convective available potential energy (CAPE) from atmospheric parameters, aimed at capturing pertinent features.
GEO-KOMPSAT-2A satellite imagery over Northeast Asia. The CNN architecture for satellite imagery uses four convolu-
The study utilizes AMI satellite imagery, ERA5 data, and tional layers with ReLU activation functions, combined with
radiosonde observations for training and evaluation. An MLP max-pooling layers for downsizing and a batch normalization
feedforward backpropagation ANN model is employed for the layer for streamlined training. Fully connected layers are also
retrieval algorithm. The model architecture includes an input integrated into the architecture to enable classification. Con-
layer with 20 neurons, a hidden layer with 40 neurons using currently, the DNN architecture for atmospheric parameters
a hyperbolic tangent activation function, and an output layer features four fully connected layers with ReLU activation, uti-
with a linear activation function. The optimization process lizing a Softmax function for classification. Both architectures
utilizes the Adam optimizer with a mean squared error loss converge within a merged network, amalgamating outputs
function. The accuracy assessment involves the utilization of from the CNN and DNN via concatenation and incorporating
6
additional fully connected layers for the final classification.

This approach harnesses the capabilities of deep learning 3) Snow: Understanding snowfall patterns is critical for
to enhance hail detection by merging multi-source data and many industries, including transportation, agriculture, and dis-
recognizing spatial patterns. The CNN model achieves height- aster planning, resulting to more effective resource manage-
ened precision by accurately identifying decreased infrared ment and risk mitigation techniques. However, the intricate
brightness temperatures linked to hail storms. [29]. In an effort task of accurately differentiating between snow and clouds
to forecast hailstorms, researchers introduced an architecture arises from their comparable white appearance in satellite
that comprises three distinct models: an Autoencoder (AE) imagery, highlighting the utmost importance of precise dis-
with encoder and decoder layers, each containing 32 neurons; crimination. Therefore, in an attempt to accurately identify
a CNN constructed with CNN layers featuring 64 and 32 cloud and snow in high-resolution remote sensing images,
filters; and an RF model characterized by an ensemble of a study introduced the DeepLab v3+ neural network with a
decision trees and decision tree aggregation through majority CRF model. The research utilized data from China Gaofen-1
voting. Both AE and CNN are optimized using the Adam (GF-1) satellite’s Wide Field View (WFV) sensor, comprising
optimizer and MSE as the loss function. The dataset utilized four bands and a spatial resolution of 16 m, encompassing
in the study consists of observations from the TRMM and a total of ten images spanning three years. The DeepLab
reanalysis data from the ECMWF spanning one year. The v3+ adopts an encode-decoder architecture and employs the
selected attributes for training the models include convective Adam optimizer with a learning rate of 0.001, a batch size of
potential energy, convective inhibition, wind shear within the 5, and 200 epochs. The study analyzes accuracy variations
1–3 km range, and warm cloud depth. The study aims to resulting from distinct loss functions, including the Cross
predict global hailstorms using these models and to compare Entropy (CE) loss function, Dice loss function, and Focal
their performance in terms of accuracy, precision, and error loss function. Evaluation metrics encompass Mean Intersection
rates. Surprisingly, contrary to expectations, RF outperforms over Union (MIoU) and Mean Pixel Accuracy (MPA) to assess
the deep learning methods in terms of hailstorm prediction model performance. This methodology effectively mitigates
performance [30]. In a study focusing on severe hail predic- misclassification issues, enhancing cloud and snow identi-
tion, researchers employed CNN to encode spatial weather fication precision through refined boundary delineation and
data and compared its performance with traditional statistical reduced isolated patches [33]. An end-to-end fully-CNN with
approaches like Logistic Mean and Logistic PCA. The dataset a multiscale prediction approach is proposed to differentiate
utilized in this study includes geopotential height, temperature, cloud and snow using a dataset of 50 high-resolution Gaofen
dewpoint, zonal wind, and meridional wind variables from the satellite images (13400×12000 pixels each), meticulously la-
NCAR ensemble model output. These variables are collected beled for cloud and snow regions. The network adopts the
at different pressure levels: 500 hPa, 700 hPa, and 800 hPa. VGG network architecture with stride reduction and atrous
The study uses upper-air dynamic and thermodynamic fields convolution techniques. Due to the frequent co-occurrence of
from an NCAR NWP model. The CNN architecture used in snow and cloud in images, a pixel-level approach is employed,
this study comprises three strided convolutional layers with involving the replacement of the last two fully connected
5x5 gridcell filters. A range of hyperparameters is tested, layers in the VGG model with two convolutional layers. The
including the initial number of filters, dropout rates, activation final layer employs a three-class softmax loss for classifying
functions (ReLU and Leaky ReLU), L2 norm regularization snow, cloud, and other land types, using batch normalization
coefficients, and optimizers (Stochastic Gradient Descent and and rectified linear units. The Multiscale Prediction Mod-
Adam) with different learning rates. The model’s evaluation ule merges feature maps from diverse intermediate layers,
is conducted using the Brier Score as the prediction error functioning as an ensemble learning approach. This approach
function, and standard probabilistic verification metrics are allows for the simultaneous utilization of low-level spatial
employed to assess the quality of probabilistic forecasts. information and high-level semantic information, enabling
The results demonstrate a significant enhancement in various accurate differentiation between cloud and snow [34]. A deep
measures of prediction skill achieved by the CNN architecture, learning-based method is developed by utilizing the Unet3+
leading to improved probabilistic predictions when compared network with Resnet50 and the Convolutional Block Attention
to the logistic PCA approach [31]. Module (CBAM) to accurately detect cloud and snow in
Accurately estimating hail size remains crucial for remote sensing images. This approach effectively eliminates
evaluating potential damage caused by hailstorms. In order interference information. The feature extraction process of
to address this challenge, a model consisting of two main UNet3+ includes five encoders with effective convolutional
components has been proposed. The PCA-based technique and pooling layers. In an enhanced version, multiple convo-
selects 18 features that strongly correlate with hail sizes, while lutional layers, regularization, ReLU activation, and residual
the BPNN regression model with a two-layer architecture and modules are added to each of the five encoders, resulting in
35 hidden layer neurons is employed to estimate the size of feature graphs. The decoders consist of convolution and acti-
hailstones from satellite images. Using a MSE loss function, vation layers. To address bias, a weighted cross-entropy loss is
the BPNN regression model achieves an R-squared value of employed, emphasizing cloud and snow regions. For enhanced
0.52 through linear fitting when assessing the correspondence focus and deeper feature extraction, the Convolutional Block
between predicted and observed Maximum Hail Diameters Attention Module (CBAM) is incorporated into ResNet50.
on the test set [32]. CBAM harnesses channel attention through global pooling,
7
multi-layer perceptron processing, and sigmoid activation to (PA), user’s accuracy (UA), intersection over union (IoU), and
generate attention feature maps. The model’s performance is overall accuracy (OA) [38].
assessed using various metrics, including Mean Intersection A deep learning approach is introduced to downscale snow
over Union (mIoU), Mean Pixel Accuracy (mPA), Mean Preci- depth retrieval across an alpine region by integrating satellite
sion (mPrecision), and Estimated Total Size. These evaluations remote-sensing data with diverse spatial scales and charac-
successfully mitigate interference data, resulting in accurate teristics. The study focuses on collaborative snow parameter
cloud and snow extraction from diverse landforms within retrieval in Northern Xinjiang, China, utilizing MODIS and
remote sensing images.[35]. The effectiveness of U-Net based MWRI data. A three-hidden-layer neural network is designed
deep learning models in delineating glacier boundaries and with 20, 20, and 10 neurons in each layer. The network
identifying snow/ice is demonstrated in a study that developed processes resampled BTD, topographic, and meteorological
ENVINet5 and ENVI Net-Multi deep learning classifiers to data at a 500m resolution, utilizing a sigmoid function for cap-
analyze Landsat-8 satellite data over the Bara Shigri glacier re- turing nonlinear patterns. Backpropagation, guided by MSE
gion in Himachal Pradesh, India. The ENVINet5 architecture, and SGD, optimizes the network to enhance the accuracy
based on a mask-based encoder-decoder U-Net model, is em- of snow depth observations. The optimization process entails
ployed for single-class categorization, while ENVI Net-Multi weight and bias adjustments, informed by a learning rate of
is used for multi-class classification of features like snow, ice, 0.001. This approach aims to boost the precision of snow
and barren areas. The ENVINet5 architecture comprises five depth observations through deep neural network downscaling.
levels with twenty-three convolutional layers, incorporating Reference data from ground station snow depth measurements
input patches, feature maps, various convolutions, feature are employed to evaluate the downscaling model’s perfor-
fusion, max-pooling, co-convolution, and 1x1 convolutions. mance and retrieval accuracy, employing assessment metrics
For ENVINet-Multi, training parameters include 25 epochs, encompassing R2, RMSE, PME, NME, MAE, and BIAS [39].
a patch sampling rate of 16, class weight of 2.5, loss weight A deep learning model is presented for ’area-to-point’ snow
of 0.5, 200 patches per epoch, 464x464 pixel patch size, and depth estimation, which integrates AMSR2 TB, MODIS, and
2 patches per batch [36]. NDSI data, achieving high accuracy with a spatial resolution of
The performance of U-Net, RF, and Sen2Cor models 0.005°. The model utilizes CNN and residual blocks to capture
for snow coverage mapping is compared in a study using spatial heterogeneity and leverage high-resolution snow infor-
Sentinel-2 satellite multispectral images across 40 diverse sites mation from MODIS. The CNN comprises convolutional and
spanning all continents except Antarctica. A Random Forest ReLU activation layers, pooling for downsampling, measures
model is built using Bayesian Hyperparameter Optimization to prevent overfitting, and concludes with a fully connected
for improved performance. The Sentinel-2 Level-2A product layer for output. The proposed deep residual network con-
incorporates cloud and snow confidence masks derived from tains a 35-layer input patch, applies convolutions with batch
Sen2Cor, which employs threshold tests on spectral bands, normalization and max pooling, followed by 4 residual blocks
ratios, and indices like NDVI and NDSI. A U-Net network for feature extraction. After adaptive average pooling and fully
architecture is employed, featuring an encoding path with connected layers, it predicts snow depth at the patch center. It
repeated 3x3 convolutions, batch normalization, and ReLU ac- has 9 convolutions and 4 fully connected layers, using ReLU
tivation, followed by 2x2 max pooling for downsampling. The activation except for the linear output layer. The model is
decoding path utilizes transpose convolutions for upsampling, trained for 50 epochs, using a learning rate of 0.0001, with
concatenating with corresponding encoding path features, and exponential decay of 0.5 every 20 epochs, and a batch size
applying 3x3 convolutions with BN and ReLU. The final layer of 32, while employing stochastic gradient descent (SGD) as
comprises a 1x1 convolution for class prediction. Training in- the optimizer. The evaluation metrics in this study include
volves weighted cross-entropy loss, stochastic gradient descent RMSE, MAE, MBE, and R2. The results demonstrate that
with a learning rate of 0.01 and momentum of 0.9, resulting by incorporating spatial heterogeneity and leveraging high-
in effective semantic segmentation. The model’s performance resolution MODIS snow cover data, the proposed model
is assessed using precision, recall, F1 score, Intersection Over achieves promising accuracy in snow depth estimation, with
Union (IoU), and accuracy metrics. The results demonstrate potential applicability to other regions [40].
that U-Net models exhibit superior performance compared A novel inverse method is presented for extracting snow
to RF and Sen2Cor in accurately mapping snow coverage layer thickness and temperature from passive microwave re-
[37]. An open-source machine learning-based system for snow mote sensing data. Utilizing convolutional, pooling, and fully-
mapping, AutoSMILE, was developed to automate the process connected layers, the study employs a ConvNet to inversely
using image processing, machine learning, deep learning, and estimate the thickness and temperature of a snowpack from
visual inspection. It was applied in a mountainous area in its corresponding vertical polarization brightness temperature
the northern Tibetan Plateau using RF and CNN algorithms, and horizontal polarization brightness temperature. The model
achieving accurate snow cover mapping. The CNN architec- chooses the Adam optimizer with a learning rate of 0.01 to
ture comprises four layers: convolutional layers for feature optimize the half mean squared error loss function, and L2
extraction, activation layers like ReLU to expedite training, regularization is employed to enhance prediction accuracy by
pooling layers for non-linear downsampling, and fundamen- mitigating over-fitting. Furthermore, a comparative analysis
tal components like fully connected and flatten layers. For is conducted between the ConvNet outcomes and those of
model evaluation, key metrics include producer’s accuracy conventional ANN and SVM. The model assessment is carried
8
out using RMSE and R2 metrics, underscoring the effective- C. Deep Learning in Drought
ness of the advanced ConvNet approach. The utilized ANN
architecture comprises three layers - input, hidden, and output Drought refers to an extended period of unusually low pre-
- with 20 hidden-layer units utilizing hyperbolic tangent basis cipitation within the natural climate cycle caused by deficiency
functions [41]. in rainfall. It can have far-reaching impacts, including water
A study on predicting snow water equivalent (SWE) in shortages, crop failure, livestock losses, increased wildfire
a semi-arid region of Iran was conducted using regression, risk, and ecosystem degradation. Drought can be categorized
ANN, and adaptive neuro-fuzzy inference system (ANFIS) into four types: meteorological, hydrological, agricultural, and
models. The study proposes a three-layer ANN alongside socio-economic [44] [45], with each type influenced by critical
ANFIS, which integrates ANN with fuzzy logic, utilizing a climatic factors such as increased evaporation, transpiration,
five-layer structure based on the Sugeno model featuring two and insufficient precipitation. In this review, we will consider
fuzzy if-then rules. In the ANFIS architecture, a hyperbolic only meteorological drought.
tangent activation function is utilized in the hidden layer, Meteorological Drought forecasting is a complex process
and optimal neuron counts are ascertained for both hidden that involves analyzing various climatic and environmental fac-
and input layers through iterative refinement. The handling of tors to anticipate future drought conditions. Statistical models
numerous independent input variables in the ANFIS approach and drought indices, such as the Standardized Precipitation
is accomplished using backpropagation training and a sub- Index (SPI) or Palmer Drought Severity Index (PDSI) [46],
clustering method. The assessment of ANN, ANFIS, and are utilized to quantify drought severity and monitor changes
regression models involves statistical metrics such as MBE, over time. These methodologies contribute to a comprehensive
MAE, RMSE, correlation coefficient, relative error percentage, understanding of drought dynamics and aid in the formulation
and Nash-Sutcliffe coefficient efficiency. The results demon- of effective drought management and adaptation measures.
strate the superior performance of both ANN and ANFIS However, they often face challenges in capturing the com-
models compared to the regression method, with ANN and plexities of drought dynamics.
ANFIS exhibiting similar prediction accuracy for SWE [42]. The study utilized ANFIS to forecast SPI-based drought
In a study, researchers focused on estimating SWE in the indices using rainfall data from 10 stations in Central Anatolia,
Samsami basin of Iran using MNLR, NNGA, and ANN Turkey. The ANFIS architecture consists of five layers: input,
architectures. The study aims to estimate snow water equiva- rule, average, consequent, and output nodes. These models,
lent, a critical component of water resources in mountainous named SPI-1, SPI-3, SPI-6, SPI-9, and SPI-12 for different
areas, based on climatic and topographic parameters such as time scales, are designed to capture diverse drought patterns
elevation, slope, aspect, longitude, and latitude. The MNLR by integrating SPI and precipitation data, addressing short-
architecture employed in the study aims to model the complex term, seasonal, and long-term variations. For each phase, a
non-linear relationship between SWE and a set of indepen- total of 20 distinct models with varied input combinations
dent parameters. Moreover, four different ANN architectures are developed. The evaluation includes performance metrics
are investigated: MLP for supervised prediction with input, such as Root Mean Square Error (RMSE), Efficiency (E),
hidden, and output layers; GFF for efficient problem-solving and Correlation (CORR), and comparisons are made against
through multi-layer connections; RBF for rapid learning via Feed Forward Neural Networks (FFNN) and Multiple Linear
self-organizing hidden layers; and MNN, a specialized MLP Regression (MLR) models. Significantly, the ANFIS models
with parallel sub-modules for specialized function and faster exhibit exceptional performance, showcasing a notable ability
training. The study evaluates these architectures to enhance to accurately identify dry and wet periods, particularly across
the prediction of SWE values. The NNGA model utilizes extended time scales [47]. In a different study, the focus
genetic algorithms to optimize neural network parameters, is on long-term Standard Precipitation Index (SPI) drought
enhancing accuracy by iteratively refining weights through forecasting (6 and 12 months lead time) in the Awash River
selection, crossover, and mutation. Additionally, six diverse Basin, Ethiopia. The study compares the efficacy of five
learning algorithms are examined for training neural network data-driven models: ARIMA, ANN, SVR, WA-ANN, and
components in the NNGA model. These algorithms include WA-SVR. The ARIMA model involves three essential steps:
Levenberg-Marquardt for adaptive MSE minimization, Delta- model identification, parameterization, and validation. The
Bar-Delta for efficient step size adaptation, Step for gradient significant lags are determined using ACF and PACF, guiding
descent with step size adjustment, Momentum for inertia- the selection of accurate and precise parameters. The ANN
infused gradient descent, Conjugate Gradient for second-order model employs a MLP structure with input, hidden, and
optimization, and Quickprop for error surface curvature-based output layers, trained using the Levenberg-Marquardt (LM)
weight adjustments. The study evaluates three activation func- backpropagation algorithm. Lagged SPI values are utilized
tions: Sigmoid, Tanh, and Linear, and assesses all models using as input, with optimal input layer neurons determined via
standard statistical criteria, including correlation coefficient, trial and error, while hidden layer neurons are selected using
RSME, ratio of average estimated to observed values, and empirical methods. The SVR model employs a non-linear
MAE. The NNGA model, specifically the NNGA5 variant, RBF kernel. The Wavelet Decomposition process encompasses
proves to be the most effective approach, offering valuable in- CWT for time-frequency representation and DWT for efficient
sights for water resource management in mountainous regions computation. The transformed time series serves as input for
[43]. both ANN and SVR models. The performance of models
9
Task Approach Ref

SPI Forecast ANFIS, FFNN, MLR [47]
ARIMA, ANN, SVR, WA-ANN, WA-SVR [48]
ANN, SVR, WANN [49]
ANN, WANN, ARIMA, SARIMA [50]
ARIMA, ANN, WANN [51]
EMD-DBN [52]
WP-ANN, WP-SVR [53]
SWSI and SIAP Forecast ANN [54]
SPEI Forecast ANFIS, hybrid WT-ARIMA-ANN [55]
TABLE III: Deep Learning in Drought
is evaluated using RMSE, MAE, R2, and persistence. It is selecting the model that minimizes K–S distance and maxi-
found that the WA-ANN model outperforms other models for mizes Kendall rank correlation. The chosen model’s efficacy
forecasting SPI values over lead times of 6 and 12 months[48]. is validated through ACF and PACF plots. The ANN model
In another study, the WA-ANN models outperform alternative underestimates certain instances of extreme drought or extreme
approaches, providing the most accurate forecasts for SPI 3 precipitation, where the observed SPI values correspond to
and SPI 6 values over lead times of 1 and 3 months. This such extreme conditions. The WANN model outperforms other
underscores the effectiveness of the WA-ANN architecture, models, exhibiting higher correlation, lower K–S distance, and
characterized by 3 to 5 neurons in the input layer, 4 to enhanced Kendall rank correlation. The comparison results
7 neurons in the hidden layer, and a single neuron in the show that the WANN model is the most suitable and effective
output layer, contributing to improved short-term drought for forecasting SPI-6 and SPI-12 values in the study area
forecasting [49]. A comparative approach is employed to [51]. Furthermore, a hybrid predictive model is presented,
evaluate the performance of various forecasting models for combining EMD with a DBN for drought forecasting using
drought using SPI as the indicator. Three primary models the SSI across the Colorado River basin. DBN is constructed
are investigated: ANN, WANN, and traditional stochastic by stacking multiple RBM on top of each other, and the
models, namely ARIMA and Seasonal ARIMA (SARIMA). RBMs are trained using the contrastive divergence algorithm.
The focus is on SPI-3, SPI-6, and SPI-12 time scales, and EMD is utilized to decompose the data into IMF with vary-
the impact of wavelet preprocessing on model accuracy is ing frequencies. Some IMFs are found to contain noise or
explored for the Algerois Basin in North Algeria. ARIMA irrelevant information. A denoising technique is proposed,
and Seasonal ARIMA models offer an empirical framework involving the use of Detrended Fluctuation Analysis (DFA)
for modeling and predicting complex hydrologic systems, with scaling exponents. A threshold (Hurst exponent 0.5) is applied
nonseasonal ARIMA addressing stationary data and seasonal to identify noisy IMFs, which are subsequently eliminated,
ARIMA handling nonstationarity through AR, MA operators, and the relevant IMFs are aggregated for reconstruction. The
and differentiation parameters. An implementation of ANN- DBN model, along with other models (MLP, SVR, EMD-MLP,
MLP involves a network structure comprising interlinked EMD-SVR), is used to predict SSI-12 with lead times of one
input, hidden, and output layers. The WA-ANN model utilizes and two months. The evaluation metrics include RMSE, MAE,
Discrete Wavelet (DW) inputs derived from original SPI time and NSE, and the EMD-DBN model outperforms all other
series and corresponding un-decomposed SPI outputs, with a models in the two-step ahead prediction [52].
focus on assessing the impact of various mother wavelets to
enhance model efficiency. The model performance is assessed The focus of the study is on drought forecasting with lead
using the Nash-Sutcliffe model efficiency coefficient (NSE), times of 1 month and 6 months for the Gulbarga district in
RSME, and MAE as evaluation metrics. The results demon- Karnataka, India, using the SPI as the drought quantifying
strate that the WANN model outperforms the ANN model for parameter. WPT is employed to preprocess SPI time series,
SPI-3 forecasts over up to six months while the SARIMA generating decomposed coefficients used as inputs for ANN
model shows satisfactory results for SPI-12 forecasts with a and SVR models. The SPI time series forecasting utilizes a
one-month lead time. However, all models experience reduced BPNN with a 3-4-1 architecture. The network’s weights and
accuracy as lead times increase [50]. In another study, three biases are determined using the gradient descent optimization
data-driven models, namely ARIMA, ANN, and WANN, are algorithm, incorporating an adaptive learning rate of 0.45,
employed for drought forecasting based on the SPI at two momentum rate of 0.15, and 5000 learning cycles. The SVR
time scales (SPI-6 and SPI-12) in the north of the Haihe River utilizes a loss function based on Vapnik’s ǫ-insensitive ap-
Basin. The effectiveness of the models is assessed using sta- proach and incorporates the Gaussian radial basis function
tistical tests like the Kolmogorov-Smirnov (K-S) test, Kendall (RBF) kernel. This approach creates hybrid WP-ANN and
rank correlation, and correlation coefficients (R2). ARIMA WP-SVR models for drought forecasting, with Daubechies 4
models with varying parameter combinations are employed, (db4) wavelet chosen as the mother wavelet. It is observed that
the hybrid WP-ANN model performs better than standalone
10
approaches, with the forecast accuracy decreasing as the lead power grids due to increased air conditioning use, resulting
time increases [53]. in power outages, and can cause crop failures, wildfires
In another study, an ANN model is used to forecast and damage to infrastructure like roads, bridges, railways,
drought indices, including the Standardised Water Storage and airports [11]. The World Meteorological Organization’s
Index (SWSI) and the Standard Index of Annual Precipi- (WMO) annual report for 2023 highlighted the unprecedented
tation (SIAP). The dataset encompasses rainfall and water heatwaves experienced in Europe during the summer, exacer-
level information originating from the Langat River Basin bated by abnormally dry conditions. Tragically, these extreme
in Malaysia, covering a time span of three decades (1986- heat events resulted in over 15,000 excess deaths across several
2016). A feed-forward multilayer perceptron (MLP) structure countries, including Spain, Germany, the United Kingdom,
is employed, comprising input, hidden, and output layers. This France, and Portugal. These alarming findings underscore the
architecture is trained using the Levenberg–Marquardt (LM) pressing need for urgent and effective heatwave mitigation
back-propagation algorithm for both traditional artificial neural strategies and adaptive measures to safeguard vulnerable pop-
network (ANN) models and the wavelet-based artificial neural ulations in the face of escalating climate challenges [58].
network (W-ANN) models. In the W-ANN approach, discrete Monitoring and predicting heatwaves and cold waves are
wavelet transform (DWT) is applied to the input data, yielding crucial for preparedness [59] and mitigating potential risks,
subseries components. Subsequently, pertinent components are such as implementing appropriate measures to protect vul-
chosen from these subseries and integrated into the MLP to en- nerable populations and ensuring the efficient functioning
hance the accuracy of forecasting. The outcomes demonstrate of critical systems during extreme cold episodes. Both the
that the W-ANN model showcases improved performance, events can be predicted using a range of approaches, including
achieving heightened correlation coefficients [54]. statistical models [60] and dynamic models such as GMC [61]
Another study employs two hybrid models, namely Wavelet- [62] and RCM [63] [64] [65] and machine learning techniques
ARIMA-ANN (WAANN) and Wavelet-Adaptive Neuro-Fuzzy that analyze extensive datasets to identify patterns associated
Inference System (WANFIS), to predict the Standardized with heatwave occurrences. Deep learning can be utilized to
Precipitation Evapotranspiration Index (SPEI) at the Langat extract features from various meteorological variables, such
River Basin for different time scales (1-month, 3-months, and as temperature, humidity, wind patterns, and atmospheric
6-months). The input data are subjected to wavelet decom- pressure, to forecast the likelihood, frequency, duration and
position at a level of three, and the resulting components intensity of both heatwaves and coldwaves.
are employed as inputs for both ANN and ANFIS models. The focus of the study is on heatwave monitoring and
ANN models are constructed using Bayesian regularization prediction, employing index-based monitoring, and LSTM-
backpropagation with a total of 1000 training epochs, and the based prediction model in northern India up to 5-6 days
optimal number of hidden neurons is identified through trial ahead. The study employs IMD daily mean gridded sur-
and error. The WANFIS involves normalizing decomposed face temperature data (1951–2020) and NCMRWF-IMDAA
historical SPEI series as input for Sugeno-type Fuzzy Infer- reanalysis dataset for humidity and wind data (1979–2020).
ence System (FIS), chosen for its computational efficiency and The objective of this study is to develop an operational
compatibility with optimization techniques. This is followed framework that can monitor, track, and predict heatwaves
by applying the ANFIS algorithm with determined training in real-time over the Indian region, utilizing a combination
parameters to enhance model performance. It is found that of temperature indices, synoptic information, and an LSTM-
the hybrid WT-ARIMA-ANN technique outperforms other based prediction model. The model’s performance is evaluated
models, providing better forecasts for both short-term and mid- using a correlation coefficient and root mean square error
term drought indices (SPEI 1, SPEI 3, and SPEI 6).[55]. (RMSE). The results demonstrate that the proposed approach
offers a promising approach to enhance heatwave prepared-
ness and response strategies [66]. In another study, a GNN
D. Deep Learning in Heatwave and Cold waves model is developed to predict regional summer heatwaves
Heatwaves are extreme weather event characterized by in North America. By utilizing daily weather data of 91
prolonged periods of excessively hot weather [56], often ac- stations across CONUS and analyzing key meteorological vari-
companied by high humidity. During a heatwave, temperatures ables, the model reduces computational burdens for immediate
rise significantly above the average for a particular location heatwave warnings and facilitates fast decision-making. The
and persist for an extended period, typically several days. Con- model utilizes an encoder-processor-decoder architecture for
versely, a coldwave is a meteorological phenomenon marked binary classification of heatwave events. Each node within
by a sudden and significant decrease in air temperature at the the graph corresponds to a weather station, while the model
Earth’s surface. This results in extremely low temperatures employs a GAL – a form of nonlinear graph convolution.
that can give rise to hazardous weather conditions, including This GAL dynamically adjusts the adjacency matrix based
frost formation and the formation of ice. Both events can on node features via attention mechanisms, thereby enhancing
have profound impacts on human health [12], [57] particularly its expressiveness. The GAL computes specialized attention
in vulnerable populations such as the elderly, children, and weights (AW) to capture interactions between nodes, encom-
individuals with pre-existing health conditions, infrastructure, passing influences from neighbors, to neighbors, and historical
agriculture, ecosystems, and can even result in mortality for data. Moreover, the model’s training incorporates a soft F1-
human beings and livestocks. Additionally, heatwaves strain score metric, effectively combining recall and precision to
11
Task Approach Ref

Heatwave Forecast LSTM [66]
GNN [67]
Heatwave and Cold wave Forecast ConvNet, CapsNets [68]
SAT Forecast CNN, CNN-RP, CNN-RP-BIN [69]
SAT and LST Forecast LSTM [70]
TABLE IV: Deep Learning in Heatwaves and Cold waves
mitigate bias and maximize the F1-score. As a result, the and optimal predictor variable subsets from exhaustive search.
GNN model achieves an impressive 90% accuracy in pre- The CNN combined with RP approaches accurately detects
dicting regional heatwave occurrences [67]. Another study maximum temperatures, indicating heatwaves, outperforming
employs ConvNet and CapsNet to predict heatwaves and classical CNN and other machine learning techniques. [69].
cold waves, predicting the occurrence and region of extreme In another study, the investigation of the association between
weather patterns in North America. The study employs daily surface air temperature (SAT) and land surface temperature
data from the Large-Ensemble Community Project (LENS) (LST) considering land use during heat and cold wave events
for surface air temperature (T2m) and geopotential height at is undertaken. The author employs LSTM with a memory
500 mb (Z500) during boreal summer and winter months block containing forget, input, and output gates. These gates
from 1920 to 2005. The ConvNet architecture comprises 4 utilize sigmoid layers and pointwise multiplication to govern
convolutional layers with ReLU activation, where the last the flow of data across the cell and neural networks, effectively
two layers are followed by max-pooling (2x2, stride 1). The managing data dynamics. The study uses Terra and Aqua
output feeds into a fully connected neural network with 200 MODIS daytime and nighttime LST data, along with observed
neurons, featuring dropout regularization and L2 regularization air temperature data obtained from 79 weather stations under
to prevent overfitting. An adaptive learning rate is implemented the Korea Meteorological Administration spanning the years
through the ADAM optimizer, while a softmax layer assigns 2008 to 2018. The performance of the model is conducted
patterns to cluster indices based on the highest probability. using metrics such as R-squared, Root Mean Square Error
On the other hand, CapsNet includes two convolutional layers (RMSE), and Index of Agreement (IoA) [70].
with ReLU activation, followed by a primary capsule layer
(eight capsules with eight convolution layers), utilizing the E. Deep Learning in Tropical Cyclone
routing-by-agreement algorithm to convey information to a
secondary capsule layer for cluster probability prediction. The Tropical cyclones are low-pressure weather systems that
squash function introduces nonlinearity, and a decoding layer form over warm tropical oceans between latitudes 23.5
with three fully connected layers aids pattern reconstruction. degrees North and South, except in the South-Atlantic Ocean
The framework’s performance is evaluated using accuracy region [71].
and recall metrics and compared against CNN and logistic
regression. The CapsNet-based framework achieves notable 1) Frequency and Identification: A study focuses on
accuracy and recall in predicting extreme temperature events predicting TC frequency during the post-monsoon season
based on Z500 patterns [68]. based on large-scale climate variables such as geopotential
height, relative humidity, sea-level pressure, and zonal wind.
The challenge of long-term air temperature prediction in Three types of artificial neural networks, namely MLP, RBF,
summer using AI techniques is addressed in the study. and GRNN, are employed to develop prediction models.
ECMWF’s ERA5 reanalysis data spanning 1950 to 2021 for The research methodology involves selecting significant
Paris (France) and Córdoba (Spain) is employed, incorporating predictors using correlation analysis and utilizing historical
nine vital meteorological variables, including 2m air temper- TC frequency data from 1971 to 2013. The models are
ature, sea surface temperature, wind components (10m and trained with data from 1971 to 2002 and evaluated with
100m), mean sea level pressure, soil water layer, and geopo- independent data from 2003 to 2013. The MLP architecture
tential pressure. For each region, two experiments are carried consists of two hidden layers with five nodes in the first layer,
out: one for shorter-term prediction and another for prolonged three nodes in the second layer, and an output layer. The
prediction time-horizons, therefore possibly indicating a heat- RBF network employs radial basis functions with optimized
wave or a coldwave occurrence. The research explores a spread parameters of 0.6, while the GRNN employs a parallel
diverse array of nine modeling approaches, encompassing Lin- structure with spread factors of 0.2. Results demonstrate
ear Regression (LR), Lasso Regression (Lasso), Polynomial that the MLP model outperforms RBF and GRNN models
Regression (Poly), AdaBoost, Decision Trees (DT), Random across various evaluation metrics, showing lower RMSE,
Forest (RF), Convolutional Neural Network (CNN), CNN higher correlation, and better agreement between predicted
with Recurrence Plots (RP+CNN), and RP+CNN with binary and observed TC counts [72]. Furthermore, a multistaged
fusion (RP+CNN+BIN). Each method is assessed using MSE, deep learning framework proposes incorporating a Mask
MAE, Pearson and Spearman rank correlation coefficients, R-CNN detector, a wind speed filter, and a classifier based
12
on CNN to detect TCs. The Mask R-CNN detector with approaches have been employed to improve the accuracy of
the R50 FPN model predicts TC locations, trained on RGB predictions and provide insights into the behavior of TCs
satellite images, and generates predictions with class labels, over an extended period. Taking a distinctive route, a study
scores, segmentation masks, and bounding box coordinates. combines SOM and FFNN to investigate changes in TCs’ GPI
A Wind Speed Filter is applied to reduce false positives using and its contributing factors for a global climate model. This
a threshold of 34KT or higher. Cropped images based on study introduces a comprehensive methodology employing
bounding box coordinates from the detector are fed to the two types of artificial neural networks to project changes in
DenseNet169 CNN classifier to differentiate between true TCs North Atlantic tropical cyclone genesis potential under the
and non-TCs. This methodology is optimized using Bayesian warming climate of the twenty-first century. Through SOMs,
optimization techniques. The study uses Meteosat Visible archetypal patterns of GPI-related environmental variables are
Infra-Red Imager (MVIRI) IR satellite images from Meteosat captured, arranging them on a two-dimensional grid to retain
5 and Meteosat 7 in the Indian Ocean Data Coverage (IODC) data topology. Concurrently, FBNNs identify the relative
region. The model is tested on a dataset of 171 images, importance of these variables in driving projected GP changes.
including 88 TCs, indicating promising performance [73]. SOMs’ training ensures the preservation of data relationships,
while FBNNs calculate variable relevance for GP outcomes.
2) Genesis Forecast: Traditionally, meteorologists relies on The FBNNs’ training involves conveying input signals to
various physical models and statistical techniques to predict hidden-layer nodes, generating output via sigmoid functions.
tropical cyclone (TC) genesis. However, these physical models The neural network framework NEVPROP4 is employed
often have limitations and simplifications, which can affect for FBNN implementation. This dual-network approach
their accuracy in capturing the complex interactions and dy- yields significant insights into the intricate trends of TC
namics involved in TC genesis. On the other hand, statistical genesis potential as they respond to evolving environmental
models have been used to analyze historical data and identify conditions [75].
patterns and relationships between different meteorological
variables and TC genesis. While statistical models can provide 3) Track Forecast: The accurate prediction of TC tracks
valuable insights and correlations, they may struggle to capture is crucial for effective disaster preparedness and response.
nonlinear and complex relationships present in the data. In recent years, deep learning techniques have emerged as
Short-term tropical cyclogenesis forecasting plays a critical promising tools for improving TC track forecasting. In a
role in predicting the formation and development of TCs study, researchers aimed to utilize the neural oscillatory
within a relatively short time frame. A study investigated elastic graph matching (NOEGM) technique for tropical
the detectability of TCs and their precursors using a CNN cyclone (TC) pattern identification, and a hybrid radial basis
model across different basins, seasons, and lead times. The function (HRBF) network integrated with time difference
CNN architecture consists of four convolutional layers, three and structural learning (TDSL) algorithm for TC track
pooling layers, and three fully connected layers, and finally prediction. The HRBF network employs three layers,
an output layer with two units for binary classification. The with past network outputs introduced through time-delay
Adam optimizer is applied to the CNN to update the network units and influenced by a decay factor. The evaluation
parameters to minimize the loss function called binary cross- encompassed 120 TC cases spanning from 1985 to 1998.
entropy. In the western North Pacific, the CNN successfully The NOEGM model achieved noteworthy results, with 98%
detects TCs and their precursors during the period of July accurate segmentation and 97% correct classification rates
to November, achieving high POD ranging from 79.0% to for TC pattern recognition. The HRBF model showcased
89.1%, along with relatively low FAR ranging from 32.8% to an accuracy of over 86% in TC track and intensity mining.
53.4%. Notably, the CNN exhibits impressive performance in In comparison with prevailing TC prediction models, the
detecting precursors, with detection results of 91.2%, 77.8%, proposed approach demonstrated substantial enhancements,
and 74.8% for precursors occurring 2, 5, and 7 days before reducing forecast errors by more than 30% and achieving
their formation, respectively. This method displays promise for a remarkable 14% enhancement in 48-hour track forecast
studying tropical cyclogenesis and exhibits robust performance accuracy. [76]. Utilizing an extensive dataset spanning 32
even in regions with limited training data and short TC years of cyclone best track analysis, researchers constructed
lifetimes. However, the detection of TCs and their precursors an ANN model to forecast TC positions 24 hours in advance.
is found to be limited in cases where cloud cover is ex- Notably, this model incorporates inputs from the two most
tremely small (¡ 30%) or extremely large (¿ 95%). Considering recent 6-hourly positions, along with the present latitude
developing TCs and precursors as one category potentially and longitude, while predicting positions for a 24-hour lead
affects the ability to detect pre-TCs. Additionally, model- time. Through a systematic exploration, a range of both
specific biases are identified due to the CNN being trained linear and nonlinear transfer functions, such as Radial Basis
solely on Nonhydrostatic ICosahedral Atmospheric Model Function and linear least squares optimization, are evaluated.
(NICAM) dataset. Notably, the detection performance in the Furthermore, different configurations of hidden layers and
North Atlantic is relatively lower, which could be attributed neurons are experimented with to optimize performance.
to the scarcity of training data and shorter lifetimes of TCs in The chosen linear neural network architecture, driven by a
that particular region [74]. pseudo invert learning algorithm, yields remarkable results,
In the realm of long-term cyclogenesis forecasting, various achieving MAE as low as 0.75 degrees for latitude and
13
Task Approach Ref

Frequency MLP, RBF, GRNN [72]
Identification CNN, Mask R-CNN [73]
Cyclogenesis Forecast CNN [74]
SOM-FNN [75]
Track Forecast HRBF [76]
ANN [77]
GAN [78]
LSTM [79]
CNN [80]
Intensity Forecast NN [81]
MLP [82]
CNN [83]
RNN [84]
CNN [85]
CNN [86]
double cascade CNN [87]
CNN, VGG-19 [88]
CNN [89]
CNN [90]
CNN, VGG [91]
RI Prediction CNN [92]
RNN, LSTM [93]
TABLE V: Deep Learning in Tropical Cyclone
0.87 degrees for longitude. The model’s effectiveness is findings underscored the GAN’s efficacy in predicting
reinforced by a comparison of average errors: the Limited typhoon trajectories and cloud formations. Accuracy in
Area Model (LAM), the National Centre for Environmental predicting typhoon center positions was assessed, revealing
Prediction based Quasi Lagrangian Model (QLM), and ANN that a majority fell within 80 km (65.5%), a notable portion
models exhibit errors of 132.6 km, 142.0 km, and 127.5 km, within 80-120 km (31.5%), and a smaller fraction exceeded
respectively. These findings underscore the potential accuracy 120 km (3.0%). The overall prediction error was significantly
of the ANN-based approach in cyclone tracking, particularly reduced to 67.2 km, compared to 95.6 km when relying solely
within a 24-hour prediction window [77]. Rüttgers et al. on observational data. The GAN’s ability to anticipate cloud
leveraged a Generative Adversarial Network (GAN) to movement patterns underscored its potential in capturing
anticipate the paths of typhoons. This was accomplished dynamic phenomena [78]. An algorithm based on LSTM is
by merging satellite images from the KMA and reanalysis employed for desirable 6-24 hour nowcasting of typhoon
data from the ECMWF dataset, covering the time span tracks on historical typhoon data from 1949 to 2011 in
from 1993 to 2017, with a focus on typhoons that could China’s Mainland. The model’s architecture encompasses
impact the Korean Peninsula. Training data comprised three layers: input, hidden, and output, featuring 20 LSTM
cropped segments of historical typhoon images, while cells in the hidden layer and 2 neurons in the output layer.
full-scale images were employed for testing purposes. The Through backpropagation utilizing the BPTT algorithm, errors
GAN framework consisted of a generator, which utilized are minimized by comparing predictions with actual observed
multi-scale capabilities to generate diverse images, and a tracks, thus offering a substantial advancement in typhoon
discriminator to differentiate between authentic and generated track prediction [79]. An innovative approach is introduced to
images. Inputs encompassed meteorological variables such predict tropical cyclone movement over a 24-hour timeframe
as Sea Surface Temperature (SST), Sea Pressure, Relative by combining historical trajectory data and reanalysis
Humidity (RH), Surface velocity field (zonal and meridional atmospheric images, particularly wind and pressure fields.
components), Velocity field at 950 mb pressure level, and The technique involves adopting a dynamic frame of reference
Vertical wind shear (at 850 mb and 200 mb pressure levels). that follows the storm center, thus enhancing the precision
The training process involved iteratively optimizing both of forecasts. The model’s versatility is demonstrated by its
networks using distinct loss functions: L2 loss for quantifying capability to rapidly provide forecasts for newly emerging
image disparities, gradient difference loss to amplify image storms, a crucial asset for real-time predictions. Leveraging
clarity, and adversarial loss to challenge the discriminator’s an extensive database spanning more than three decades and
ability to distinguish real from generated images. The over 3,000 storms, sampled at six-hour intervals, the approach
14
integrates past displacement data, metadata, wind fields, and are fine-tuned through SGD optimization with momentum, and
geopotential height fields to capture diverse information. a specialized Softmax loss layer facilitates accurate multi-class
The methodology involves separate training of the Wind classification. By autonomously extracting pivotal features
CNN, Pressure CNN, and Past Tracks + Meta NN, followed from TC images, this methodology achieves better accuracy
by integration into a fused network. Training incorporates and reduced RSME, indicating a significant advancement in
root mean square error (RMSE) as the loss function, with tropical cyclone intensity estimation [83]. Additionally, a study
regularization to prevent overfitting. This fusion network investigates the application of RNN for forecasting TC inten-
not only enhances prediction accuracy but also significantly sity by leveraging historical observation data in the Western
reduces testing time, making it a promising advancement North Pacific since 1949. The RNN architecture captures
for real-time forecasting in the realm of tropical cyclones [80]. intricate relationships among sequential elements—longitude,
latitude, and intensity—across input, hidden, and output lay-
4) Intensity Prediction: Tropical cyclone intensity predic- ers. Integrating a backpropagation through time optimization
tion is a critical aspect of forecasting, and various approaches algorithm, the model refines weights and biases. Employing
have been explored in recent research. In a study, an advanced a cross-entropy loss function, it gauges disparities between
neural network model is employed to predict tropical cyclone predicted and actual TC intensity, with the hidden layer em-
intensity changes in the western North Pacific, incorporating ploying the tanh activation function. Notably, the model excels,
climatology, persistence, and synoptic factors. The neural net- achieving a compelling 5.1 m/s error in 24-hour forecasts,
work architecture consists of three layers: an input layer with outperforming select dynamical models and closely approxi-
11 units representing climatology, persistence, and synoptic mating subjective predictions.[84]. A dual-branch CNN model
predictors; a hidden layer with 11 units capturing complex is proposed for estimating tropical cyclone (TC) intensity
relationships; and an output layer predicting intensity changes. in the Northwest Pacific. The model exhibits strong perfor-
The models analyzed include a multiple linear regression mance for tropical storm and super typhoon categories but
model with climatology and persistence predictors (R-CP), demonstrates reduced accuracy for moderate intensity and
a neural network model with the same predictors (N-CP), a the weakest tropical depression category. The architecture of
multiple linear regression model with climatology, persistence, the TCIENet model comprises two parallel CNN branches
and synoptic predictors (R-CPS), and a neural network model designed for processing infrared and water vapor images. Each
incorporating all predictors (N-CPS). The performance of branch includes essential modules for feature extraction, water
these models is assessed through average intensity prediction vapor attention, and intensity regression, with the overall goal
errors across different prediction intervals. The N-CPS model of capturing the intricate relationship between image patterns
demonstrates superior performance in predicting tropical cy- and TC intensity. The training is facilitated by the Adam opti-
clone intensity changes, especially over shorter time intervals, mizer, utilizing techniques such as Softmax operation, dropout
while the N-CP model shows slight superiority over the R- regularization, and L1 and L2 loss functions to enhance its
CPS model [81]. In a study, the use of MLP models capable predictive capability. The research also delves into the impact
of forecasting intensity changes at 3-hour intervals beyond 72 of diverse image sizes and model components on intensity
hours in the North Indian Ocean, specifically in the Bay of estimation accuracy, leveraging metrics like RMSE, MAE,
Bengal and Arabian Sea, is explored. The architecture of the bias, and absolute error to evaluate the model’s effectiveness
MLP model incorporates central pressure (CP), maximum sus- [85]. Tian et al. presented a novel CNN-based hybrid model
tained wind speed (MSWS), pressure drop (PD), total ozone designed to accurately estimate tropical cyclone intensity by
column (TOC), and sea surface temperature (SST) as inputs harnessing 46,919 infrared images sourced from the Pacific
for predicting cyclone intensity. The model’s effectiveness is Northwest and Atlantic Ocean. This architecture incorporates
assessed using metrics like RSME and MAE, revealing the a classification model, fine-grained regression models, and a
MLP’s superior performance compared to other models like Back-propagation neural network. The classification model ef-
RBFN, MLR, and OLR for forecasting cyclone intensity. The fectively categorizes TC samples into distinct intensity levels,
models’ individual performances are evaluated for various thereby guiding the selection of appropriate regression models.
cyclones, accounting for varying sea surface temperatures The model’s optimization is carried out using the Adam op-
over the Arabian Sea and Bay of Bengal [82]. In a study, timization algorithm, while a cross-validation loss function is
a CNN model is used to estimate the intensity of tropical employed for both classification and regression tasks. Notably,
cyclones in the Atlantic and Pacific regions. The proposed the model achieves exceptional accuracy and remarkably low
model uses a comprehensive dataset comprising two distinct RSME, outperforming the existing methodologies [86]. The
components: a collection of 48,828 infrared (IR) hurricane TCICENet model offers a novel approach to accurately clas-
images sourced from the Marine Meteorology Division of sifying and estimating tropical cyclone intensity using infrared
the U.S. Naval Research Laboratory, and HURDAT2 data to satellite images from the northwest Pacific Ocean basin. This
label these images. The model’s architecture integrates convo- model adopts a cascading deep-CNN architecture consisting of
lutional layers with varying filter sizes and strides, followed two essential components: TC intensity classification (TCIC)
by strategic max-pooling for down-sampling. Complemented and TC intensity estimation (TCIE). The TCIC module em-
by local response normalization and fully connected layers ploys convolutional layers to categorize TC intensity into
with ReLU activation, the model incorporates regularization three specific classes, while the TCIE module, inspired by
techniques like dropout to prevent overfitting. Weight updates a modified AlexNet structure, predicts intensity values across
15
different TC intensity categories. Notably, the TCIC module thickness (CLOT), cloud top temperature (CLTT), cloud top
employs a cross-entropy loss with L2 regularization, and the height (CLTH), cloud effective radius (CLER), and cloud
TCIE module employs a SmoothL1 loss function for precise type (CLTY). The model’s architecture is based on the VGG
intensity estimation. The model’s effectiveness is validated framework, enhanced with attention mechanisms and resid-
using a dataset encompassing 1001 TCs from 1981 to 2019, ual learning to improve precision while reducing parameter
partitioned into distinct sets for training, validation, and test- count. The CNN comprises four convolutional blocks with
ing. Evaluation based on intensity estimation metrics reveals progressively larger filter sizes, integrating residual learning at
impressive performance, achieving an overall root mean square different levels and a Convolutional Block Attention Module
error of 8.60 kt and a mean absolute error of 6.67 kt in (CBAM) after a maximum pooling layer. Batch normalization
comparison to best track data. [87]. In a separate study, a CNN and dropout layers are employed to counter overfitting. The
model is utilized to predict the intensity levels of hurricanes model is optimized using the Adam optimizer and MAE loss
using IR satellite imagery data from HURSAT and wind speed function. It undergoes training and tuning through six-fold
data from the HURDAT2 of the Greater Houston region. cross-validation and is evaluated on independent test data,
The architecture involves sequential layers: input, convolution, utilizing real-time typhoon track information from the western
pooling, and fully connected, guided by ReLU activations, North Pacific basin alongside Himawari-8 cloud products
MSE loss, and RmsProp/Adam optimizers. This facilitates [91]. These studies contribute valuable insights into improving
accurate hurricane intensity estimation, pattern recognition for tropical cyclone intensity forecasts through the utilization of
storm categorization by severity, achieving lower RMSE (7.6 advanced neural network models.
knots) and MSE (6.68 knots) through batch normalization and
dropout layers. Additionally, a VGG19 model is employed In addition to general tropical cyclone intensity forecasting,
to evaluate the extent of damage and automate annotation of a particularly challenging aspect is predicting Rapid Intensifi-
satellite imagery data. The VGG 19 model undergoes fine- cation (RI), where a tropical cyclone undergoes a sudden and
tuning for hurricane damage prediction and classification of significant strengthening over a short timeframe. RI is a critical
severe weather events. The optimization process is guided by phenomenon due to its potential to escalate a relatively mild
the Adam Optimizer, utilizing MSE as the foundational loss storm into a highly destructive force, posing severe threats to
function. The models are subjected to rigorous evaluation, coastal communities and infrastructure.
encompassing a diverse set of metrics including RMSE, MAE,
MSE, and Relative RSME. Notably, the model demonstrates To address the complexities of RI prediction, a CNN model
remarkable performance, achieving a 98% accuracy in predict- called TCNET was developed to enhance the prediction of RI
ing hurricane damage and a 97% accuracy in classifying severe in tropical cyclones by extracting features from large-scale
weather events [88]. Furthermore, a model called DeepTCNet environmental conditions. The study used ECMWF ERA-
is proposed specifically designed for TC intensity and size Interim reanalysis data and the SHIPS database. TCNET’s
estimation in the North Atlantic by using IBTrACS and the architecture consists of data filters, a customized sampler
Hurricane Satellite dataset. The study harnesses CNN as the (GMM-SMOTE), an XGBoost classifier, and hyperparameter
core architecture within DeepTCNet to estimate TC intensity tuning. This model’s performance outperforms COR-SHIPS
and wind radii from IR imagery. Extensive experimentation and LLE-SHIPS in RI prediction, yielding superior results
establishes VGGNet with 13 layers and compact (3 × 3) in terms of kappa, PSS, POD, and FAR metrics. Moreover,
convolutional filters as the optimal configuration, forming the TCNET identifies previously unexplored variables, such as
foundational structure for DeepTCNet. The evaluation presents ozone mass mixing ratio, that influence RI. The training
MAE for TC intensity estimation (measured in knots) on the of TCNET involves backpropagation, utilizing mean square
test dataset across various depths and kernel sizes in VGGNet’s error as the loss function and Adam optimizer for weight
initial convolutional layer. Leveraging the Adam optimization updates of the filters [92]. In another study, deep learning
with default parameters, learning occurs through the adop- models including RNN and LSTM were explored for pre-
tion of MAE as the loss function. This holistic approach dicting tropical cyclone intensity and rapid intensification
exemplifies the seamless fusion of physics-augmented deep (RI). The proposed approach involved convolutional layers for
learning, culminating in enhanced TC analysis and prediction autonomous feature extraction from satellite images, an RNN
capabilities [89]. The study focuses on estimating TC intensity block with ConvLSTM cells for feature evolution, and a final
using a CNN model. Satellite IR imagery and Best Track data output regressor composed of convolutional and dense layers
are employed to analyze 97 TC cases over the Northwest to forecast tropical cyclone intensity (Vmax) at +24 hours.
Pacific Ocean from 2015 to 2018. The CNN architecture Additionally, the study introduced a deep learning ensemble
encompasses an input layer, four convolutional layers, four strategy involving 20 models with diverse designs, effectively
pooling layers, two fully connected layers, and an output layer, improving TC intensity and RI prediction by incorporating
resulting in the derivation of eight intensity values. Notably, both conventional and satellite-derived features. This ensemble
the multicategory CNN achieves an accuracy of 84.8% for TC method offered intensity distributions for deterministic pre-
intensity estimation, which further improves to 88.9% through dictions, RI likelihood estimation, and prediction uncertainty
conversion to a binary classification task. [90]. Another study assessment, yielding improved RI detection probabilities and
proposes a CNN model for estimating TC intensity using reduced false-alarm rates compared to operational forecasts for
Himawari-8 satellite cloud products, including cloud optical western Pacific TCs [93].
16
IV. C HALLENGES various sources such as satellites, radar systems, automatic

The effective utilization of DL models in weather weather stations, numerical models and manual observations.
forecasting is accompanied by several challenges that require However, each of these sources employs its own distinct
careful consideration. In this section, we explore several key method of storing data. Satellites often employ formats like
challenges associated with the application of deep learning in TIFF or GeoTIFF, while radar data may utilize formats such
the field of weather forecasting. as HDF5 or NetCDF, and other sources could have unique
formats. To ensure the seamless operation of a weather
1) Data Availability: Data availability is crucial for prediction model, these different types of data must be
advancing the capabilities of deep learning models in harmonized, enabling the model to effectively learn from the
meteorological applications. Limited access to historical combined data and resulting in accurate and reliable weather
records, real-time observations, and specialized data sources forecasts.
can hamper model development and evaluation [94].
Addressing data availability challenges requires establishing 6) Model Explainability: Model explainibility refers to the
robust data-sharing frameworks, promoting data collaboration how a model process and transform input into corresponding
between meteorological organizations, and exploring output, making the process transparent and easy to compre-
innovative approaches to gather and enhance meteorological hend [95]. For instance, if the model predicts upcoming rain,
data. meteorologists need to comprehend the specific meteorological
variables and potential biases influencing its predictions. This
2) Data Quality: Ensuring data quality poses a significant understanding becomes crucial as the model transitions to real-
challenge for deep learning models in meteorology. Weather world application, allowing its developers to provide insights
data, obtained from various sources like weather stations, into its functioning.
satellites, and radars, may have limitations in terms of spatial
coverage, temporal resolution, and accuracy. Missing or V. D ISCUSSION AND F UTURE D IRECTIONS
inaccurate observations can introduce biases and errors. For
example, inadequate temperature measurements in remote The integration of deep learning methods into the study
regions due to limited weather station distribution can lead to of extreme weather events brings about a significant transfor-
incomplete climate models, potentially affecting the accuracy mation, expanding our understanding and predictive abilities.
of long-term weather predcitions. These advanced models not only excel in deciphering intricate
spatial relationships but also stand out in unraveling the
3) Model Architecture: DL models often consist of complex timing patterns inherent in meteorological phenom-
complex architectures with numerous layers and parameters, ena. This advancement holds the potential to greatly improve
posing challenges in their design and optimization for weather weather forecasting accuracy across a wide range of events,
forecasting. Determining the optimal network architecture, from thunderstorms and lightning occurrences to the tracking
selecting appropriate activation functions, and managing of tropical cyclones.
computational resources are critical task in developing At the heart of this innovation lies the natural capacity
efficient DL models for meteorological applications [8]. of deep learning models to identify and replicate non-linear
relationships within the complex fabric of atmospheric data.
4) Hybrid Approach: Combining DL techniques with By analyzing extensive sets of information, these models
traditional physical models can leverage the strengths of both unearth hidden patterns and interactions that conventional
approaches, leading to more accurate and reliable predictions. methods struggle to capture. However, as we move forward
DL models excel at learning complex patterns and capturing with these promising advancements, it becomes crucial to
nonlinear relationships in large datasets [94], while traditional address certain key challenges that must be overcome to fully
physical models provide valuable insights into the underlying realize the potential of deep learning in meteorology. One
physical processes. For instance, coupling a deep learning such challenge revolves around the continuous need for high-
algorithm with a NWP model can allow the DL component to quality and comprehensive data. The effectiveness of deep
capture intricate spatial patterns in satellite imagery, while the learning models relies on their exposure to a diverse array of
physical model contributes its understanding of atmospheric carefully curated data points. This emphasizes the importance
physics. This collaborative approach offers the potential for of creating robust data pipelines and well-organized datasets.
more precise predictions of complex meteorological events. Furthermore, the inherently complex nature of deep learning
However, integrating DL models with existing frameworks architectures presents a dilemma regarding their interpretabil-
triggers challenges such as resource requirement, data quality ity. Ensuring that the decisions made by these intricate models
and interpretability. can be comprehended and validated by experts in meteorology
remains an ongoing endeavor.
5) Data Heterogeneity: Data heterogeneity refers to the The future of weather prediction calls for the exploration
diversity of data sources, formats, and features, which can of ensemble techniques that combine the strengths of various
complicate the integration and analysis of different data types. models to produce more comprehensive and accurate forecasts.
For instance, in the development of a deep learning model for This pursuit involves developing innovative approaches that
weather prediction, information must be accumulated from seamlessly integrate deep learning models with traditional
17
numerical weather prediction methods, drawing on the well- damage assessment methods using deep learning techniques,
established physical understanding of atmospheric processes. despite recent advancements in hail-related studies. Improving
The convergence of deep learning capabilities with specialized models, such as NWP, remains a substantial challenge.
insights in meteorology emerges as a fertile area for further
exploration. Hybrid models that blend empirical meteorologi-
cal knowledge with the computational power of deep learning VI. C ONCLUSION
offer a promising path to enhancing forecast accuracy and
reinforcing our ability to handle the multifaceted impacts of This review highlighted the significant advancements and
extreme weather events. promising potential of deep learning techniques in the field
In the broader context of this review, a clear message under- of meteorology, specifically in extreme weather events. Deep
scores the persistent drive for progress, necessitating ongoing learning models, such as CNN and RNN, demonstrated their
collaboration and interdisciplinary synergy. By harnessing the effectiveness in various applications, including cyclone predic-
capabilities of deep learning and pushing the boundaries of tion, severe rainfall and hail prediction, cloud and snow de-
meteorological understanding, we are positioned to empower tection, rainfall-induced flood, landslide forecasting, and more.
decision-makers and stakeholders with invaluable tools to The utilization of deep learning algorithms allowed researchers
proactively mitigate the far-reaching consequences of the ever- to extract intricate patterns and features from complex me-
evolving realm of extreme weather events. teorological datasets, leading to improved accuracy and per-
In weather forecasting, there exist several research gaps formance in weather prediction and analysis. These models
that need to be addressed to enhance the capabilities and showed remarkable skill in capturing spatial and temporal
effectiveness of these models. Firstly, in drought prediction, dependencies in weather data, enabling more accurate predic-
current studies lack long-term forecasting capabilities and tions of extreme events and enhancing our understanding of
are limited in spatial resolution. Improving these aspects their underlying dynamics. Furthermore, deep learning meth-
is crucial to provide accurate and detailed information on ods offered advantages over traditional statistical approaches
drought conditions, enabling proactive mitigation measures. by automatically learning representations and hierarchies of
In the case of tropical cyclones, there is a notable absence features, eliminating the need for manual feature engineering.
of studies focused on pattern identification. Efforts should be This allowed for more efficient and effective analysis of large-
directed towards reducing the cone of uncertainty by improv- scale meteorological datasets, facilitating the development of
ing track accuracy, size estimation, and spatial distribution advanced forecasting models and decision-support systems.
of cyclones. More research is needed in predicting RI and Deep learning models also provided an alternative approach by
associated weather phenomena such as storm surge, floods, directly learning the relationships between input observations
and quantitative precipitation forecasts. The lack of practical and output variables from data, circumventing the compu-
success stories in this area underscores the need for further tational bottlenecks and time lags associated with physics-
investigation and advancements. In heatwave prediction, there based models. However, further advancements were needed
is a paucity of research focused on the frequency and duration to enhance the performance and efficiency of these models
of heatwaves. The prediction of severe thunderstorms poses its in weather forecasting applications. Closing these research
own set of challenges. To improve forecasts in this domain, gaps and advancing the field of deep learning in weather
exploring other ensemble techniques, incorporating feature forecasting would contribute to more accurate, reliable, and
selection methods, and leveraging dynamic graph modeling timely predictions, ultimately benefiting various sectors and
approaches can be beneficial. Integrating data from multiple society as a whole.
NWP models, along with high-resolution NWP models, holds
promise for enhancing thunderstorm forecasts. The intensity,
frequency, and location prediction of lightning strikes require ACKNOWLEDGMENT
further attention. Monitoring and predicting lightning strikes,
The author would like to express heartfelt gratitude to
especially in discreetly distributed scenarios, remain complex
the India Meteorological Development (IMD) and the Indian
tasks that require advanced techniques and data integration.
Institute of Information Technology, Allahabad (IIIT Alla-
Radar and satellite data play a crucial role in weather forecast-
habad) for their invaluable support and contributions to this
ing. However, challenges persist in utilizing radar data to make
journal. Their guidance, resources, and assistance have been
predictions without clear indications of initial convections.
instrumental in the successful completion of this research.
Exploiting the early-stage signals of convections using radar
and satellite data can aid in improving forecast accuracy,
particularly in mitigating false alarms. Cloud-related weather
C ONFLICTS OF I NTEREST
forecasting also faces challenges, including high computation
time and resource requirements. Inefficient observations due to The authors declare no conflict of interest.
rain, strong winds, foggy conditions, and sunsets further hinder
the efficiency of cloud-related forecasting methods. Address-
ing these challenges is essential to unlock the full potential of A BBREVIATIONS
deep learning in cloud prediction. Further research is required
to enhance hail detection, size estimation, forecasting, and The following abbreviations are used in this manuscript:
18
RI Rapid Intensification
ANFIS Adaptive Neuro-Fuzzy Inference System RMSE Root Mean Square Error
ANN Artificial Neural Networks RNN Recurrent Neural Networks
ARIMA Autoregressive Integrated Moving Average RP Recurrence Plots
BPNN Back Propagation Neural Network SAT Surface Air Temperature
CAPE Convective Available Potential Energy SCW Severe Convective Weather
CapsNets Capsule Neural Networks SGD Stochastic Gradient Descent
CBAM Convolutional Block Attention Module SIAP Standard Index Of Annual Precipitation
CFS Climate Forecast System SOM Self-Organizing Maps
CG Cloud-To-Ground SPEI Standardized Precipitation Evapotranspiration Index
CNN Convolutional Neural Networks SPI Standardized Precipitation Index
Conv2d 2D Convolution Layer SSI Standardized Streamflow Index
CRF Conditional Random Fields SVM Support Vector Machine
CRPS Cumulative Ranked Probability Score SWE Snow Water Equivalent
DBN Deep Belief Network SWG Stochastic Weather Generator
DL Deep Learning SWSI Standardised Water Storage Index
DLNN Deep Learning Neural Network TC Tropical Cyclone
DNN Deep Neural Network TPW Total Precipitable Water
EEMD Ensemble Empirical Mode Decomposition VAEGAN Variational Autoencoder Gan
EMD Empirical Mode Decomposition VGG Visual Geometry Group
ENSO El Niño-Southern Oscillation WANN Wavelet-Artificial Neural Network
FAR False Alarm Rate WGAN Wasserstein Generative Adversarial Network
FFNN Feed Forward Neural Networks WMO World Meteorological Organization’S
GAN Generative Adversarial Network WPT Wavelet Packet Transform
GNN Graph Neural Network
GNSS Global Navigation Satellite System
GPI Genesis Potential Index R EFERENCES
GRNN Generalized Regression Neural Network [1] WMO, “World Meteorological Organization.” [Online].
GRU Gated Recurrent Unit Available: https://public.wmo.int/en
HR Heavy Rain [2] D. R. Easterling, G. A. Meehl, C. Parmesan, S. A.
HREF High-Resolution Ensemble Forecast Changnon, T. R. Karl, and L. O. Mearns, “Climate
IMD India Meteorological Department extremes: observations, modeling, and impacts,” science,
IMF Intrinsic Mode Function vol. 289, no. 5487, pp. 2068–2074, 2000.
IR Infrared [3] M. Beniston and D. B. Stephenson, “Extreme climatic
JTWC Joint Typhoon Warning Center events and their evolution under changing climatic con-
LM-ResNet Lightning Monitoring Residual Network ditions,” Global and planetary change, vol. 44, no. 1-4,
LRCN Long-Term Recurrent Convolutional Network pp. 1–9, 2004.
LST Land Surface Temperature [4] L. V. Alexander, X. Zhang, T. C. Peterson, J. Caesar,
LSTM Long Short-Term Memory B. Gleason, A. Klein Tank, M. Haylock, D. Collins,
MAE Mean Absolute Error B. Trewin, F. Rahimzadeh et al., “Global observed
ML Machine Learning changes in daily climate extremes of temperature and
MLP Multi-Layer Perceptrons precipitation,” Journal of Geophysical Research: Atmo-
MLR Multiple Linear Regression spheres, vol. 111, no. D5, 2006.
MNLR Multivariate Non-Linear Regression [5] C. Tebaldi, K. Hayhoe, J. M. Arblaster, and G. A.
NARX Nonlinear Autoregressive Exogenous Meehl, “Going to the extremes: an intercomparison of
NCEP National Centers For Environmental Prediction model-simulated historical and future changes in extreme
NIO North Indian Ocean events,” Climatic change, vol. 79, no. 3-4, pp. 185–211,
NNGA Neural Network-Genetic Algorithm 2006.
NWP Numerical Weather Prediction [6] IMD, “India meteorological department,”
PCA Principal Component Analysis https://mausam.imd.gov.in/.
PDSI Palmer Drought Severity Index [7] L. Espeholt, S. Agrawal, C. Sønderby, M. Ku-
POD Probability Of Detection mar, J. Heek, C. Bromberg, C. Gazen, R. Carver,
POSH Probability Of Severe Hail M. Andrychowicz, J. Hickey et al., “Deep learning for
QPE Quantitative Precipitation Estimation twelve hour precipitation forecasts,” Nature communica-
RBF Radial Basis Function tions, vol. 13, no. 1, pp. 1–10, 2022.
RBM Restricted Boltzmann Machines [8] I. Goodfellow, Y. Bengio, and A. Courville, Deep learn-
RF Random Forest ing. MIT press, 2016.
[9] P. Bauer, A. Thorpe, and G. Brunet, “The quiet revolution
19
of numerical weather prediction,” Nature, vol. 525, no. [24] S. Yao, H. Chen, E. J. Thompson, and R. Cifelli, “An
7567, pp. 47–55, 2015. improved deep learning model for high-impact weather
[10] K. E. Trenberth, “Changes in precipitation with climate nowcasting,” IEEE Journal of Selected Topics in Applied
change,” Climate research, vol. 47, no. 1-2, pp. 123–138, Earth Observations and Remote Sensing, vol. 15, pp.
2011. 7400–7413, 2022.
[11] I. Abdin, Y.-P. Fang, and E. Zio, “A modeling and [25] L. Espeholt, S. Agrawal, C. Sønderby, M. Ku-
optimization framework for power systems design with mar, J. Heek, C. Bromberg, C. Gazen, R. Carver,
operational flexibility and resilience against extreme heat M. Andrychowicz, J. Hickey et al., “Deep learning for
waves and drought events,” Renewable and Sustainable twelve hour precipitation forecasts,” Nature communica-
Energy Reviews, vol. 112, pp. 706–719, 2019. tions, vol. 13, no. 1, pp. 1–10, 2022.
[12] D. Coumou and S. Rahmstorf, “A decade of weather [26] T. Nan, J. Chen, Z. Ding, W. Li, and H. Chen, “Deep
extremes,” Nature climate change, vol. 2, no. 7, pp. 491– learning-based multi-source precipitation merging for the
496, 2012. tibetan plateau,” Science China Earth Sciences, vol. 66,
[13] F. Heidari, Q. Lin, E. F. E. Sarmiento, A. Toreti, and no. 4, pp. 852–870, 2023.
E. Xoplaki, “Towards the development of an ai-based [27] W. Li, H. Chen, and L. Han, “Polarimetric radar quan-
early warning system: a deep learning approach to titative precipitation estimation using deep convolutional
bias correct and downscale seasonal climate forecasts,” neural networks,” IEEE Transactions on Geoscience and
Copernicus Meetings, Tech. Rep., 2023. Remote Sensing, 2023.
[14] K. Zhou, Y. Zheng, B. Li, W. Dong, and X. Zhang, “Fore- [28] Y. Lee, M.-H. Ahn, and S.-J. Lee, “Incremental learning
casting different types of convective weather: A deep with neural network algorithm for the monitoring pre-
learning approach,” Journal of Meteorological Research, convective environments using geostationary imager,”
vol. 33, pp. 797–809, 2019. Remote Sensing, vol. 14, no. 2, p. 387, 2022.
[15] E. Vosper, P. Watson, L. Harris, A. McRae, R. Santos- [29] M. Pullman, I. Gurung, M. Maskey, R. Ramachandran,
Rodriguez, L. Aitchison, and D. Mitchell, “Deep learn- and S. A. Christopher, “Applying deep learning to hail
ing for downscaling tropical cyclone rainfall to hazard- detection: A case study,” IEEE Transactions on Geo-
relevant spatial scales,” Journal of Geophysical Re- science and Remote Sensing, vol. 57, no. 12, pp. 10 218–
search: Atmospheres, p. e2022JD038163, 2023. 10 225, 2019.
[16] ——, “Deep learning for downscaling tropical cyclone [30] F. Pulukool, L. Li, and C. Liu, “Using deep learning
rainfall,” Copernicus Meetings, Tech. Rep., 2023. and machine learning methods to diagnose hailstorms in
[17] A. C. Mondini, F. Guzzetti, and M. Melillo, “Deep large-scale thermodynamic environments,” Sustainabil-
learning forecast of rainfall-induced shallow landslides,” ity, vol. 12, no. 24, p. 10499, 2020.
Nature communications, vol. 14, no. 1, p. 2466, 2023. [31] D. J. Gagne II, S. E. Haupt, D. W. Nychka, and
[18] M. A. K. Azad, A. R. M. T. Islam, M. S. Rahman, and G. Thompson, “Interpretable deep learning for spatial
K. Ayen, “Development of novel hybrid machine learning analysis of severe hailstorms,” Monthly Weather Review,
models for monthly thunderstorm frequency prediction vol. 147, no. 8, pp. 2827–2845, 2019.
over bangladesh,” Natural Hazards, vol. 108, pp. 1109– [32] Q. Wu, Y.-X. Shou, L.-M. Ma, Q. Lu, and R. Wang, “Es-
1135, 2021. timation of maximum hail diameters from fy-4a satellite
[19] S. Guastavino, M. Piana, M. Tizzi, F. Cassola, A. Iengo, data with a machine learning method,” Remote Sensing,
D. Sacchetti, E. Solazzo, and F. Benvenuto, “Prediction vol. 14, no. 1, p. 73, 2021.
of severe thunderstorm events with ensemble deep learn- [33] Z. Wang, B. Fan, Z. Tu, H. Li, and D. Chen, “Cloud
ing and radar data,” Scientific Reports, vol. 12, no. 1, p. and snow identification based on deeplab v3+ and crf
20049, 2022. combined model for gf-1 wfv images,” Remote Sensing,
[20] Y. Essa, H. G. Hunt, M. Gijben, and R. Ajoodha, “Deep vol. 14, no. 19, p. 4880, 2022.
learning prediction of thunderstorm severity using remote [34] Y. Zhan, J. Wang, J. Shi, G. Cheng, L. Yao, and W. Sun,
sensing weather data,” IEEE Journal of Selected Topics in “Distinguishing cloud and snow in satellite images via
Applied Earth Observations and Remote Sensing, vol. 15, deep convolutional network,” IEEE geoscience and re-
pp. 4004–4013, 2022. mote sensing letters, vol. 14, no. 10, pp. 1785–1789,
[21] T. Lin, Q. Li, Y.-A. Geng, L. Jiang, L. Xu, D. Zheng, 2017.
W. Yao, W. Lyu, and Y. Zhang, “Attention-based dual- [35] M. Yin, P. Wang, C. Ni, and W. Hao, “Cloud and snow
source spatiotemporal neural network for lightning fore- detection of remote sensing images based on improved
cast,” IEEE Access, vol. 7, pp. 158 296–158 307, 2019. unet3+,” Scientific Reports, vol. 12, no. 1, p. 14415, 2022.
[22] M. Lu, Y. Zhang, M. Chen, M. Yu, and M. Wang, [36] V. Sood, R. K. Tiwari, S. Singh, R. Kaur, and B. R.
“Monitoring lightning location based on deep learning Parida, “Glacier boundary mapping using deep learn-
combined with multisource spatial data,” Remote Sens- ing classification over bara shigri glacier in western
ing, vol. 14, no. 9, p. 2200, 2022. himalayas,” Sustainability, vol. 14, no. 20, p. 13485,
[23] Z. Qian, D. Wang, X. Shi, J. Yao, L. Hu, H. Yang, and 2022.
Y. Ni, “Lightning identification method based on deep [37] Y. Wang, J. Su, X. Zhai, F. Meng, and C. Liu, “Snow
learning,” Atmosphere, vol. 13, no. 12, p. 2112, 2022. coverage mapping by learning from sentinel-2 satellite
20
multispectral images via machine learning algorithms,” and stochastic models: case of the algerois basin in
Remote Sensing, vol. 14, no. 3, p. 782, 2022. north algeria,” Water Resources Management, vol. 30,
[38] H. Wang, L. Zhang, L. Wang, J. He, and H. Luo, “An pp. 2445–2464, 2016.
automated snow mapper powered by machine learning,” [51] Y. Zhang, W. Li, Q. Chen, X. Pu, and L. Xiang, “Multi-
Remote Sensing, vol. 13, no. 23, p. 4826, 2021. models for spi drought forecasting in the north of haihe
[39] L. Zhu, Y. Zhang, J. Wang, W. Tian, Q. Liu, G. Ma, river basin, china,” Stochastic environmental research
X. Kan, and Y. Chu, “Downscaling snow depth mapping and risk assessment, vol. 31, pp. 2471–2481, 2017.
by fusion of microwave and optical remote-sensing data [52] N. A. Agana and A. Homaifar, “Emd-based predictive
based on deep learning,” Remote Sensing, vol. 13, no. 4, deep belief network for time series prediction: An appli-
p. 584, 2021. cation to drought forecasting,” Hydrology, vol. 5, no. 1,
[40] D. Xing, J. Hou, C. Huang, and W. Zhang, “Estimation p. 18, 2018.
of snow depth from amsr2 and modis data based on [53] P. Das, S. R. Naganna, P. C. Deka, and J. Pushparaj,
deep residual learning network,” Remote Sensing, vol. 14, “Hybrid wavelet packet machine learning approaches
no. 20, p. 5089, 2022. for drought modeling,” Environmental Earth Sciences,
[41] H. Yao, Y. Zhang, L. Jiang, H. T. Ewe, and M. Ng, vol. 79, pp. 1–18, 2020.
“Snow parameters inversion from passive microwave re- [54] M. M. H. Khan, N. S. Muhammad, and A. El-Shafie,
mote sensing measurements by deep convolutional neural “Wavelet-ann versus ann-based model for hydrometeo-
networks,” Sensors, vol. 22, no. 13, p. 4769, 2022. rological drought forecasting,” Water, vol. 10, no. 8, p.
[42] H. Ghanjkhanlo, M. Vafakhah, H. Zeinivand, and 998, 2018.
A. Fathzadeh, “Prediction of snow water equivalent using [55] Y. Soh, C. H. Koo, Y. Huang, and K. Fung, “Application
artificial neural network and adaptive neuro-fuzzy infer- of artificial intelligence models for the prediction of stan-
ence system with two sampling schemes in semi-arid dardized precipitation evapotranspiration index (spei) at
region of iran,” Journal of Mountain Science, vol. 17, langat river basin, malaysia,” Computers and electronics
no. 7, pp. 1712–1723, 2020. in agriculture, vol. 144, pp. 164–173, 2018.
[43] S. Marofi, H. Tabari, and H. Z. Abyaneh, “Predicting [56] S. E. Perkins and L. V. Alexander, “On the measurement
spatial distribution of snow water equivalent using mul- of heat waves,” Journal of climate, vol. 26, no. 13, pp.
tivariate non-linear regression and computational intelli- 4500–4517, 2013.
gence methods,” Water resources management, vol. 25, [57] J. A. López-Bueno, M. Á. Navas-Martı́n, J. Dı́az, I. J.
pp. 1417–1435, 2011. Mirón, M. Y. Luna, G. Sánchez-Martı́nez, D. Culqui, and
[44] M. M. H. Khan, N. S. Muhammad, and A. El-Shafie, C. Linares, “The effect of cold waves on mortality in
“Wavelet based hybrid ann-arima models for meteoro- urban and rural areas of madrid,” Environmental Sciences
logical drought forecasting,” Journal of Hydrology, vol. Europe, vol. 33, no. 1, pp. 1–14, 2021.
590, p. 125380, 2020. [58] WMO, “Wmo annual report highlights continuous ad-
[45] A. Gyaneshwar, A. Mishra, U. Chadha, P. D. Raj Vincent, vance of climate change,” WMO Annual Report, Tech.
V. Rajinikanth, G. Pattukandan Ganapathy, and K. Srini- Rep., 2023.
vasan, “A contemporary review on deep learning models [59] C. Lavaysse, G. Naumann, L. Alfieri, P. Salamon, and
for drought prediction,” Sustainability, vol. 15, no. 7, p. J. Vogt, “Predictability of the european heat and cold
6160, 2023. waves,” Climate Dynamics, vol. 52, pp. 2481–2495,
[46] A. A. Pathak and B. Dodamani, “Comparison of mete- 2019.
orological drought indices for different climatic regions [60] V. B. Dodla, G. C. Satyanarayana, and S. Desamsetti,
of an indian river basin,” Asia-Pacific Journal of Atmo- “Analysis and prediction of a catastrophic indian coastal
spheric Sciences, vol. 56, pp. 563–576, 2020. heat wave of 2015,” Natural Hazards, vol. 87, pp. 395–
[47] U. G. Bacanli, M. Firat, and F. Dikbas, “Adaptive neuro- 414, 2017.
fuzzy inference system for drought forecasting,” Stochas- [61] R. D. Peng, J. F. Bobb, C. Tebaldi, L. McDaniel, M. L.
tic Environmental Research and Risk Assessment, vol. 23, Bell, and F. Dominici, “Toward a quantitative estimate of
pp. 1143–1154, 2009. future heat wave mortality under global climate change,”
[48] A. Belayneh, J. Adamowski, B. Khalil, and B. Ozga- Environmental health perspectives, vol. 119, no. 5, pp.
Zielinski, “Long-term spi drought forecasting in the 701–706, 2011.
awash river basin in ethiopia using wavelet neural net- [62] W. Nasim, A. Amin, S. Fahad, M. Awais, N. Khan,
work and wavelet support vector regression models,” M. Mubeen, A. Wahid, M. H. Rehman, M. Z. Ihsan,
Journal of Hydrology, vol. 508, pp. 418–429, 2014. S. Ahmad et al., “Future risk assessment by estimating
[49] A. Belayneh, J. Adamowski, and B. Khalil, “Short- historical heat wave trends with projected heat accumu-
term spi drought forecasting in the awash river basin in lation using simclim climate model in pakistan,” Atmo-
ethiopia using wavelet transforms and machine learning spheric Research, vol. 205, pp. 118–133, 2018.
methods,” Sustainable Water Resources Management, [63] A. Dosio, “Projection of temperature and heat waves
vol. 2, pp. 87–101, 2016. for africa with an ensemble of cordex regional climate
[50] S. Djerbouai and D. Souag-Gamane, “Drought fore- models,” Climate Dynamics, vol. 49, no. 1-2, pp. 493–
casting using neural networks, wavelet neural networks, 519, 2017.
21
[64] R. Vautard, A. Gobiet, D. Jacob, M. Belda, A. Colette, Oceanic Technology, vol. 29, no. 9, pp. 1202–1220, 2012.
M. Déqué, J. Fernández, M. Garcı́a-Dı́ez, K. Goergen, [76] R. S. Lee and J. N. Liu, “Tropical cyclone identification
I. Güttler et al., “The simulation of european heat waves and tracking system using integrated neural oscillatory
from an ensemble of regional climate models within elastic graph matching and hybrid rbf network track min-
the euro-cordex project,” Climate Dynamics, vol. 41, pp. ing techniques,” IEEE Transactions on Neural Networks,
2555–2575, 2013. vol. 11, no. 3, pp. 680–689, 2000.
[65] S. Singh, R. Mall, J. Dadich, S. Verma, J. Singh, and [77] M. Ali, C. Kishtawal, and S. Jain, “Predicting cyclone
A. Gupta, “Evaluation of cordex-south asia regional tracks in the north indian ocean: An artificial neural
climate models for heat wave simulations over india,” network approach,” Geophysical research letters, vol. 34,
Atmospheric Research, vol. 248, p. 105228, 2021. no. 4, 2007.
[66] N. Narkhede, R. Chattopadhyay, S. Lekshmi, [78] M. Rüttgers, S. Lee, and D. You, “Prediction of ty-
P. Guhathakurta, N. Kumar, and M. Mohapatra, phoon tracks using a generative adversarial network with
“An empirical model-based framework for operational observational and meteorological data,” arXiv preprint
monitoring and prediction of heatwaves based on arXiv:1812.01943, 2018.
temperature data,” Modeling Earth Systems and [79] S. Gao, P. Zhao, B. Pan, Y. Li, M. Zhou, J. Xu, S. Zhong,
Environment, vol. 8, no. 4, pp. 5665–5682, 2022. and Z. Shi, “A nowcasting model for the prediction of
[67] P. Li, Y. Yu, D. Huang, Z.-H. Wang, and A. Sharma, “Re- typhoon tracks based on a long short term memory neural
gional heatwave prediction using graph neural network network,” Acta Oceanologica Sinica, vol. 37, pp. 8–12,
and weather station data,” Geophysical Research Letters, 2018.
vol. 50, no. 7, p. e2023GL103405, 2023. [80] S. Giffard-Roisin, M. Yang, G. Charpiat, C. Kumler Bon-
[68] A. Chattopadhyay, E. Nabizadeh, and P. Hassanzadeh, fanti, B. Kégl, and C. Monteleoni, “Tropical cyclone
“Analog forecasting of extreme-causing weather patterns track forecasting using fused deep learning from aligned
using deep learning,” Journal of Advances in Modeling reanalysis data,” Frontiers in big Data, p. 1, 2020.
Earth Systems, vol. 12, no. 2, p. e2019MS001958, 2020. [81] J.-J. Baik and J.-S. Paek, “A neural network model for
[69] D. Fister, J. Pérez-Aracil, C. Peláez-Rodrı́guez, predicting typhoon intensity,” Journal of the Meteorolog-
J. Del Ser, and S. Salcedo-Sanz, “Accurate long-term air ical Society of Japan. Ser. II, vol. 78, no. 6, pp. 857–869,
temperature prediction with machine learning models 2000.
and data reduction techniques,” Applied Soft Computing, [82] S. Chaudhuri, D. Dutta, S. Goswami, and A. Middey,
vol. 136, p. 110118, 2023. “Intensity forecast of tropical cyclones over north indian
[70] J. Chung, Y. Lee, W. Jang, S. Lee, and S. Kim, “Corre- ocean using multilayer perceptron model: Skill and per-
lation analysis between air temperature and modis land formance verification,” Natural Hazards, vol. 65, pp. 97–
surface temperature and prediction of air temperature 113, 2013.
using tensorflow long short-term memory for the period [83] R. Pradhan, R. S. Aygun, M. Maskey, R. Ramachandran,
of occurrence of cold and heat waves,” Remote Sensing, and D. J. Cecil, “Tropical cyclone intensity estimation
vol. 12, no. 19, p. 3231, 2020. using a deep convolutional neural network,” IEEE Trans-
[71] S. Verma, A. Agarwal, and K. Srivastava, “An adaptive actions on Image Processing, vol. 27, no. 2, pp. 692–702,
approach to detect and track the cyclone path using 2017.
remote sensing data,” in 2022 IEEE 19th India Council [84] B. Pan, X. Xu, and Z. Shi, “Tropical cyclone intensity
International Conference (INDICON). IEEE, 2022, pp. prediction based on recurrent neural networks,” Electron-
1–4. ics Letters, vol. 55, no. 7, pp. 413–415, 2019.
[72] S. Nath, S. Kotal, and P. Kundu, “Seasonal prediction [85] R. Zhang, Q. Liu, and R. Hang, “Tropical cyclone in-
of tropical cyclone activity over the north indian ocean tensity estimation using two-branch convolutional neural
using three artificial neural networks,” Meteorology and network from infrared and water vapor images,” IEEE
Atmospheric Physics, vol. 128, pp. 751–762, 2016. Transactions on Geoscience and Remote Sensing, vol. 58,
[73] A. Nair, K. S. Srujan, S. R. Kulkarni, K. Alwadhi, no. 1, pp. 586–597, 2019.
N. Jain, H. Kodamana, S. Sandeep, and V. O. John, “A [86] W. Tian, W. Huang, L. Yi, L. Wu, and C. Wang, “A
deep learning framework for the detection of tropical cnn-based hybrid model for tropical cyclone intensity
cyclones from satellite images,” IEEE Geoscience and estimation in meteorological industry,” IEEE Access,
Remote Sensing Letters, vol. 19, pp. 1–5, 2021. vol. 8, pp. 59 158–59 168, 2020.
[74] D. Matsuoka, M. Nakano, D. Sugiyama, and S. Uchida, [87] C.-J. Zhang, X.-J. Wang, L.-M. Ma, and X.-Q. Lu,
“Deep learning approach for detecting tropical cyclones “Tropical cyclone intensity classification and estimation
and their precursors in the simulation by a cloud- using infrared satellite images with deep learning,” IEEE
resolving global nonhydrostatic atmospheric model,” Journal of Selected Topics in Applied Earth Observations
Progress in Earth and Planetary Science, vol. 5, no. 1, and Remote Sensing, vol. 14, pp. 2070–2086, 2021.
pp. 1–16, 2018. [88] J. Devaraj, S. Ganesan, R. M. Elavarasan, and U. Subra-
[75] Z. K. Yip and M. Yau, “Application of artificial neural maniam, “A novel deep learning based model for tropical
networks on north atlantic tropical cyclogenesis potential intensity estimation and post-disaster management of
index in climate change,” Journal of Atmospheric and hurricanes,” Applied Sciences, vol. 11, no. 9, p. 4129,
22
2021.
[89] J.-Y. Zhuo and Z.-M. Tan, “Physics-augmented deep
learning to improve tropical cyclone intensity and size
estimation from satellite imagery,” Monthly Weather Re-
view, vol. 149, no. 7, pp. 2097–2113, 2021.
[90] C. Wang, G. Zheng, X. Li, Q. Xu, B. Liu, and J. Zhang,
“Tropical cyclone intensity estimation from geostationary
satellite imagery using deep convolutional neural net-
works,” IEEE Transactions on Geoscience and Remote
Sensing, vol. 60, pp. 1–16, 2021.
[91] J. Tan, Q. Yang, J. Hu, Q. Huang, and S. Chen, “Tropical
cyclone intensity estimation using himawari-8 satellite
cloud products and deep learning,” Remote Sensing,
vol. 14, no. 4, p. 812, 2022.
[92] Y. Wei, R. Yang, and D. Sun, “Investigating tropical
cyclone rapid intensification with an advanced artificial
intelligence system and gridded reanalysis data,” Atmo-
sphere, vol. 14, no. 2, p. 195, 2023.
[93] B.-F. Chen, Y.-T. Kuo, and T.-S. Huang, “A deep learning
ensemble approach for predicting tropical cyclone rapid
intensification,” Atmospheric Science Letters, vol. 24,
no. 5, p. e1151, 2023.
[94] X.-W. Chen and X. Lin, “Big data deep learning: chal-
lenges and perspectives,” IEEE access, vol. 2, pp. 514–
525, 2014.
[95] S. Chakraborty, R. Tomsett, R. Raghavendra, D. Har-
borne, M. Alzantot, F. Cerutti, M. Srivastava, A. Preece,
S. Julier, R. M. Rao et al., “Interpretability of deep
learning models: A survey of results,” in 2017 IEEE
smartworld, ubiquitous intelligence & computing, ad-
vanced & trusted computed, scalable computing &
communications, cloud & big data computing, In-
ternet of people and smart city innovation (smart-
world/SCALCOM/UIC/ATC/CBDcom/IOP/SCI). IEEE,
2017, pp. 1–6.

Deep Learning Techniques in Extreme Weather

Uploaded by

Copyright:

Available Formats

You might also like

Deep Learning Techniques in Extreme Weather

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Deep Learning Techniques in Extreme Weather

Uploaded by

Copyright:

Available Formats

1

Deep Learning Techniques in Extreme Weather

W EATHER refers to short-term natural events that occur

Task Approach Ref

TABLE I: Deep Learning in Thunderstorm and Lightning

Task Approach Ref

TABLE II: Deep Learning in Precipitation

additional fully connected layers for the final classification.

Task Approach Ref

TABLE III: Deep Learning in Drought

Task Approach Ref

TABLE IV: Deep Learning in Heatwaves and Cold waves

Task Approach Ref

TABLE V: Deep Learning in Tropical Cyclone

IV. C HALLENGES various sources such as satellites, radar systems, automatic

You might also like