Professional Documents
Culture Documents
GGCNN An Efficiency-Maximizing Gated Graph Convolutional Neural Network Architecture For Automatic Modulation Identification
GGCNN An Efficiency-Maximizing Gated Graph Convolutional Neural Network Architecture For Automatic Modulation Identification
Abstract— Automatic modulation identification (AMI) is a AMI enables receivers to blindly identify a received signal’s
technique to detect the modulation type and order of a received modulation scheme, without a priori information about the
signal, which has the potential to enhance cognitive radio capabil- signal. This ability has been recognized as an important contri-
ities for future generations of communication devices. However,
AMI classifiers traditionally have exhibited low efficiency in bution towards cognitive radio technology [2]. This is because
low signal-to-noise ratio (SNR) environments. Hence, to address implementing AMI in receivers provides the transmitter the
this problem we present our novel Gated Graph Convolutional freedom to select the best modulation scheme from a wide
Neural Network (GGCNN) classifier for feature-based AMI. This pool of available modulation schemes based on the environ-
architecture includes a robust feature extraction stage to extract mental conditions providing that the transmitter functions on
deep correlative patterns about the received symbols. Not only
does this feature extraction stage use the temporal characteristics a software-defined-equipped platform. Then, such changes in
of the received symbols, but it also takes advantage of embedded transmission characteristics do not need to be communicated
signaling features from the received signal. In the proposed with the receiver [1], [2]. In addition, providing this capability
classifier, the received constellations are treated as a graph, allow- to a transceiver system can also further help in adapting other
ing it to outperform state-of-the-art classifiers due to its strong communication parameters such as sample rate in order to
performance in graph classification. This is observed clearly in
the visualization of the extracted features, even for high-order maximize achievable channel throughput [3]. Furthermore,
modulation schemes. In this paper, we present our systematic AMI possesses multiple other civilian applications such as
research conducted for maximizing the efficiency obtainable intelligent modem designs [4], spectrum sensing [5], safety
by our classifier. Extensive simulation results demonstrate a monitoring or threat assessment [6], interference mitigation
significant accuracy improvement of 18.44 percentage points, and
and dynamic spectrum access [7]. All these applications have
an efficiency increase by 60.78% for our GGCNN-AMI classifier
compared to state-of-the-art classifiers in low-SNR environments. either direct or indirect contributions towards the improvement
of cognitive radio technology [8]. Out of these applications,
Index Terms— Automatic modulation identification (AMI),
feature-based (FB) classification, gated graph convolutional
interference mitigation plays a more prominent role; yet it
neural network (GGCNN), performance analysis, efficiency traditionally has been a challenge for receivers. With the
improvement. near-ubiquitous presence of wireless devices, communication
systems face a significant problem in spectrum congestion [3].
Thus, by deploying AMI at the receiver and then conducting
I. I NTRODUCTION
a sweep of all supported frequencies, the receiver can then
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
6034 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 22, NO. 9, SEPTEMBER 2023
communications systems in low signal-to-noise ratio (SNR) Wigner-Ville distribution (SPWVD), which can represent both
environments [15]. On the other hand, data-driven AMI essen- time and frequency features of the received signal. Considering
tially involves deep learning models acting as classifiers that a low-SNR environment and a high-order modulation scheme,
are robust in pattern recognition [16]. These types of AMI SPWVD cannot be effective in carrying deep correlative
classifiers are trained before deployment for the modulations information of the received constellation with a low number of
they support over the environments they are intended to samples due to the integration process in its mathematical rep-
be used in. From among the various deep learning mod- resentation. Improving their AMI classifier therefore increases
els [17], convolutional neural networks (CNN) [18] have been the execution latency, which further implies that their classifier
broadly and successfully adopted for AMI by learning valid cannot be efficiently used in real-time classification for such
representations from complex data and mapping them into environments and high-order modulation schemes. We refer to
image characteristics and features [19]. Therefore, we select this model as SPWVD-CNN in our paper.
CNN-based deep learning models to review and assess their The closest model to CNN-based architectures to observe
operations in improving the performance and efficiency of the input data as an image and exhibits the capability of learn-
AMI in low-SNR environments. ing convolutional features is graph convolutional neural net-
works (GCNN). In the only GCNN-based work for AMI [25],
A. Related Work referred to as AMRGNN in this paper, the authors implemented
Wang et al. [20] proposed a data-driven deep learning a stacked-layer GCNN architecture with 2 layers followed by
model, which we refer to as DR-CNN in our paper, that com- a softmax operation acting as the final classification stage. The
bines two CNN-based models trained over different simulated presented results in this work show not only higher classifica-
datasets generated by the authors. Even though their method tion accuracy compared to CNN-based architectures, but also
of combining CNN-based models performed slightly better the likelihood of less computational complexity (less number
than other AMI classifiers they investigated, their method of trainable parameters). This initial study of a GCNN-based
achieves lower efficiency since their classification processing model indicates the potential for improving the AMI general
chain appears more computationally complex compared to performance in low-SNR environments.
other schemes in this domain due to the need for training
and testing two separate CNN structures. Additionally, the use B. Problem Statement
of simulated datasets rather than real-world captured datasets The presented related works and their efforts to improve the
makes their approach and the trained classifier less applicable AMI classifier performance can be categorized as follows:
to real-world scenarios. Zhang et al. [21] proposed a dual- • The first group contains those works adopting a heavy
stream CNN-based model that utilizes long short-term memory CNN-based architecture to extract deep correlative infor-
(LSTM) to explore feature interactions of the received signal. mation from received symbols. Even though these mod-
However, while this model could increase the performance els have made improvements in increasing the ultimate
of their AMI classifier for low-order modulation schemes, classification accuracy of their AMI classifier, they also
such an increase does not occur for higher-order modulation significantly increased the computational complexity of
schemes. This limits their model’s real-world applicability. their models structure. This further reduced their model’s
Moreover, using dual-stream and LSTM structures in their efficiency, which we define as the tradeoff between clas-
model significantly increases their model’s computational sification accuracy and computational complexity, as well
complexity, which leads to lower efficiency for implementation as its suitability for real-world implementations.
in low-SNR environments. This model is referred to as LSTM- • The second group are those related works that adopted
CNN in our paper. Meng et al. [22] proposed a CNN-based a light-weight CNN-based architecture to maintain or
deep learning model, which is trained in two different steps reduce their model’s computational complexity compared
to obtain a closer approximation to the optimal LB approach. to literature works. These models fail to produce sig-
This method exhibits a slight increase in the performance of nificant improvements in their ultimate AMI classifier’s
the AMI classifier in low-SNR environments based on the classification accuracy, however.
evaluation they provided in their paper. On the other hand, this Even though related works all have attempted to increase
model’s high computational complexity due to dual-training the ultimate performance of their AMI classifiers, their models
of their model does not represent a practical implementation fail to either produce significant improvements in classification
of an AMI classifier in a real-world scenario. In our paper’s accuracy or maintain the computational complexity, i.e., exe-
comparative analysis, we refer to this method as TwoStep- cution latency as defined in subsection B of section III in the
CNN. Wang et al. [23] proposed a light-weight CNN-based context of AMI. As it is analyzed in subsection D of section
model with smaller model sizes, obtained through reduction IV, this results in a further decrease in efficiency of their AMI
in kernel sizes or filter counts, which leads to lower compu- classifiers in low-SNR environment. This is especially true
tational complexity. In our paper, this model is referred to as considering that high-order modulation schemes will further
Light-CNN. Moreover, in their model, they also took advantage exacerbate these problems.
of SNR estimation in the receiver structure. Due to its very
light-weight feature extraction structure, this model performs C. Contributions
poorly in low-SNR environments. Zhang et al. [24] proposed To improve both the classification accuracy and the effi-
a CNN-based model that takes advantage of smooth pseudo ciency of the AMI classifier in low-SNR environments,
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
GHASEMZADEH et al.: GGCNN: AN EFFICIENCY-MAXIMIZING GATED GRAPH CONVOLUTIONAL NETWORK ARCHITECTURE FOR AMI 6035
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
6036 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 22, NO. 9, SEPTEMBER 2023
In equation (2), where the L is the number of received where Ht−1 is the old state of Ht , and where the update gate
symbols, the mean of the received symbols X is computed Zt ∈ Rn×d and hidden state Kt ∈ Rn×d are computed based
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
GHASEMZADEH et al.: GGCNN: AN EFFICIENCY-MAXIMIZING GATED GRAPH CONVOLUTIONAL NETWORK ARCHITECTURE FOR AMI 6037
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
6038 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 22, NO. 9, SEPTEMBER 2023
(K) (K)
Due to the diagonality of either A{i,i+1} or A{i+2,i+3} ,
equation (14) can be rewritten based on equation (15) as:
(K) (K) (K)
A{i,i+1} ∗ A{i+2,i+3} = Xdiag(Â{i+2,i+3} (λ0 ),
(K) (K)
. . . , Â{i+2,i+3} (λN −1 ))XT A{i,i+1} .
(16)
Fig. 3. Graph convolutional blocks’ structure. This operation can be considered as a convolution kernel,
which can be implemented through a set of free parameters
N −1 (K) (K)
features measuring the differences between the vector nodes {θm }m=0 ∼ diag(Â{i+2,i+3} (λ0 ), . . . , Â{i+2,i+3} (λN −1 )) in
(received samples) of two consecutive constellations carry the the Laplacian eigenspace, i.e., Fourier domain. Moreover,
geometric information of the constellations and their points these parameters can also be considered as a function of
(graph nodes). This can mathematically be seen in (11). eigenvalues G(Θ) = diag(θ0 , . . . , θN −1 ), which leads to a
Moreover, edge features also include the direction information rewritten form of equation (16) as:
of graph nodes, which can determine if nodes have a for- (K) (K) (K)
A{i,i+1} ∗ A{i+2,i+3} = XG(Θ)XT A{i,i+1} . (17)
ward, backward, or undirected relationship. Determining the
directionality of nodes’ relationship will improve the training The above convolution kernel operation exhibits notable com-
process due to the fact that it can determine the correlation putational cost due to the eigendecomposition performed in
between graph nodes. The more accurate these features are, each computation described by (17), as well as incompatibility
the higher the ultimate classification accuracy will be. with CNN local connections due to the relationship of the
The adjacency matrix operation is conducted for every two these parameters to the global vertices in the defined Laplacian
received signal vectors (constellations), which will result in space [39]. Addressing these problems involves implementing
(K) (K) (K) a fast localized convolution based on low-order polynomial
a set of adjacency matrices: {A{1,2} , A{3,4} , . . . , A{d−1,d} }.
This set will be used in the next procedure, i.e., graph convo- approximation. Therefore, G(Θ) can be expressed as:
lution block, which its overall structure can be seen in Fig 3. R
X
In each GCNN block, the graph convolutional blocks extract G(Θ) = θr Λ r , (18)
the information related to the frequency of the computed l=0
adjacency matrix characteristics from previous operation in
where R is the polynomial order and {Λ}R r=0 is the vector of
GCNN block. Hence, a Fourier-based operation based on
polynomial coefficients. Then, equation (17) can be rewritten
adjacency matrix should be adopted. We let the eigenvectors
−1 (K) as:
{Xm }N m=0 of the Laplacian of adjacency matrix A{i,j} that R
satisfy the orthogonality condition be used the decomposition (K) (K)
X (K)
A{i,i+1} ∗ A{i+2,i+3} = X( θr Λr )XT A{i,i+1} . (19)
bases in the Fourier transform instead of the conventional
r=0
complex exponentials [37]. N represents the dimension of
eigenvector. Then, the Fourier transform and its inverse can Due to the independence of free parameters and modified
be defined, respectively, as: Fourier transform, we can obtain:
R
N −1 (K) (K) (K)
X
(K) X
T (K) (K) A{i,i+1} ∗ A{i+2,i+3} = θr (XΛr XT )A{i,i+1} , (20)
Â{i,j} (λm ) = Xm (n) A{i,j} (n) = XT A{i,j} , (12) r=0
n=0
N −1 through which eigendecomposition computation is avoided,
(K)
X (K) (K)
A{i,j} = Â{i,j} (λm ) Xm (n) = XÂ{i,j} . (13) since the convolution kernel is then calculated by K number
m=0 of multiplications instead. Incorporating the ReLU activation
function after graph convolution operation with a layer-wise
(K)
This can be interpreted as the expansion of A{i,j} in terms propagation rule can be expressed as:
of eigenvectors of Laplacian [38]. Thus, the convolution over P X R
(K) (K) (K
X
two consecutive adjacency matrix can be converted into a Yi = σ( ( θ{i,i+1} r (XΛr XT )A{i,i+1} )+bj )), (21)
point-wise product in the Fourier domain as: i=0 r=0
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
GHASEMZADEH et al.: GGCNN: AN EFFICIENCY-MAXIMIZING GATED GRAPH CONVOLUTIONAL NETWORK ARCHITECTURE FOR AMI 6039
its principal functionality - selection of identified features temporal-based features on top of the GCNN sequence, where
with the least amount of repetition within the feature set and the deep feature extraction and training processes are con-
removal of all others - contradicts with the goal of training ducted by adjacency matrix and graph convolutional block
the model to focus on tightly clustered features. If overfitting operations, enables the proposed GGCNN-AMI classifier to
happens in an AMI classifier, it generally causes a reduced operate with high robustness in low-SNR environments.
classification accuracy, especially when the data behaves more
randomly, which is the case with received samples in low- III. S IMULATION I NTEGRATION , E VALUATION
SNR environments. Hence, a dropout is employed to prevent P ROCEDURE AND E FFICIENT A RCHITECTURE
overfitting in the training process. A concatenation process This section provides information about the simulation
is applied to pass the raw features extracted from received integration, such as the chosen dataset and the environment
symbols by the adjacency matrix operation to the next GCNN selected for training and testing. We also introduce the eval-
block, where same operation is implemented. Hence, the uation procedure along with its comparison factors in our
output of the graph convolutional block is: performance analysis. Using this evaluation procedure, we will
Vi
(K)
= Pooling(Yi
(K) (K)
) + A{i,i+1} . (22) determine an efficient architecture of our proposed GGCNN-
AMI classifier.
Network optimization and parameter training are both con-
ducted through minimizing the loss function for each batch
A. Simulation Integration
of data and gradient back-propagation algorithm, respectively.
This procedure is similar to the conventional CNN models The RadioML 2018.01A dataset [41] is selected as the
apart from the convolution operation as can be seen in equation modulation signal reference for our evaluation effort. This
(20). As stated above, this modified convolution operation dataset contains 24 digital and analog modulation schemes,
only involves the addition and multiplication of adjacency each with 4096 signal waveforms at each individual SNR
matrix and training parameters. Hence, the two gradients can point. Each signal waveform includes 1024 separate complex
be obtained as: IQ samples (2 × 1024). This creates a dataset of 2,555,904
vectors of modulated signal waveforms. Additionally, having
∂J (K) ∂J
= (XΛr XT )A{i,i+1} (K) , (23) separate complex IQ samples provides flexibility for the AMI
∂θ{i,i+1} r ∂A classifier to process received symbols in their complex or real-
{i+2,i+3}
valued forms. The number of training, validation and testing
and
samples of the RadioML dataset are set to be 50%, 10% and
P R
∂J X ∂J X 40%, respectively. We selected a smaller than usual training
(K)
= ( (K)
( θ{i,i+1} r (XΛr XT ))). dataset compared to literature works, specifically with the aim
∂A{i,i+1} i=0 ∂A{i+2,i+3} r=0
to represent a real-world scenario where the available and
(24) usable samples for training of an AMI classifier might be few.
This training operation will exhibit the same learning compu- By studying the training procedure of most of the literature
tational complexity as conventional CNN models [40]. This works, we realize that their models were each trained on a sub-
fact helps limit the computational complexity of the designed set of modulation schemes available in the RadioML dataset.
GCNN sequence. To serve the same purpose, a dense layer However, it should be noted that such limited training does not
with 16 units, Xavier uniform set for kernel initializer and necessarily represent a real-world scenario, where typically a
no activation function and kernel regularizer is implemented transceiver is capable of handling a wide range of modulation
between each GCNN block. This operation is followed by a schemes and thus any AMI classifier similarly needs to be
flatten layer to turn the high-dimensional space of extracted trained on a wide set of modulation schemes. Therefore,
features into a one-dimensional vector to be able to be used in order to achieve a more realistic view of the performance
in the subsequent GCNN block, which leverages the same and as part of our comparative evaluation, we train our
operation as explained above. Finally, the features that were GGCNN-AMI model as well as the comparison models, which
extracted and abstracted in the feature extraction stage will will be introduced in next section, on all digital modulation
then be used in classification module. schemes in the dataset. The training procedure is designed
to be executed pre-deployment on a per-modulation scheme
basis, over the entire supported SNR range. We perform
D. Classification Module this process for the proposed GGCNN-AMI classifier and all
The classification module consists of a fully connected comparison models. This training procedure represents a more
neural network with two layers. The first layer has 96 neurons realistic scenario where the AMI classifier has no a priori
while the second layer’s number of neurons s equivalent to the knowledge of the SNR value for the environment where it
targeted number of modulation schemes to be classified. The will be deployed in.
Softmax activation function produces the final probability of Our training, validating and testing environments are imple-
detecting a given modulation type for a received signal. mented using the deep learning library Keras running on top
With the functionality of each GGCNN-AMI classifier of TensorFlow, executed on our university’s supercomputing
element detailed, we can observe that designing such fea- infrastructure, HCC Crane [42]. The training factors for our
ture extraction stage capable of extracting signaling- and designed GGCNN-AMI classifier can be seen in Table II.
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
6040 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 22, NO. 9, SEPTEMBER 2023
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
GHASEMZADEH et al.: GGCNN: AN EFFICIENCY-MAXIMIZING GATED GRAPH CONVOLUTIONAL NETWORK ARCHITECTURE FOR AMI 6041
TABLE III
E XECUTION L ATENCY FOR 1-4 GCNN B LOCKS IN S ECONDS
Fig. 4. Training and validation accuracy and loss for our designed model
during the model verification process (3 GCNN blocks, ELU as activation
function, no-pooling, 16-QAM, 0 dB SNR).
Fig. 5. Average PCC based on the number of GCNN block. Based on Table III, adding a GCNN block will incur a
relative increase of 8% in execution latency. This increase
in execution latency justifies the PCC improvement for up
the final architecture. To accomplish this, we can simply inves- to 3 GCNN blocks. Hence, we select to use 3 GCNN blocks
tigate the PCC gain versus execution latency increase while in our proposed model’s architecture.
stepwise increasing the number of GCNN blocks. Fig 5 shows In the second step of determining the efficient architecture,
the average PCC for all modulation schemes for these tests we will need to evaluate different options for activation
using 1 through 4 GCNN blocks. It should be noted that since function selection in the adjacency matrix, as shown in Fig 2
at this point we have not yet conducted our investigation into and equation (10), and for different pooling operations within
determining the best choice for activation function and pooling the Graph convolutional block, as shown in Fig 3 and equation
operations, we will continue to use ELU and no-pooling for (22). For this purpose, as mentioned earlier, we study four
determining the best number of GCNN blocks. different activation functions (ReLU, LeakyReLU, GeLU and
As can be seen from Fig 5, increasing the number of GCNN ELU) and three different pooling operations (max-, average-
block from 1 to 2 and from 2 to 3 results in a significant or no-pooling), resulting in 12 combinations that can be used
improvement of PCC by on average 15.45 and 17.09 per- in our model. For determining the most effective combination,
centage point (p.p.) across the entire range of SNR. However, we first need to investigate each combination’s probability of
having 4 GCNN blocks does not exhibit any substantial PCC correct classification. We conduct this evaluation by averaging
gain. We observe only a gain of 2.81 p.p. on average compared across all available modulation schemes within our dataset.
to using 3 GCNN blocks, especially in lower SNR values, This allowed us to include any modulation-specific effects
which is the focus of this paper. Investigating the execution in this selection process. The resulting PCC averaged across
latency, shown in Table III, alongside the results in Fig 5 modulation schemes can be seen in Fig 6, plotted for all
provides confirmation that the selection of 3 GCNN blocks 12 aforementioned activation+pooling combinations.
for the final model’s architecture provides the most effective We can observe from these results that purely in terms
and efficient performance outcome. of accuracy, the combination of LeakyReLU + No Pooling
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
6042 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 22, NO. 9, SEPTEMBER 2023
TABLE IV
N ORMALIZED E XECUTION L ATENCY OF 12 C OMBINATIONS TO D ETER -
MINE E FFICIENT A RCHITECTURE
TABLE V
C OMPUTED E FFICIENCY FOR E ACH A FOREMENTIONED C OMBINATION
Fig. 7. Training and validation accuracy and losses of the comparison models.
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
GHASEMZADEH et al.: GGCNN: AN EFFICIENCY-MAXIMIZING GATED GRAPH CONVOLUTIONAL NETWORK ARCHITECTURE FOR AMI 6043
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
6044 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 22, NO. 9, SEPTEMBER 2023
TABLE VI
AVERAGE P ERFORMANCE G AIN OF THE P ROPOSED GGCNN-AMI
C LASSIFIER C OMPARED TO OTHER C LASSIFIERS
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
GHASEMZADEH et al.: GGCNN: AN EFFICIENCY-MAXIMIZING GATED GRAPH CONVOLUTIONAL NETWORK ARCHITECTURE FOR AMI 6045
Fig. 11. This figure shows the receiver operating characteristic (ROC), decision threshold versus detection probability and precision versus recall plots for
the proposed GGCNN model compared to related works.
TABLE VII
N ORMALIZED E XECUTION L ATENCY OF E VALUATED AMI C LASSIFIERS
D. Efficiency
Based on the definition of AMI classifier efficiency pre-
sented in the previous section, Fig 12 shows the computed ρ
for each AMI classifier based on the lower and upper perfor-
mance bounds, shown separately for training + validation
and classif ication.
Fig 12(a) shows that the efficiency of the proposed
GGCNN-AMI classifier is on average 63% and 87.84%
higher than the comparison models’ training + validation
or classif ication efficiency, respectively. Fig 12(b) indi-
cates an average efficiency improvement of 43.23% and
49% for training + validation or classif ication effi-
ciency, respectively, for the proposed GGCNN-AMI clas-
Fig. 12. Computed efficiency for each AMI classifier based on lower
sifier compared to all other investigated comparison and upper bounds for their respective training + validation and
models. classif ication execution latency.
To summarize, the efficiency of the proposed GGCNN-AMI
classifier is on average 75.42% and 46.11% higher for
V. C ONCLUSION
lower and upper bounds, respectively, while considering both
training + validation and classif ication execution latency In this research, we proposed a novel architecture to
compared to all comparison models. In other words, our maximize the efficiency of feature-based AMI in low-SNR
GGCNN approach achieves on average a 60.78% higher environments. The proposed GGCNN-AMI classifier includes
efficiency, which further provides a compelling argument jus- a feature extraction module, where embedded signaling-based
tifying the incurred increase in execution latency. In particular, and temporal features are extracted and appended to the
the high efficiency of GGCNN for the lower bound confirms original received signal. Additionally, a sequence of graph
that the proposed GGCNN-AMI classifier was very successful convolutional neural networks is implemented to extract the
in improving the AMI classifier performance for low-SNR graph-based features. Considering the significant effect of the
environments, while not significantly increasing computational activation function and pooling operation on the performance
complexity. of the AMI classifier, we adopted an efficient architecture
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
6046 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 22, NO. 9, SEPTEMBER 2023
through investigating all combinations of activation functions [13] P. Ghasemzadeh, S. Banerjee, M. Hempel, M. Alahmad, and H. Sharif,
and pooling operations considered in this research. Visual- “Analysis of distribution test-based and feature-based approaches toward
automatic modulation classification,” in Proc. IEEE 30th Annu. Int.
ization of the scatter maps of the extracted features showed Symp. Pers., Indoor Mobile Radio Commun. (PIMRC), Sep. 2019,
the robustness of our designed feature extraction module in pp. 1–6.
producing highly distinguishable features, which will directly [14] S. Banerjee, J. Santos, M. Hempel, P. Ghasemzadeh, and H. Sharif,
“A novel method of near-miss event detection with software defined
impact the decision making bounds of the AMI classifier. radar in improving railyard safety,” Safety, vol. 5, no. 3, p. 55, 2019.
The proposed GGCNN-AMI classifier exhibits a significant [15] Y. A. Eldemerdash, O. A. Dobre, O. Üreten, and T. Yensen, “A robust
improvement of 20.89 p.p. and 16 p.p. in low-SNR envi- modulation classification method for PSK signals using random graphs,”
IEEE Trans. Instrum. Meas., vol. 68, no. 2, pp. 642–644, Feb. 2018.
ronments for 256QAM and BPSK, respectively, representing
[16] T. J. O’Shea and J. Hoydis, “An introduction to deep learning for
the lower and upper performance bounds. We also evalu- the physical layer,” IEEE Trans. Cogn. Commun. Netw., vol. 3, no. 4,
ated the execution latency and efficiency of our proposed pp. 563–575, Oct. 2017.
GGCNN-AMI classifier and compared them against those of [17] S. Banerjee, M. Hempel, P. Ghasemzadeh, and H. Sharif, “A novel
biomimicry-based analysis of D2D user association retention for achiev-
other state-of-the-art published AMI classifiers. We showed ing maximal throughput,” in Proc. 15th Int. Wireless Commun. Mobile
that the proposed GGCNN-AMI classifier achieves a signifi- Comput. Conf. (IWCMC), Jun. 2019, pp. 2036–2042.
cant efficiency improvement by, on average, 60.78% compared [18] S. Huang, “Automatic modulation classification using compressive con-
volutional neural network,” IEEE Access, vol. 7, pp. 79636–79643,
to all AMI classifiers investigated as part of this research. 2019.
[19] S. Huang, Y. Yao, Z. Wei, Z. Feng, and P. Zhang, “Automatic modulation
classification of overlapped sources using multiple cumulants,” IEEE
ACKNOWLEDGMENT Trans. Veh. Technol., vol. 66, no. 7, pp. 6089–6101, Jul. 2016.
[20] Y. Wang, M. Liu, J. Yang, and G. Gui, “Data-driven deep learning for
This study is being conducted at the University of Nebraska- automatic modulation recognition in cognitive radios,” IEEE Trans. Veh.
Lincoln by the faculty and students at the Advanced Telecom- Technol., vol. 68, no. 4, pp. 4074–4077, Apr. 2019.
munications Engineering Laboratory (www.TEL.unl.edu). [21] Z. Zhang, H. Luo, C. Wang, C. Gan, and Y. Xiang, “Automatic modula-
tion classification using CNN-LSTM based dual-stream structure,” IEEE
Trans. Veh. Technol., vol. 69, no. 11, pp. 13521–13531, Oct. 2020.
R EFERENCES [22] F. Meng, P. Chen, L. Wu, and X. Wang, “Automatic modulation
classification: A deep learning enabled approach,” IEEE Trans. Veh.
[1] L. Han, F. Gao, Z. Li, and O. A. Dobre, “Low complexity automatic Technol., vol. 67, no. 11, pp. 10760–10772, Sep. 2018.
modulation classification based on order-statistics,” IEEE Trans. Wireless [23] Y. Wang, J. Yang, M. Liu, and G. Gui, “LightAMC: Lightweight
Commun., vol. 16, no. 1, pp. 400–411, Jan. 2017. automatic modulation classification via deep learning and compressive
[2] S. Banerjee, M. Hempel, P. Ghasemzadeh, N. Albakay, and H. Sharif, sensing,” IEEE Trans. Veh. Technol., vol. 69, no. 3, pp. 3491–3495,
“High speed train wireless communication: Handover performance anal- Mar. 2020.
ysis for different radio access technologies,” in Proc. ASME Joint Rail [24] Z. Zhang, C. Wang, C. Gan, S. Sun, and M. Wang, “Automatic mod-
Conf., vol. 58523, 2019, Art. no. V001T03A006. ulation classification using convolutional neural network with features
[3] S. Amuru and C. R. da Silva, “A blind preprocessor for modulation clas- fusion of SPWVD and BJD,” IEEE Trans. Signal Inf. Process. Netw.,
sification applications in frequency-selective non-Gaussian channels,” vol. 5, no. 3, pp. 469–478, Sep. 2019.
IEEE Trans. Commun., vol. 63, no. 1, pp. 156–169, Jan. 2015. [25] Y. Liu, Y. Liu, and C. Yang, “Modulation recognition with graph
[4] R. Gupta, S. Kumar, and S. Majhi, “Blind modulation classification convolutional network,” IEEE Wireless Commun. Lett., vol. 9, no. 5,
for asynchronous OFDM systems over unknown signal parameters pp. 624–627, May 2020.
and channel statistics,” IEEE Trans. Veh. Technol., vol. 69, no. 5, [26] P. Ghasemzadeh, S. Banerjee, M. Hempel, and H. Sharif, “A novel deep
pp. 5281–5292, May 2020. learning and polar transformation framework for an adaptive automatic
[5] S. Banerjee, M. Hempel, N. Albakay, P. Ghasemzadeh, and H. Sharif, modulation classification,” IEEE Trans. Veh. Technol., vol. 69, no. 11,
“A framework for high-speed passenger train wireless network pp. 13243–13258, Nov. 2020.
radio evaluations,” in Proc. Joint Rail Conf., Apr. 2019, [27] V. N. Ioannidis, A. G. Marques, and G. B. Giannakis, “Tensor graph
Art. no. V001T08A003. convolutional networks for multi-relational and robust learning,” IEEE
[6] H. C. Wu, M. Saquib, and Z. Yun, “Novel automatic modulation Trans. Signal Process., vol. 68, pp. 6535–6546, 2020.
classification using cumulant features for communications via multipath [28] F. Gama, A. G. Marques, G. Leus, and A. Ribeiro, “Convolutional neural
channels,” IEEE Trans. Wireless Commun., vol. 7, no. 8, pp. 3098–3105, network architectures for signals supported on graphs,” IEEE Trans.
Aug. 2008. Signal Process., vol. 67, no. 4, pp. 1034–1049, Feb. 2019.
[7] S. Banerjee, M. Hempel, P. Ghasemzadeh, Y. Qian, and H. Sharif, [29] M. Gustineli, “A survey on recently proposed activation functions for
“A novel approach to social-behavioral D2D trust associations using deep learning,” 2022, arXiv:2204.02921.
self-propelled Voronoi,” in Proc. IEEE 90th Veh. Technol. Conf. (VTC- [30] P. Movva, “Survey on activation functions: A comparative study between
Fall), Sep. 2019, pp. 1–5. state-of-the-art activation functions and oscillatory activation functions,”
[8] S. Hu, Y. Pei, P. P. Liang, and Y.-C. Liang, “Deep neural network for Indian Inst. Inf. Technol., Sri City, India, Tech. Rep., 2022. [Online].
robust modulation classification under uncertain noise conditions,” IEEE Available: https://engrxiv.org/preprint/view/2250/version/3345
Trans. Veh. Technol., vol. 69, no. 1, pp. 564–577, Jan. 2020. [31] L. Datta, “A survey on activation functions and their relation with Xavier
[9] N. Gresset, H. Halbauer, J. Koppenborg, W. Zirwas, and H. Khanfir, and He normal initialization,” 2020, arXiv:2004.06632.
“Interference-avoidance techniques: Improving ubiquitous user experi- [32] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, “Fast and accurate
ence,” IEEE Veh. Technol. Mag., vol. 7, no. 4, pp. 37–45, Dec. 2012. deep network learning by exponential linear units (ELUs),” 2015,
[10] S. Rajendran, W. Meert, D. Giustiniano, V. Lenders, and S. Pollin, “Deep arXiv:1511.07289.
learning models for wireless signal classification with distributed low- [33] D. Hendrycks and K. Gimpel, “Gaussian error linear units (GELUs),”
cost spectrum sensors,” IEEE Trans. Cogn. Commun. Netw., vol. 4, no. 3, 2016, arXiv:1606.08415.
pp. 433–445, Sep. 2018. [34] Activation Functions Explained (GELU, SELU, ELU, RELU and More).
[11] Y. Wang, J. Wang, W. Zhang, J. Yang, and G. Gui, “Deep learning-based Accessed: Apr. 17, 2022. [Online]. Available: https://mlfromscratch.
cooperative automatic modulation classification method for MIMO com/activation-functions-explained
systems,” IEEE Trans. Veh. Technol., vol. 69, no. 4, pp. 4575–4579, [35] V. Garcia and J. Bruna, “Few-shot learning with graph neural networks,”
Apr. 2020. 2017, arXiv:1711.04043.
[12] M. Abu-Romoh, A. Aboutaleb, and Z. Rezki, “Automatic modulation [36] X. Zhang, C. Xu, X. Tian, and D. Tao, “Graph edge convolutional neural
classification using moments and likelihood maximization,” IEEE Com- networks for skeleton-based action recognition,” IEEE Trans. Neural
mun. Lett., vol. 22, no. 5, pp. 938–941, May 2018. Netw. Learn. Syst., vol. 31, no. 8, pp. 3047–3060, Aug. 2019.
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.
GHASEMZADEH et al.: GGCNN: AN EFFICIENCY-MAXIMIZING GATED GRAPH CONVOLUTIONAL NETWORK ARCHITECTURE FOR AMI 6047
[37] D. K. Hammond, P. Vandergheynst, and R. Gribonval, “Wavelets on Michael Hempel (Member, IEEE) received
graphs via spectral graph theory,” Appl. Comput. Harmon. Anal., vol. 30, the Ph.D. degree in computer engineering from
no. 2, pp. 129–150, Mar. 2011. the University of Nebraska–Lincoln, Nebraska.
[38] X. Yan, T. Ai, M. Yang, and H. Yin, “A graph convolutional neural He is currently working as a Research Assistant
network for classification of building patterns using spatial vector data,” Professor at the Advanced Telecommunication
ISPRS J. Photogramm. Remote Sens., vol. 150, pp. 259–273, Apr. 2019. Engineering Laboratory (TEL), University of
[39] M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural Nebraska–Lincoln. He has authored or coauthored
networks on graphs with fast localized spectral filtering,” in Proc. Adv. more than 150 publications in major international
Neural Inf. Process. Syst., vol. 29, 2016, pp. 1–9. journals and conferences. His research interests
[40] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning include wireless communication protocol design
applied to document recognition,” in Proc. IEEE, vol. 86, no. 11, and performance analysis, wireless multimedia
pp. 2278–2324, Nov. 1998. services, and distributed computing. For his research in networking, he has
[41] T. J. O’Shea, T. Roy, and T. C. Clancy, “Over-the-air deep learning also been developing various network simulation and analysis solutions for
based radio signal classification,” IEEE J. Sel. Topics Signal Process., streaming media and WiFi/WiMAX technologies. He has served as a TPC
vol. 12, no. 1, pp. 168–179, Feb. 2018. member for numerous international conferences.
[42] Holland Computing Center (HCC) at University of Nebraska-Lincoln.
Accessed: May 6, 2021. [Online]. Available: https://hcc.unl.edu/
Honggang Wang (Fellow, IEEE) is currently a Pro-
[43] L. Van der Maaten and G. Hinton, “Visualizing data using t-SNE,”
fessor at the University of Massachusetts (UMass)
J. Mach. Learn. Res., vol. 9, no. 11, pp. 2579–2605, 2008.
Dartmouth. His research interests include wireless
health, body area networks, cyber security, mobile
multimedia and cloud, wireless networks and cyber-
physical systems, and big data in mHealth. He has
also been serving as the Editor-in-Chief for IEEE
I NTERNET OF T HINGS J OURNAL.
Authorized licensed use limited to: University of Nebraska - Lincoln. Downloaded on May 20,2024 at 19:49:13 UTC from IEEE Xplore. Restrictions apply.