Deinterlaving Tezi

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 50

UPTEC F 19006

Examensarbete 30 hp
19 Mars 2019

Deinterleaving pulse trains


with DBSCAN and FART

Shad Mahmod
Abstract ii

Deinterleaving pulse trains with DBSCAN and FART

Shad Mahmod

Teknisk- naturvetenskaplig fakultet


UTH-enheten Studying radar pulses and looking for certain patterns is critical in order to assess
the threat level of the environment around an antenna. In order to study the
Besöksadress: electromagnetic pulses emitted from a certain radar, one must first register and
Ångströmlaboratoriet
Lägerhyddsvägen 1 identify these pulses. Usually there are several active transmitters in
Hus 4, Plan 0 anenvironment and an antenna will register pulses from various sources. In order
to study the different pulse trains, the registered pulses first have to be sorted
Postadress: sothat all pulses that are transmitted from one source are grouped together.
Box 536
751 21 Uppsala This project aims to solve this problem, using Density-Based Spatial Clustering
of Applications with Noise (DBSCAN) and compare the results with those obtained
Telefon: by Fuzzy Adaptive Resonance Theory (FART). We aim to further dig into these
018 – 471 30 03 methods and map out how factors such as feature selection and training time
Telefax: affects the results. A solution based on the DBSCAN method is proposed which
018 – 471 30 00 allows online clustering of new points introduced to the system.
The methods are implemented and tested on simulated data. The data consists
Hemsida: of pulse trains from simulated transmitters with unique behaviors. The deployed
http://www.teknat.uu.se/student
methods are then tested varying the parameters of the models as well as the
number of pulse trains they are asked to deinterleave. The results when applying
the models are then evaluated using the adjusted Rand index (ARI).
The results indicate that in most cases using all possible data (in this case the
angle of arrival, radio frequency, pulse width and amplitudes of the pulses)
generate the best results. Rescaling the data further improves the result and
tuning the parameters shows that the models work well when increasing the
number of emitters. The results also indicate that the DBSCAN method can be
used to get accurate estimates of the number of emitters transmitting. The online
DBSCAN generates a higher ARI than FART on the simulated data set but has a
higher worst case computational cost.

Handledare: Rickard Norberg


Ämnesgranskare: Niklas Wahlström
Examinator: Tomas Nyberg
ISSN: 1401-5757, UPTEC F 19006
Tryckt av: UPPSALA
iii

Populärvetenskaplig sammanfattning
Radio detection and ranging (RADAR) är ett system som nyttjar elektromag-
netisk strålning för att kartlägga systemets omgivning. En radar skickar ut
en puls och studerar ekot av pulsen. Beroende på om ett eko uppstår, och
i sådana fall hur lång tid det tagit för ekot att uppstå, kan radarn detektera
objekt i den närliggande miljön samt objektets hastighet och avstånd relativt
radarn. En radar kan utnyttjas för en rad olika ändamål såsom utvinning av
väderleksinformation, geografisk kartläggning av markytor och detektion av
främmande föremål.
En radars ändamål kan uppskattas genom att studera strålningen som den
utsänder. Information om ändamålet kan vara kritisk i lägen med fientliga
enheter i omgivningen. I många fall fungerar radarer genom att sekvensvis
skicka ut pulser av strålning. När en antenn lyssnar efter pulser så detekterar
och registrerar den pulser från alla aktiva enheter i närheten, givet att pulserna
är tillräckligt starka. För att kunna studera miljön och uppskatta hotbilden
måste pulserna sorteras ut så att alla pulser från en viss källa klustras ihop.
I den här rapporten studeras två maskininlärningsmetoder för att sortera
dessa pulser. För båda metoderna studerar olika parametrar av pulserna så-
som deras radiofrekvens och amplitud. Den första, density-based spatial clus-
tering of applications with noise (DBSCAN), letar efter områden med en hög
densitet av pulser. Om den exempelvis detekterar att det förekommer många
pulser med en frekvens kring 10 GHz och en amplitud på -20 dB så klustrar
den ihop dem. En modifikation av DBSCAN föreslås i den här rapporten för
att identifiera källan av en puls i realtid.
Den andra metoden, fuzzy adaptive resonance theory (FART), skapar istäl-
let något som kan liknas vid virtuella pulser. För varje radar metoden tror sig
ha upptäckt, så skapas en virtuell puls med exempelvis en viss radiofrekvens
och amplitud som ska representera alla pulser tillhörande den här radarn. När
en ny puls detekteras så kommer denna puls att jämföras med alla virtuella
pulser och sägas tillhöra den virtuella pulsen som den är mest lik enligt en rad
kriterier. Värdena för den valda virtuella pulsen kommer sedan att uppdat-
eras.
Dessa metoder har studerats genom att evaluera dem på simulerad data.
Metoderna har evaluerats enligt hur väl de löser problemet när antalet radarer
ökar och när indatan har ändrats. Indatan har ändrats dels genom att reglera
vilken information om pulserna som används. Man kan exempelvis välja att
bara utnyttja information om pulsernas amplitud eller utnyttja information
om amplitud och radiofrekvens.
Resultatet visar att den nya föreslagna versionen av DBSCAN är en lämplig
lösning till klustringsproblemet. Vi kartlägger vidare hur både den föreslagna
metoden och FART påverkas av att ändra en rad parametrar samt beräkn-
ingskostnaden av att hantera en ny puls.
iv

Acknowledgements
I would like to thank all the people who have been supporting me in one
way or another throughout my project. My supervisor Rickard Norberg for
his guidance throughout this project and providing me with a deep well of
knowledge in the field of radar. My subject reader, Niklas Wahlström for tak-
ing so much time to read through my project and offering his thoughts and
suggestions to improve it. Without doubt doing much more than he initially
signed up for. My colleagues at Saab (especially Eric, Erik and Gustaf) for
always keeping their door open for discussing the field of machine learning.
Adam for being a great office-mate providing me with motivation and joy at
the workplace!

——————————————————
v

Contents

Abstract ii

Populärvetenskaplig sammanfattning iii

Acknowledgements iv

1 Introduction 1
1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Radar Theory 3
2.1 Basic Radar Principle . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Continuous Wave Radar . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 Pulsed Radar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3.1 Coherency in Pulsed Radar . . . . . . . . . . . . . . . . . 5
2.3.2 Pulse Width . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3.3 Radio Frequency . . . . . . . . . . . . . . . . . . . . . . . 6
2.3.4 Amplitude . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.5 Angle of Arrival . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.6 Pulse Repetition Interval . . . . . . . . . . . . . . . . . . . 8
2.4 Electronic Warfare . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Deinterleaving of Radar Signals 10


3.1 Extracting Pulses from Data . . . . . . . . . . . . . . . . . . . . . 10
3.2 Deinterleaving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Time of Arrival Methods . . . . . . . . . . . . . . . . . . . . . . . 11
3.3.1 Sequence Search . . . . . . . . . . . . . . . . . . . . . . . 12
3.4 Non-Time of Arrival Methods . . . . . . . . . . . . . . . . . . . . 13
3.5 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.6 Density-Based Spatial Clustering of Applications with Noise . . 14
3.6.1 minPts-parameter . . . . . . . . . . . . . . . . . . . . . . 16
3.6.2 e-parameter . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.6.3 Importance of Distance Function . . . . . . . . . . . . . . 16
3.7 Proposition for Online Version of DBSCAN . . . . . . . . . . . . 17
3.8 Fuzzy Adaptive Resonance Neural Network . . . . . . . . . . . 18
3.8.1 Input Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.8.2 The F Layer . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.8.3 Training the Network . . . . . . . . . . . . . . . . . . . . 20
3.8.4 Classification Stage . . . . . . . . . . . . . . . . . . . . . . 21
3.9 Evaluating Clustering Quality . . . . . . . . . . . . . . . . . . . . 21
3.9.1 Rand Index . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.9.2 Adjusted Rand Index . . . . . . . . . . . . . . . . . . . . . 22
vi

4 Method 23
4.1 Data Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Emitter Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.1 Pulse Width . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2.2 Pulse Repetition Interval . . . . . . . . . . . . . . . . . . . 24
4.2.3 Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2.4 Position . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3 Adding Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.4 Distance Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.5 Scaling the Features . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.6 Training and Validation Sets . . . . . . . . . . . . . . . . . . . . . 27
4.7 Programming Environment . . . . . . . . . . . . . . . . . . . . . 28

5 Results 29
5.1 DBSCAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.1.1 Parameter Tuning . . . . . . . . . . . . . . . . . . . . . . . 29
5.1.2 Choosing Features . . . . . . . . . . . . . . . . . . . . . . 30
5.1.3 Scaling Features . . . . . . . . . . . . . . . . . . . . . . . . 30
5.1.4 Online DBSCAN . . . . . . . . . . . . . . . . . . . . . . . 30
5.2 FART . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2.1 Parameter Tuning . . . . . . . . . . . . . . . . . . . . . . . 31
5.2.2 Choosing Features . . . . . . . . . . . . . . . . . . . . . . 32
5.2.3 Scaling Features . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.4 Online FART . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.3 Comparing DBSCAN and FART . . . . . . . . . . . . . . . . . . 33
5.3.1 Increasing Number of Emitters . . . . . . . . . . . . . . . 34
5.3.2 Predicted Number of Emitters . . . . . . . . . . . . . . . 34
5.3.3 Online Clustering . . . . . . . . . . . . . . . . . . . . . . . 34

6 Discussion 36
6.1 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.2 Number of Emitters . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.3 Trigonometric Features with DBSCAN . . . . . . . . . . . . . . . 36
6.4 Online Versions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.5 Computational Costs . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.6 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

7 Conclusion 39

A Complexities 40
A.1 Fuzzy ART . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
A.2 DBSCAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
A.2.1 Classifying a Batch . . . . . . . . . . . . . . . . . . . . . . 40
A.2.2 Introducing a New Point . . . . . . . . . . . . . . . . . . . 41

Bibliography 43
1

Chapter 1

Introduction

Radio detection and ranging (radar) equipment is used in modern day for a
variety of tasks such as detecting airplanes, mapping surroundings and deter-
mining weather conditions [1]. The basic radar works by transmitting electro-
magnetic waves and listening to the echoes of these waves from the surround-
ing environment. These waves can be transmitted continuously or in pulses
where the radar toggles between transmission and reception. In certain sce-
narios it is of importance to hinder the usage of the electromagnetic spectrum
by hostile units while preserving it for friendly units. Actions taken with this
purpose are termed electronic warfare (EW) [2].
In an environment with active hostile units, the electromagnetic signals
transmitted from these units can give hints about their objectives and what
kind of vessels they are attached to [3]. Detecting the waves that these radars
transmit and analyzing them can therefore provide insight in determining
their threat level. When a unit aims to solve this task, its antenna detects
the signals from several transmitters simultaneously and obtains a superpo-
sition of the pulses that the radars transmit. An example is given in figure 1.1a
where two radars are active. In figure 1.1b the superimposed pulses, as seen
by the receiver, are depicted. As a first step in disentangling the pulses, a fast
Fourier transform can be done to sort the pulses out according to different fre-
quencies but even then, pulses with similar frequencies will be superimposed
on each other. There is no labeling encoded into the data and at first glance
the pulses become indistinguishable in terms of which pulse belongs to which
radar. Furthermore, the superposition of the pulses adds the two first pulses
together. This entanglement of pulse trains is called interleaving [3]. In a real
environment there is noise present as well as the occurrence of both missing
and spurious pulses.
Information regarding an individual pulse such as its time of arrival, pulse
width and amplitude is extracted and stored in a pulse descriptor word. This
information is then used to label each individual pulse such that the pulses
emanating from a certain emitter have the same labels. This task is called dein-
terleaving and to solve this task, one can look at different characteristics of the
pulses such as their amplitude and from which direction they arrived from.
The methods of deinterleaving can generally be split into two subsections.
The first one only looks at the time of arrival of the pulses and tries to find
patterns in how the pulses occur. The sequence search, the pulse searching
and the cumulative difference histogramming algorithms are some examples
of algorithms that only look at the arrival times of the pulses [4]. The second
type of methods, which will be the focus of this project, look at the different
characteristics of the pulses and matches similar pulses to each other. Fuzzy
Adaptive Resonance Theory, Fuzzy Min-Max Clustering and Self-Organizing
Chapter 1. Introduction 2

Feature Mapping are three examples of algorithms that look at different fea-
tures of the pulses and neglect the time of arrival [5].

Detected signals from two pulsed radars Superimposed pulses as seen by the target
Transmission A
Transmission B
Amplitude

Amplitude
Time Time

(a) The antenna detects pulses from (b) The superimposed pulses from fig-
two different sources (dashed and full ure 1.1a. The receiving unit is not
lines). The pulses have the same pulse able to distinguish between which
width but different amplitudes. pulses come from which source at first
glance.

F IGURE 1.1: Two pulse trains are superimposed upon each


other. The original pulse trains and how they would be reg-
istered by an antenna is depicted.

1.1 Problem Statement


The objective of this project is to determine how suitable two different algo-
rithms are for deinterleaving pulses continuously. The two algorithms are
density-based spatial clustering of applications with noise (DBSCAN) and fuzzy
adaptive resonance theory (FART) network. The suitability of these algorithms
is based on how well they performed according to the adjusted Rand index
scores, how rapidly they solve this task and the robustness of the model in
regards to the amount of emitters.
3

Chapter 2

Radar Theory

A radar equipment is a device that uses the characteristics of electromagnetic


(EM) waves to map its surroundings. By emitting waves and listening to
echoes it can detect objects such as trees, vehicles and raindrops for ranges
up to hundreds of kilometers [1]. The application areas of radar cover a wide
variety of usages such as weather prediction, navigational aid and reconnais-
sance and surveillance.

2.1 Basic Radar Principle


The basic principle behind radar utilizes the fact that the propagation speed of
electromagnetic radiation is constant through a medium. This means that if a
radar transmits a pulse and receives an echo after ∆t seconds it can calculate
the distance to the object, often called the target, which the original pulse hit,
through

c∆t
distance = ,
2
where c is the speed of the electromagnetic wave (≈ 3 × 108 m/s in vacuum)
and the division by 2 is to take into account the fact that the pulse has travelled
to the target and back to the transmitter [1].
A radar needs two basic components, a transmitter and a receiver, in order
to determine the range to an object. The transmitters job is to send out the
signal to the surroundings and the receiver’s job is to pick up the echoes from
the surrounding. The strengths of radar are that the user can detect targets that
are much further away than with visual instruments and that the detection
is less susceptible to weather condition. Beyond detecting target distances,
a radar can be tailored to detect a target’s velocity. This can be utilized to
distinguish moving targets from stationary scenery (e.g. trees and mountains).
Radars are classified into two general types: continuous wave (CW) radar and
pulsed radar. A CW radar transmits a wave at all times and simultaneously
listens for echoes. A pulsed radar transmits EM waves for a short duration
and then listens for echoes after each transmission. Pulsed radars can further
be categorized into non-coherent and coherent radar (see figure 2.1).

2.2 Continuous Wave Radar


A CW radar simultaneously emits a wave continuously (i.e. without stopping
the transmission) and listens for echoes, meaning that the transmitted wave
Chapter 2. Radar Theory 4

Continuous
Continous Pulsed
wave
wave wave

Coherent Non-coherent

F IGURE 2.1: Chart showing the categorization of a radar.

rivals the echo. It is undesirable for the receiver to receive the transmitted sig-
nal directly from the antenna since it can interfere with the detection of echoes
and might damage sensitive equipment since the transmitted signal tends to
be several orders of magnitude larger than the echo. CW radars therefore need
some isolation between the transmitter and receiver which will limit the re-
ceiver’s ability to detect pulses. This limitation is one of the main reasons why
pulsed radar is more popular than CW [2].

2.3 Pulsed Radar


A pulsed radar toggles between two states: transmitting and receiving (see
figure 2.2). The duration for which it transmits energy is called the pulse width
(PW) and a set of pulses is called a pulse train. The duration from the begin-
ning of the transmission of a pulse to the beginning of the next transmission is
called the pulse repetition interval (PRI) [1]. If pulses are transmitted at constant
intervals, the PRI is equal to 1 divided by the number of pulses transmitted
per second, also called pulse repetition frequency (PRF).

PW0
PRI0

F IGURE 2.2: A pulse train consisting of three pulses emitted


from a radar. The first PRI and PW are marked.
Chapter 2. Radar Theory 5

(a) A pulse (A0 ) is transmitted and an (b) Two indistinguishable pulses (A0 and
echo (B0 ) is received. Assuming that A1 ) are transmitted and an echo (B0 ) is
there are no other transmitters in the received. There is no way for a radar
area, the radar can confidently conclude to determine if B0 is an echo of A0 or
that B0 has arisen due to A0 . A1 . The calculated range could therefore
take on two values and is ambiguous.

F IGURE 2.3: In a setting with several indistinguishable trans-


mitted pulses, the radar can have difficulties with determining
which transmitted pulse an echo stem from. If the range of
an object exceeds the maximum unambiguous range, the echo
will reach the receiver after a new pulse has been transmitted.

After the PRI, the radar will transmit a new pulse. This introduces an am-
biguity if the pulses are similar since the radar cannot tell from which trans-
mitted pulse a detected echo originates from. When transmitting a pulse, there
is a maximum range that the pulse can travel and return from before the radar
transmits another pulse. This range is called the maximum unambiguous range
(ru ) and can be calculated by

c · PRI
ru = .
2
If a target lies beyond this range, the echo will appear to come from the next
pulse transmitted (see figure 2.3) [1]. This problem can be solved for instance
by changing the pulse repetition interval or changing a parameter of each
pulse (e.g. each pulse has its own distinct frequency).

2.3.1 Coherency in Pulsed Radar


A pulsed radar can either be coherent or non-coherent. In coherent radars, the
phase between two signals is consistent [1]. A coherent transmission can be
thought of as a continuous sinusoidal wave in which the transmitted pulses
are cutouts of this wave (see figure 2.4). In a non-coherent radar, the phases
between two pulses can be shifted any amount without regard to consistency.
When there is a relative velocity between the transmitting unit and the target,
there will be a distortion of the frequency of the echo of the original transmis-
sion. This phenomenon is called the doppler effect [1]. The change in frequency
can be obtained by

Vr
fD = fO ,
c
Chapter 2. Radar Theory 6

Original wave

Coherent transmission
Amplitude

Non-coherent transmission

Time
F IGURE 2.4: Three categories of radars. In the topmost figure
the radar transmits a continuous wave during the whole time
interval. In the second figure, the coherent radar only transmits
the full, red, lines whereas the dashed lines show what a con-
tinuous wave would have transmitted. The last figure shows
a radar transmitting at the same time intervals as the previous
one. However, here the full lines are not following the dashed
lines in the broadcasting regions, so the radar is non-coherent.

where f D is the change in frequency, Vr is the radial velocity between the trans-
mitter and the target, c is the speed of light and fO is the frequency of the orig-
inal transmission [6]. Since Vr << c, f D is prone to be very small. The phase
difference between the echo and a reference signal, will however be apparent
for the duration of a pulse. By measuring the rate of change of the phase dif-
ference, the frequency shift and thus the relative radial velocity between the
transmitting and receiving units can be obtained [1]. Being able to determine
the velocity is useful for many reasons, one is that the radar can distinguish
between moving targets and stationary targets and can thus choose to neglect
things such as trees and mountains.

2.3.2 Pulse Width


The radar’s pulse width (PW) is a major implication of its limitations and
strengths. A shorter PW will result in a narrower echo making it easier to
distinguish between several objects (see figure 2.5) [1]. A longer PW will in
turn lead to difficulties in separating objects since their echoes will overlap
[1]. The limitation of shorter PWs is that the amount of energy transmitted is
lower which makes it harder to use in applications suited for long range (see
equation 2.1).

2.3.3 Radio Frequency


Radars today operate with radio frequencies (henceforth called frequencies) in
the interval between 1 MHz and 300 THz where most airborne radars operate
with frequencies between 100 MHz and 100 GHz [1]. The frequency heav-
ily influences the performance of the radar and is strongly correlated with its
Chapter 2. Radar Theory 7

Pulse A Pulse A
Pulse B Pulse B

Amplitude

Amplitude
Time Time

(a) The echoes of two pulses, A and B, (b) The two pulses, A and B, are here de-
are depicted. Since the pulse widths picted when their pulse widths are in-
are narrow enough, the pulses are sep- creased.
arated.

F IGURE 2.5: An increase in in the pulse width will increase the


likelihood of an overlap in the echoes of objects lying near each
other and thus reduce the resolution of the system. Here two
pulses are depicted arriving at TA and TB . In figure 2.5b the
pulse width is higher than in figure 2.5a.

carrying vessel. A lower frequency necessitates a larger antenna and higher


energy usage. The atmospheric attenuation increases with frequency and is
relatively low for frequencies between 0.3 GHz - 10 GHz and becomes gravely
influential for frequencies larger than 20 GHz. The frequencies can change
from pulse to pulse and even within a single pulse.

2.3.4 Amplitude
The amplitude is calculated by measuring the signal energy. There are differ-
ent ways to do this, two ways is to either set the amplitude to the peak reg-
istered signal energy or to take the mean signal energy of a pulse. The signal
energy, in a setting where there is no loss (e.g. due to atmospheric attenuation
and signal processing), is described by the radar equation

Pavg G2 λ2 σ
Signal energy = , (2.1)
(4π )3 R4
where

Pavg = average transmitted power [W]

G = antenna gain

σ = radar cross section of the target [m2 ]

R = range [m]

λ = wavelength [m]

The signal energy, and thus the amplitude, of the registered signal must be
high enough that the signal can be set apart from background noise. The radar
cross section of the target is a measure of how the area of a target is seen from
the radar of the transmitting vessel and parameters such as the material of the
target and its size [1]. The radar cross section is only of importance for the unit
Chapter 2. Radar Theory 8

that is transmitting. For the target the radar cross section is not of importance
and will not affect the registered pulse. However, the relative position of the
target to the incoming pulses does affect the registered amplitude (e.g. if the
antennas are located under a plane and the pulses are coming from above then
the amplitude is prone to be lower). Other things like blocking obstacles and
clouds can affect the amplitude between pulses.

2.3.5 Angle of Arrival


The angle of arrival (AOA) of an incoming pulse lies between [0, 360] (degrees)
and can vary rapidly depending on the relative velocities and the distance
between the transmitting object and the target. For a radar warning receiver
to determine the AOA, it needs several antennas. The difference in intercepted
power at the antennas is used to calculate the AOA of an intercepted pulse [1].
The AOA of a pulse train is unique in comparison to the other parameters
since it is the only parameter which cannot be manipulated from pulse to pulse
(i.e. a transmitter at a certain position cannot "lie" about its position). Using
the AOA thus provides a strong correlation between pulses in a certain pulse
train. However, since both the transmitting unit and the receiving target can
be mobile (e.g. airplanes and ships), the likelihood of the AOAs of the pulses
of two different transmitters overlapping in a setting with several transmitters
is high.

2.3.6 Pulse Repetition Interval


Changes in the PRI are not necessarily intentional and can occur due to in-
ternal fluctuations in the equipment that differentiate consecutive pulses from
each other. Usually changes in PRI are done to resolve range/velocity ambi-
guities and to counteract jamming. For a jammer, the arrival time of the next
pulse is valuable if it wants to transmit a pulse before it gets hit by it.

2.4 Electronic Warfare


Electronic Warfare (EW) is a term for actions taken in order to prevent en-
emies from benefiting from the utilization of communications using electro-
magnetic waves while preserving the benefits for friendly units [3]. EW can
be categorized into Electronic Support (ES), Electronic Attack (EA) and Elec-
tronic Protection (EP). The purpose of ES is to extract information from the
received pulses in order to make tactical choices and support the two other
areas of EW. An example could be to determine the PRI of a radar, its position
or broadcasting frequency. EA entails actions taken to nullify the capabilities
of hostile radar equipment through various means. This includes using jam-
ming, flares and anti-radiation weapons. The purpose of EP is to protect the
sensors of friendly units from hostile units.
An example of a scenario of how EW comes to play is seen in figure 2.6
where a radar is able to detect a target after ti seconds and thus can determine
the range to it. If the target transmits pulses with the same characteristics as
the transmitted pulse, the radar receives two indistinguishable pulses and will
not be able to precisely locate the target (see figure 2.6b).
Chapter 2. Radar Theory 9

Target Target

Radar Radar
(a) The environment extracted from the (b) The environment extracted from the
radar when the target does not jam the radar when the target jams the signal by
signal. sending out similar pulses.

Transmission and echo of a pulsed radar Transmission and echo of a pulsed radar
Amplitude

Amplitude

0 0

Time Time

(c) The larger pulse depicts the emitted (d) In this scenario, the target transmits
signal and the smaller pulse depicts the a pulse either before (green) or after (or-
echo from the target. Since the target ange) the transmitted pulse hits the tar-
does not emit any signals, only the "true" get. The radar will detect the true echo
echo will be received and a good esti- and the transmitted pulse but will not be
mate of the targets position can be ex- able to determine which pulse is the true
tracted by the radar. echo. This leads to a positional ambigu-
ity when determining the range

F IGURE 2.6: Jamming can be used by a target to affect a radar’s


ability to determine range by transmitting pulses with similar
characteristics as the transmitted pulse from the hostile radar.
10

Chapter 3

Deinterleaving of Radar Signals

3.1 Extracting Pulses from Data


A radar measures the power of the electromagnetic radiation in its environ-
ment by sampling the power at certain time steps. A simulated example of
a sampling is seen in figure 3.1 in which two pulses along with some white
noise is depicted. The red dashed line in the figure depicts a threshold which
the user can set. If the intercepted power exceeds the threshold, the system in-
terprets this occurrence as a detected pulse. Lowering this threshold leads the
system to easier detect weak signals but will raise the probability of the noise
exceeding the threshold, raising a false alarm. Conversely, raising this thresh-
old will reduce the amount of false alarms but will lead to the system only
detecting strong signals. A receiver trying to analyze the environment will de-

15

10
Amplitude [dBm]

−5

−10

−15
500 1000 1500 2000 2500 3000 3500
Sample [n]

F IGURE 3.1: Simulated data showing how a sampling could


look. The dashed red line shows the threshold that the signal
and the added noise has to exceed in order for the detector to
identify a pulse.

tect signals and create a pulse descriptor word (PDW) containing information for
Chapter 3. Deinterleaving of Radar Signals 11

each detected pulse. The PDW contains information about the pulse such as
its amplitude, TOA, AOA, PW and frequency. In figure 3.1, if the threshold is
set as is, two pulses will be detected and two PDWs will be retrieved (see table
3.1).

TOA freq. PW AOA ampl. sin cos


[ms] [GHz] [ms] [rad] [dB]
Pulse 1: 0.967 7.747 0.262 -1.336 -15.338 -0.973 0.234
Pulse 2: 3.437 7.965 0.251 -1.394 -15.638 -0.984 0.176

TABLE 3.1: An example of two PDWs. The sin/cos parameters


are obtained by sin( AOA) and cos( AOA).

Henceforth a data point will be used to refer a vector containing different in-
formation about a specific pulse. A data set is a set of such points. The different
parameters (frequency, AOA etc.) will be called features.

Processing angles
Angles can be measured in either radians or degrees. Both units are periodical
which causes a problem since the angle is periodic. As an example, consider
the case with an angle of 360◦ = 0◦ . The models will assume that there is a
large discrepancy and assume that a pulse with an angle equal to 0◦ is dif-
ferent from one with 360◦ . In order to solve this problem the AOA is broken
down into two features, sin( AOA) and cos( AOA) (henceforth referred to as
sin/cos).

3.2 Deinterleaving
The purpose of deinterleaving is to extract the correct pulse trains from an en-
vironment of mixed pulses [3]. In other words, the objective is to determine
which pulses are transmitted from the same transmitter. When this is done,
each pulse train can be analyzed separately. This analysis can render infor-
mation regarding the intended usage of the radar from which the pulse train
originates and instantiate countermeasures if a threat is detected.
To distinguish between different pulses, one needs to look at their simi-
larities and differences. There are several approaches to solving this problem
and a general distinction between these methods are a) those who only rely on
temporal data and look for patterns in PRI (henceforth referred to as time of ar-
rival methods) and b) those who disregard temporal data and look for similar-
ities/differences in other features such as AOA, RF and amplitude (henceforth
referred to as non-time of arrival methods).

3.3 Time of Arrival Methods


The methods that look at temporal data operate under the hypothesis that the
PRI of the pulses sent out from a certain transmitter are somewhat constant.
In figure 3.2, this information for three interleaved pulse trains is depicted and
shows how the input could look like. Pulse trains are extracted by looking for
patterns in the detected pulses (e.g. pulses occurring every 5 ns).
Chapter 3. Deinterleaving of Radar Signals 12

The strength of these methods is that they are relatively fast since they only
handle one feature (TOA). They are however reliant on there being a pattern in
this feature, meaning that if an emitter changes its PRI frequently the deinter-
leaver will have great difficulties extracting the correct pulse train. Spurious
and missing pulses will further impair the methods’ ability to deinterleave.
Examples of algorithms belonging to this class are the sequence search, the
pulse searching and the cumulative difference histogramming algorithms [7]
[8].

Pulses received during 4 ms


pulse

0.002 0.003 0.004 0.005 0.006


Time of Arrival [s]
F IGURE 3.2: The time of arrival of the pulses received during 4
ms.

3.3.1 Sequence Search


Compared to the other TOA-methods, the sequence search (SS) is a computa-
tionally heavy algorithm that has a high success rate for deinterleaving pulse
trains consisting of constant PRIs [4]. The SS works by first estimating a PRI
ˆ by taking the difference in TOA between two pulses. It then takes the
( PRI)
TOA of the first pulse t0 and searches for pulses at t0 + PRI,ˆ t0 + 2 PRI.
ˆ Usually
the algorithm has a tolerance window, meaning that it will look for pulses in
ˆ t0 + 1.05 PRI]).
an interval (e.g. [t0 + 0.95 PRI, ˆ If no pulses are found at these
time steps it stops and repeats the process from t1 and so on. If the process
continues and does not manage to find a valid sequence iterating through all
the ti as starting points, it tries a different PRI.
The second part of the algorithm evaluates if the "found" sequence is a
valid sequence. It does so by continuing the process of searching for pulses.
If it started from tk it now searches beyond tk + 2 PRI. ˆ If it misses two times
in a row it invalidates the sequence whereas if it hits nk times, where nk is a
predetermined threshold (recommended to be at least 5 [7]), it is determined to
be an actual pulse train from a single emitter. The pulse train is then removed
from the original sequence and the process restarts. In the worst-case scenario
the complexity of this algorithm is O( P2 ) where P is the number of pulses to
deinterleave.
Chapter 3. Deinterleaving of Radar Signals 13

3.4 Non-Time of Arrival Methods


The non-time of arrival methods aim to solve the deinterleaving task by look-
ing for similarities and differences in the other features such as the amplitude
and PW. They operate with the underlying assumption that the pulses that
are transmitted during a short duration from a certain transmitter are similar
(e.g. similar RF and AOA). These algorithms perform poorly if the transmitted
pulses from a radar are significantly different from each other or if the pulses
transmitted by several different radars are similar. The amplitude, PW and
frequency of the pulses seen in figure 3.2 are seen in figure 3.3. For this par-
ticular case, it is easier to distinguish the three different clusters in this figure
rather than in figure 3.2.

Pul e received during 4 m

18

Frequency [GHz]
16
14
12
10

90
80
70
−53−52 60 dth
50 i
−51−50 40 e W
Amplit−49
ude−48
30 ul
[dB−47 20 P
] −46 −45

F IGURE 3.3: The amplitude, PW and frequency of the pulses


from figure 3.2.

If the pulses from an emitter are identical, then the points will coincide
exactly. As the discrepancy between the pulses increases, the groups formed
by these pulses become less dense and more spread out. The models can be
tailored to work better in such an environment by tuning the parameters (see
chapter 3.6 and 3.8) of the different models to make them more flexible.
If the pulses from two different emitters are very similar in many of the
aspects, they will become increasingly difficult to distinguish from each other.
This problem can be solved to some extent as well with parameter tuning by
making the model less flexible and putting a tighter constraint on how dif-
ferent pulses belonging to a certain pulse train can be. The pulses from an
extracted pulse train will consequently be very similar.
If two pulse trains are extremely similar over all the features, the trans-
mitters themselves will have problems with distinguishing which pulses are
echoes of its own signals and it is thus rare that two radars operate with iden-
tical settings in the same environment.
Chapter 3. Deinterleaving of Radar Signals 14

3.5 Clustering
The deinterleaving task is synonymous with the correct clustering task. Let
Xi = { X1i , X2i , ..., X iM } denote a vector of size M where each X j is one of the pre-
viously mentioned features of the ith registered pulse. Let DS = {X1 , X2 , ..., XP }
be a set of P points, then a clustering C = {C1 , C2 , ..., CN } on DS is the parti-
tioning of DS into N clusters, such that all the points Xi ∈ DS belong to a
cluster Cj ∈ C. Furthermore, each point is assigned to only one cluster.
In the context of deinterleaving, DS is the set of the received pulses where
each Xi corresponds to a received pulse. The task is to create these N clusters
where each cluster corresponds to a pulse train from a specific emitter. Then
each pulse Xi is to be associated to a certain cluster Cj . If the system outputs
the correct clustering, N will be the actual number of pulse trains and each
Xi will be put in the cluster with all the other pulses that came from the same
emitter.

3.6 Density-Based Spatial Clustering of Applications with


Noise
Density-based spatial clustering of applications with noise (DBSCAN) is a
clustering algorithm that works by identifying high density regions in the data
space [9]. In this section a basic description of the algorithm is given. For
further reading the reader is referred to the original article. The strengths of
DBSCAN are that it does not require any information regarding the number
of clusters and that it can detect clusters of various shapes (see figure 3.4).
Furthermore, it also has the ability to label points as noise meaning that it is
useful in a setting where noise is present since a subset of the presented points
can be disregarded for further analysis. One weakness of the algorithm is that
it can have difficulties in settings in which the densities of different clusters
differ a lot (i.e. one cluster is concentrated whereas another one is sparse in
comparison).

F IGURE 3.4: Three clusters are depicted of two types of shapes.


Two of the clusters are formed as half moons whereas the third
on is circular.

DBSCAN only requires two parameters from the user, e and minPts. It
also allows the user to choose which distance function to use to compute the
distance between two points. e determines the maximum distance that two
Chapter 3. Deinterleaving of Radar Signals 15

points can have from each other and still be regarded as neighbors whereas
minPts determines the number of points that needs to be in the neighborhood
of a point for it to be classified as a core point. A point is either a core point
or a border point where a core point is a point with at least minPts neighbors
within a distance e (see figure 3.5). The algorithm iterates through the points
and identifies clusters by looking for regions with a high density of points.

p
q

F IGURE 3.5: The black circles show the e neighborhood for two
different points, p and q. With minPts = 5, the q is a border
point whereas p is a core point.

The pseudocode of the algorithm as described by Schubert et al. [10] is


given in algorithm 1 in which RangeQuery is a function that returns the dis-
tances from a point Xi to all the points in a data set DS. The algorithm traverses
through all points and if a point Xi is an unlabeled core point the algorithm la-
bels this point and all its neighbors as belonging to a cluster Ca . It then jumps
to the neighbors of Xi and if a certain neighbor X j is a core point, the algorithm
will label all neighbors of X j as belonging to Ca and continue the process for
these neighbors.
Permuting the order in which the data points are processed by the algo-
rithm will render a similar clustering, with the only potential difference being
the labels of some border points. If a cluster Ca is being labeled and a border
point Xi is a neighbor of a core point in Ca , Xi will be labeled as a point of Ca .
If Xi is later detected as a neighbor of a core point in Cb ( a 6= b), it will not be
relabeled. Had Cb been detected first, Xi would have been classified as a point
of Cb rendering a slightly different clustering. A possible scenario is depicted
in figure 3.6 in which the algorithm will end up with a cluster of circles and
another cluster of squares. The triangle point can be labeled as either a circle
or a square point depending on which cluster is detected first.
DBSCAN works by clustering all the data at once and is not designed to
work sequentially in a setting where new data is introduced, meaning that for
radar applications DBSCAN would have to receive P points and cluster them
all at once. The worst-case complexity is O( P2 ) with an average complexity of
O(nlog( P)) when using R*-trees in order to determining the neighbors of any
point [9]. R*-trees are a type of data structures used for storing data [11].
Chapter 3. Deinterleaving of Radar Signals 16

Algorithm 1 The Pseudocode of the DBSCAN algorithm [10].


1: Input: DS: data set
2: Input: e
3: Input: minPts
4: Input: dist: Distance function
5: Data: label: Labels of points, initially undefined
6: for each point Xi do
7: if label (Xi ) 6= undefined then continue
8: Neighbors N ← RANGEQUERY ( DS, dist, Xi , e)
9: if | N | < minPts then continue
10: c ← next cluster label
11: label (Xi ) ← c
12: Seed set S ← N \{Xi }
13: for each X j in S do
14: if label (X j ) 6= undefined then continue
15: Neighbors N ← RANGEQUERY ( DS, dist, X j , e)
16: label (X j ) ← c
17: if | N | < minPts then continue
18: S ← S∪N

3.6.1 minPts-parameter
As previously mentioned the minPts-parameter determines how many neigh-
bors a point Xi must have in order to be labeled as a core point. The higher this
number, the more densely packed must the surrounding of Xi be for it to be a
core point. It is therefore desirable to raise the value of minPts as the signal-
to-noise ratio (SNR) decreases in a setting with white noise since regions with
only noise will be less dense than regions with signals [10].

3.6.2 e-parameter
The e parameter is used to determine whether two points, Xi and X j , are neigh-
bors by determining if e ≥ dist(Xi , X j ) for some distance function. Therefore,
setting the value of e is dependent on the distance function. Schubert et al.
[10] recommend setting e as small as possible.

3.6.3 Importance of Distance Function


The choice of distance function is gravely influential in determining the sim-
ilarity of two pulses. There is no generic distance function that works best
across all domains and in most cases, it is often down to an expert in the
domain to choose a suitable distance function [12]. For example, it could
j
be the case that a difference ∆ = | X if req − X f req | in the frequency between
two pulses in the same cluster is insignificant whereas the same difference
j
∆ = | XPW
i − XPW | in the PW is deemed significant. The user might then use a
distance metric that captures this discrepancy.
Chapter 3. Deinterleaving of Radar Signals 17

BORDER POINT

CORE POINTS

F IGURE 3.6: The triangle point can be classified as belonging


to either cluster depending on which cluster is labeled first. If
minPts = 8 then the triangle is a border point since it has less
than 8 neighbors but is a neighbor of a core point.

3.7 Proposition for Online Version of DBSCAN


In order to deal with sequential clustering of pulses, a modification of the DB-
SCAN method, named online DBSCAN, is proposed. The goal of this method
is to be able cluster an unlabeled point given a labeled set of points. Further-
more, we want this set of points to be able to grow but only add points that
provide new information. As an example, consider the case of a set of pulses,
all with a frequency of around 10 GHz. A new point with a frequency of 10
GHz will provide no new information to the system regarding the behavior of
this cluster. However, a point with a frequency of 10.2 GHz might be deemed
sufficiently close to belong to the same cluster and could be an indicator that
the frequency of this cluster is increasing.
When dealing with real-time applications it is undesirable to process data
in large batches since the output is needed instantly. Waiting for the method
to gather data and process it will render the output outdated for most of the
data. Furthermore, when a new data point is introduced all the previous data
points must be re-processed. The online DBSCAN method collects data for a
time period t and labels this data using traditional DBSCAN. This time period
t will later be referred to as the training time. New points are then introduced
sequentially to the data set.
For a new point Xi , the algorithm finds neighbors of Xi and if they are
labeled, any of these labels are chosen to label Xi and its unlabeled neighbors.
If there are no neighbors the data point is added to the data set and if there
are only neighbors without labels, the data point determines if the number of
neighbors exceeds a threshold (minPts) and labels them if the condition is met.
Let DS be the set of points received during t, (eseq , minPtsseq ) be the pa-
rameter for the sequential part of the algorithm and (einitial , minPtsinitial ) be a
parameter for the initial part of the algorithm. Then the pseudocode of the
proposed algorithm is
Chapter 3. Deinterleaving of Radar Signals 18

Algorithm 2 The Pseudocode for suggested DBSCAN with sequential cluster-


ing.
1: Input: DS: data set
2: Input: einit , minPtsinit
3: Input: eseq , minPtsseq
4: Input: dist: Distance function
5: Data: label: Labels of points, initially undefined
6: DBSCAN(DS, einit , minPtsinit , dist, label)
7: c ← max(label)
8: when new point Xi do
9: Neighbors N ← RANGEQUERY ( DS, dist, Xi , eseq )
10: if | N | == 0 then
11: label (Xi ) ← Noise
12: DS ← DS ∪ {Xi }
13: else if (label (X j ) == Noise ∀ X j ∈ N ) and (| N | ≥ minPtsseq ) then
14: for all X j ∈ N do
15: label (X j ) ← c + 1
16: c← c+1
17: else if (label (X j ) == Noise ∀ X j ∈ N ) then
18: label (Xi ) ← Noise
19: DS ← DS ∪ { p}
20: else
21: pick any X j ∈ {X j |X j ∈ N, label (X j ) 6= Noise}
22: label (Xi ) ← label (X j )

Where DBSCAN() is the DBSCAN algorithm as initially proposed. To han-


dle the P points in the initial part of the algorithm, the complexity is that of
DBSCAN. It then takes o1 (3 + p(2M + 2)) + ( M + 1)o2 computations to han-
dle a new data point in the worst-case (see appendix A.2.2) where P is the
number of points in DS, M is the number of features used, o1 is the cost of one
type of operations (e.g. +,-, min, max) and o2 is the cost of heavier operations
(e.g. *,/, sqrt). Note that for two points introduced at different times, the only
thing that will differ is P as the size of the data set DS might have grown be-
tween the handling of these points. Thus, the method is heavily reliant on the
tendency of the method to add new points to the data set.

3.8 Fuzzy Adaptive Resonance Neural Network


The Fuzzy Adaptive Resonance Theory (FART) neural network is a type of
neural network that has been shown to be effective in clustering radar signals
[5] [13] [14]. The basic FART neural network consists of a hidden layer F of size
N, where N is the number of clusters (see figure 3.7) [15]. The input sample
Xi is of size 2M. Each node Fj has an associated weight vector w j and outputs
a measure, Tj , called the resonance between Xi and Fj . The network can be in
two states, training and classifying. When it is in classifying mode each new
Xi is simply labeled as belonging to the node Fj with the largest Tj . When it is
in training mode, either the weights of one of the nodes will updated or a new
node will be created each time a new Xi is presented.
Chapter 3. Deinterleaving of Radar Signals 19

Hidden layer F

Output category j
F1 (T
1 ,w
1)
Input sample .
(Ti,wi)
X . arg max{Tj}
j
. )
w N
. ,
(T N
FN

Orienting subsystem

ρ wj

F IGURE 3.7: A representation of an ART network. The in-


put sample consists of a vector Xi = { X1i , X2i , ..., X2M
i } and is

mapped to the output through the hidden layer F in which


each node represents a cluster. Each node j calculates Tj , this
value and the associated weights of that node are sent forward.
If the network is in training mode, the w j associated with the
largest Tj will be sent forward to the orienting subsystem to
determine if the vigilance criterion is met. If so, w j will be up-
dated and if not, the vigilance criterion is tested for the next
largest Tj . If there does not exists a Tj that passes this criterion,
a new node will be added to the hidden layer. If the network
is not in training mode, j of the largest of Tj will represent the
category which the input is mapped to.

3.8.1 Input Data


The original input data in the model consists of a vector Xi = { X1i , X2i , ..., X iM }
of size M where each X j is one of the previously mentioned features. This
vector is rescaled so that each point lies in [0, 1] and complement encoded by
appending the complement of Xi , Xi ← [Xi , 1 − Xi ]. Complement encoding
the data is done to reduce proliferation of clusters (i.e. nodes in the F layer)
whereas rescaling is done in order to neutralize natural magnitude differences
in data (e.g. frequencies operating in GHz region versus AOA lying in [0, 2π ])
[15].

3.8.2 The F Layer


Each node in the F layer represents a cluster. Its size is initially 0 and progres-
sively increases as the network is learning. Each neuron Fj in F is associated
with a set of weights w j , of size 2M × 1, that are used to calculate the reso-
nances, Tj , between the input and the node.
Chapter 3. Deinterleaving of Radar Signals 20

3.8.3 Training the Network


At each iterations i during the training of the network, a new data vector Xi
is introduced to the network. First, the category choice functions calculate the
resonances between the data and the different categories

| Xi ∧ w j |
Tj = Tj (Xi , w j , α) = ,
α + |w j |

where α > 0 is the choice parameter that usually is set to a very small value.
Setting it to a small value will increase the tendency of the system to opt for
categories with large w j . ∧ is the fuzzy AND operator that is defined as

a ∧ b = [min( a1 , b1 ), min( a2 , b2 ), · · · , min( an , bn )] ,

for two vectors a and b of size n. For a vector q of size n, |q| represents the L1
norm where
n
|q| = ∑ |qk | .
k =1

The neuron that has the highest Tj is said to be the winning neuron. If two
neurons have the same Tj the one with the smallest j wins. If Tj satisfies the
vigilance criterion
| Xi ∧ w j |
>ρ ; 1 ≥ ρ ≥ 0,
| Xi |
where ρ is called the vigilance parameter, then the weights w j are updated

wnew
j = β(Xi ∧ wold old
j ) + (1 − β )w j

where β ∈ [0, 1] is the learning rate parameter and dictates how fast the net-
works learns. If the chosen Tj does not satisfy the vigilance criterion, a new
node is chosen by taking the next largest T. This process repeats itself until
the vigilance criterion is met and if the criterion is left unsatisfied for all j a
new node q will be committed to represent a new cluster with wq = Xi . Af-
ter having processed all the data points one time, the same data points can be
reintroduced a second time for retraining. The number of epochs is then said
to be 2. Depending on β, the weights change by different amounts from the
previous epoch. This process can be repeated for any number of epochs until
the weights have converged to a constant value. When the β = 1 and the input
is complement encoded, the network will have converged after the first epoch.
The vigilance parameter determines how strict the model is when it comes
to creating new clusters. If the parameter is closer to 0, the model will not be
prone to creating new clusters since the current clusters will likely satisfy the
vigilance criterion and the reverse is true for a high value of ρ.
The computation time for handling a point for this part of the algorithm is

Computation time = o1 (6N M + 5M + 3N + 1) + o2 ( N + 4M) ,

where N is the number of nodes in the F layer and M is the number of features
of the original input. If β = 1 it is reduced to

Computation time = o1 (6N M + 5 + 6N + 1) + o2 ( N ) .


Chapter 3. Deinterleaving of Radar Signals 21

Method Number of computations


FART (training) o1 (6CM + 4M + 3C + 1) + o2 (C + 4M)
FART (training, β = 1) o1 (6CM + 5 + 6C + 1) + o2 (C )
FART (classi f ying, β = 1) o1 (6M + 1)C + o2 C
DBSCANseq o1 (3 + N (2M + 2)) + ( M + 1)o2

TABLE 3.2: The number of computations for the different algo-


rithms when a new point is introduced.

In the worst-case scenario, each data point renders a new node (i.e. is classified
as its own cluster).

3.8.4 Classification Stage


In order to determine the cluster to which a data point belongs, the algorithm
uses the category choice function T and maps the data point to the category j
with the highest Tj . Note that a data point might be mapped to a different T
in the classification stage than in the training stage since the weights are adap-
tive and new clusters might have been introduced. Furthermore, the vigilance
criterion does not play a role in the classification stage.
When only classifying points, without updating the weights, the computa-
tional cost of introducing a new point reduces to

Computation time = o1 (6M + 1)C + o2 C ,

The results obtained with a FART network is deterministic but does vary
with different parameters and are reliant on the order upon which the data is
processed meaning that the FART lacks consistency.

3.9 Evaluating Clustering Quality


The evaluation of clustering quality will be done through the usage of the
adjusted Rand index detailed below [16].

3.9.1 Rand Index


Let C be the correct clustering, i.e. the grouping of pulses into the emitters
they are from and P be the clustering generated from the model. The similarity
between P and C is done by pairwise evaluation of all the points in the data
set. For all pairs Xi , X j , their relationship can have four different types:
N11 - The pair belongs to the same cluster in C as well as in P

N10 - The pair belong to the same cluster in C but not in P

N01 - The pair belong to the same cluster in P but not in C

N00 - The pair does not belong to the same cluster in C nor in P
N11 and N00 represent correct pairwise clusterings. The Rand index (RI) is then
computed as [16]

N11 + N00 N + N00


RI = = 11 n .
N11 + N10 + N01 + N00 (2)
Chapter 3. Deinterleaving of Radar Signals 22

The RI can take on values in [0, 1].


An example of a clustering is seen in figure 3.8a where the correct pulses
are shown. In figure 3.8b the clusters as given by an arbitrary model is shown.
The RI of the clustering done by the model is 0.93.

(a) 15 different clusters (all circled in (b) The clustering of the pulses in fig-
dashed lines) are depicted. In each clus- ure 3.8a by randomly labeling the points,
ter there are only three points. rendering 31 unique clusters. Most of the
clusters consist of only one point. The RI
of the clustering is 0.93 and the ARI is -
0.025.

F IGURE 3.8: The real clusters are depicted in figure 3.8a and
the estimated clusters are depicted in 3.8b. There are 45 points
in this setting.

3.9.2 Adjusted Rand Index


In settings where the number of clusters is relatively high in comparison to the
number of data points, any algorithm can obtain an optimistically high RI by
randomly assigning each data point to a unique cluster. In this case the N11 is
low but N00 will be large resulting in an RI close to 1.
The adjusted Rand index (ARI) is the RI adjusted for chance, meaning that
the probability of randomly putting two pulses in clusters according to N11
or N00 is taken into account [17]. This is illustrated in figure 3.8 where the RI
obtained by randomly labeling points is 0.93 whereas the ARI is -0.025. The
ARI is obtained by

2( N00 N11 − N01 N10 )


ARI = .
( N00 + N01 )( N01 + N11 ) + ( N00 + N10 )( N10 + N11 )
Since there is no constraint that N01 N10 ≤ N00 N11 , the ARI can take on a
negative value. The ARI equals 0 when a clustering is similar to a clustering
in which the labels are randomly generated. An ARI of 1 indicates that the
clustering is perfect.
23

Chapter 4

Method

In order to test the performance of the models according to the ARI (see chap-
ter 3.9.2), the models had to be implemented. A library already existed for the
DBSCAN, so only the proposed online version that can handle new points as
well as the FART network were implemented. Furthermore, the data sets used
to test the chosen models were all simulated. This was done due to the nature
of sensitivity of the data as well as the difficulty of obtaining real labeled data
(i.e. it is easy to gather data but difficult knowing where it comes from). In
this chapter the simulation environment and data generation parameters are
detailed.

4.1 Data Generation


The raw input (before rescaling and complement coding) to the models con-
sisted of PDWs. To generate the PDWs, the supervisor at Saab provided com-
pany software to handle this task. The inputs to this software were the de-
sired characteristics of the emitters generating pulses, called emitter defini-
tions. These emitter definitions include, but are not limited to, the frequency,
pulse width and relative position of the emitters. As an example, consider the
case with an antenna receiving pulses from 3 active transmitters. The input to
the software could look like the rows in table 4.1.

Emitter # Frequencies [GHz] Pulse Widths [ns] ···


1 [8, 8.2] [2.32, 2.34, 2.41]
2 [13, 13.1, 13.12] [4.21]
3 [4] [9.32, 9.99. 10.01]

TABLE 4.1: An example of a set of emitter definitions that could


be used as input to the PDW generator.

The output would then consist of an interleaved pulse train containing the
pulses emitted from the specified emitters (see table 4.2). The software takes
into account the fact that both the antenna and the emitters are moving along
some unique trajectories.

4.2 Emitter Definitions


To generate the PDWs, emitter definitions have to be defined. For each emit-
ter definition, parameters linked to its PW, frequency, PRI and position were
defined. Several aspects of these definitions are set through guidance by radar
Chapter 4. Method 24

labels TOA Frequency PW Amplitude ···


1 0.000293 7.091410 83.090728 -39.203
2 0.001165 4.739233 83.424555 -45.485
1 0.001183 7.084387 85.887775 -39.200
3 0.004950 4.386671 9.469255 -31.184
..
.

TABLE 4.2: A snippet of an output from the PDW generator.


The output consists of several (here 4) PDWs, each containing
information regarding a pulse.

experts at Saab. The methodology used to choose the various parameters is


outlined in this chapter section. These characteristics were randomly drawn
from various probability functions. The parameters of these probability func-
tions are listed (summarized in table 4.3) and the used distributions are out-
lined throughout the chapter.

4.2.1 Pulse Width


Each transmitter is allowed to transmit pulses with several different PWs. The
number of PWs, n pw , is generated from a random distribution between 1 and
nmax . First a base value of the PW is generated

PW = c pw + x pw ,

where c pw = 0.1 is a constant representing a lower bound on the PW and x pw


is drawn from an exponential distribution with the following density function

1 x pw
f ( x pw ; λ pw ) = exp(− ), x pw ≥ 0 , (4.1)
λ pw λ pw

where λ pw = 70 is the expected value of generated variable x pw . Meaning


that the mean PW will be c pw + λ pw = 70.1. The rationale behind choosing
an exponential distribution is that the probability of obtaining a certain PW
decreases as the value increases. After this base PW has been generated, n pw
PWs are generated by multiplying the base PW with a random offset of 10%

PW = [ PWo0 , ..., PWo n pw ] ,

where oi ∼ Uni f orm(0.9, 1.1).

4.2.2 Pulse Repetition Interval


The PRIs are calculated through

PRI = PW (c PRI + x PRI ) ,

where x PRI is drawn from an exponential distribution using λ PRI (see eq. 4.1)
and c PRI = 10 is constant. Generating the PRI this way ensures that the PW is
1
smaller than the PRI and that the PW is at least a certain fraction ( CPRI ) of the
PRI. Note that the number of PRIs is equal to the number of PWs for a certain
pulse train and that there can be up to nmax different PWs.
Chapter 4. Method 25

+LVWRJUDPRYHUWKHJHQHUDWHGIUHTXHQFLHV



1XPEHURIRFFXUUHQFHV












      

)UHTXHQF\>*+]@

F IGURE 4.1: The histogram for the frequencies of a generated


pulse train with | f i | = 4 after noise is added.

As mentioned in section 2.3.6, an emitter can send out a pulse train with
several types of PRI modulations. The modulation of a certain pulse train is
chosen randomly. Note that the PRI is not used as an input to the evaluated
models. However, it affects how often a pulse occurs and will therefore impact
the number of pulses in the data set.
As an example, consider the case of generating the pulses for a radar with
certain characteristics for a specified time period. If the characteristics remain
unchanged except for the PRI, which is decreased, the number of pulses emit-
ted from this radar will increase. If one aims to cluster the pulses registered
during a time period, decreasing the PRI will lead to a higher computational
cost since more pulses have to be processed.

4.2.3 Frequency
The base frequency is drawn from a random distribution from the interval
[2, 20] GHz with a bump in [8, 11] GHz. A transmitter is allowed to transmit
up to 20 different frequencies, with this number being drawn from a uniform
distribution. These different frequencies, f i , differ from the base frequency by
up to 5% and the pulse train will consist of frequencies around each f i (see
figure 4.1 for an example).

4.2.4 Position
The position is given by three coordinates (x, y, z) that are drawn from random
uniform distributions. In each direction, the maximum distance that an emit-
ter can have from the receiver is 100km meaning that the furthest Euclidean
distance between the receiver and the emitter is ≈ 173km.

Parameter values
The values of the parameters presented in the previous sections are presented
in table 4.3.
Chapter 4. Method 26

Parameter Value Interpretation


nmax 10 Number of PWs
1
λ pw 70 Expected value of PW added to the base PW
λ PRI 1 Expected value of PRI added to the base c PRI
c pw 0.1 Base PW
c PRI 10 PRI base fraction of PW

TABLE 4.3: The values for parameters used to generate data in


the first data environment.

4.3 Adding Noise


Once obtained, noise was added to the different features of the PDWs. The
noise was additive for the AOA and the frequency, meaning that for the fre-
quency of a PDW X if req , a new frequency was obtained by

X if req ←− X if req + noise f req ,

where noise f req ∼ Uni f orm(−0.6, 0.6)[ GHz] and noise AOA ∼ Normal (µ =
0, σ = 7)[◦ ] where µ is the mean and σ the standard deviation of the distri-
bution. The PW of each PDW was multiplied by a factor in order to account
for the noise
i i
XPW ←− XPW × noisePW ,
where noise PW ∼ Uni f orm(0.99, 1.01).

4.4 Distance Function


The chosen distance function was the Euclidean distance function. Given two
points Xi and X j with 2K features, the Euclidean distance is given by
v
u 2K
Euclidean Distance = t ∑ ( Xqi − Xq ) .
u j

q =1

This distance metric was chosen since it seemed suitable for the generated data
and it performed sufficiently well on some initial trials. Further tailoring the
distance function to the data could be beneficial in real life but doing so in
this environment could be deceitful since the data is generated. The reason
for this is that the user specifies the distributions from which the pulses are
generated and so the final model could be overfitted to work on this specific
environment.

4.5 Scaling the Features


The data is initially rescaled for both methods so that each feature lies in [0,1].
For a data set DS, the largest feature over all the points (e.g. the largest fre-
quency) will be set to 1 and the smallest to 0. Each feature of each point, X ji
Chapter 4. Method 27

will be rescaled

X ji − a j
X ji ← . (4.2)
bj − a j

Let Y j be the vector containing all the values of the jth feature of all the points
(e.g. all amplitudes in our data set), then

a j = min(Y j ) ,
b j = max (Y j ) .

In other words, a j is the minimum value that the feature j takes. This does
assume that there is a set of previously collected data set of points DS that is
to be labeled or that new points will be similar to the points in DS.
This means that the distance between two points will be dependent on the
maximum and minimum values of the different features in the data set. As
an example, consider the case where two pulses occur with frequencies 1 GHz
and 2 GHz respectively. The distance between these points after rescaling will
be strongly reliant on how the rest of the data set looks.
Another problem with using this methodology in an online setting is the
case of handling a new point Xnew whose features lie above the previously set
a j or below b j . Then either a new a j /b j can be set which means that the jth
feature of all the values have to be rescaled, or X jnew can simply be set to 1 or 0.
In order to adjust for this reliance on the data set, custom a j /b j can be
chosen so that the model uses these values. In this case, the assumption is
made that for instance the frequency cannot exceed 20 GHz and cannot be less
than 2 GHz, then

a f req = 2 ,
b f req = 20 .

For the data set used in this report, these values can easily be set. In a real envi-
ronment, it would be much more difficult to know what a maximum frequency
would be. But a sufficiently good estimate can be retrieved by theoretical anal-
ysis. An alternative approach could be to estimate these values based on the
pulses gathered the last during a specified time period.

4.6 Training and Validation Sets


A training and a validation data set were generated. The training set was
used to tune the models whereas the validation data set was used to validate
the obtained results from the training set. Each data set consisted of 10 ms
long pulse trains of around 7000 different emitters. Even though all emitters
transmit pulses for around 10 ms, the number of pulses vary greatly since the
PRIs will be different.
When a simulation was done, a radar environment with nradar number of
emitters was generated by randomly selecting nradar emitters from our data set
and extracting the PDWs of these emitters. These PDWs were used as input
for a particular setup of the models. The number of simulations done for each
setup varies between 5000 and 10000 different environments per setup with
Chapter 4. Method 28

up to 50 emitters (see chapter 5). The number of ways of choosing 50 emit-


ters out of 7000 is ≈ 5 × 10127 indicating that it is unlikely that two identical
environments are generated by choosing 50 emitters from a set of 7000.
This means that the data set used to evaluate how the results change in e.g.
sections 5.1.1 and 5.1.2 might differ but for all simulations within each section
the data set is the same. These two data sets were generated from the same
data set.

4.7 Programming Environment


All the code was written in Python 3. The DBSCAN method and the ARI were
imported from the scikit-learn library 1 2 . All the simulations were performed
on a computer running on Windows 10 with Intel Core i7-8700K 3.70 GHz and
NVIDIA GeForce GTX 1080 Ti.

1 https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html
2 https://scikit-learn.org/stable/modules/generated/sklearn.metrics.adjusted_rand_score.html
29

Chapter 5

Results

There are two things that will affect how well the different models perform on
the data set that we can regulate; the features and the models’ parameters. Re-
garding the features, one can choose first of all which features to use (e.g. only
use amplitude and frequency) as well as processing the features (e.g. scaling
the values). This chapter consists of two sections, one for DBSCAN and one
for FART, in which the impact of changing these values is described and a
third section in which the best results for the two methods are compared. The
features available can be seen in table 3.1, except for the TOA which isn’t used.

5.1 DBSCAN
5.1.1 Parameter Tuning
As previously mentioned, there are two parameters to specify when using DB-
SCAN: e and minPts . Doing a grid-search in an environment with 12 emitters
using all the features the optimal parameters are obtained for this setting. In
figure 5.1 the results are depicted for two settings and it can be seen that e is
far more influential on the results in comparison to minPts . The best results
were obtained by using e = 0.08 and minPts = 3. These set of parameters was
validated to be optimal when varying the features.

ARI obtained in a setting with 12 emitters


1.0 minPts = 3
minPts = 5
0.9

0.8
ARI

0.7

0.6

0.1 0.2 0.3 0.4 0.5


epsilon
F IGURE 5.1: Results obtained using DBSCAN to cluster the
data using different parameter values. For each combination
5000 simulations were done where e = 0.08 and minPts = 3.
Chapter 5. Results 30

5.1.2 Choosing Features


To determine which set of features are the most important, permutations of the
different features were tested. The best results can be seen in figure 5.2. The
features that yield the highest ARI is different in two regions. In the region
where the number of emitters is larger than 6, the results with the highest ARI
are obtained using all parameters. Dropping one of either sin/cos yields a sig-
nificantly better result when the number of emitters is lower than 7. The trend
seems to be that features that obtain a higher ARI in an environment where the
number of emitters is less than 7 perform worse when the number of emitters
grows compared to models that perform poorly in the first environment.

DBSCAN with epsilon = 0.08, minpts = 3


1.0

0.9

0.8
ARI

0.7

0.6 freq., pw., amp.


sin, pw., freq. amp.
sin, cos, pw., freq.,amp
0.5 sin, amp., freq.
2 4 6 8 10 12 14
Number of emitters
F IGURE 5.2: Results obtained using DBSCAN to cluster the
data using different features and varying the number of emit-
ters. For each combination 5000 simulations were done where
e = 0.08 and minPts = 3.

5.1.3 Scaling Features


In figure 5.3 the inputs are rescaled according to equation 4.2 with predeter-
mined values of a j and b j . These predetermined values were set to the global
min/max values in the whole data set. These set values are here called flags.
The obtained results are significantly better than those in 5.2 when the number
of emitters is low. Note that the y-axis has different limits in figure 5.2 and 5.1.

5.1.4 Online DBSCAN


In this subsection, the whole data set consists of pulses gathered for 10 ms.
The data set is partitioned into two subsets where one is used to familiarize
the algorithm with the environment and the second one is labeled. The result
obtained are seen in figure 5.4. The features used here where the frequency,
PW and amplitude. Using the sin/cos yielded worse results.
Chapter 5. Results 31

1.00
DBSCAN
all features, global flags
0.99 cos excl., global flags
0.98
ARI0.97
0.96
0.95
0.94
0.93
0.92
0 10 20 30 40 50
Number of emitters
F IGURE 5.3: Results obtained using DBSCAN to cluster the
data using different features, rescaling the data with global
flags and varying the number of emitters. For each combina-
tion 5000 simulations were done where e = 0.08 and minPts =
3. The blue line with triangle markers uses all features and the
orange line with square markers uses all features except cos.

1.0

0.9

0.8
ARI

0.7

0.6

0.5 pw., ampl., freq. used


all features used
0 2 4 6 8
Training time [ms]
F IGURE 5.4: Results obtained using DBSCAN to cluster
the data using the two sets of features, rescaling the data
using global flags and changing the training time. For
each combination 5000 simulations were done where einit =
0.08, minPtsinit = 2 and eseq = 0.05, minPtsseq = 2. The num-
ber of emitters acting in this environment is 12.

5.2 FART
5.2.1 Parameter Tuning
The two parameters to specify for the FART network are the ρ and β parameter.
As previously discussed, the influence of the ρ parameter is to control how
strict the algorithm will be when associating a point to a cluster. The higher
Chapter 5. Results 32

the value of ρ, the stricter the algorithm will be when evaluating the vigilance
criterion and the more prone will it be towards creating new clusters. The β
parameter regulates how many epochs the network takes before the weights
have converged. Since it is desirable that the algorithm executes as fast as
possible, β was set to 1. β and ρ = 0.9 yielded the highest ARI.

5.2.2 Choosing Features

1.0

0.9

0.8

0.7
ARI

0.6

0.5 sin, pw., freq. amp.


sin, cos, pw., freq.,amp
0.4 freq., pw., amp.
Scaling all the features
2 4 6 8 10 12 14
Number of emitters
F IGURE 5.5: Results obtained using FART to cluster the data
using different features, rescaling the data and varying the
number of emitters. For each combination 5000 simulations
were done where ρ = 0.9 and β = 1.

Figure 5.5 depicts the results for the best combinations of features to use.
In the region with 6 or more emitters, more features yield better results. In
the other region, omitting trigonometric features yields better results. For all
combinations the results are significantly worse when there is a low number
of emitters. Note that the red curve with diamond markers depicts the case
where all features are rescaled (see the next section).

5.2.3 Scaling Features


Following the reasoning in chapter 5.1, global flags are introduced when rescal-
ing the features. The results improve significantly in the region with a low
number of emitters when global flags are introduced as seen in figure 5.5.

5.2.4 Online FART


For the orange line with square markers in figure 5.6 the FART network was
trained on the data retrieved the first 2 ms and then clustered the data collected
during the next 8 ms. As seen in the figure, the results became slightly worse
but is expected to compute faster. In figure 5.7 the training time is changed in
an environment with 12 acting emitters.
Chapter 5. Results 33

0.98 FART: training on all data


FART: training on 10% of data
0.96

0.94
ARI

0.92

0.90

0.88

0 10 20 30 40 50
Number of emitters
F IGURE 5.6: Results obtained using FART to cluster the data
using all features, rescaling the data with global flags and vary-
ing the number of emitters. For each combination 5000 simula-
tions were done where ρ = 0.9 and β = 1. The orange line with
square gathered data for 2 ms and clustered the data from the
next 8 ms using the trained network. The blue line gathered
data for 10 ms and used that network to cluster itself.

0.96

0.94

0.92
freq., pw., ampl. used
ARI

All features used


0.90

0.88

0.86
0 2 4 6 8
Training time [ms]
F IGURE 5.7: Results obtained using FART to cluster the data
using two sets of features, rescaling the data using global flags
and varying the training time. For each combination 10 000
simulations were done where ρ = 0.9 and β = 1. There are 12
active emitters in the simulation.

5.3 Comparing DBSCAN and FART


In the previous sections the best results for both methods were retrieved and
in this section they are further compared to each other.
Chapter 5. Results 34

5.3.1 Increasing Number of Emitters


The ARIs when there are up to 50 emitters are shown in figure 5.8. It is evi-
dent that DBSCAN performs better than FART in most cases. As the number
of emitters increase, both methods yield lower results. The decline in the re-
sults is larger for FART than for DBSCAN. Note that the limits on the y-axis
are different from the previous figure to illustrate the performance drop more
effectively.

Best results obtained for DBSCAN and FART


0.98

0.96

0.94
ARI

0.92

0.90
all features, global flags: DBSCAN
0.88 all features, global flags: FART
0 10 20 30 40 50
Number of emitters
F IGURE 5.8: Results obtained for online DBSCAN and FART
when varying the number of emitters. For online DBSCAN:
e = 0.08 and minPts = 3. For FART: β = 1 and ρ = 0.9. The
training time is 2 ms for both the methods.

5.3.2 Predicted Number of Emitters


In certain scenarios it can be of great value to predict how many emitters are
operating. Figure 5.9 shows how the two models with the best ARI perform
in determining the number of emitters. Note that the models can be tuned
to perform better in regard to this score (e.g. decreasing ρ will decrease the
number of clusters) but will reduce the ARI. FART tends to overestimate the
number of emitters whereas DBSCAN shows the opposite behavior. The error
obtained using FART exceeds the one using DBSCAN, which lies closer to the
true value (depicted as a green dashed line).

5.3.3 Online Clustering


The effect of increasing the number of emitters for the online DBSCAN and
FART when training on 2 ms of the data is seen in 5.10. Once again, the results
decay with an increase in the number of emitters. In this scenario the decrease
is more evident than in figure 5.8.
Chapter 5. Results 35

Predicted number of emitters


70
DBSCAN (eps = 0.08, minPts = 3)
60 FART (B = 1, rho = 0.9)
Ideal prediction
50

40

30

20

10

0
0 10 20 30 40 50
Number of emitters
F IGURE 5.9: The predicted number of emitters when changing
the number of emitters for FART and DBSCAN. For DBSCAN
(blue line, circle marker): e = 0.08 and minPts = 3. For FART
(orange line, square marker): β = 1 and ρ = 0.9.

Training on 2 ms of data
0.98

0.96

0.94

0.92
ARI

0.90

0.88

0.86
FART, all features
0.84 DBSCAN, pw., ampl., pw.
0 10 20 30 40 50
Number of emitters
F IGURE 5.10: Best results obtained for DBSCAN and FART
when varying the number of emitters. For DBSCAN: e = 0.08
and minPts = 3. For FART: β = 1 and ρ = 0.9.
36

Chapter 6

Discussion

6.1 Parameters
When evaluated on the ARI both models perform well on the generated data
sets. For DBSCAN, when e increases, the distance that two points can have
from each other and still be regarded as neighbors increases which means that
the user allows less dense clusters to be identified. If the value is increased
further, separate clusters will be joined together and the number of clusters
will decrease as the results in figure 5.1 indicate.

6.2 Number of Emitters


Since the initial parameter values where optimized to run in an environment
with more than 6 emitters, the model performs poorly in a setting with fewer
emitters and not using global flags (see figures 5.2 and ??). In an environment
with a lot of emitters the probability of the max/min values of a feature across
all the pulses being close is small. When the max/min values of a feature are
close and the data is rescaled accordingly, the models will perform poorly. This
is because the sampled data set will appear to be sparser than if the max/min
values had been far apart.

6.3 Trigonometric Features with DBSCAN


Both in the scenario where there are few emitters with no global flags as well
as when the training time is reduced, using the trigonometric features yield
worse results for the DBSCAN methods. In both cases the reason to why the
result is significantly diminished is that the angle of arrival feature is prone
to be noisier. Originally, ±7◦ was added to the registered AOA of each pulse.
Meaning that two pulses from a pulse train can span ≈ 3.9% of the input space.
Since both the emitter and the antenna are moving in this environment, their
relative positions will shift in the span of the sampling. This shift will induce
further discrepancies in the AOA of the pulses in a pulse train. A single pulse
spans 14◦ but the average span of a pulse train is 45◦ .
Splitting up the AOA induces a loss in accuracy for the individual sin/cos
features. Consider the case where we have several pulses with AOA = 0◦ .
These pulses will lie between [−7, 7]◦ (spanning 14/360 ≈ 3.9% of the feature
space) due to the added noise to each pulse. The sin feature of this pulse train
will take on values between [−0.12, 12] (spanning 0.24/2 = 12% of the feature
space).
Chapter 6. Discussion 37

When reducing the training time, the effect of the moving units becomes
apparent in the ARIs obtained using the online DBSCAN. In figure 5.4 the
curves show that omitting the sin/cos variables yield significantly better re-
sults when the training time is low. For the FART network the trigonometric
functions have a positive effect on the ARI (see figure 5.7). One reason to why
this could be is that FART does not use a distance function to determine the
similarity between a point and a cluster. It instead uses the resonance which
might be less susceptible to the previously described phenomenon.

6.4 Online Versions


Looking at the results from the online versions depicted in figures 5.4, 5.7 and
5.10 it is evident that even though the training time is decreased, the results
yielded are still high. There is however a drop in the ARI which is more ev-
ident when the number of emitters is increased as seen in figure 5.10. This
suggests that it is indeed a viable approach to decrease the training time.

6.5 Computational Costs


The computational costs of the online DBSCAN and the FART network are not
entirely comparable since they both depend on how the methods have per-
formed on the previous pulses. Consider the case where the online DBSCAN
method adds each new point to the data set as a unique cluster and that it
has added P points already. Then a new point Xi added to the network has to
be compared to all P previous points rendering a high cost. Assume that in
P
this setting there were 20 true clusters and that the algorithm only needed 2
P
pulses to identify a cluster. Then the algorithm would only have stored 10 of
i
the pulses and the computational cost for X reduced by a factor of 10.
The FART network has the advantage of having a classification stage which
does not change the current network at all whereas the online DBSCAN method
lacks such a setting. The advantage of this setting is that when switching from
training to classifying, the user can regulate how many computations each
new point will have to perform to be labeled. Performing the k-nearest neigh-
bor algorithm on the output of DBSCAN could be an alternative solution.

6.6 Future Work


The methods described in this report should be applied to a more diverse data
set in order to further map their weaknesses and asses their suitability for solv-
ing this task. Changes to the data set should be done especially in regard to
noise and spurious pulses added to the data set.
The results from batch-wise clustering in predicting the number of emitters
as seen in figure 5.9 are very accurate, especially in the case of DBSCAN. This
offers an alternative functionality which is to estimate the number of trans-
mitters. This information could be used to implement a set of new clustering
methods who need to know the number of emitters in an environment to func-
tion such as the K-means method.
If could also be of interest to evaluate the suitability of models that work
beyond 10 ms and perhaps operate over several seconds. A difficulty in this
Chapter 6. Discussion 38

setting is the amount of data that is accumulated (either in the creation of new
nodes for FART or through the addition of points in the online version of DB-
SCAN). DenStream is an algorithm which performs density-based clustering
over an evolving data stream with noise and which could offer a solution to
this problem [18].
39

Chapter 7

Conclusion

This report proposes an online version of the Density-Based Clustering for Ap-
plications with Noise (DBSCAN) method that performs better than the Fuzzy
Adaptive Resonance Theory on a simulated radar environment in regards the
adjusted Rand index (ARI).
The simulations show that reducing the training time for the evaluated
methods leads to a small drop in the ARI which is appealing in a setting where
computational cost is of importance. The results also indicate that DBSCAN
can be used to give a good estimate of the number of acting emitters for the
generated radar environment.
The complexities of both these methods are mapped out and FART is shown
to have a lower worst-case complexity than the proposed solution.
40

Appendix A

Complexities

In this appendix, the derivation for the various complexities can be found. In
all implementations, the data is assumed to be rescaled as to lie in between 0
and 1. Furthermore, there is a distinction being made between two types of
computations. This derivation is based on the one by Granger et al. [5].

• o1 : the time required to do elementary operations such as addition, com-


parisons, min/max functions.

• o2 : the time required to do advanced operations such as multiplication,


division, square root.

In the subsequent computations, the assumption is made that memory is not


a constraint (i.e. we can reuse computations).

A.1 Fuzzy ART


Let M denote the number of features used and N denote the number of clus-
ters. The number of iterations at each step is seen in table A.1. The total num-
ber of iterations when training a network is thus

Computation time = o1 ( M + N (6M + 1 + 1 + 1) + 4M + 1) + o2 ( N + 4M)


= o1 (6N M + 5M + 3N + 1) + o2 ( N + 4M)

If the learning parameter β is equal to 1 we can reduce the computation time


to:

Computation time = o1 ( M + N (6M + 1 + 1 + 1) + 1) + o2 N


= o1 (6N M + 5 + 6N + 1) + o2 N

When only classifying points, without updating the weights, the cost reduces
to:

Computation time = o1 (6M + 1) N + o2 N

A.2 DBSCAN
A.2.1 Classifying a Batch
Let M denote the number of features and P be the number of data points.
The worst-case complexity of the method is approximately (derived from table
Appendix A. Complexities 41

Step Operations Iterations


Complement coding Mo1 1
Calculating choice functions Tj o1 (6M + 1) + o2 N
X ∧ wj 2Mo1 N
| · |*** 2Mo1 N
a+b 1o1 N
a/b 1o2 N
Vigilance test o1 ( C + 1)
ρ| X | o1 * 1
a>b o1 N
Finding largest Tj o1 N
Updating weights o1 (4M ) + o2 (4M) or 0 1
β( X ∧ wold
j ) 2Mo2 or 0 ∗ ∗ 1
(1 − β ) 2Mo1 or 0** 1
(1 − β)wold
j 2Mo2 or 0 ∗ ∗ 1
a + b (vectors of len 2M) 2Mo1 1

TABLE A.1: *: | X | is always equal to 2M if our input is com-


plement coded. **: If β = 1 then we can skip some steps since
the new weights become wnew j = X ∧ wold
j . ***: Note that this
operation is done twice and since all values lie between 0 and
1, the absolute value operation can be omitted.

A.2)

Computation time = P(o1 (5 + 2PM ) + o2 ( M + 1) P)

In a batch of size N, the worst-case complexity is in the order of O( N 2 ).

A.2.2 Introducing a New Point


If a new point is introduced in accordance with the algorithm described in
section 3.7 the worst-case complexity is

Computation time = o1 (3 + P(2M + 2)) + ( M + 1)o2

Step Operations Iterations per point


label(P) != undefined o1 1
Calculate distances o1 (2M ) + o2 ( M + 1) P
Xi − Yi Mo1 P
a2 Mo2 P
M

√i ai Mo1 P
a o2 P
Comparing distances o1 P
Finding neighborhood o1 1
neighborhood.size o1 1
neighborhood.size < minPts o1 1
maxClusterId + 1 o1 1

TABLE A.2: The steps performed for each point in a data set in
which each point will form a single cluster.
Appendix A. Complexities 42

Step Operations Iterations per point


Calculating distances o1 (2M ) + o2 ( M + 1) P
dist( p, q) < e o1 P
neighborhood.size == 0 o1 1
q.ClusterId == −1 o1 P (worst-case)
neighborhood.size ≥ minPts o1 1
AND o1 1

TABLE A.3: The steps performed for each point in a data set in
which each point will form a single cluster.

The computations for each step are outlined in table A.3 in which P is the size
of the data set DS consisting of the points used to do the initial clustering and
any points added to it.
43

Bibliography

[1] C. J. B. George W. Stimson Hugh D. Griffiths and D. Adamy, Introduction


to airborne radar, Third edition. Scitech Publishing, 2014.
[2] D. C. Schleher, Introduction to electronic warfere. Artech House, 1990.
[3] D. Adamy, Ew 101: A first course in electronic warfare. Artech House, 2001,
vol. 101.
[4] K. Manickchand, “Multiple radar environment emission deinterleaving
and pri prediction”, PhD thesis, University of Cape Town, 2017.
[5] E. Granger, Y. Savaria, P. Lavoie, and M.-A. Cantin, “A comparison of
self-organizing neural networks for fast clustering of radar pulses”, Sig-
nal Processing, vol. 64, no. 3, pp. 249–269, 1998.
[6] R. G. Wiley, Electronic intelligence: The analysis of radar signals, Second
printing. Artech House, 1985.
[7] H. Mardia, “New techniques for the deinterleaving of repetitive sequences”,
in IEE Proceedings F-Radar and Signal Processing, IET, vol. 136, 1989, pp. 149–
154.
[8] D. Milojevic and B. Popovic, “Improved algorithm for the deinterleaving
of radar pulses”, in IEE Proceedings F-Radar and Signal Processing, IET,
vol. 139, 1992, pp. 98–104.
[9] M. Ester, H.-P. Kriegel, J. Sander, X. Xu, et al., “A density-based algo-
rithm for discovering clusters in large spatial databases with noise.”, in
Kdd, vol. 96, 1996, pp. 226–231.
[10] E. Schubert, J. Sander, M. Ester, H. P. Kriegel, and X. Xu, “Dbscan re-
visited, revisited: Why and how you should (still) use dbscan”, ACM
Transactions on Database Systems (TODS), vol. 42, no. 3, p. 19, 2017.
[11] N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger, “The r*-tree:
An efficient and robust access method for points and rectangles”, in Acm
Sigmod Record, Acm, vol. 19, 1990, pp. 322–331.
[12] J. Friedman, T. Hastie, and R. Tibshirani, The elements of statistical learn-
ing, 10. Springer series in statistics New York, NY, USA: 2001, vol. 1.
[13] S. B. Damelin, Y Gu, D. C. Wunsch, and R. Xu, “Fuzzy adaptive res-
onance theory, diffusion maps and their applications to clustering and
biclustering”, Mathematical Modelling of Natural Phenomena, vol. 10, no. 3,
pp. 206–211, 2015.
[14] A. Ata’a and S. Abdullah, “Deinterleaving of radar signals and prf iden-
tification algorithms”, IET radar, sonar & navigation, vol. 1, no. 5, pp. 340–
347, 2007.
[15] G. A. Carpenter, S. Grossberg, and D. B. Rosen, “Fuzzy art: Fast stable
learning and categorization of analog patterns by an adaptive resonance
system”, Neural networks, vol. 4, no. 6, pp. 759–771, 1991.
BIBLIOGRAPHY 44

[16] W. M. Rand, “Objective criteria for the evaluation of clustering meth-


ods”, Journal of the American Statistical association, vol. 66, no. 336, pp. 846–
850, 1971.
[17] L. Hubert and P. Arabie, “Comparing partitions”, Journal of classification,
vol. 2, no. 1, pp. 193–218, 1985.
[18] F. Cao, M. Estert, W. Qian, and A. Zhou, “Density-based clustering over
an evolving data stream with noise”, in Proceedings of the 2006 SIAM in-
ternational conference on data mining, SIAM, 2006, pp. 328–339.

You might also like