Professional Documents
Culture Documents
Proceedings of Spie: Target Classification With Data From Multiple Sensors
Proceedings of Spie: Target Classification With Data From Multiple Sensors
Proceedings of Spie: Target Classification With Data From Multiple Sensors
SPIEDigitalLibrary.org/conference-proceedings-of-spie
Oliver Drummond
ABSTRACT
The methods used in the classification of multiple small targets can be very different from the methods commonly used
in traditional pattern recognition. First, there may be characteristics of the features for each target class that can permit
simpler computations than other features. In addition, in classifying targets, the target tracks are updated as new data
becomes available and hence there can be a sequence of feature measurements that are available for the target
classification process. In addition, with multiple targets, the a priori information may be in a form that make the
classification processing for one target dependent on the classification processing of other targets. These aspects of
target classification that make that processing different from traditional pattern recognition are the concern of this paper.
To limit the length of the paper, the scope is restricted to classification tasks that allow the linear-Gaussian assumption to
be used. Also, the data used in the classification process is restricted to features, i.e., no attributes, and the assumption is
the tracker does not employ feature-aided tracking. While these assumptions simplify the discussion, the methods used
could be modified to permit classification of a broad scope of classification tasks.
Keywords: Multiple target tracking, features, target typing, detection, classification, recognition, identification, and
discrimination, automatic target recognition (ATR), multiple models, Bayesian methods, and Kalman filter.
1. Introduction
Target tracking and classification problems can be broadly categorized into four generic classes [1], as follows:
1. Sensor tracking of a single (bright) target
2. Tracking of targets that are large
3. Tracking of targets that are medium sized
4. Small target tracking.
These four classes are described in more detail in [2]. Note that the size indicated in this list is in terms of the number of
resolution elements or pixels. The algorithms used in the signal, image, track and classification processing for each of
these problems differ. One of the concerns in tracking small targets is the data association function since
misassociations can corrupt the tracks.
Since each class of tracking and classification problem poses different algorithm development issues, this paper will
concentrate on only one class of tracking, namely, tracking of small targets using either single or multiple target tracking
and classification methods. Multiple target tracking is a relatively new field. The first book dedicated exclusively to
multiple target tracking was published in 1986 [3] and a number of recent books are available [4,5,6]. In addition there
are in the open literature in numerous reports and papers in journals and conference proceedings (too numerous to be
listed here). This paper freely extracts and paraphrases material from some of the author’s prior documents [1,7,8,9].
For this paper, a small target is characterized as one that does not provide enough data for traditional automatic (or
assisted) target recognition (ATR) using a single frame of data [7]. In contrast, a target large enough for ATR typically
extends beyond a diameter of about 15 resolution elements, for example, larger than 10 by 10 pixels. Note that it is not
uncommon to refer to all objects as targets whether they are of interest or not. Small targets of concern in this paper
include point source targets and small-extended targets, including unresolved closely spaced objects.
Signal and Data Processing of Small Targets 2002, Oliver E. Drummond, Editor,
Proceedings of SPIE Vol. 4728 (2002) © 2002 SPIE · 0277-786X/02/$15.00 377
2. Background
3. Preliminaries
Different types of classification problems call for different types of processing. Those different types of classification
problems can be viewed in various ways. This section provides the background for how these different classification
problems are viewed in this paper.
In Equations 1, 2 and 3, only the a priori feature information depends on the track target class, k. The (true) feature
states in Equations 1 and 2 are deterministic and the elements of that state are known for each class. The nuisance
variable, y, represents variables, such as, range and aspect angle on which the feature measurements might depend but
the feature state does not. Equation 3 is formulated so that the feature state mathematical models can depend on the
sensors and the (true) feature state can vary from target to target of the same class.
For the other problem, the feature filter mathematical models are not the same for any two classes and the a priori
feature information may or may not be the same for any two classes. The structure of the filter design or the values for
the filter parameters will not be the same for any two classes. Hence for this classification problem, a unique filter must
be implemented for each class. Using Bayesian probability methods, this feature estimation problem can be processed
using composite estimation methods that uses multiple models based on a hybrid estimation method that combines
probability densities functions and discrete probabilities. That is, this can be treated as a non-switching multiple model
filtering problem [11,12,13] .
Single Unique
Mathematical Mathematical
Model Models
Single Target
or Type 1 Type 2
Isolated Targets
Multiple Targets
in Classification Type 3 Type 4
Region
Processing methods of each type of classification problem is addressed in the following sections. Two aspects of the
classification processing are addressed for each type of classification problem. One aspect is how to compute the a
posteriori class probabilities and the other aspect is how to compute hard decisions. Where practical updated a
posteriori target class probabilities should be computed and provided to the user as the threat evolves. However, the
user may not be in a position to use all that information for some targets. Therefore, the classifier should normally
provide the user with one or more hard decisions. By hard decision is meant the selection of a single (best in some
sense) target class for a target track. More than one hard decision may be desired because different decision criteria
might lead to a different hard decision.
3.4. Target Classes for Corrupted Tracks, Missed Targets, and New Target Classes
The target classes should be augmented to account for a variety of target tracking problems. One problem is that
misassociations can degrade tracks to a minor or major extent. A few, occasional misassociations may not degrade a
track very much and might be reflected in the error covariance matrix, depending on the tracking methods used. On the
other hand, a number of misassociations during a short period of time could lead to a track being lost, i.e., no longer
following the original target. This and other conditions can lead to what will be called corrupted tracks for which the
measured features do not closely relate to the true feature state of the original target or, maybe, were never based on the
feature state of a single target. There are a number of approaches to augment the track classes to compute the probability
that a target track is corrupted. For the purpose of the following analysis, it is assumed that there is an added single
member target class for each target track to compute the probability that the track is corrupted. While this corrupted-
track target class is set aside for a target track, in practice it may be necessary to add a number of corrupted-track target
classes for a target track depending on how the corrupted-track class probabilities are computed.
Another problem is that there may be one or more targets in a region that are major modifications of known target
designs or are new targets for which there is no available intelligence information about their characteristics. There are a
4.1. Type 1: Class Probabilities With Single Feature Filter Mathematics Model For All Classes
The classification tasks using the problems of Equation 1, 2, and 3 exhibit the property that for each problem, their
feature filter mathematical models and parameter values are the same for all target classes. First consider the
classification task for which the feature state vector is the same for all targets of a class but different from class to class,
such as displayed in Equations 1 and 2 and thus the processing is greatly simplified. With that condition, the processing
could proceed without a feature filter using the following:
Initially for apparent target track and each target class compute:
and the variables are as displayed by Equation 2. When a subsequent feature measurement is provided by the target
tracker for a target track at time n, then recursive processing can be invoked by replacing Equation 5a by:
P(k| Z n ) = p( z n |k ) P(k| Z n−1 ) / p( z n | Z n−1 ) with p(z n | Z n−1) = ∑ p( z n |k ) P(k| Z n−1 ) (6)
k
where
zn
Zn = (7)
Z n−1
p( z n | Z n−1 , k ) = p( z n | k ) (8)
because the feature state is the same for all targets of a target class and xk is known for all k.
If, however, the feature state for each target of a class were not the same, but were random variables, then Equation 8
does not apply. A method of handling the condition for which Equation 8 does not apply, such as for Equation 3, is to
use a feature filter for each target class and each target track to estimate the feature state vectors. For generality, let the
(true) feature state be time variable according to a deterministic dynamic state equation and then use:
[ ]
−1
χ 2k,n = ( z n − Hx k ,n|n−1 − u) T R k + HPk ,n|n−1HT ( z n − Hx k ,n|n−1 − u) (10)
− χ 2k,n / 2
e
p( z n | Z n−1 , k ) = (11)
[
2π R k + HPk ,n|n−1H T
]
p(z n | Z n−1) = ∑ p( z n | Z n−1 , k ) P(k| Z n−1 ) (12)
k
where
[
Pk ,n|n−1 = E ( x k ,n − x k ,n|n−1 )( x k ,n − x k ,n|n−1 ) T | Z n−1 ] (14)
x k ,n|n−1 = Feature filter predictied feature state for class k
However, a bank of filters is needed for each target track, one for each target class. Hence full processing is needed for
each class and that could be processor intensive. Figure 1 illustrates the processing for the above classification methods.
For the simpler classification problems of Equation 1, 2, and 3, an alternative processing approach can be used to
reduce the processor load. The computations for this alternative approach take advantage of the property that
the feature filter mathematical model is the same for all target classes, hence, only one filter is needed for each
target track rather than a bank of filters for each target track. This alternative approach is illustrated in Figure
2. For simplicity, assume that the feature state for a target for these problems is a vector of time invariant constants but
not necessarily the same for all targets of the same class. To permit that processing simplification, the a priori feature
state information is not used to initiate the feature filter for a target track. Instead, the first two or three feature
measurements are used to start a feature filter and the resulting estimated feature state does not benefit from the known a
priori feature state information. For this processing, let:
Then every time the probabilities of the target classes are desired for a target track, the estimated feature state plus the a
priori feature state information is used to compute those probabilities as follows:
[ ]
−1
χ 2k,n = ( xk , 0 − x n ) T Pn + Pk , 0 ( xk ,0 − x n ) (16)
and hence the processing load is decreased by reducing the bank of filters to a single filter. Note that for classification of
Equation 1 and 2 problems, then Pk , 0 = 0 and xk , 0 = x k since the features state is the same for all targets of the same
class. The P*(k|Zn) used in Figure 2 is simply the left hand side of Equation 18 without the normalization obtained by
dividing by p(Zn) and is used to facilitate displaying the functions conveniently in block diagram form of Figure 2. Note
that for feature state estimation, batch estimation or a sequential batch filter could be used rather than a recursive filter,
such as a Kalman filter or an information filter. Also, there is an additional savings if the user does not need the target
class probabilities after every feature measurement. That saving is because Equations 16, 17, and 18 only need to be
computed when the target class probabilities are desired for a target.
For most applications, it should be practical to further reduce the processor load by eliminating the processing of the low
probability target classes for a target track. As the processing evolves for a target track, the probability of some target
classes should decrease and others increase. A rule might be established to eliminate the processing (for a target track)
of all target classes whose target class probability is less that a threshold value. A reasonable threshold value might be
computed as a function of the number of target classes and the probability of the most probable class for the target track.
4.2. Type 2: Class Probabilities With A Unique Feature Filter Mathematical Model For Each Class
The classification tasks for the problems formulated using Equation 4 are more processor intensive. The additional
processing is because a bank of feature filters is needed and they are needed because the filter mathematical model is
different for each target class. That difference might be in the structure of the mathematical models or in the values of
the parameters of the models or both. For example, the length of the feature state vector might not be the same for all
models and hence the structure of at least some of the models would be different. Alternatively, the length of feature
state vector might be adjusted to be the same for all classes and equal to the longest vector needed, but then some of the
filter matrices would be larger than is needed and the values of their parameters would have to be adjusted accordingly.
As the estimates of the feature state is obtained from the bank of filters, then the target class probabilities would be
computed using Equations 9 through 14 with minor modifications. For these conditions, the processing scheme of
Figure 1 applies. In view of the complexity of this approach, it would be important to find methods for reducing
the processor load, especially if there are many target classes. For most applications, it should be practical to
reduce the processor load by eliminating the processing of the low probability target classes for a target track as
discussed at the end of Section 4.1.
hence, the solution is to simply decide on the target class for a target track that is most probable. This criterion is
sometimes referred to as the DMAP criterion [11] because the criterion is the maximum a posteriori probability (MAP)
for a decision among discrete alternative (hypotheses), i.e., from discrete sample space, rather then for an estimate from
continuous sample space.
4.3.2. Decision based on the minimum Bayes risk
The traditional minimum Bayes risk approach for decisions is to establish the Bayesian cost matrix and minimize the
Bayes risk as follows.
Let
k= Index of (true) target classes
k = Decided class for a target track
Nk = Number of target classes
(21)
Z = All measurments to date for a target track
c k ,k = Bayes cost for deciding class is k given it is k in truth
C= Bayes cost matrix
(22)
Given the values of the elements of the cost matrix, C, the Bayes risk and the optimal solution are
(23a)
(23b)
(23c)
Thus, the solution depends on the values of the elements of the cost matrix, C.
4.3.3. Decision based on the Bayesian indifference cost matrix
Consider the special case of what is sometimes called the Bayesian indifference cost for which all errors are penalized
equally:
(24)
The solution using the indifference Bayes cost is the same as the DMAP solution because the minimum over k of
1-p(k|Z) is the same as the maximum over k of p(k|Z). Note, however, that by varying the element of the cost matrix,
C, in Equation 23, it is usually possible to obtain a number of different optimal decisions for the Bayes risk of Equation
and no target class is assigned to more than one target and in a hypothesis and each hypothesis includes all tracks. That
completes the first step.
The second step is to compute the probabilities of each target class for each target track taking the multiple targets into
account. Let
Ω(i, k i ) = {Ph Ph for which target track i is assigned target class k } (28)
then
n )=
Pi (k i | ZM T
∑ Ph (29)
h ⊂ Ω ( i,k i )
and the computation defined by Equation 29 is computed for each target class for each target track. Equation 29
provides the target class a posteriori probabilities taking all targets in a classification region into account. That
concludes the processing of the second step. This two step procedure could be very processor intensive if there are
many target tracks or many target classes. Processing of Type 1 target classification problems will be less
processor intensive than Type 2. Eliminating the low probability target classes as mentioned in Section 4.1 should
reduce the processing but there may be the need for additional methods to reduce the processing complexity.
This two step procedure is similar to an optimal first-frame multiple-target tracking method and to JPDA tracking and
will be referred to as the multiple target class a posteriori probability processing (MTCP) method. When making the
single target computations referred to in Equation 25, the target class a priori probabilities need to be adjusted to take the
multiple target tracks into account. That can be done by applying the MTCP method to the initial single-target class a
priori probabilities. Note that after the above analysis was undertaken, a reportedly related analysis of multiple target
classification by M. Tsai was pointed out that was published in a limited distribution proceedings [14] that this author
has not yet had an opportunity to obtain or read.
5.2. Type 3 and 4: Hard Decisions with Multiple Targets in a Classification Region
The procedure for making hard decisions with multiple targets in a classification region is basically the same for Type 3
and Type 4 classification problems after the feature states are estimated and the single-target class probabilities are
computed. There are a variety of criteria that can be used to make hard classification decisions with multiple targets.
Some approaches suggested here employ the target class a posteriori probabilities that take the multiple targets into
account that can be computed using the MTCP method outline in the prior section. Other approaches use the multiple-
target class hypotheses of Equation 26 or 27 to advantage.
5.2.1 Hard decision based on a simple multiple target Bayes risk using MTCP.
There are no doubt a number of methods for defining the Bayes cost for decisions involving multiple targets. One
approach is briefly defined as follows. Consider defining the following based on the definitions in Section 4.3.1. Let
Given the values of the elements of the cost matrix, CMT , the Bayes risk is
[ &
Optimal Bayes Risk = Min dMT ( κ ) T CMT pMT ( κ| Z nMT )
κ
] (32)
&
Let a( Z Mn T ) = CMT p MT( κ| Z nMT ) (33)
then the cost matrix is an Nc by Nc matrix. This cost matrix leads to MT simultaneous and independent optimization
problems of the form
ki
[ n ) = Arg Max Pi (k i | Z n )
k i = Arg Min MT − Pi (k i | Z MT
]
ki
MT
(36)
which is equivalent to finding the most probable target class for each target track. More specifically, find the largest
multiple target class a posteriori probability for each target track. As in the prior section, this does require that the a
posteriori probabilities be computed using the MTCP method of Section 5.1. Note, however, that this simple Bayes risk
there is no assurance that there will not be the same target class decision for two or more target tracks. That issue is
addressed in the next section.
5.2.3 Hard decision based on a constrained multiple target Bayes risk.
Consider a modification to the Bayes risk suggested in Equation 32 to insure unique hard decisions. Let the Bayes risk
of Equation 32 apply except that there is the constraint that at most one target track be assigned to each target class and
each target track be assigned to exactly one target class. Then the optimal Bayes risk solution can be obtained by using a
2-dimensional unique assignment algorithm for an assignment cost matrix with elements equal to b k i as defined in
i,
Equation 34. This assignment requires constraints of equal to one for each track and inequality constraints of less than
or equal to one for each target class. An assignment algorithm such as the JVC unique assignment algorithm [15] could
be used for this purpose. In this way, no two or more targets will be assigned to any one target class as was established
by the assumed a priori target class information.
b i,k i = MT − Pi (k i | Z M
n ) or simply − Pi (k i | Z n )
T MT
(37)
and the constraints are as described in the prior paragraph. Note that a solution is guaranteed because of the way that the
corrupted track classes are generated for the target tracks, i.e. one unique corrupted-target track class for each track, as
discussed in Section 3.4. The advantage is that the solution is feasible in that no more than one target track is assigned to
any target class as specified by the a priori target class information. The hard decisions of this section require that the
multiple target class a posteriori probabilities be computed using the MTCP method of Section 5.1 and that could be a
processor intensive process if there are many target tracks or many target classes. The hard decision approach of the
next two sections address that issue.
5.2.4 Bayes risk hard decision using multiple target hypotheses.
An appealing approach to hard decisions that take multiple targets into account is to use hypothesis probabilities in the
Bayes risk rather than multiple target class probabilities. With Nh the total number of hypotheses and Ch the Bayes cost
that is an Nh by Nh matrix, consider the following definitions:
Ph=1 δ1, h
&
Ph= 2 δ
Ph = ; d(h ) = 2, h
(38)
Ph=Nh δ Nh ,h
[ ]
&
Optimal Bayes Risk = Min
d(h ) T C h Ph (39)
h
&
Let f ( ZM
n ) = C h Ph
T
(40)
=
Then h Arg Min f ( Z n T )
h M
(41)
h
thus, the solution depends on the values of the elements of the cost matrix, Ch. This is somewhat simpler than the hard
decisions of the prior decision because the solution does not depend on the multiple target class a posteriori probabilities
but do depend on the multiple target hypothesis probabilities, which are typically processor intensive to compute. This
issue is addressed in the next paragraph.
If the Bayes cost is the indifference cost matrix then each element of the optimal solution is simply of the form:
h = Arg Min log(Ph* ) = Arg Min ∑ log Pi (k h,i )
h h
i
[ ] (43)
where the quantities of Equations 25 and 26 are substituted into Equation 42 and the unnormalized probability of the
hypotheses can be used because there are all divided by the same quantity. This solution can be obtained using a unique
assignment algorithm such as the JVC [15]. The elements of the assignment cost matrix are the log of the single target a
posteriori class probabilities and the size of the cost matrix is MT by Nk . This assignment requires constraints of equal to
one for each target track and inequality constraints of less than or equal to one for each target classes in order to insure
6. Extensions
The multiple target classification problems addressed in Section 5 are restricted to problems with at most one true target
in each target class. The approaches of that section could be modified for use with the possibility of multiple targets
from some or all target classes. These modifications would require establishing the probability of the number of targets
in each target class plus special methods, such as, for hard decision there would be a need for a special assignment
method to accommodate multiple targets per class and their probabilities.
The classification methods discussed assumed that the target tracker used hard tracking decisions and single-frame data
association. Those classifications methods could be modified to handle multiple frame approaches that carry forward
more than one track per apparent target, such as in multiple hypotheses tracking and multiple frame assignment using a
moving window. The modification would depend on the type of multiple frame data association used.
The discussions of the prior sections have not addressed the important issue of what data to distribute in a distributed
sensor platforms system. Different types of data can be distributed depending on the type of classification problem. For
example, feature measurements, estimated feature states, likelihoods, or probabilities might be distributed. In some cases
the target classifications might not be needed after every feature measurement is obtained for a target track. In that case
the tracklet concept [16] might be applied to feature measurements, feature state estimates, feature likelihoods, or
probabilities depending on the specifics of the applications.
In Equations 3 and 4, the nuisance variables y and u were assumed to be know, such as the target aspect angle.
Typically some of the nuisance variables will not be know but are estimated. The classification methods discussed in
this paper need to be modified to take the estimation errors of the nuisance variables into account and this would
introduce random errors not discussed in the paper because of the complexity that would be added to the discussion.
7. Conclusions
This paper has addressed the classification of small targets for which a sequence of feature measurements would be
provided by one or more sensors. Both single target and multiple target classification methods are described. The
classification methods applicable to tracking a single target or isolated targets have been addressed that are less complex
than the methods needed to take multiple targets into account. Classification methods that reduce the processing load
because the filter mathematical models are the same for a number of target classes have been described and apply to both
single and multiple target classification.
Methods for making hard decisions have been described that supplement the a posteriori probabilities of each target
class for each target track that is provided to each user. Furthermore for multiple targets, if the a posteriori probabilities
of each target class are not needed for all targets, then methods or making hard decisions that are substantially less
processor intensive have been presented. The classification processing of multiple targets is similar to multiple target
tracking, which is also processor intensive. While optimal multiple target tracking is not practical, given certain
comditions, optimal or near optimal multiple target classification might be practical for a few target tracks and a few
target classes.
References
1. Drummond, O. E., Multiple Target Tracking Lecture Notes, (a sequence of editions since 1985), Technical Book
Company, 2056 Westwood Blvd., Los Angeles, CA 90025, 2002.
2. Drummond, O. E., Target Tracking, Wiley Encyclopedia of Electrical and Electronics Engineering, Vol. 21, pp. 377-
391, 1999.
3. Blackman, S. S., Multiple Target Tracking With Radar Applications, Denham, MA: Artech House, 1986.
4. Bar-Shalom, Y. and X. R. Li, Estimation and Tracking: Principles, Techniques and Software, Boston: Artech House,
1993.
5. Bar-Shalom, Y. and X. R. Li, Multitarget-Multisensor Tracking: Principles and Techniques, Los Angeles: OPAMP
Tech. Books, 1995
6. Blackman, S. S. and R. F. Popoli, Design and Analysis of Modern Tracking Systems, Norwood, MA: Artech House,
1999.
7. Drummond, O. E., (ed.), "Introduction, Signal and Data Processing of Small Targets 1997," Proc. SPIE, 3163, p. ix,
1997.
8. Drummond, O. E., "Target Tracking With Retrodicted Discrete Probabilities," Signal and Data Processing of Small
Targets 1997, Proc. SPIE Vol. 3163,pp. 249-268, 1997.
9. Drummond, O. E., "On Features and Attributes in Multisensor, Multitarget Tracking," Proceedings of The 2nd
International Conference on Information Fusion (FUSION’99), July 1999, Sunnyvale, CA, pp. 1045-1053.
10. Drummond, O. E., "Feature, Attribute, and Classification Aided Target Tracking," Signal and Data Processing of
Small Targets 2001, Proc. SPIE Vol. 4473, pp. 542-558, 2001.
11. Drummond, O. E., X. Rong Li, Chen He, "Comparison of Various Static Multiple-Model Estimation Algorithms,"
Signal and Data Processing of Small Targets 1998, SPIE Proc. Vol. 3373, 1998, pp. 510-527.
12. Magill, D. T., "Optimal adaptive estimation of sampled stochastic processes," IEEE Trans. Autom. Control, AC-10
(4), pp. 434-439, 1965.
13. Sims, F. L. and D. G. Lainiotis, "Recursive algorithm for the calculation of the adaptive Kalman filter weighting
coefficients," IEEE Trans. Autom. Control, AC-14 (2), pp. 215-217, 1969.
14 Tsai, M, and M. Zimmer, "An Architectural Approach to Achieving Robust Radar Classifier Performance," MSS MD-
SEA Conference, Monterey, CA, 2000.
15 Drummond, O. E., D. A. Castanon, and M. S. Bellovin, "Comparison of 2-D Assignment Algorithms for Sparse,
Rectangular, Floating Point, Cost Matrices," Journal of the SDI Panels on Tracking, Institute for Defense Analyses,
Alexandria, VA, Issue No. 4/1990, 15 Dec. 1990, pp. 4-81 to 4-97.
16 Drummond, O. E., "On Track and Tracklet Fusion Filtering," Signal and Data Processing of Small Targets 2002,
Proc. SPIE Vol. 4728, (to be published) 2002.
Figure 1. Classification Processing with Unique Feature Mathematical Model for Each Class
Compute p (k =1Z
*
| )
Probability
of Class 1
xˆ n , Pn Compute p*(k = 2| Z)
Track Feature Feature Vector Decision Decision
Probability
Measurements Kalman Filter Processing Information
of Class 2
Compute p*(k = Nk | Z)
Probability
of Class N k
Figure 2. Classification Processing with Single Feature Mathematical Model for All Classes