Professional Documents
Culture Documents
Application of Game Theory To Sensor Resource Management: Historical Context
Application of Game Theory To Sensor Resource Management: Historical Context
Application of Game Theory To Sensor Resource Management: Historical Context
Resource Management
his expository article shows how some concepts in game theory might be
useful for applications that must account for adversarial thinking and discusses
in detail game theory’s application to sensor resource management.
HISTORICAL CONTEXT
Game theory, a mathematical approach to the analy- famously known as Nash equilibrium, which eventually
sis of situations in which the values of strategic choices brought him Nobel recognition in 1994. Another is
depend on the choices made by others, has a long and Robert Aumann, who took a departure from this fast-
rich history dating back to early 1700s. It acquired a firm developing field of finite static games in the tradition of
mathematical footing in 1928 when John von Neumann von Neumann and Nash and instead studied dynamic
showed that every two-person zero-sum game has a maxi- games, in which the strategies of the players and
min solution in either pure or mixed strategies.1 In other sometimes the games themselves change along a time
words, in games in which one player’s winnings equal parameter t according to some set of dynamic equations
the other player’s losses, von Neumann showed that it is that govern such a change. Aumann contributed
rational for each player to choose the strategy that maxi- much to what we know about the theory of dynamic
mizes his minimum payoff. This insight gave rise to a games today, and his work eventually also gained him
notion of an equilibrium solution, i.e., a pair of strategies, a Nobel Prize in 2002. Rufus Isaacs deviated from the
one strategy for each player, in which neither player can field of finite discrete-time games in the tradition of
improve his result by unilaterally changing strategy. von Neumann, Nash, and Aumann (Fig. 1) and instead
This seminal work brought a new level of excitement studied the continuous-time games. In continuous-
to game theory, and some mathematicians dared to time games, the way in which the players’ strategies
hope that von Neumann’s work would do for the field determine the state (or trajectories) of the players
of economics what Newtonian calculus did for physics. depends continuously on time parameter t according to
Most importantly, it ushered into the field a generation some partial differential equation. Isaacs worked as an
of young researchers who would significantly extend the engineer on airplane propellers during World War II, and
outer limits of game theory. A few such young minds after the war, he joined the mathematics department of
particularly stand out. One is John Nash, a Princeton the RAND Corporation. There, with warfare strategy as
mathematician who, in his 1951 Ph.D. thesis,2 extended the foremost application in his mind, he developed the
von Neumann’s theory to N-person noncooperative theory of such game-theoretic differential equations, a
games and established the notion of what is now theory now called differential games. Around the same
108 JOHNS HOPKINS APL TECHNICAL DIGEST, VOLUME 31, NUMBER 2 (2012)
APPLICATION OF GAME THEORY TO SENSOR RESOURCE MANAGEMENT
C s i + / i ^ h i
sli = , (8)
1 + / i ^ h
Figure 3. Brouwer’s fixed-point theorem: A continuous function
from a ball (of any dimension) to itself must leave at least one where i = max(0,pi() – pi()); is an n-tuple of
point fixed. In other words, in the case of an N-dimensional ball, mixed strategies; i is player i’s th pure strategy; pi()
D, and a continuous function, f, there exists x such that f(x) = x. is the payoff of the strategy to player i; and pi() is the
payoff of the strategy to player i if he changes to th
shows how ts tt , the next estimate of the adversarial intent pure strategy.
given by its next strategy, is a combination of the follow- We note that the crux of Nash’s proof of the exis-
ing three terms: tence of equilibrium solution(s) lies in this fixed theorem
(shown in Fig. 3) being applied to the following mapping
• t – 1 ts tt – 1 : prediction of the next adversarial strat- T " l . By investigating the delicate arguments that
egy given strategy transition model t, Nash used to convert these fixed points into his famous
• K t ` t s t – t – 1 t – 1 ts tt –– 11 j : measurement of the Nash equilibrium solutions, we can use these arguments
current adversarial strategy given the model of how to construct an iterative method to find the approximate
solutions of N-person games. One possible approach we
an intelligence analyst would process sensor mea-
have in mind is to first start from a set of sample points
surement data, and
in the space of strategies and compute:
• c t ` P tt – 1 j N ^G t h : Nash equilibrium tempered by a
< T^ h <
discount factor, c t ^P tt h . C^ h = < < , (9)
Furthermore, the next two equations describe how
uncertainty of adversarial strategy given by the covari- where C() measures the degree to which a strategy
ance term ^P tt h and Kalman gain (Kt) grow under this is changed by the map T " l . We can then look
dynamic system. We have empirical verification of the for a strategy , where C() is close to 1, as the possible
effectiveness of these techniques through extensive initial search points to look for equilibrium points to
Monte Carlo runs, which we reported in Ref. 4, and we start such an iterative search process. Furthermore, we
plan to solidify this theory of our filtering techniques have incorporated such techniques into a recent work
for dynamic games to present
a practical and computation- Map Map
Create
ally feasible way to understand Create Mc Discritize Md Function f
function
continuous map Mc
adversarial interactions for future bounded
payoff map
by Md
applications. Payoff
matrices Solve for
zero points
Solving N-Person Games in f
Nash
Unfortunately, there is no strategies Identify Fixed points Determine
known general approach for Nash fixed
equilibria points in Md Zero points
solving N-person games because
Nash’s famous result on equi-
librium was an existence proof Figure 4. Flow chart for computing Nash equilibriums for N-person games. Mc, continuous
only and not a constructive payoff map; Md, discretized payoff map.
110 JOHNS HOPKINS APL TECHNICAL DIGEST, VOLUME 31, NUMBER 2 (2012)
APPLICATION OF GAME THEORY TO SENSOR RESOURCE MANAGEMENT
Dimensions Size Complexity Previous solve time (s) New solve time (s)
3 5×6×5 36 0.015 0.000
5 6×6×6×6×6 1,296 2.532 0.406
3 200 × 200 × 200 40,400 26.922 1.109
by Chen and Deng,5 which has to date the best com- cation with each other, it allows us to use the theory
putational complexity result on the number of players of bargaining, which is well understood in the game
and strategies for computing fixed points of direction theory community, to model the cooperation and coor-
preserving (direction-preserving maps are discrete dination among sensors to achieve the most optimal
analogues of continuous maps; see Fig. 4). We have solution (Pareto optimal). The optimal solution is not
developed a JAVA program that efficiently implements always achievable in noncooperative games, which we
these new techniques (see Table 1 for the current will use when the sensor nodes cannot trust each other
computational time on one single dual-core PC run- as readily.
ning at 2 GHz given different numbers of players and
strategies with various algorithmic complexities) and
plan to further this technique for future applications APPLICATIONS FOR SENSOR RESOURCE
in order to adjudicate resources among N sensors and
systems of sensors. The advantage of using the Nash
MANAGEMENT
equilibrium solution is that it is an inherently safe solu- Figure 5 shows how a possible approach could work.
tion for each actor (sensors or players) because it gives There are three levels at which game theory is being
a solution that maximizes minimum utility of using applied. First, there is a local two-person game that is
each sensor given the current conditions and is robust defined by a set of sensing strategies for each sensor and
against future changes in conditions. Moreover, when- the adversary who is aware of being sensed and thus
ever actors can trust and establish effective communi- tries to elude such sensing (which is being implemented
p ` s 1i , s 2i , s 3i , s 4i , s 5i j
1 2 3 4 5
GRAB-IT SRM
Figure 5. Game theory at three levels (global, tactical, and local). E/O, electro-optical; HRR, high-range resolution radar; MTI, moving
target indicator.
112 JOHNS HOPKINS APL TECHNICAL DIGEST, VOLUME 31, NUMBER 2 (2012)
APPLICATION OF GAME THEORY TO SENSOR RESOURCE MANAGEMENT
2.2
Strategies
1.5 2
1.8
1
1.6
1.4
0.5
1.2
0 1
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Time steps Time steps
each trial, we conduct 100 time steps where a dynamic (sensor mode and strategies, respectively) only take on
sensor decision is made at each step. In the simulation, the integral values.]
we try three scenarios, with each consisting of a dif- In each test, we also compare the game solver with
ferent composition of units (offensive, deceptive, and a heuristic algorithm where the sensor action/no-action
defensive). In the simulation, we run 50 Monte Carlo decision was assigned on the basis of a prespecified
trials for each scenario. In each scenario, the window probability. We have found that in general, the
size for sensor action is set at 10 and the decision performance of the heuristic solver is significantly worse
threshold is set differently in each test. For example, than the performance of the game solver (the heuristic
Fig. 7 shows the results of a typical trial with thresh- solver determines when to or when not to use the
old equal to 40%. Figure 7a shows that approximately sensor stochastically with the probability at x%). This
36% of the time sensor is off (mode 0) and the rest is understandable because the enemy strategy adapts to
sensor is on (mode 1 for ground moving target indica- our sensor actions and changes accordingly, and thus it
tor, mode 2 for HRR on unit 1, and mode 3 for HRR on is much more difficult to assess enemy’s strategy without
unit 2). Figure 7b shows that in this trial, the enemy’s an interactive game-theoretic strategy analyzer. Figure 8
strategy has changed from 1 (offensive) to 2 (defense), shows the overall performance comparison between the
then back to 1, then later to 3 (deceptive), and then heuristic approach and the game solver approach. In this
back to 1. Figure 7b also shows that approximately 44% test, we set the size of time window for enemy strategy
of the time, the most likely strategy of our assessment policy at 10, and the decision threshold for change is 80%
is the true one. [Note: in Figs. 7a and 7b, the y values (i.e., our sensor will need to be on greater than 80% of
(a) (b)
0.75 0.65
0.70 0.60
0.65 0.55
Pcc or Pcd
Heuristic Pcc
Game solver Pcd
0.55 Game solver Pcc 0.45
0.50 0.40
Heuristic Pcd
0.45 0.35 Heuristic Pcc
Game solver Pcd
0.40 0.30 Game Solver Pcc
0.35 0.25
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Heuristic sensor action rate Heuristic sensor action rate
Figure 8. Effect of game theory on probability of correct classification (Pcc) and probability of correct decision (Pdd). (a) Offensive
enemy; performance comparison, average of all cases. (b) Defensive enemy; performance comparison, average of all cases.
the time in the preceding time window of size 10 before or a cooperative game (when sensors can communicate
the enemy will potentially change strategy). Note that effectively enough to enter into a binding agreement).
in Fig. 8, we display both the mean and the 1-standard- The Nash solution(s) of such games provides each sensor
deviation interval for both the Pcd and the Pcc. The a strategy (either pure or mixed) that is most beneficial
x axis represents the probability of sensor action for the to itself and to its neighbor, and this strategy can then
heuristic approach. For example, x% represents that for be translated into its most optimal sensing capability,
x% of the time without the game-theoretic reasoning, providing timely information to the tactical users of the
the sensor is being used. The corresponding y values sensor network.
represent that corresponding Pcc and Pcd of such action. Beyond this proof of concept, we believe game theory
The horizontal lines represent the Pcc and Pcc that are has the potential to contribute to a myriad of cur-
achieved when the game-theoretic approach is being rent and future challenges. It is becoming increasingly
incorporated. Figure 8a represents the case in which the important to estimate and understand the intents and
adversary is of the offensive type. Figure 8b represents the overall strategies of the adversary at every level in
the case in which the adversary is of the defensive type. this post-September 11 era. Game theory, which proved
For the offensive adversary, the game-theoretic approach to be quite useful during the Cold War era, now has
outperformed in most cases. For the defensive adversary, even greater potential in a wide range of applications
the game-theoretic approach seems to outperform any being explored by APL, from psychological operations
heuristic methods. in Baghdad to cyber security here at home. We believe
our approach—which overcomes several stumbling
blocks, including the lack of efficient game solver, the
CONCLUSIONS lack of online techniques for dynamic games, and the
We have shown that game-theoretic reasoning can lack of a general approach to solve N-person games, just
be used in the sensor resource-management problem to to name a few—is a step in the right direction and will
help identify enemy intent as the Blue force interacts allow game theory to make a game-changing difference
and reacts against the strategies of the Red force. We in various arenas of national security.
have laid down a solid mathematical framework for our
approach and built a Monte Carlo simulation environ- REFERENCES
ment to verify its effectiveness. Our approach uses game
1Von Neumann, J., and Morgenstern, O., Theory of Games and Eco-
theory (two-person, N-person, cooperative/noncoop-
nomic Behavior, Princeton University Press, Princeton, NJ (1944).
erative, and dynamic) at different hierarchies of sensor 2Nash, J. F., Non-cooperative Games, Ph.D. thesis, Princeton University
set of sensing capabilities (location, geometry, modality, Based Approach to Higher-Level Fusion with Application to Sensor
time availabilities, etc.) whose respective effectivenesses Resource Management,” in Proc. National Symp. on Sensor and Data
Fusion, Washington, DC (2006).
are numerically captured in a payoff function. As these 5Chen, X., and Deng, X., “Matching Algorithmic Bounds for Finding a
sensors (players) in the sensor network come into the Brouwer Fixed Point,” J. ACM 55(3), 13:1–13:26 (2008).
6Chin, S., Laprise, S., and Chang, K. C., “Filtering Techniques for
vicinity of each other (spatially, temporally, or both),
Dynamic Games and their Application to Sensor Resource Manage-
they form either a noncooperative game (when the com- ment (Part I),” in Proc. National Symp. on Sensor and Data Fusion,
munication between sensors is minimal or nonexistent) Washington, DC (2008).
The Author
Sang “Peter” Chin is currently the Branch Chief Scientist of the Cyberspace Technologies Branch Office of the Asym-
metric Operations Department, where he is conducting research in the areas of compressive sensing, data fusion, game
theory, cyber security, cognitive radio, and graph theory. His e-mail address is sang.chin@jhuapl.edu.
The Johns Hopkins APL Technical Digest can be accessed electronically at www.jhuapl.edu/techdigest.
114 JOHNS HOPKINS APL TECHNICAL DIGEST, VOLUME 31, NUMBER 2 (2012)