Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Defense Intelligence Journal; 13-1&2 (2005), 47-63

Data Mining and Predictive Analytics:


Battlespace Awareness for the War on
Terrorism
Colleen McCue, Ph.D.
Information analysis is the brain of homeland security. Used
well, it can guide strategic, timely moves throughout our country
and around the world. Done poorly, even armies of guards and
analysts will be useless. Markle Foundations Task Force on National Security in the Information Age.1
Steve A. Yetiv, a political scientist at Old Dominion University who specializes in U.S. policy and the Middle East recently suggested that, the
military victory in Iraq offers significant opportunities for an intelligence
windfall.2 While that may be true, what are we going to do with all of
that information? Recent reports from the federal law enforcement and
intelligence community paint a gloomy portrait of our ability to reliably
manage, analyze, and exploit the vast amounts of information currently inundating the various agencies charged with acquisition and analysis. Failures in connecting the dots have been highlighted repeatedly, from the
popular press to congressional testimony. While emphasis has been placed
on more accurate and reliable methods for gathering information, on privacy concerns relating to the intelligence collected, and on more effective
methods for sharing critical intelligence in a timely fashion, as much if
not more attention must be paid to techniques for effective analysis and
exploitation of these information resources.
Many of these tools already exist. The business community has exploited data mining and predictive analytics (analysis) for several years.
The same tools and techniques that are used to determine credit risk, discover fraud, and identify which consumers are likely to switch cell-phone
providers also can be exploited in the fight against terrorism and the protection of homeland security. With data mining we can perform exhaustive
searches of very large databases using automated methods, searching well
47

48

Data Mining and Predictive Analytics

beyond the capacity of human analysts or even a team of analysts. Using


predictive analytics, we can accurately model complex interactions, associations or relationships and then use these models to identify and characterize unknown relationships or make reliable predictions of future events.
Employed in military strategy development and planning, data mining and
predictive analytics can facilitate the attainment of dominant battlespace
awareness.

Data Mining and Predictive Analytics


Data mining and predictive analytics are gaining acceptance in criminal
investigations and public safety.3,4,5 The predictability of violent crime is
the foundation for the behavioral analysis of violent crime. In many ways,
terrorism is violence with a larger agenda. Terrorism and efforts to support
it also encompass other crimes including fraud, smuggling, money laundering, identity theft, and murder, which have been investigated successfully with the use of data mining and predictive analytics. Like many others, we have been exploring the use of data mining and predictive analytics
in crime and intelligence analysis with some very promising preliminary
successes.6
Described as knowledge discovery or sense making tools, data
mining and predictive analytics give us an opportunity to manage and
make sense of the information coming in, with the output being actionable
intelligence products. To paraphrase Sir Francis Bacon, battlespace awareness is power, particularly in the war on terrorism. Moreover, if knowledge
is power, then foreknowledge can be seen as battlespace dominance or
supremacy. Similar to the information overload experienced by combat
pilots, however, the volume challenge of information threatens to overwhelm military analysts and decision makers. The amount of information that must be assessed, described, integrated and managed is increasing exponentially almost daily and most certainly exceeds the capacity of
the human brain. Even a team of trained analysts finds the task daunting,
and might miss larger patterns and trends associated with analysis of the
whole, rather than the parts. These new approaches to analysis and associated knowledge management tools can aid tactical decision making in an
information-saturated environment.
Far from being reserved exclusively for academic think tanks or large
marketing firms, these tools are readily accessible and available in the
PC environment. Advanced training in statistics or artificial intelligence

Colleen McCue, Ph.D.

49

is not required. Rather, domain expertise is the essential prerequisite.


Operationally defined, domain expertise means that you have a working
knowledge of your adversary;7,8 something that most military planners
and strategists already possess. Domain expertise allows you to review
the analytical products for reliability, accuracy and value. For example,
identifying a reliable association between suicide bombers and religious
extremism would add little value to our ability to combat terrorism. On the
other hand, the ability to accurately characterize, detect, anticipate, and
ultimately prevent subsequent attacks based on a thorough analysis of preincident behavior, planning and surveillance would have tremendous value
in the fight against terrorism and the protection of our homeland security.
Recent innovations in technology have allowed for the deployment
of analytical products, or scoring algorithms to operational personnel
with no formal training in statistics (e.g., SPSS Clementine Data Mining
Suite).9,10 These models can be used in the field for a variety of functions
including risk assessment, as well as the prediction of future events or behaviors.11 One particular advantage of this approach provides the equivalent of persistent analysis through the exploitation of centralized analytical
resources by operational personnel anywhere in the world with access to
a laptop or PDA and a secure Internet connection. In keeping with recent
praise for Network Centric Warfare (NCW), this approach can provide
timely and reliable analysis to end-users when and where it is needed the
most. While NCW has focused on technology, these recent innovations in
analysis can further enhance our operational capacity in the field by providing an overlay of analysis and intelligence, which ultimately enhances
the technological hardware recently deployed.12
Similar to criminals, terrorists do not respect, in fact they frequently
exploit, jurisdictional and national boundaries. Even within the United
States, information integration across levels of government is limited at
best. This so-called stovepiping has been criticized extensively, because it
significantly compromises our efforts by duplicating resources and efforts,
while limiting access to information resources across domains. Discussion of Fourth-Generation Warfare started well before 9/11;13 however, the
larger implications of this concept became a reality on that day. FourthGeneration Warfare specifically addresses the likelihood of domestic, civilian targets and casualties; [t]he distinction between war and peace will
be blurred to the vanishing point. It will be nonlinear, possibly to the point
of having no definable battlefields or fronts. The distinction between civilian and military may disappear. The when, where, and how of war-

50

Data Mining and Predictive Analytics

fare is changing, with an increased likelihood that battlefields and fronts


will be domestic, a very powerful concept particularly when viewed from
an information collection and analysis perspective. As the frontline in the
war on terrorism moves into our own communities, the number of players, organizations and data collection methodologies has increased geometrically. Local law enforcement and citizens frequently will be the eyes
and ears in the Fourth-Generation intelligence apparatus, with intelligence
gathering being widely disseminated and the information being collected
by an increasing number of methods and stored in a variety of forms.
Since 9/11, the amount of information being collected, gathered and
compiled by all of the various agencies local, state and federal agencies has
increased significantly. However, the information management and analysis capacity has not kept pace. An effective strategy in the war on terrorism
will depend on accurate and reliable collection and analysis of information
from a variety of sources and venues. This value-added analysis will be
essential to the effective characterization and anticipation of the next move
by the worldwide terrorist network that has emerged. What would happen, however, if the paradigm shift in information sharing occurred today?
What would we do if all of the agencies currently responding to terrorism
at the local, state, national, and international levels decided to share all of
their information right now? Without some capacity to effectively analyze,
correlate, and interpret all of this associated information, it would not enhance our ability to fight terrorism significantly. Therefore, we also need
to acquire and utilize the skills and tools necessary to effectively analyze
the information that we are trying to share so that we may exploit the fact
that it is being explored in a common analytical environment.

Information Collection
Louis Pasteur has observed that, [i]n the field of observation, chance favors only the prepared mind. Multiple assessments of terrorist groups
have highlighted the fact that those intent on committing violent acts are
tenacious and extremely resourceful when it comes to information gathering. Examples of long-term surveillance, detailed operational planning,
and multiple attempts on a common target have been documented, including the attacks on the Khobar Towers, 14 the 1993 and 2001 attacks on the
World Trade Center,15 as well as the recent disclosure of casing reports
demonstrating extensive research on and surveillance of major financial
institution in the United States.16 This planning and patience also offers
multiple opportunities for detection. While connecting the dots has been

Colleen McCue, Ph.D.

51

cited widely as the answer, connecting the dots and predicting the next step
holds even more value from a strategic perspective.
As the war on terrorism progresses, information gathered should
feed the process. Interviews and debriefing should not be an end in itself; rather, they can drive the knowledge-acquisition process further. With
knowledge discovery tools, information can serve as a dynamic interface
between the analytical and operational personnel. By making operational
personnel part of the analytical process the entire data collection, processing, analysis, and dissemination is greatly enhanced and fit specifically to
the operational requirements. Again, the tools now exist that will allow
information collected in the United States to inform the process half a
world away at any time, day or night, requiring only a laptop computer and
a model developed by an analyst.
The information collected from detainees or human-intelligence
(HUMINT) assets can further enhance this developed knowledge base and
inform future interviews and actions. For example, information obtained
can further guide the collection process by identifying subtle or common
patterns of deception, or tipping points when information begins to flow
freely. Sophisticated data and text mining software is available in the desktop environment, available for analysis and for wide and rapid deployment
to the areas where it is needed most, especially the theater of operations.
Secure networking and information deployment associated with NCW
will allow analysts to share information and identify larger patterns and
trends, including those that transcend their operational purview.

Strategic Characterization: Forewarned is Forearmed


In many respects, it is not enough to connect the dots; we need to be able
to anticipate the next move. Human behavior, even extremely violent or
unusual behavior, frequently follows predictable patterns or trends. This
behavior can then be characterized, modeled, classified, and even predicted in some cases.17 In fact, the entire discipline of criminal-investigative analysis, or profiling, is based on this finding. While behavioral
analysis might not be able to identify a specific individual or suspect, it
frequently can provide investigators additional knowledge or insight regarding what type of person might be associated with a particular crime
or series of crimes. Perhaps more importantly, this type of analysis also
can provide some insight regarding what type of behavior might predict or
foreshadow violence.

52

Data Mining and Predictive Analytics

In our experience, we have been able to use data mining and predictive analytics to identify likely motives, offender characteristics, and
victim-risk factors in violent crime.18,19,20 In many ways, terrorism can be
described as violence with a larger agenda. While the mechanism might
be different, the intended outcome is the same, that is, to achieve behavioral control through intimidation, violence or threats of violence. Suicide bombers might represent the frontline warriors of Fourth-Generation
Warfare. The fact that their surveillance and pre-attack behavior has been
characterized and described further highlights the predictability of human,
even criminal, behavior.21,22 In this array of associated indicators, we want
to be able to identify and give weight to the most valuable predictors so
that we can rapidly identify these individuals and prevent suicide attacks.
The ability to characterize and predict this behavior could afford tremendous tactical as well as strategic value to those fighting in the war on terrorism.
The ability to accurately and reliably predict risk also can be a tremendous asset in deployment decisions. Exploiting predictive analytics in
local policing, we have developed the concept of risk-based deployment.
Using data mining and predictive analytics to analyze historical data has
yielded models that predict when and where incidents are likely to occur.
By identifying the times and locations associated with an increased likelihood of risk for an incident, we can proactively place assets when and
where they are needed, thereby more efficiently utilizing our resources,
and increasing the likelihood of rapid identification and apprehension, or
even deterrence through enhanced presence.

The Volume Challenge23


Recent advancements in telecommunications have influenced the way that
many of us do business, and criminals are no exception. Meetings can now
be held with individuals from the United States, South America, and the
Middle East without any of the participants leaving their home or office.
Telephone and Internet conferencing techniques that save time and money
for businesses also afford a greater degree of anonymity to those wishing
to keep their relationships and activities hidden, somewhat limiting the
options for some kinds of physical surveillance. Telephone records, on the
other hand, represent an invaluable source of information, providing additional linking and timeline information regarding specific individuals and
groups. Analysis of telephone data can be extremely tedious; however, it is
one area where data mining and predictive analytics can make a meaningful difference in analytical capacity.

Colleen McCue, Ph.D.

53

One particular case study24 encompassed a series of conference calls


that were held over a one-month period in 2002. The case originally came
to the attention of the law-enforcement community, because it was associated with a very large, delinquent account. Further examination of the
calls, however, revealed the inclusion of a number of locations throughout
the world associated with international terrorism. Complicating the recovery process, the account had been established with fraudulent information.
While no information pertained to specific individuals involved in these
teleconferences or the true content or nature of these calls, a tremendous
amount of information was gathered from the billing records alone.
By using data-mining techniques, it was possible to characterize the
teleconference call data based on the date, number of participants, and
some common telephone numbers that appeared frequently throughout
the billing invoice. Subsequent analysis using more sophisticated modeling techniques supported classifying the various teleconferences into
three groups based on the number of participants and the date of occurrence. This analysis uncovered teleconference groups of greatly differing
sizes, which possibly related to their member composition or underlying
function. For example, the smaller groups frequently included the same
telephone numbers, perhaps associated with key participants, while the
larger groups extended well beyond what would be considered suitable for
a teleconference and might have been associated with lectures or fundraising. Additional analysis also revealed individual telephone numbers that
frequently were associated with teleconference leader functions. While
many of the telephone numbers could not be traced readily, some could
be linked to individual subscribers, particularly those in the larger groups
that could provide additional insight into our understanding of the general
methods and larger objectives associated with these conferences. Clearly,
data mining was not able to determine the exact nature of the calls or identify the specific participants in this case study. However, by using these
techniques it was possible to reveal a tremendous amount of information
over what was available in the original invoice.

Identify Theft
Identity theft has been with us in various forms for a very long time. Many
unsuspecting consumers have had their financial lives ruined by thieves
who assumed their identities in an effort to commit fraud. After 9/11, it became painfully obvious that the highjackers had easily obtained the false
credentials necessary to move throughout the many systems that require

54

Data Mining and Predictive Analytics

identification. Unfortunately, these same systems have limited capacity to


validate the accuracy of the credentials required. Highlighting the ease
of obtaining false credentials in some locations, seven of the 19 terrorists
involved in the 9/11 attacks had valid Virginia state ID cards, although
they lived in Maryland hotels. Subsequent investigation uncovered a ring
that had provided hundreds of false ID cards to individuals from Muslim
countries.
Unfortunately, detection of identity theft generally occurs after something bad has happened, either fraud or something far more sinister. Manual searches of these datasets in an effort to proactively identify cases of
identity theft or misuse, however, would be extremely difficult and inefficient given the extremely large amount of information involved. Imagine
searching drivers license records searching for duplicate social security
numbers or birth dates. Alternatively, automated searches of existing information with data-mining technology could flag invalid, suspicious, or
duplicate social security numbers, detecting possible identify theft before
serious consequences occur. Additional information including the use of
multiple birth dates or addresses, aliases, and fraudulent addresses also
could be identified with data-mining tools. While this approach would
not catch everyone, it might detect enough illegal use of credentials to
make this type of identity theft more difficult and deter criminal use of
valid credentials in the future. It also would limit terrorists ability to move
throughout the various systems in our country that require a social security
number or other identification and force them further underground without
compromising the privacy of honest, law-abiding citizens.

Force Protection
Lind and others in their discussion of Fourth-Generation Warfare have observed that Terrorists use a free societys freedom and openness, its greatest strengths, against it. With the move to transparent government, vast
amounts of information are deployed over municipal Websites, including
information with significant tactical as well as strategic value. Information pertaining to public safety infrastructure, utilities, equipment, personnel strength, deployment, response times and protocols can either be
extrapolated, or are even provided directly in some situations. Moreover,
seemingly innocent information alone or in combination with other open
source materials can hold value for operational preparation and terrorist
attacks. For example, the availability of detailed, high-resolution orthophotography images of communities, landmarks and other high-profile

Colleen McCue, Ph.D.

55

targets has increased rapidly over the past several years. As early as the
spring of 2001, Israel reported increased use of precision air photos by
certain groups, which often are available through commercial outlets or
the Internet.25 In many cases these images are precise, with a high degree
of resolution, and hold great operational value for military, paramilitary,
and terrorist organizations alike. For example, these detailed images can
be used to identify locations appropriate for the placement of car bombs,
cover, concealment, and escape routes. More recently, these images and
detailed infrastructure information collected from open source internet
sites have appeared in casing reports of financial institutions within the
United States, which are believed to have been collected in support of possible terrorist operations.26
Additional information outlining military thinking, tactics and strategy also is available freely over the Internet. Recent reports leave no doubt
that this information is being used by friend as well as foe. For example,
Abu Ubeid Al-Qurashi, identified as one of Bin Ladens closest aides,
specifically sites the principles of fourth generation warfare, published in
the Marine Corps Gazette, when outlining the al Qaeda combat doctrine in
the al Qaeda biweekly Internet magazine, Al-Ansar.27 Other reports have
indicated that the Website, www.C4I.org, encountered vigorous activity
from Iraqi Internet addresses in the period immediately preceding the most
recent Gulf War.28 Of particular interest to the Iraqis were links about psychological tactics, information warfare and other military issues. In other
words, we know that they are looking at us; that they are learning from our
playbook.
They do not need to hack into our systems; we give it all away. Tom
Clancy noted in his recent keynote address at the Gartner IT Expo that
[t]here are no secrets in the world. The only hard part is finding the right
person to ask. If you have a phone, you can find out anything you want in
under 60 minutes. With the Internet, its even faster.29 Many have decried
the availability of information through open sources, and in response to
this developing threat the National Infrastructure Protection Center has
advised localities to survey the information currently available and remove
possibly dangerous information. This change is unlikely to occur in the
immediate future. In fact, it appears that even more information is being
made available. While we can hope that our adversaries will be swamped
by this same tidal wave of information that we are struggling with, tools
are available now that will allow us to use their interest in our information
to our advantage. Again, these data resources are far too large for analy-

56

Data Mining and Predictive Analytics

sis using purely human resources. Data mining and predictive analytics,
however, can automate the knowledge discovery process. The same tools
used in the E-commerce sector to identify shopping patterns, demographic
information, and geographic preferences also can be used to identify and
highlight interesting or suspicious pieces of information or activity for
an analyst to evaluate further. This type of electronic surveillance detection, in combination and integrated with traditional physical surveillance
detection and threat assessment, offers new opportunities for value-added
analysis that will significantly increase our force-protection capacity. Perhaps more important, however, these tools are available right now.

Anomaly Detection: Defining Normal


Anomaly detection can have significant utility in informing battlespace
awareness, whether it is the war on terrorism, the war on drugs or the
war on crime. In many ways, the investigative training process resembles
case-based reasoning whereby investigators or special agents come to understand a new experience or a new case based on prior accumulated experiences.30 By compiling a veritable internal database of previous cases
and associated outcomes, they can attempt to match each new experience
to their internal repository of known outcomes. If a new case matches a
previous one, then they have an internal scenario that can be used to structure the current investigation. For example: husband calls, reports wife
missing, wife found murdered with signs of overkill -> previous cases
indicate domestic homicide, interview husband. If something new does
not fit into their past experiences in any sort of logical fashion then they
have encountered an anomaly, which requires further inquiry to fit it into
an existing norm or create a new category. In many cases, listening closely
to these internal anomaly detectors can highlight individuals or situations
that bear further scrutiny.
Intentionally trying to identify unusual or suspicious behavior indicative of something far more sinister often resembles looking for the proverbial needle in the haystack. For example, suspicious actions or behavior
suggestive of preoperational planning or surveillance by their very nature
are both infrequent and subtle. Frequently, indications of these types of
activities almost always occur only when the potential suspect makes a
mistake, which further highlights their rarity. What would be helpful in
further revealing these activities, the needle in the haystack, therefore,
would be some sort of magnet. In many ways, the technique of anomaly
detection can serve that function.

Colleen McCue, Ph.D.

57

Many localities have reported suspicious or unusual activity on their


Websites. In particular, IP addresses associated with locations in the Middle East have been noted searching pages related to local infrastructure and
public safety.31 While many agencies and organizations have focused on
intrusion detection, another potential vulnerability includes surveillance
or misuse of information available through the Internet and other opensource venues. The ability to identify, characterize, and monitor unusual
or suspicious Internet activity can provide additional insight regarding our
adversarys interest and possible intentions, thereby increasing our battlespace awareness. Again, it is possible to gather a tremendous amount
of information regarding potential surveillance activity on Websites of interest by simply using a good understanding of normal behavior and
anomaly detection.

Surveillance Detection and Other Suspicious Situations


People often get caught when they try to behave normally or fly under the
radar. In many cases, however, they do not have a good sense of what normal truly looks like and get caught out of ignorance or because they stand
out even more in their attempts to be inconspicuous. It is often difficult to
completely understand what normal looks like until we characterize it and
then analyze it in some detail. Similarly, language or cultural differences
also can impair an individuals ability to melt into the background noise.
Ignorance of cultural subtleties, nuances, or norms can serve as a spotlight, highlighting unusual or suspicious behavior. Understanding normal
trends and patterns as well as normal abnormal trends and patterns can
be a valuable component of public-safety domain expertise, or battlespace
awareness.
While there are no crystal balls in law enforcement and intelligence
analysis, data mining and predictive analytics can help characterize suspicious or unusual behavior so that we can make accurate and reliable predictions regarding future behavior or actions, which is absolutely essential
to meaningful and effective prevention strategies. One area where this has
tremendous potential is surveillance detection. In many ways, preoperational surveillance can be described as a systematic review of a person,
route, facility, or some other item of interest. Data mining and predictive
analytics are uniquely suited to identify and characterize homogeneous
and coordinated behavior embodied in preoperational surveillance.
Preoperational surveillance is generally intended to appear relatively
innocuous to un-informed observers. Frequently, it is only when a larger

58

Data Mining and Predictive Analytics

pattern of suspicious or presumptive preoperational surveillance activity


has been identified, compiled and characterized that the true nature of the
activity is revealed. At a minimum, the ability to characterize suspicious
behavior provides invaluable guidance for those interested in identifying
the presence of possible preoperational surveillance. Characterization and
analysis of suspicious activity and behavior can reveal or further define the
area of interest to the adversary, or red zone. Operational resources are
almost always in short supply and must be deployed as efficiently as possible. The ability to take a series of suspicious situation reports and identify both temporal and spatial trends and patterns gives us the opportunity
to deploy surveillance detection when and where it is most likely to gather
additional information.
Building on the concept of risk-based deployment developed for use
with police patrol and tactical units,32, 33,34 similar data mining strategies
can be used to maximize surveillance detection resources.35,36,37 As with
patrol deployment, the use of data mining exploits the nonrandom or systematic nature of preoperational surveillance activity. The ability to characterize and predict when and where this activity is likely to occur can
guide proactive deployment of surveillance detection resources in a manner that maximizes the likelihood that they will be present when and where
the behavior of interest occurs. Moreover, this strategy also decreases the
likelihood that resources will be deployed when and where they are not
needed, which greatly facilitates judicious resource allocation.
In another case study,38 data mining and predictive analytics were used
to characterize possible surveillance activity associated with a facility of
interest. In this particular example, suspicious situation reports had been
investigated and then compiled, although never analyzed. A quick review
of the frequency of reports over time revealed increasing activity, which
was consistent with growing interest in the facility. Analysis of the activity
by day of week further highlighted the nonrandom nature of this activity; approximately 25 percent of the reported incidents occurred on one
specific day of the week, which was the same day of interest associated
with other, similar facilities. Further review indicated refinement of the
activity, suggesting an increased focus on this particular day of the week
over time, as compared to other days of the week. Additional analysis also
highlighted certain times of the day that were associated with increased
suspicious activity.
Using relatively simple techniques, it was possible to generate operationally actionable output from the analysis. Preparation of a crude facility

Colleen McCue, Ph.D.

59

map highlighted the relative spatial distribution of the incidents. Additional


value was added to the map by highlighting relative changes in activity
or the pattern of behavior across time. This simple technique also served
to underscore the emerging geographic specificity of the surveillance activity. Analytical products like these can be given directly to operational
personnel for use in the field, significantly increasing the operational value
of the analysis.
As outlined in this case study, the techniques do not need to be extremely sophisticated. Rather, the key is to convey analytical output and
information in a format that is operationally actionable. For example, the
use of risk-based deployment maps39 or schedules provides operationally actionable analytical products that can be deployed directly to personnel in the field. The ability to simultaneously integrate and analyze data
from multiple locations can further enhance our understanding, particularly regarding those groups and organizations with a historical preference
for multiple, simultaneous, geographically distinct attacks. In this case,
far from representing a volume challenge, analysis of integrated data
resources can be exploited to reveal infrequent events and subtle trends or
patterns. Finally, when and where frequently can provide insight regarding why. The identification and characterization of surveillance activity can not only refine surveillance detection planning and deployment,
but also can be used to highlight potential vulnerabilities and threats; forewarned is forearmed.

Conclusions and Future Trends


Given the geometrically increasing amounts of information, connecting
the dots will be possible only with automated systems. Analysts are faced
with a veritable tsunami of information that threatens to sweep them away.
The ability to bring analytical and predictive models directly to operational personnel and into the operational environment holds the promise
of allowing us to maneuver within the decision and execution cycle of our
adversary, thereby gaining dominant battlespace awareness in the war on
terrorism. Again, with the use of data mining and predictive analytics,
information can serve as an interface between analytical and operational
personnel. Tom Clancy advised the audience at a recent security expo to
seek out the smart people, observing that [t]he best guys are the ones
who can cross disciplines [t]he smartest ones look at other fields and
apply them to their own. This has worked well with NCW. Perhaps the
next step is to bring knowledge-discovery tools and the associated experts

60

Data Mining and Predictive Analytics

to the frontlines of the war on terrorism. Give the smart people in their
respective fields an opportunity to interact and identify creative methods
for bringing the existing knowledge and technology to the war on terrorism. Working hand-in-glove, the analytical products can then be tailored specifically to the operational needs and requirements. Moreover,
by working together, an additional benefit realized would be enhanced
information collection, which will result in value-added intelligence and
analysis. These new tools would then give us the ability to more fully exploit NCW and afford collaboration and cooperation across a worldwide
venue, transcending traditional operational boundaries.
A complete discussion of dominant battlespace awareness was well
beyond the scope of this paper. Suffice it to say, however, that forewarned
is forearmed; to be prepared is half the victory, (Miguel de Cervantes).
The importance of acute and well-informed situational awareness cannot
be understated. In outlining the opportunities and challenges of dominant
battlespace knowledge (DBK), Dr. Stuart E. Johnson has written that, exploiting DBKmeans that it be applied across the entire cognitive hierarchy from data, to information, knowledge, and finally, understanding.40
In light of this statement, the value of data mining and predictive analytics,
so-called knowledge discovery tools, becomes immediately apparent.
We must do far more than connect the dots; to gain truly dominant battlespace awareness and supremacy in the war on terrorism we must connect the dots and use them to anticipate the next image. We must exploit
the technology available currently and begin anticipating the next move
to achieve dominant battlespace awareness and victory in the war on
terrorism.

Notes
1. Markle Foundation Task Force on National Security in the Information Age,
Protecting Americas Freedom in the Information Age, ISBN 0-9725440-0-3, 07
October 2002, 1.
2. Steve A. Yetiv, quoted in Gauging Iraqs Espionage Possibilities, 1 May 2003,
URL: <www.CBS.com>, 18 November 2004.
3. Colleen McCue, Emily S. Stone, and Teresa P. Gooch, Data Mining and ValueAdded Analysis, FBI Law Enforcement Bulletin 72, (2003): 1-6.
4. Colleen McCue, and Colonel Andre Parker, Connecting the Dots: Data Mining and Predictive Analytics in Law Enforcement and Intelligence Analysis, Police Chief 70, (2003): 115-122.

Colleen McCue, Ph.D.

61

5. Colleen McCue, and Colonel Andre Parker, Web-Based Data Mining and
Predictive Analytics: 24/7 Crime Analysis, Law Enforcement Technology 31,
(2004): 92-99.
6. Colleen McCue, Data Mining and Value-Added Analysis.
7. Colleen McCue, Connecting the Dots.
8. Colleen McCue, Data Mining and Value-Added Analysis.
9. Colleen McCue, Web-Based Data Mining.
10. Colleen McCue. Data Mining and Predictive Analytics: Enhancements to
Network Centric Warfare. Naval Proceedings, in press.
11. Colleen McCue, Web-Based Data Mining.
12. Colleen McCue. Data Mining and Predictive Analytics: Enhancements to
Network Centric Warfare. Naval Proceedings, in press.
13. William S. Lind, Colonel Keith Nightengale, Captain John F. Schmitt, Colonel
Josephs W. Sutton, and Lieutenant Colonel Gary I. Wilson, GI, The Changing
Face of War: Into the Fourth Generation, Marine Corps Gazette, October 1989,
22-26.
14. Lieutenant Colonel Robert L. Creamer, USMC, and Lieutenant Colonel James
C. Seat, USAF, Khobar Towers: The Aftermath and Implications for Commanders, Report chaired by Colonel Richard L. Hamer (Maxwell Air Force Base, AL:
Air War College/Air University, April 1998).
15. The 9/11 Commission Report, ISBN 0-393-32671-3, 22 July 2004.
16. Joint DHS and FBI Advisory, Homeland Security System Increased to ORANGE for Financial Institutions in Specific Geographic Areas, 1 August 2004,
URL: <www.dhs.gov/ interweb/assetlibrary/IAIP_AdvisoryOrangeFinancial
Inst_080104.pdf>, 19 November 2004.
17. John E. Douglas, Ann W. Burgess, Allen G. Burgess, and Robert K. Ressler,
Crime Classification Manual: A Standard System for Investigating and Classifying Violent Crimes (San Francisco: Jossey-Bass, 1997).
18. Colleen McCue, Data Mining and Value-Added Analysis.
19. Colleen McCue, Connecting the Dots.
20. Colleen McCue, and General Paul J. McNulty, Guns, Drugs and Violence:
Breaking the Nexus with Data Mining, Law and Order 51, (2004): 34-36.
21. Lieutenant Colonel Robert L. Creamer, Khobar Towers.
22. Billy Alfano, Terrorism Strikes Russia: Summary of the Attacks from August 24 to September 3, 2004, Overseas Security Advisory Council (OSAC), 13
September 2004.
23. Tabassum Zakaria, CIA Turns to Data Mining, 20 September 2002, URL:
<www.parallaxresearch.com/news/2001/0309/cia_turns_to.html >, 10 April
2003.

62

Data Mining and Predictive Analytics

24. Colleen McCue, untitled lecture presented to Diplomatic Security Service personnel at US Department of State (ArmorGroup, International Training), Rosslyn,
VA, 14 May 2004, 25 June 2004.
25. Reuven Shapira, We are on the Palestinians Map, Maariv (Tel Aviv), 18
May 2001.
26. Joint DHS and FBI Advisory.
27. Papyrus News, Fourth-Generation Wars: Bin Laden Lieutenant Admits to
September 11 and Explains Al-Qaidas Combat Doctrine, 10 February 2002,
URL: <vstevens.tripod.com/ papyrus/2002/pn020211a.htm>, 19 November
2004.
28. Brian McWilliams, Iraqs Crash Course in Cyberwar, Wired News, 22 May
2003, URL: < www.wired.com/news/conflict/0,2100,58901,00.html>, 19 November 2004.
29. Dennis Fisher, Clancy Urges CIOs: Seek Out the Smart People, eWeek.
com, 2 June 2003, URL: < www.eweek.com/article2/0,3959,1114813,00.asp>, 19
November 2004.
30. Eoghan Casey, Using case-based reasoning and cognitive apprenticeship to
teach criminal profiling and internet crime investigation, Knowledge Solutions,
URL: <www.corpus-delicti.com/case_based.html>, 19 November 2004.
31. Barton Gellman, Cyber-Attacks by Al Qaeda Feared, Washington Post, 27
June 2002, URL: <http://www.washingtonpost.com/wp-dyn/articles/A50765202June26.html>, 10 April 2003.
32. Colleen McCue, Colonel Andre Parker, General Paul J. McNulty, and Major
David McCoy, Doing More with Less: Data Mining in Police Deployment Decisions, US Department of Justice Violent Crime Newsletter, Spring 2004, 1+.
33. Colleen McCue, Guns, Drugs and Violence.
34. Colleen McCue, and General Paul J. McNulty, Gazing into the Crystal Ball:
Data Mining and Risk-Based Deployment, US Department of Justice Violent
Crime Newsletter, September 2003, 1-2.
35. Colleen McCue, Gazing into the Crystal Ball.
36. US Department of State, International Training Incorporated, Rosslyn, VA
37. SPSS Directions conference
38. Colleen McCue, lecture.
39. Colleen McCue, Gazing into the Crystal Ball.
40. Stuart E. Johnson, DBK: Opportunities and Challenges, in Dominant Battlespace Knowledge, eds. S.E. Johnson and M.C. Libicki (Washington, DC: National Defense University, 1995).

Colleen McCue, Ph.D.

63

Author Biography
Dr. Colleen McCue joined the Research Triangle Institute as a Senior
Research Scientist in July of 2004. Previously, Dr. McCue served as the
Program Manager of the Crime Analysis Unit at the Richmond, Virginia
Police Department, during which time she also maintained adjunct appointments at the Medical College of Virginia, Virginia Commonwealth
University. She earned her undergraduate degree in psychology from
the University of Illinois at Chicago, her Doctorate in Psychology from
Dartmouth College, and completed a five-year postdoctoral fellowship in
the Department of Pharmacology and Toxicology at the Medical College
of Virginia where she received additional training in pharmacology and
molecular biology. During her tenure with the Richmond Police Department, Dr. McCue pioneered the use of data mining and predictive analytics in crime analysis. Her experience in the applied setting resulted in
the development of risk-based deployment strategies and operationally
actionable analytical products, which have received international attention. Currently, her research involves the application of expert systems in
the analysis of crime and intelligence data, with particular emphasis on
deployment strategies, surveillance detection, threat and vulnerability assessment, automated motive determination, and the behavioral analysis of
violent crime. Dr. McCue publishes her research findings in journals and
book chapters, and has been an invited speaker at national conferences on
data mining, predictive analytics and violence.

This article is posted by permission of the Defense Intelligence Journal,


Charlotte A. M. Gallagher, Publisher/Editor.
To subscribe to the Journal, please contact dijed@jmicfoundation.org
Please consider joining the JMIC Foundation or Alumni Association and
receiving the Journal as one of many benefits of membership. For descriptions of
membership categories, please contact membership@jmicfoundation.org

You might also like