Professional Documents
Culture Documents
Computational Red Teaming: Hussein A. Abbass
Computational Red Teaming: Hussein A. Abbass
Abbass
Computational
Red Teaming
Risk Analytics of Big-Data-to-Decisions
Intelligent Systems
Computational Red Teaming
Hussein A. Abbass
123
Hussein A. Abbass
School of Engineering and Information Technology
University of New South Wales Australia
Canberra, ACT, Australia
This book is about elite types of thought processes and architectures for big data and
modeling that can enable smart and real-time decisions. Today’s world is abundant
with data and models; many new problems are formulated and solved everyday;
many artificial-intelligence, mathematical, and statistical models exist, but there
is a lack of scholarly work to demonstrate how to bring these data, models, and
opportunities together to produce value for organizations. This book does exactly
that and is written in a style designed to bridge management and computational
scientists.
This is a book about Computational Red Teaming (CRT): a computational
machine that can shadow the operations of any system. The Shadow CRT Machine
can think together with, or on behalf of, the system by asking “what–if” questions,
assessing threats and risks, challenging the system, environment, and competitors,
and using its well-engineered predictive models and computational thinking tools to
make the right decision at the right time.
Red Teaming (RT) is traditionally a decision-aiding art used by the military to
role play an adversary, play the devil’s advocate against one’s own concepts, plans,
strategies, or systems to “test and evaluate” them to improve decision making. This
book has been written to distill general principles from RT, and generalize and
transform RT, the art, into CRT, the science. The discussion will depart from the
military context to demonstrate the utility and applicability of CRT to individuals
and organizations. CRT transforms the classical “test-and-evaluation” process to a
continuous and proactive “test-and-redesign” process.
CRT means systemic and scientific RT. The word “computational” emphasizes
the necessity for systemic and computable steps that can be executed by humans
and computers alike, and allows for an evidence-based decision-making process
that can be traced to causes. Many tools discussed in this book can be employed
by using pencil and paper, and can equally be scaled up to big data and big models
that exceed human cognitive processing and classical computer abilities. With the
advances that have been made in fields such as computational intelligence, data
analytics, optimization, simulation, systems thinking, and computational sciences,
today, we have the tools to implement CRT in silico.
vii
viii Preface
Analytics is the science for transforming data to decisions. CRT uses risk
analytics, where risk is the focal point of the decision-making process, and challenge
analytics, where actions and counteractions are designed just across the system
performance boundary, to test and redesign the right decisions for an organization.
CRT creates opportunities for individuals, organizations, and governments by
grounding RT in system and decision sciences, and by identifying the architectures
required to transform data into decisions.
Risk analytics and challenge analytics, jointly, create the CRT world of this
book. The part of the world that treats risk analytics examines what risk is, and
demonstrates how evidence-based decisions must always be driven by risk thinking.
The part of the world treating challenge analytics structures the concept of what a
challenge is, discusses how to systematically and autonomously design and discover
challenges, and how to challenge an individual, organization, or even a computer
algorithm.
Over six chapters, CRT will be presented. Chapter 1 brings the reader inside
the classical world of RT. It explains the philosophy of this art, and presents a
story to demonstrate that the art of RT can benefit each individual, not only large
organizations. The steps for implementing an RT exercise are explained, and the
characteristics of a successful RT exercise and the ethics of RT are discussed.
The book then sweeps into the two building blocks of risk analytics and challenge
analytics that form the scientific principles for CRT, the science. Chapter 2 uses
a systems approach to establish the basis for risk thinking and challenge design.
Materials in the chapter cross the boundaries of uncertainty and risk, intentional
and deliberate actions, and deliberate challenges to the systems approach, skills and
competency to shape and influence performance.
Chapter 3 presents the big-data-to-decisions CRT. The chapter introduces and
brings together the architectures and building blocks used to design and develop
the computational environment that supports CRT. This chapter presents a gentle
introduction to experimentation, optimization, simulation, data mining, and big data
before presenting how these technologies need to blend to offer CRT architectures.
The CRT science relies on efficient tools to understand the future, and allows an
effective understanding of how to analyze “messy spaces,” as well as discover the
right methods to deconstruct complex organizations and the intermingled physical,
cyber, cognitive, and social domains (PC2SD). Beginning by offering scenarios
to prompt thoughts about the future and concluding with control mechanisms for
networks and generation of effects, Chap. 4 complements the computational tools
presented in Chap. 3 with the necessary system-thinking ingredients to transform
computational models into effective strategic tools. This chapter discusses planning
scenarios, and the complexity arising from the interaction of effects in the PC2SD.
It presents two models to manage this complexity: a model to transform complex
organizations into simple building blocks for analysis, and a model discussing the
operations required to analyze and generate effects in complex networked systems,
that form the basis for a thinking model suitable to design and form cyber-security
operations and complex social-engineering strategies.
Preface ix
xi
xii Acknowledgments
xiii
xiv Contents
Index . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 217
Acronyms
xvii
xviii Acronyms
IG Imitation game
IQ Intelligence quotient
ISAAC Irreducible semi-autonomous adaptive combat
ISO International standard organization
IT Information technology
JANUS Not an acronym
JDL Joint directors of laboratories
jSWAT Joint seminar wargame adjudication tool
KDD Knowledge discovery in databases
LMDT Linear machine decision trees
LSL Lanchester square law
M2T Model to think
MANA Map aware nonuniform automata
MAP Military Appreciation Process
ModSAF Modular semi-automated forces
NID Network-intrusion detection
NSGA Non-dominated sorting genetic algorithm
OneSAF One Semi-Automated Forces
OPFOR Opposing Force
OR Operations Research
PAX Plausible agents matrix
PAX3D PAX in three dimensions
PESTE Political, environmental, social, technological, and economic
QUEST Quick, Unbiased, Efficient, Statistical Tree
RAA Risk analytics architecture
R&D Research and development
RT Red teaming
RT-A RT auditor
RT-C RT communicator
RT-D RT designer
RT-Doc RT documenter
RT-LC RT legal councilor
RT-O RT observer
RT-S RT stakeholder
RT-T RT thinker
RT-Tech RT technician
SC2PD Social, cognitive, cyber and physical domains
SDA Sense, decide, act
SLA Sense, learn, act
SLIQ Supervised learning in quest
SOA Service-oriented architecture
SPRINT Scalable classifier for data mining
Acronyms xix
xxi
xxii List of Figures
xxiii
Chapter 1
The Art of Red Teaming
The commander must work in a medium which his eyes cannot see, which
his best deductive powers cannot always fathom; and with which, because of
constant changes, he can rarely become familiar.
Carl von Clausewitz (1780–1831) [49]
Abstract Red Teaming (RT) has been considered the art of ethical attacks. In RT,
an organization attempts to role play an attack on itself to evaluate the resilience of
its assets, concepts, plans, and even organizational culture. While historically, RT
has been considered a tool by the military to evaluate its own plans, this chapter will
remove RT from the military context and take steps to generalize it as an art before
discussing it in later chapters as a science. This chapter will first introduce the basic
concept of RT, will discuss the characteristics of what makes a successful red team,
and present a set of systemic steps to design a RT exercise. The topic necessitates
a detailed discussion on the ethics of RT, including the ethical issues to consider
when planning the budget and financial commitments of the exercise. To lay the
foundation for transforming RT to the computational world, this chapter concludes
with an explanation of why RT exercises cannot be fully automated, followed by a
discussion on how RT contributes to the field of artificial intelligence.
John has an interview for his dream job. He has spent his life dreaming of becoming
a branch manager in a bank. Finally, his dream is close to becoming a reality. He
does not want to risk making a mistake during the interview. However, this is his
first interview. He does not know the questions he will be asked, or whether he is
ready for a surprise question.
Martin is John’s best friend. John talked to Martin about his feelings. Martin
suggested a great idea: “How about we do a mockup interview together: I will act
as the interview panel you will face tomorrow; you try to think on your feet and
answer my questions.” John liked the idea.
John said to Martin, “If you really want this exercise to be effective, you need
to ask me difficult questions. Ask me questions you know I may not know how to
answer. Do not be nice to me: the harder you are on me today, the better prepared
I will be for tomorrow. Push me beyond my limit, and break me up today so that I
can stand strong tomorrow.”
Martin began asking John questions, some of them John knew would never be
asked in his interview. Martin knew very little about the job of a branch manager in
a bank. His knowledge and imagination of the questions he should ask John were
limited by his experience. He had never been in a bank environment, and so did not
know the issues a branch manager faces.
Martin suggested to John that they invite their friend Amy. She worked as a
branch manager in a bank. Therefore, she must know what it is like to be a branch
manager and the questions to ask. John liked the idea. He invited Amy to join them.
John was torn apart by Amy and Martin’s questions. The questions from Amy
were spot on, and John was surprised by the diversity of challenges a branch
manager in a bank faces. The questions from Martin focused on general personality
and management skills that are vital for the job. They provided John with a set of
challenges he did not anticipate. Amy’s questions triggered questions in Martin’s
mind, and vice versa. At the end of the exercise, John thanked Martin and Amy,
while sweating.
The following day, John went to the interview. At the end of the interview,
he was offered the job. Back at the coffee shop, while celebrating with Amy and
Martin, they both asked John in one voice, “How many questions did we ask that
the interview panel asked as well?” John smiled and replied calmly, “None.”
Amy and Martin were sad. Amy asked, “So we did not help you much! We have
pretty much wasted your time!” John smiled and said, “On the contrary, without
the mockup interview, I would not have secured this job.” Amy and Martin asked
simultaneously, “How so?” John replied, “After the interview, I realized that the
value of the mockup interview we did together was not in predicting the questions
the panel would ask me : : : . It actually prepared me for life. It prepared me to
think on my feet, to manage surprise questions and focus on how to answer them,
rather than being shocked by them. It showed me how to link what I know and
my comfort zone with the unknown questions that I could not anticipate.” John
continued, “Thank you Amy and Martin for being a great red team!”
1.2 Red Teaming 3
Today’s world is a great deal more complex than the days of Sun Tzu. During those
old days, the black and white view of the universe was clear: who is the enemy
and who is not. Today, political, environmental, social, technological, and economic
(PESTE) issues are intermingled. Two countries can form a political coalition, while
they compete economically. The country that poses the greatest political threat
can also be the greatest economic supporter. Who is the enemy? Perhaps this is
a question that some people are still able to answer, but the world is not divided
between “us and them:” along the way lie many players who are a critical part of
the game. Therefore, we need to generalize Sun Tzu’s statement to one that is more
appropriate to today’s complex environment.
Know your enemy, your friends, yourself and the environment but in so
doing do not forget to know how to know.
The above statement emphasizes the need to “know how to know.” If we have the
tools in place to know how to know, and use these tools appropriately, we will create
knowledge. Knowledge is power, especially in the age in which a knowledge-based
society is dominant. This book is about an evidence-based approach to know how
to know.
In addition to knowing how to know, we need to decide what we need to know
and about whom. From a competition perspective, possibly the four categories
of entities we need to know well are ourselves, our enemies (or better call them
competitors to leave the sphere of the military), our friends, and the environment. We
need to know ourselves to understand our own strengths and weaknesses; therefore,
we know when to use our strengths and when our weaknesses will expose us to
vulnerabilities. We need to know what we know and what we do not know. Without
such knowledge, we have no way to assess ourselves to be able to understand what
we are capable of and where our blind spots might be hiding.
4 1 The Art of Red Teaming
1.2.1 Modelling
The word “modeling” reflects the process whereby a situation that might be too
complex is transformed into a representation (such as diagrams) that focuses
on important information and relationships in that situation, while ignoring less
important information and relationships. By emphasizing modeling, we emphasize
the thinking process that is required for a RT exercise, where information is
collected, filtered, and mapped into a form that is simple for people to comprehend.
A model does not need to be a mathematical one for a special use. A model can be
a simple diagram drawn on a wall that connects the players and summarizes their
relationships to each other, or it can be a script given to actors in a movie, or in a live
experimentation, that describes their role within the artificial and synthetic world.
The concept of risk is central to any RT exercise. The red team is formed to
discover blind spots that can impact blue’s objectives. This is primarily the concept
of risk, which is defined by the International Standard Organization (ISO) as the
effect of uncertainty on objectives [27]. Therefore, regardless of the objective
of a RT exercise, the fundamental driver of the exercise is to understand how
unknowns impact objectives; this is risk. We will adopt the word “impact” rather
than “effect/affect” because the latter is reserved for a different purpose later on in
the book.
Therefore, getting red to win per se is not and should not be the primary aim of
RT. Winning is merely an indication that the objective of discovering vulnerabilities
and negatively impacting blue’s objectives has been achieved. The primary aim for
RT is to understand the risk: the interplay between what blue did not anticipate and
blue’s objectives. This risk lens differentiates RT from other classical games. A RT
exercise needs to be equipped with a variety of tools for studying and identifying
the weaknesses of blue, tracing the causes, and identifying strategies to influence
and reshape the system. However, equally, the exercise needs to be equipped with
the suite of tools to analyze uncertainties and objectives, and therefore, risks.
Again, the primary aim of a RT exercise is not for red to beat blue. A possible aim
is simply to gain a greater understanding of what is possible. Any group will have
its own biases, the culture that operates within, and frame of mind that constrains
groupthink. Red attempts to help blue overcome these biases by exposing blue
to possibilities that blue may not have thought of. Getting blue to appreciate this
space of possibilities can assist blue in designing more robust strategies and open
opportunities for blue to make use of this sphere of new possibilities that blue was
not aware off before. This point will be revisited in Sect. 1.4.
A company that thrives on technology will have many employees that are techno-
logically savvy. The beliefs and behaviors of members in such an environment are
centered on technology. People living and working in such an environment may not
be able to imagine how to survive and live without technology. Technology becomes
the thought-baseline (bias) for the people working in this company. Such a thought-
baseline will steer their perception, ability to understand what they sense, and the
reasoning process they will follow to make a decision.
1.2 Red Teaming 9
A strategy [19, 37] denotes the set of methods that are able to transform the
resources to which an organization has access to, or has the ability to access,
into the organization’s goals. Strategies must be tested all the time. During their
inception, it is not yet known whether the strategies are the correct strategies. It
is not known whether the competitor can design counter-strategies and what these
counter-strategies might be. Even as strategies are implemented, indicators need
to be monitored continuously to determine how successful these strategies will
be in achieving the designed goals. In a highly dynamic environment, indicators
for assessing the performance of strategies are vital because the environment may
change to states that were not considered when the strategies were designed.
RT transforms a strategy-design process from a one-off exercise to a lifelong
learning exercise. It sees the testing process as a strategy in its own right that works
hand-in-hand with any strategy. Through RT, a strategy is consistently scrutinized,
challenged, and tested against competing strategies and plausible counter-strategies.
The concept of strategy will be defined in Chap. 4 and will be discussed in more
detail throughout the book.
10 1 The Art of Red Teaming
RT sees problems and situations through a risk lens. It is this risk lens that makes
the RT exercise continuously conscious and self-aware of risk. As a result of this
consciousness and self-awareness, risk-mitigation strategies emerge as the process
of RT discovers and challenges different risks within a situation.
Analysts in risk management understand that risk cannot be minimized per se.
Instead, it is mitigated: its negative impact is steered away from the system, or the
system is reshaped to transform a negative impact into a positive one. For example,
let us assume a person is working as a police officer, and is expecting that in a
couple of years, they will receive a promotion, which will mean they leave the space
of action and take a back seat in the office. For this person, such change might
constitute a negative risk. There are many different ways to mitigate this risk, each
with its own pros and cons. The person may decide to begin making mistakes or
behave angrily toward the boss so that they are not promoted. However, such a
reaction does not constitute RT. RT is about being smart and strategic in every move.
A possibility for this person that would reflect the aims of RT would be to determine
who is being considered for a promotion. Subsequently designing a risk-mitigation
strategy to ensure this next-in-line person is promoted first is RT in situ!
RT is different from other activities that evaluate plans because of its reliance on
a deliberate, proactive and challenging approach.1 It has been used widely by the
military, security organizations, and large organizations. The success of the RT
exercise relies on a number of factors, some of which are discussed below.
1
These concepts will be discussed in more details in Chap. 3
1.3 Success Factors of Red Teams 11
in conflict are the protection of a value system by the state and promotion of the
value system of the non-state entity. The objectives focused on economic and people
issues of the state are impacted if the value system of the state is impacted, and
therefore, these two objectives are dependent on the objective of protecting the value
system of the state.
The two objectives in conflict can be represented formally as follows:
1. State objective: minimize the damage to the value system of the state
2. Non-state objective: maximize the change to the value system of the state.
The military is a government policy tool designed to resolve the conflict. The
conflict can be resolved by means such as the state educating the non-state parties,
or exercising economic pressures on the non-state. Under certain circumstances, the
state may decide that the best policy tool to adopt is military action. Consequently,
a military conflict is initiated.
As such, the root of a conflict is the existence of at least two objectives that are
in conflict. An internal conflict exists when the two conflicting objectives are owned
by the same system. An external conflict exists when the two conflicting objectives
originate from two different systems.
An external conflict does not necessarily imply that one entity is the enemy of
the other. With an enemy, there is a declaration that an entity, labeled as an enemy
(red), will cause damage to a second entity (blue) to deny the latter the opportunity
to achieve its objectives which are in conflict with the former entity. This is not
necessarily because red has objectives designed for its own benefit that happen to
conflict with the objectives of blue that were designed for its own benefit. In some
cases, an enemy is a historical concept in people’s minds that arises from issues such
as a past conflict or religious views. RT used for enemies in a military context is a
very narrow view of RT.
The general concept of RT in this book is that it denotes reciprocal interaction,
causing reciprocal dynamics between two or more entities with conflicting objec-
tives. The conflict in objectives causes these two entities to be a threat to one another.
John’s need for a job can be considered in conflict with the company to which he
applied; the company in which the managers would like to find the best person for
the job or would prefer an internal applicant if possible. In this situation, John may
even create two objectives for himself that are in conflict: to balance his own life;
he could create an objective to relax and an objective to be very competitive. Over
time, he can decide on the level of trade-off he needs to achieve the balance and his
goals, but the two conflicting objectives reside and survive inside him, continuously.
This example demonstrates the difference between internal and external conflicting
objectives.
The strategies or ways used to resolve conflicting objectives are the primary aim
of a RT exercise. A military use of RT to design these strategies is a very narrow
purpose of RT. In fact, between two state players, many conflicting objectives
can be in play at any point of time, and it becomes essential to understand
the interdependencies among these objectives, and how they should be resolved.
A consequence of this analysis might be that we discover that a military use is
unnecessary.
12 1 The Art of Red Teaming
Similar to John’s experience, the composition of the red team is critical for the
success of the RT exercise. As the proverb says: “To imitate an ant, one has to
become an ant.” It is not sufficient to split a group into a blue and red team at
random. The group needs to be socially engineered to create the right red team.
Membership of each team is critical. Members must have the ability to think in the
same way as the role they are playing.
Martin could not help John alone. While he could have thought of many questions
that John would not know how to answer—preparing John for thinking on his
feet and managing surprises—context matters, and that was where Amy became
a necessary addition to the team. Amy’s role was not just an additional member
who knows the context. Her role was to prepare John for the unknown within the
context of the job. She was able to make John think about branch management.
Equally importantly, her questions triggered Martin to ask more questions simply
by listening to her. In fact, she also trained Martin subconsciously to ask questions
within the context of bank-branch management.
Without Amy’s intimate knowledge of the activities of a branch manager in a
bank, she would not know how to ask the right questions, or how to ask the questions
in the right manner. Martin listening to Amy meant that Martin was able to see his
blind spots, and this triggered him to ask questions he would not otherwise have
asked. Similarly, Amy listening to Martin enabled her to realize the bias in her line
of questioning, and opened her eyes to ask questions that she would otherwise not
have asked. Different categories of members in the RT exercise are discussed in
Sect. 1.6.2.
Amy was ready for the exercise because she worked as a branch manager. Imagine if
Amy had not had this experience. For the red team to function properly, Amy would
have needed to learn what a branch-manager job involves. However, such a level of
basic knowledge is not sufficient. In this situation, the culture of branch managers
is not something only to know; it needs to be lived.
The concept of embodiment is critical for RT. Back to the proverb of the ant. One
can notice how the ant behaves. One can build many theories on the behavior of the
1.3 Success Factors of Red Teams 13
ant. These theories can even be validated to demonstrate that they truly reflect the
manner in which the ant behaves. However, if human nature can be revealed with
exactness, psychologists would have been replaced with pure engineers. Knowing
how ants behave, and even successfully predicting ants’ behaviors should not be
mistaken with the conclusion that one can duplicate the behavior and thinking
process of ants.
To place this in context: Amy could have read and studied many books on the
role of a branch manager in a bank. However, theoretical knowledge would not have
been sufficient. There are lessons that are learned on the job. Amy’s mind has been
reshaped every time she has a new experience in her job. These evolutionary steps
of Amy’s thinking are what make Amy think as a “branch manager.”
Amy’s body posture, manner of looking at John, the pitch of her voice, and every
aspect of her physical appearance have somehow been influenced by the job of a
branch manager. Being embodied in the job has transformed Amy into a branch
manager. Being situated in the job, Amy knows how to think “branch manager” in
the same way that the ant knows how to move as an “ant,” and think “ant.”
The time for learning, embodiment and situatedness requires a great deal of
seriousness and commitment from the organization sponsoring the RT exercise, as
well as from all members of the exercise. The internal transformation of Amy as she
was before working as a branch manager to the new Amy who is a branch manager
requires a level of commitment from Amy, without which Amy would have only
become a bad imitation of a branch manager.
For example, consider the situation in which Amy did not have experience as a
branch manager. She wanted to know about this job so that she could help John.
She made the assumption that by gaining theoretical knowledge, through education,
about a branch manager job would be sufficient to give her the knowledge required
to ask John relevant questions. Let us assume that the time required for learning is
not an issue here. The result is clear, Amy would have been a bad imitation of a
branch manager. In such a scenario, she may even build a level of false confidence,
which results in her biasing the questions in a manner that negatively biases John.
Consequently, the entire exercise with John could have negative consequences. It
could have had the reverse effect on John than it did in the original example. Imagine
you know you will meet someone in the morning, but you do not know whether it
will be a male or a female. Now, imagine someone positioned your frame of mind
to believe you will be meeting a female. The surprise you receive from meeting a
male is much less in the former case of not knowing than in the latter case in which
you have been made to believe the opposite to what is true.
When Amy questioned John, the form and nature of her questioning subcon-
sciously placed John’s mind in a mental environment that was consistent with those
of branch managers. If Amy had done a bad job in her learning and commitment
to being a branch manager, John would have been positioned in the wrong frame
14 1 The Art of Red Teaming
of mind. This could have made him more prone to being surprised by the interview
questions, or worse, it could have caused him to misinterpret the questions.
The seriousness and commitment of the red-team members to becoming red is a
double-edged sword. Members of the red team can be transformed psychologically
to be red. This transformation needs to sit below a line that should not be
crossed; otherwise, they will turn into real red (i.e. competitors to the organization’s
objectives or enemies to the state)! Members of the red team need to be as close as
possible to becoming red but should not cross the line of being red. If they cross the
line and become truly red, they will counteract and deceive the RT exercise.
In effect, members of the red team need to be socially engineered to have two
or more concurrent minds: the mind of a red and the mind of a blue. They need to
be trained and know when, where, and how to inhibit or excite each mind. Thus,
training members of the red team needs to be socially engineered, and continuously
monitored to create a safety net around these members. The mind of a blue needs to
be built first and must be stronger than the mind of a red.
Let us consider two hypothetical information-technology (IT) companies: we
will call them Minisoft and Manysoft. If in a RT exercise, Minisoft trains its red team
to such an extent that they believe truly in the Manysoft products over the Minisoft
products, the Minisoft RT exercise will fail. Members of the red team would see
Minisoft as a competitor to their ideology and desire the success of the Manysoft
products over that of the Minisoft products. We will continue with this example
during discussing different concepts throughout the book.
The amount of time and level of seriousness and commitment required by the red
team to understand red implies that there is a high level of investment required to
train members of the red team. It is expensive and inefficient for members of the red
team to be used once and then dispatched.
The continuity of the members of the red team in playing red provides them with
a unique experience to innovate and use creative thinking to counteract blue’s plans.
It is through this continuity that members of the red team have the time to reflect,
experience, and reflect again to improve their skills in acting red, and situate and
embody themselves in the red team’s environment and manner of thinking.
can do, and therefore, what counter-strategies they need to develop themselves.
Regardless of whether de-skilling occurred or not, red and blue need to interact.
Through interaction, red and blue accumulate a unique experience for acting and
counteracting.
John, Martin and Amy. From John’s perspective, it was an opportunity for preparing
him for the job interview. For Martin and Amy, it was an opportunity to help a friend
and gain experience themselves as interviewers.
1.4.3 Training
The example of John’s job interview demonstrates how RT was used to train John.
The nature of training that RT provided in this situation is very different from
classical non-RT training. In Non-RT training, John would have watched a number
of videos of similar job interviews, and possibly would have been assigned a coach
to give him “to-do” and “not-to-do” tips. RT-training has three conditions: (1)
reciprocal interaction, (2) deliberate challenge through active participation, and (3)
continuous assessment of risk. Through RT, John was trained in situ to be adaptive,
to think on the fly, and to manage surprise questions.
The mock-up interview enabled reciprocal interaction to take place. Amy and
Martin were actively listening to John to discover from his answers if more questions
can be generated to challenge him more. They needed to be goal focus on the job
of Branch Manager, and they needed to continuously assess John’s answers within
the scope of this goal. They needed to see if any of John’s answer threaten the goal
that John gets the job, and they needed to actively contribute questions to train John
more in these areas of vulnerabilities.
RT is a very effective training technique. In a non-RT training exercise, training
scenarios are standardized for all participants, but in RT training, the training
evolves differently for each different participant. Every time something about blue
changes, such as team membership, additional capabilities, and new knowledge,
it becomes a necessity that red evaluates the need to change its strategy. Through
interaction with the trainee, the trainer discovers areas that require more attention
and the training exercise is tuned and tailored toward these areas. Equally, the trainee
continues to learn from its own mistakes, and from designed and unintentional
mistakes of the trainer.
RT does not only train people to be efficient on the task, but it also trains people
to be efficient in their ability to adapt when the task changes. In summary, RT trains
people “to think”, not just “to do”. This difference truly differentiates RT from non-
RT training.
2
By self-talking or self-rehearsal, we mean internal conversations that occur in a person’s mind.
Imagine you are going to fire someone in the organization that you know very well. Assume you are
1.4 Functions of Red Teaming 17
Vulnerabilities in a plan and biases are not the same thing. It is known in manage-
ment that without a bias, one cannot make a decision. In fact, every decision being
made carries a bias of some sort. Bias is not necessarily bad. Bias becomes bad bias
when it has a negative impact on the decision.
For example, a selection panel may choose the male applicant out of two equally
qualified male and female applicants in a job interview for a kindergarten teacher.
Each of the applicants would have been successful in that job but the panel needed
to make a choice. Possibly, for a kindergarten position in which many of the teachers
are females, the selection panel was biased to the male to balance the genders in the
working environment. This bias breaks symmetry, and without it, a decision cannot
be made.
In the same example, imagine that the female applicant was less qualified and
the panel consists of females who select the female applicant because they believe
that females do a better job in kindergartens than males. Here, there exists a
different form of bias, which is labeled “discrimination.” This type of bias relies
on stereotyping and unfairness. It is not the type of bias to which we are referring in
this section.
Let us revisit the first scenario in the kindergarten example in which the male
was selected. RT can help to understand that sort of bias. That is, the organization
a people person; that is, you care about people so it is important for you to ensure the person gets
hurt as less as possible. You start to rehearse in your own mind what you will say to this person.
You may even imagine what the person will reply to you and what you will reply back. This is
a form of rehearsal and internal RT within one’s self. Through self-talking, the person reinforces
certain concepts and words, a process which helps the person to remember and counteract their
internal fears and negative thoughts.
18 1 The Art of Red Teaming
may not have been aware of this type of bias in its decision-making process; the
decision-making process may have been subconscious. If RT reveals this bias, the
organization becomes more conscious of its existence in advance. The organization
may establish a policy to increase people’s awareness of the need for males in
kindergarten education. The organization may even go further and study the impact
of a female-dominated environment in kindergarten education, and its psychological
impact on the children. Revealing biases can open doors for opportunities. Either the
organization will discover that the sort of bias being used is healthy (but its impact
needs to be better understood), or the organization will discover that it is unhealthy
and it needs to be eliminated from the organization’s decision-making culture.
A RT exercise is realistic, but not “real” per se. If the RT exercise is genuine, it
is no different from the daily experiences we accumulate. It adds to our memory
a different type of experiences that we may not be able to afford to live truly in
reality. If lessons from the RT exercise were learnt properly by the organization,
these experiences can be engraved in our memory and be retrieved when they are
needed.
Although RT in large organizations is often an expensive exercise, it can be
considered a cheaper option to certain experiences that might arise if RT training is
not employed. A RT exercise is cheaper than provoking a real war. Participants in the
RT exercise learn from the experiences to improve their knowledge, performance,
and decision-making abilities. The organization learns its weaknesses and strengths.
Equally importantly, the memories that come from the exercise can be retrieved
when similar situations are encountered.
The plans and responses developed during the RT exercise can be saved for future
use. For example, RT is used by emergency-response management, in areas such as
fire fighting and mass evacuations, to create scenarios for plausible futures. Lessons
learnt from these scenarios are stored within the city council. These scenarios and
their associated lessons can be retrieved when similar situations occur.
As described, RT can be used to learn about situations that may not yet have
been encountered. Similarly, RT can be used to “unlearn” situations that have been
encountered so that the individual is prepared for a new manner of thinking and
behavior in situations to be encountered.
For example, RT could be used to train emergency-response management
on using a new wireless-communication device. As the exercise unfolds, the
participants involved in the exercise accumulate experience using the new
1.5 Steps for Setting Up RT Exercises 19
The purpose of the RT exercise defines the objective of the RT exercise, and acts as a
reminder of why the exercise is being conducted. The scope of a RT exercise is a list
of the soft constraints3 defining the context of the exercise. The criteria of success
are measures of utility of the exercise and their values can be used to demonstrate
the value-add of the exercise.
While RT is exploratory in nature, it is vital to know the purpose, scope and
criteria of success for the exercise before moving forward, that is, the answers must
be found to the questions of “why,” “what,” and “so-what.”
The purpose of the RT exercise influences all the steps in designing the exercise.
For example, if the purpose is to improve blue’s ability to anticipate how red acts
and reacts, it becomes essential to design scenarios to create that effect. The RT
scenarios need to produce a large number of situations that sample red’s behavior.
The exact task used to conduct the scenario may not matter in this example, as the
focus is on red’s actions in a wide range of contexts.
3
Constraints can be hard or soft. Hard constraints can’t be broken, that is, the constraint must be
respected or the solution is not accepted. Soft constraints can be broken with a cost. The scope of
a RT exercise may need to be updated, or the interaction between red and blue may beg a change
in the original scope.
20 1 The Art of Red Teaming
A scope is a set of soft constraints that bind the context of the RT exercise. As
these constraints are soft, they can be broken. They can be ambiguous in nature and
their only purpose is to ensure that the exercise is not unbound. A scope defines
what is essential, what is useful to have and what is irrelevant in the context of a RT
exercise. However, by no means should this scope be fixed. The interactive nature
of the RT exercise may necessitate a change in scope. As new events unfold, one
may discover that the exercise was scoped incorrectly and a refined scope is needed.
The scope of the example above is training blue to anticipate red behavior. The
scope is defined with the two keywords “anticipate” and “behavior.” As such, the
exercise should not focus on the details of the situations, but on which behaviors are
likely to be generated in which situations. These situations need to be defined on a
sufficient level for the details of these behaviors to emerge, and in no more detail
than necessary. If more details are defined than necessary, the RT exercise can lose
flexibility.
The next important element to know in advance before designing the exercise is
how success of the RT exercise will be judged. The criteria of success define whether
the exercise was successful in fulfilling its purpose or not. If it is not known how
the success of the exercise will be judged, it will be difficult to define which data to
collect, which factors should be measured, how to measure them, and on what basis
the effectiveness of the exercise can be justified.
The purpose, scope and criteria of success establish a set of guidelines to measure
the appropriateness and cost benefit of each decision to be made in the remaining
steps of a RT exercise.
A RT exercise is not different on any fundamental level from any other type of
experimentation exercises we conduct. Experimentation is a general concept with its
own guidelines and principles and a RT exercise is one type of experimentation. As
will be explained in Sect. 1.7.4, not every red-blue experiment is a RT experiment.
A RT experiment needs to focus on the design of, and interaction between, the red
and blue teams. More importantly, in RT, the experiment needs to focus on designing
the process of a deliberate challenge, that is, how each side will challenge the other
side? The objective here is not simply to win or play the game. The objective is to
learn how to stretch each side’s boundaries to the limit.
Imagine a simple RT military exercise whereby the blue and red teams were
deployed in a field. Soon after deployment, red began to fire and eliminated blue
very quickly. Blue discovered a weakness, and the exercise demonstrated some
benefits, but not true benefits because the true value of the RT exercise is to learn
about the thinking process red and blue experienced that created this result. The
exercise needs to be designed around discovering this thinking process, not around
which team wins or loses alone.
1.5 Steps for Setting Up RT Exercises 21
A RT exercise begins the moment the need for a RT exercise is announced, that is,
in the moments before the purpose, scope and criteria of success are designed. This
is important because this moment dictates constraints on which information should
be communicated to whom. However, conducting the exercise is about the moment
the experiment and game begin. This is the moment in which both red and blue
prepare for engagement and interaction. It is also the moment in which the scenario
is executed.
The RT exercise would usually involve a number of teams. In addition to the red
and blue teams, there is the team of designers who design the exercise; the team
of observers who watch the exercise unfolding, and possibly share their perception
of the events taking place; the team of analysts who specialize in analyzing the
RT exercise qualitatively and quantitatively; the technical team that is responsible
for all technical and engineering elements of the exercise, including monitoring
the automated data-collection tools; and there may also be other groups such as
politicians who simply watch the exercise to get more familiar with the situation.
24 1 The Art of Red Teaming
Sometimes, other colors are used to designate other teams. For example, the
designers, analysts and observers are grouped into a white team, while a green color
denotes a special team who supports and acts as a coalition of the blue team. More
colors can be introduced to define other groups with interest in the exercise.
The scenarios discussed above demonstrate that the scale of a RT exercise can
extend from three people (as in the case of John’s job interview) to hundreds, as in
the case of a national-security exercise. Each member should not interfere with the
tasks and purpose of other members. The technical team should be invisible to both
the red and blue teams so that they do not distract them when performing their tasks.
The observers should be separated from the analysts so that they are not influenced
by the discussions among the analysts. The politicians should be separated from the
entire exercise so that they do not push their own agenda, influencing the exercise
to change its original intent.
The majority of the analysis required to data mine the RT exercise to extract trends
and patterns (i.e. lessons) will be offline and occur after the RT exercise. This is
sometimes due to the need to have the complete data set of the exercise before a
pattern can be extracted. The analysis may need to propagate information forward
and backward in the data to establish reasons and rational for the extracted pattern.
1.5 Steps for Setting Up RT Exercises 25
Sometimes, it is important to bring both red and blue teams back to the analysis
room after the exercise is completed. In this situation, the events can be played back,
while asking team members to reflect on why certain sequences of events occurred
in the manner in which they did during the exercise. This process of reflection may
be designed as part of the training process for both red and blue in preparation for
a subsequent exercise. It may also be necessary for understanding the results and
outcomes of the exercise.
RT exercises are a capability that any organization or nation should see as a lifelong
continuous capability. Every exercise teaches an organization how to perform the
next in a better manner. Therefore, lessons learned on RT should be captured,
documented, and stored as a source of knowledge for future exercises.
The first RT exercise to be conducted by an organization will be perfect only
in rare cases. Even if one exercise is perfect, there is no guarantee the following
exercise will be. RT exercises are complex and the likelihood that something will
go wrong is very high. Similarly, the likelihood that something that went right in a
previous exercise will go wrong in a future exercise is equally high. Overconfidence,
human bias, and the complex nature of the situations and decisions encountered
during a RT exercise are critical issues that will threaten the success of any RT
exercise. Lessons learnt from a RT exercise form part of the organization’s corporate
memory.
26 1 The Art of Red Teaming
Players involved in a significant RT exercise, such as a national security one, are elite
people. From the designer to the technicians, they are all highly qualified individuals
to perform the role they have been assigned within the RT exercise. Members of such
an exercise should be chosen very carefully, and they should fully understand the
consequences of being involved in RT. The different roles within a RT exercise will
be discussed in this section.
A RT exercise can take many different forms, and be of different scales.
Therefore, it is not expected that all the roles being discussed in this section must
be fulfilled by separate individuals in each exercise. All roles can be fulfilled by a
team as small as five people, and for larger RT exercises, some roles may have large
teams managing them.
The organization should categorize key players in a RT exercise, and discuss the
risk associated with each category. It is important to emphasize that a RT exercise
trains people to “think” in the first place, in addition to training them “to do”. The
risk level described below represents the risk a player in a specific category poses to
1.6 Ethics and Legal Dimensions of RT 27
an organization, that is, if this team member becomes a bad citizen,4 to how much
negative risk will the organization be exposed. Equally, if the team member remains
a good citizen, how much positive risk (i.e. opportunities) will the organization gain
from having them as part of the staff.
This point of risk needs to be considered a natural step, not as a something to
hinder the exercise. In normal circumstances, every employee in an organization,
from the bottom level to the highest level, is trained. The risk that the employee
switches to a bad citizen of the organization always exists. However, this does not
happen with great frequency thanks to careful choices made in appointing people to
their positions.
Recruiting people to a RT exercise is similar to recruiting people to any other
position in the organization. Therefore, the risk cannot be ignored.
The subjective assessment of the risk level associated with each role in a RT
exercise should be viewed with caution. Some risk level may increase or decrease
for each role based on the nature of the RT exercise.
RT stakeholders (RT-S) are the primary beneficiaries and problem owners of the RT
exercise; therefore, RT-S should be setting at the highest level in any organization.
In the private sector, the board should be the primary stakeholder in a RT exercise.
A RT activity is very likely to touch on multiple activities within the organization.
Members of the RT team need to be protected at the highest level given the
benefits of the RT exercise are usually organization-level benefits. The board needs
to establish a subcommittee that oversees RT within the organization, similar to
other board-level committees such as the auditing and risk committees. If it is not
desirable to make RT activities visible to the outside world, the board risk committee
can take responsibility for RT.
The organization carries the risk that comes with every role in the organization.
Every role in a RT exercise can pose a risk on the organization. As we discussed
above, a red teamer can become red, and get transformed into a bad citizen.
The risk of the RT-S is low because the board will only have two roles: to ask
questions to the RT teams, and to protect the teams. Members of the board should
not be involved as players in the RT teams because this will create confusion about
the responsibilities of each participant, and may create an undesirable position of
power in the red and blue teams. An exception of this point is when the scope of the
RT exercise is about the board itself. In this case, some board contributions to the
exercise will fall under technical roles, not stakeholders role.
4
For example, a person in a red teaming exercise learns the skills to penetrate a computer system
then decides to do so in the real world to commit fraud.
28 1 The Art of Red Teaming
The RT designer (RT-D) is the maestro who designs how the exercise is played and
how players and actions need to synchronize. a RT-D is the person who needs to
understand and be immersed in experimental design for in-situ experiments. The
word “designer” instead of a “team leader” is used to avoid the implication that
there is only one team or only one leader. There is also an attempt to emphasize the
fact that besides RT-D being a leadership role, it is also a role in which design skills
and knowledge of the principles of RT are required.
The RT-D draws, and therefore is exposed to, the entire picture of the RT exercise.
The RT-D is the interface between the RT team with the RT-C, the RT-LC, and the
RT-S. The RT-D acts as the access control for information to all subteams of the
RT team. The RT-D should be a key person(s) in selecting the RT team members
because part of this role is identifying the skill set required to conduct the RT
exercise, as well as the personality types associated with the skill set.
The role of the RT-D comes with two aspects that make the risk level associated
with this role very high. First, the RT-D’s access to information and systems
make the role of the RT-D a high-risk role. Second, being the mastermind behind
designing the exercise gives the RT-D a level of knowledge greater than that of all
the members of the RT team, even though RT-D may not be very skilled in some
very technical tasks in the exercise.
A second role with a very high level of risk is the RT thinker (RT-T). The RT-T will
be an individual who thinks about risk, knows how to design strategies to penetrate
systems or challenge plans. a RT-T is a system thinker of the highest caliber, who
combines a reasonable level of technical skills and understanding with strategies
and systems thinking.
This role is very high risk. Systems thinking alone is not sufficient to fulfil this
role. Members in a RT-T role need also to have a variety of technical skills. People
who are too technically qualified are not suitable for the RT-T role. The reason is
that a technical person can be too narrowly focused on the technical issues, may not
have the thinking risk skills, may have many biases arising from their technical
knowledge which would hinder innovative thinking, and may not have much
understanding of the role of strategy in a RT exercise. Similarly, system thinkers
who have no experience with the technical side of RT can be counterproductive to
the goal of the RT exercise, as they can imagine what needs to be done without
necessarily having the ability to judge whether it is doable.
This combination of technical know-how and systems thinking is where the high
risk resides in this role.
30 1 The Art of Red Teaming
The expense of RT may not represent a large amount of money to the organization
when compared to other expenditures of that organization. However, most large
expenditures in an organization are involved with production and core business.
RT can be mistakenly considered a “nice” activity to have, rather than as a core
activity for the organization. This can lead to a perception that the expense of RT is
unjustifiably high.
If the RT-D is pressured and agrees to begin the exercise with an insufficient
budget to demonstrate that the benefits are greater than the cost, the following three
undesirable possibilities may arise:
1. The quality of the exercise will be compromised to ensure the assigned budget
is not exceeded. The consequence of such a situation can be expressed simply:
“what is built on ashes will end in ashes.” a RT-D should understand that the
RT-S have one primary objective: obtaining the right answer to the questions
that motivated them to approve the exercise with the minimum cost. Obtaining
the right answer is not controllable by the stakeholders who are not necessarily
experts in RT. They will entrust the RT-D to provide them with the right design
and answers. However, they can control the cost. Therefore, they will always
attempt to push down the cost. The designer should not accept a budget that will
not lead to the right answer. Therefore, the ethical hurdle to ensure that the design
is right lies with the RT-D once they accept a budget.
2. The designer takes the attitude that if the RT exercise begins with an inap-
propriate budget, the stakeholder will be forced to assign more money to the
exercise when it is needed. For example, an organization begins with a promise
that cannot be delivered with the limited budget it assigns; however, the benefits
32 1 The Art of Red Teaming
The military has been leading the efforts on RT for many decades, but only over
the last decade, the need for establishing RT as a science has been stressed. It is
important to explain why and how this book departs from RT in the military. To do
this, we will offer a personal reflection on the different views on RT within military
decision sciences.
Some computer security consulting companies use the words of RT as a muscle-
based approach, that is, the company demonstrates that they are able to penetrate
into any security system to satisfy a client concern. Since no system is bulletproof,
there are always ways to penetrate a system. RT became the brand to sell this
approach.
This view to RT is detrimental. First, it has legal consequences that can generate
many negative risks to organizations and the government as a whole. In computer
security, the objective of RT is not to penetrate a system, but to map out the space of
vulnerabilities from a threat lens. Second, the military, similar to scientists, is used
to disciplinary approaches to conduct any study. This is important because the value
of any military study is in the lessons gained. The muscle-based approach used by
some consultancy companies focuses on selling the final results and the success
in penetrating a system. Proper RT studies should focus instead on the systemic
and disciplinary design and approach followed in the study to clearly articulate the
lessons learnt.
1.7 From Red Teaming to Computational Red Teaming 33
In Sect. 1.1, it was described that John went through the mockup interview exercise
to prepare himself for the interview. He began with Martin and discovered that the
34 1 The Art of Red Teaming
team composition was not right. They invited Amy to join the team. John asked them
to play devil’s advocate with him, to ask him difficult questions, and indeed, they
did. This was a simple example of RT that many people would have encountered
it in their life. Unfortunately, Martin and Amy did not have a book to read on
how to execute a RT exercise properly, or what is expected. That is, they relied
on their understanding of the exercise and their experience. They did a good job
but the lack of science principles from which to derive this process would mean
that they cannot teach what they learned to others, they cannot generalize it beyond
the limited experience they had, and they cannot properly justify their choices or
thinking process. This is the value of transforming the art of RT, to the science of
Computational Red Teaming (CRT).
1.7 From Red Teaming to Computational Red Teaming 35
So far, this chapter has distilled lessons learned from the military application
of RT, and the author’s own style and experience. CRT will contextualize this
into a wider context to generalize RT so that it leaves the realm of the military
to applications in industry, technology, and government departments to support
effective decision making.
Before CRT is discussed, more light will be shed on John’s experience. Many
people have experienced a job interview. A possible way to describe the dynamics
of a job interview is to view it in three stages: first, the welcome and ice-breaking
stage. Initial questions are asked such as: why did you apply for this position? what
do you bring to this position? These are the sort of questions that the candidate may
have anticipated or should be answered by any reasonable candidate without being
too stressed.
The second stage focuses on the job with more targeted questions. For example,
you only managed a budget of $1 million, but in this job you will need to manage a
budget of $100 million, can you convince us you are capable of managing this large
budget?
The third and final stage of an interview cools down the interview environment.
For example, if you are successful, when can you take up this position? Can we
contact your referees?
What is the objective of a job interview? In normal circumstances, the objective
is to select the best candidate for the position. The organization may have never
encountered this candidate before. Therefore, this candidate is a “black box;” the
selected candidate might be the right person for the job, or it may be that the
selection of this candidate was a big mistake that the organization has to deal with
for some time. From this perspective, a job interview is nothing more or less than a
risk-assessment exercise, whereby the organization assesses the risk of appointing
each candidate. In this context, this risk is simply how the uncertainty about the
candidate can be assessed to judge properly its impact on organizational objectives.
Challenging the candidate is the means to execute this assessment. The sec-
ond stage of the interview discussed above is normally where the candidate is
challenged-this is one of the cornerstones of CRT: what is a challenge, and how
to challenge? The selection committee evaluates the application, referees reports,
and may test the candidate before the interview using a number of psychological
and technical tests. During the ice-breaking stage of the interview, the selection
committee continues to evaluate the candidates. Sometimes this stage triggers more
questions later in the interview.
During the second stage, the candidate is challenged. The selection committee
attempts to estimate the boundaries of each candidate’s abilities, skills, and behav-
ioral space. This can take many forms, including direct questions, or by presenting
the candidate with a real-life situation and asking for an opinion. For these questions
to challenge the candidate, the selection committee observes the responses, updates
its beliefs about the candidate’s skills and abilities, and steers the challenge. This
process reduces, or at least adds more confidence in, the selection committee’s
space of uncertainty about the candidate. Every time a possible doubt exists, a new
challenge is formulated. The process of challenge is guided by how the uncertainty
36 1 The Art of Red Teaming
about the candidate may impact the organization and job’s objective. However, this
is a weak form of a challenge. It is subjective and mostly ad-hoc.
The science of RT, that is, CRT, will have these two cornerstones: risk and
challenge as the basis for designing and understanding the process of CRT. In
today’s world, where data and models are abundant, CRT attempts to design an
architecture to bring together the elements of risk and challenge to achieve the
objective of the exercise.
As was explained in the preface, the word “computational” emphasizes the aim
to design systemic steps for RT. It does not necessarily mean “computer based.”
However, in complex RT situations, and assuming that an organization understands
the CRT science to be discussed in this book, computer support for the RT exercise
is vital.
Before the discussion on CRT progresses, two issues at the interface of computer
science and CRT need to be explained. One is related to CRT, where computer
scientists have been tempted to automate RT exercises completely. The other is
related to computer science, where CRT offers an interesting perspective on the
concept of “intelligence” such as in artificial intelligence (AI).
a human to think about a problem by being embodied and situated in the problem
and its context. As such, perhaps it is more sound to consider implementing an
augmented-reality version of RT than commanding a computer to perform RT on
one’s behalf.
Nevertheless, many of the components of the RT exercise can be automated if
we can structure them in a systemic manner and into computational elements that
are computable. Therefore, it is important first to focus on establishing the science
so that we are able to automate RT, if we can.
The automation of RT, which is a large complex exercise, should not be confused
with blue-red simulations, which is a special type of behavioral-based simulation
systems.
The first problem was that agents could behave in manners that military personnel
considered meaningless. In essence, they believed that no troop would behave
in such a manner in a real-world situation. This generated suspicion about the
validity of these systems. The second problem was the lack of any means to explain
why certain complex behaviors arose in the simulation. Without knowing why, the
military was unable to transform the results of these simulations into meaningful
doctrines to adopt.
These two problems were acknowledged by researchers in the field. The first
problem was considered an advantage in the simulations-these strange behaviors
can generate risk. While a military culture may not allow these behaviors, an
individual might behave in such a manner if they had lost their sanity. Therefore,
these behaviors were not considered a disadvantage per se. However, if the designer
wanted to enforce a military hierarchy, there was no way in these simulations to
maintain the coherence of such a hierarchy over the course of the simulation.
The second problem was dealt with by researchers using two means: visualiza-
tion and post-analysis using data-mining techniques. Visualization provided easy,
but extremely effective, tools for gaining insight relevant to humans. The research
into data mining has resulted in post-analysis tools that can collate the massive
amount of information produced by these simulations in a coherent form. Both
means were combined and referred to as “data farming.”
In an attempt to address these problems, the author of this book with his
PhD student at the time designed WISDOM [55–57] as a system to solve both
problems. WISDOM designed the internal architecture that enabled a solution for
both problems. First, every relationship between any two agents was represented as
an explicit network. For example, vision (who sees whom in an environment) was
represented as a vision network that is reformed in every step of the simulation.
Similarly, the command structures involving factors such as communication were
represented as networks. An influence diagram was then constructed to represent
how these networks influence each other and in which context. As shown in Fig. 1.1,
each relationship/network within the agent society is associated with a node in the
influence diagram, which acts as a prior knowledge to guide reasoning.
Given that agents interact nonlinearly in a CAS, it is almost impossible to
understand how group-level behavior relates to the behavior of individual agents.
For example, who is responsible for producing an idea that resulted from a group
discussion: the person who uttered it, the people who were discussing it before that,
or someone who said something very early in the discussion and kept silent for the
rest of the time?
Reasoning in complex systems is difficult. WISDOM relies on the fact that
each relationship is a network and that these networks are interdependent in a
manner described by the influence diagram to do reasoning. Figure 1.2 offers an
approach that enables reasoning in these highly nonlinear dynamical systems. At
each simulation time step, each network is analyzed and many network measures
are extracted. Over time, these network measures form different time series. The
influence diagram represents the domain knowledge required to interpret these time
series. All that remained was to design data-mining tools to correlate these time
1.7 From Red Teaming to Computational Red Teaming 39
Fig. 1.2 Reasoning in nonlinear dynamics using networks and time series analysis
40 1 The Art of Red Teaming
series to provide the confidence that one change in a network influenced a change
in another network, sometimes with a long time lag in between.
Some researchers have attempted to equate blue-red simulation with RT. This
can be considered a weak comparison because RT requires explicit understanding
and modeling of risk and challenge.
Researchers, including the author and his students, have attempted to automate
some aspects of the RT exercise. These attempts have created a rich literature
that can be leveraged for use in CRT, but clearly, as the remainder of this book
demonstrates, the gap is extremely large between the current state of automation of
RT and CRT.
The first attempt to claim an Automated RT (ART) is attributed to preliminary
discussions by Upton and McDonald [47, 48]. ART relies on the evolutionary-
computation search technique (discussed in Sect. 3.3) to uncover good red strategies
in predefined scenarios. The two cornerstones of CRT were not modeled or
discussed. Thus far, all ideas for automating RT can be described as wrapping an
optimization layer (mostly evolutionary or co-evolutionary computation) around
a blue-red simulation. The search technique uses a blue-red simulation environ-
ment to evaluate its proposed strategies and solutions. A more serious series of
computational studies was conducted simultaneously by the author and his PhD
student [54, 55].
This line of research was followed by rich literature on the topic. Within the realm
of military-style blue-red simulations, ART [13] and Automated Co-Evolution
(ACE) [30] continued the traditions of EINSTein [25] and WISDOM [54, 55]
in adopting evolutionary algorithms to search the strategy space of blue and
red. More studies emerged on blue-red simulations under the banner of CRT,
including [17, 18, 23, 24, 38, 39], and one of the early studies using CRT for risk
assessment is reported in [8].
Outside the realm of the military, CRT began to be included in a wide range of
applications, including in cyber security [40, 42, 44] and air-traffic control [5, 6, 58].
Some early review papers include [1, 3, 4].
The above literature has provided a rich foundation for CRT. However, there
are many opportunities and research areas that remain unexplored. The remainder
of this book will discuss the science of CRT to draw a map of these unexplored
areas. The objective is to explain the foundations of CRT in an attempt to drive the
literature toward more meaningful studies on the RT domain.
Since the inception of computer science, and the dream to execute in silico what
humans can do in their minds, one of the fundamental questions that has generated
1.8 Philosophical Reflection on Assessing Intelligence 41
many inquiries into the philosophy of AI is what is “intelligence” in the first place.
Researchers ask what this word means and how it can be judged whether an entity
is intelligent.
The history of the philosophy of AI is replete with famous stories: from the
Chinese Room Argument that negatively impacted the work on Natural Language
Understanding, changing the name of the field to Natural Language Processing, to
the inability of the perceptron node in artificial neural networks to solve the XOR
problem.
One topic that created a great deal of discussion over the years is how to judge
whether a machine is intelligent.
Alan Turing [45] proposed an answer: the Imitation Game (IG). In this game, an
AI or a machine contestant is placed in one room, a human contestant is placed in
another room, and a human judge sits in a third room. The human judge does not
know who is in which room.
The human judge begins by asking questions to the agents in both rooms. At the
end of the task, the human judge needs to judge which room has the human and
which room has the machine. If the human judge believes the room that has the
machine is the one that has the human, the machine has passed the intelligence test.
Recently, a version of the IG was introduced for computer games in a competition
termed Human-Bots [22]. Interestingly, no machine has ever passed the IG test to
this day. If this had been a test provided to humans in a school or a university
environment, it would have been scrambled by university management long ago.
IG has been widely criticized by many, but no valid alternative has been proposed.
IG has many fundamental drawbacks. Some of these are discussed below from a
technological, rather than a philosophical perspective:
• IG assumes a binary definition of intelligence that does not help to establish a
gradient to advance the science of AI. The test does not allow intelligence to
be assessed on a score. While this is not a major drawback (as it can be easily
amended by asking the human judge to score each room or provide a weight that
a room has a human), designing such a score function would be sensitive to the
subjective opinion of a human judge.
• IG advances research in AI in a backward direction! The fundamental concept
underneath IG is for the machine to match human abilities and inabilities,
equally! For example, given that the judge would expect a human to make a
mistake when asked to complete complex calculations in a short period, or at
least to take longer time to complete such calculations, AI designers attempted
to mimic this human inability by slowing down the calculations or introducing
deliberate errors in the calculations. Such a behavior is not useful from an
engineering perspective: what is the point of producing human mistakes in a
machine from a technological perspective? If a society has a gifted child, should
the child not be embraced, or should society ask the child to make more mistakes
to seem like other children? Obviously, this is a matter of perspective.
42 1 The Art of Red Teaming
• IG is logically inconsistent. Imagine both rooms have humans inside, what would
be the meaning of the decision made by the judge? Equally, if both rooms have
a machine, does it mean that whichever room the judge believes has the human
has an intelligent machine?
• IG makes a wrong assumption on a fundamental level: that intelligence is
context independent. Is a smart computer scientist or a mathematician necessarily
smart in a social context? Intelligence is a context-dependent phenomenon.
A mathematician specializing in the field of mathematical topology may appear
smarter than many other people when asked questions in this field. However,
our ability or inability to answer questions depends on many factors, including
our workload level, fatigue level, stress level, knowledge in the domain, level
of maturity, and our attitude toward self-reflection. Today, human intelligence
is assessed using multiple scales such as the intelligence quotient (IQ), and the
emotional quotient (EQ). Intelligence should not be assessed independently of
context because there is no single type of intelligence, even within human society.
• IG assumes that the ultimate aim of AI is to imitate humans rather than
complement humans. A social system is constructed with each human playing
a role that serves the society as a whole (i.e. the division of labor principle) and
allows the people in the society to live in harmony and learn to act intelligently.
If all humans in a society attempt to imitate each other, creating the same copy of
one another, the system collapses and the concept of intelligence will be erased
from the system. Therefore, it can be said that intelligence breeds differences
and a differentiation process among the agents in the environment. Similarly, we
should aim to cherish the differences that AI offers, not the similarities, when
assessing intelligence.
CRT offers perspective on how intelligence should be assessed. The basic tenet
of CRT is its reliance on deliberate challenge. In its simplest form, a deliberate
challenge may take the form of a debate, or as in John’s story, a mockup interview.
A reciprocal interaction between two entities, where each entity attempts to
deliberately challenge what the other is attempting to achieve, is a more objective
manner for each entity to evaluate the other.
The concept of a challenge does not need an external judge to make a decision.
Instead, the parties themselves can assess their own interaction. Every time one party
throws a challenge at the other, the recipient party can assess how far this challenge
truly expands its horizon. This is what occurs in a social system. People tend to
evaluate each other constantly based on feedback they receive through conversation
and interaction.
The concept of a deliberate challenge is different from a classical competition.
While in both cases, a context exists that bounds the scope of the interaction, the
primary aim in a competition is to win, regardless of whether a new lesson has been
learned.
References 43
In RT, the primary aim of the exercise is to learn new things. Blue attempts to
learn about the opponent, holes in their own thinking, and holes in their planning;
they also attempt to estimate their boundaries: where red’s abilities end and blue’s
inabilities begin. These are the boundaries between what blue can and cannot do,
what blue knows and does not know, and where the true challenges lie.
The process of estimating these boundaries, probing the other team with events
that require them to act outside their skills boundaries, and designing mechanisms
to counteract this team action distinguishes RT from a classical competition.
RT offers this unique mechanisms that provide objective ways to not only assess
intelligence, but to analyze the system to be assessed. For example, the time taken by
one side to sustain the interaction before it breaks down can be such objective metric.
One can rely on syntactical and semantic complexity to analyze the interaction or
more advanced complexity metrics [43]. This analysis can assist in pushing this
system up in the intelligence scale by identifying root causes for the limited behavior
expressed by the system during the RT Exercise.
References
1. Abbass, H.: Computational red teaming and cyber challenges. In: Platform Technologies
Research Institute Annual Symposium, PTRI (2009)
2. Abbass, H.A., Barlow, M.: Computational red teaming for counter improvised explosive
devices with a focus on computer games. In: Gowlett, P. (ed.) Moving Forward with
Computational Red Teaming. DSTO, Australia (2011)
3. Abbass, H.A., Bender, A., Gaidow, S.: Evolutionary computation for risk assessment using
computational red teaming. In: Sobrevilla, P., Aranda, J., Xambo, S. (eds.) 2010 IEEE
World Congress on Computational Intelligence Plenary and Invited Lectures Proceedings,
pp. 207–230. IEEE, Barcelona (2010)
4. Abbass, H., Bender, A., Gaidow, S., Whitbread, P.: Computational red teaming: past, present
and future. IEEE Comput. Intell. Mag. 6(1), 30–42 (2011)
5. Alam, S., Zhao, W., Tang, J., Lokan, C., Ellejmi, M., Kirby, S., Abbass, H.: Discovering delay
patterns in arrival traffic with dynamic continuous descent approaches using co-evolutionary
red teaming. Air Traffic Control Q. 20(1), 47 (2012)
6. Amin, R., Tang, J., Ellejmi, M., Kirby, S., Abbass, H.A.: Computational red teaming for
correction of traffic events in real time human performance studies. In: USA/Europe ATM
R&D Seminar, Chicago (2013)
7. Barlow, M., Easton, A.: Crocadile-an open, extensible agent-based distillation engine. Inf.
Secur. 8(1), 17–51 (2002)
8. Barlow, M., Yang, A., Abbass, H.: A temporal risk assessment framework for planning a future
force structure. In: IEEE Symposium on Computational Intelligence in Security and Defense
Applications, (CISDA), pp. 100–107. IEEE, Honolulu (2007)
9. Bitinas, E.J., Henscheid, Z.A., Truong, L.V.: Pythagoras: a new agent-based simulation system.
Technol. Rev. J. 11(1), 45–58 (2003)
10. Calder, R., Smith, J., Courtemanche, A., Mar, J., Ceranowicz, A.Z.: Modsaf behavior sim-
ulation and control. In: Proceedings of the Conference on Computer Generated Forces and
Behavioral Representation (1993)
11. Caldwell, W.J., Wood, R., Pate, M.C.: JLINK—Janus fast movers. In: Proceedings of the
27th Conference on Winter Simulation, pp. 1237–1243. IEEE Computer Society, Washington
(1995)
44 1 The Art of Red Teaming
35. Millikan, J., Brennan, M., Gaertner, P.: Joint seminar wargame adjudication tool (jSWAT). In:
Proceedings of the Land Warfare Conference (2005)
36. NATO: Bi-strategic command alternative analysis concept. Tech. rep., Supreme Allied
Commander, Norfolk (2012)
37. Porter, M.E.: What is strategy? Harv. Bus. Rev. (November–December), 61–78 (1996)
38. Ranjeet, T.: Coevolutionary algorithms for the optimization of strategies for red teaming
applications. Ph.D. thesis, Edith Cowan University (2012)
39. Ranjeet, T.R., Hingston, P., Lam, C.P., Masek, M.: Analysis of key installation protection
using computerized red teaming. In: Proceedings of the Thirty-Fourth Australasian Computer
Science Conference, vol. 113, pp. 137–144. Australian Computer Society, Darlinghurst (2011)
40. Rastegari, S., Hingston, P., Lam, C.P., Brand, M.: Testing a distributed denial of service
defence mechanism using red teaming. In: IEEE Symposium on Computational Intelligence
for Security and Defense Applications (CISDA), pp. 23–29. IEEE, Ottawa (2013)
41. Schwarz, G.: Command and control in peace support operations model pax-approaching new
challenges in the modeling of c2. Tech. rep., DTIC Document (2004)
42. Shafi, K., Abbass, H.A.: Biologically-inspired complex adaptive systems approaches to
network intrusion detection. Inf. Secur. Tech. Rep. 12(4), 209–217 (2007)
43. Teo, J., Abbass, H.A.: Multiobjectivity and complexity in embodied cognition. IEEE Trans.
Evol. Comput. 9(4), 337–360 (2005)
44. Thornton, C., Cohen, O., Denzinger, J., Boyd, J.E.: Automated testing of physical security: red
teaming through machine learning. Comput. Intell. (2014)
45. Turing, A.M.: Computing machinery and intelligence. Mind, pp. 433–460 (1950)
46. Tzu, S.: The Art of War, p. 65. Translated by Samuel B. Griffith. Oxford University Press,
New York (1963)
47. Upton, S.C., McDonald, M.J.: Automated red teaming using evolutionary algorithms. WG31–
Computing Advances in Military OR (2003)
48. Upton, S.C., Johnson, S.K., McDonald, M.J.: Breaking blue: automated red teaming using
evolvable simulations. In: GECCO 2004 (2004)
49. Von Clausewitz, C.: On War. Digireads. com Publishing (2004)
50. Wheeler, S.: Moving forward with computational red teaming. Tech. rep., Defence Science and
Technology Organisation - DSTO, Australia (2012)
51. White, G.: The mathematical agent-a complex adaptive system representation in bactowars. In:
First Workshop on Complex Adaptive Systems for Defence (2004)
52. White, G., Perston, R., Bowden, F.: Force flexibility modelling in bactowars. In: Proceedings
of the International Congress on Modeling and Simulation (MODSIM), pp. 663–669 (2007)
53. Wittman Jr, R.L., Harrison, C.T.: Onesaf: A product line approach to simulation development.
Tech. rep., DTIC Document (2001)
54. Yang, A., Abbass, H.A., Sarker, R.: Evolving agents for network centric warfare. In: Proceed-
ings of the 2005 Workshops on Genetic and Evolutionary Computation, pp. 193–195. ACM,
Washington (2005)
55. Yang, A., Abbass, H.A., Sarker, R.: Landscape dynamics in multi–agent simulation combat
systems. In: AI 2004: Advances in Artificial Intelligence, pp. 39–50. Springer, Berlin (2005)
56. Yang, A., Abbass, H.A., Sarker, R.: Characterizing warfare in red teaming. IEEE Trans. Syst.
Man Cybern. B 36(2), 268–285 (2006)
57. Yang, A., Abbass, H.A., Sarker, R.: How hard is it to red team? In: Abbass, H.A., Essam,
D. (eds.) Applications of Information Systems to Homeland Security and Defense, p. 46. IGI
Global, Hershey (2006)
58. Zhao, W., Alam, S., Abbass, H.A.: Evaluating ground–air network vulnerabilities in an inte-
grated terminal maneuvering area using co-evolutionary computational red teaming. Transp.
Res. C Emerg. Technol. 29, 32–54 (2013)
Chapter 2
Analytics of Risk and Challenge
2.1 Precautions
This chapter will revisit many basic concepts that may seem already known to many
readers. Nevertheless, a formal definition of each of these concepts will be provided.
Some of the definitions will be obvious, some may deviate from daily uses of the
concept, and some may even contradict our present understanding of the concept.
This is why defining these basic concepts is essential.
The discussion of many concepts in this chapter intersects with other disciplines,
including those of the behavioral and educational sciences and organizational
psychology. In fact, psychology literature is rich in dealing with these concepts,
with many articles published on each of the many concepts that will be discussed
here.
A CRT exercise may include a behavioral psychologist to perform a behavioral
assessment of the blue team. It may use an organizational psychologist to understand
the culture of the blue organization or it may include a cognitive psychologist
to advise on task designs with specific cognitive-load characteristics to overload
blue’s thinking. Our discussion in this chapter does not aim to discuss these roles
and the science needed to perform each of them. A psychologist in any of these
roles is another team member of the CRT exercise, bringing their own expertise to
the CRT exercise. Psychology literature examines each of these roles and concepts
underpinning them with more depth than the discussion here.
The discussion in this chapter does not aim to reproduce the psychology
literature, nor does it aim to introduce a new psychological theory. The main aim is
to design a model of a challenge that we can use in a computational environment.
This model will be used to analyze an algorithm, a machine, a human or an
organization. The discussion will offer simple and structured behavioral models that
can be used by non-psychologists. These models are simple when compared to the
great amount of literature available on the concepts, and the complexity involved
in understanding human psychology. However, the models are reliable because
whether we use pencil and paper or computers to red team, and whether we use
them for small or large-scale problems, they will produce results that can be traced
to causes and evidence.
To bring the different pieces of a model of challenge together successfully, the
discussion will intersect with a number of fields, including psychology, education,
risk management, system theory, and computational sciences. Structuring these
concepts is a daunting task. First, science by nature offers a thesis and antithesis.
The reader may find scientific articles with different definitions that contradict
each other. In places, the treatment of the topic will certainly contradict some of
this science. Second, most of these concepts are also used in our daily language;
therefore, a first encounter with a definition for any of these concepts that does not
comply with one of our daily uses may create unease for the reader.
Nevertheless, given that one of the aims is to structure these concepts so that we
are able to compute them, we must define them clearly in an unambiguous manner.
Such unambiguous definitions will eliminate confusion in the reader’s mind while
reading this book, even if the definitions themselves are not universally accepted.
2.2 Risk Analytics 49
collected, the organization continuously assesses the situation, the associated risks,
and the threats that may exist in the environment. Most of these terminologies, such
as risk and threats, will be explained in more details in the rest of this chapter.
For the time being, we can rely on common knowledge in understanding these
terminologies to follow the current discussion on the risk analytics process.
When the organization identifies a specific type of threat or a possible negative or
positive impact on organizational objectives, a need arises to analyze this situation
and formulate alternatives. Response analysis is the process of formulating and
assessing responses. Consequence analysis then projects some of the selected
responses onto future states to assess the longer term impact of these responses
on organizational objectives.
A suitable response is then selected. The response design step transforms the
selected response into suitable actions that can be executed. For example, one
important aspect of response design is how the selected response will be framed
to others. The organization may decide to fire people. Will the organization present
this response as a direct consequence of drop in sales, as a restructure of operations
to improve productivity, or as a step towards renewing the organization. Framing the
response is a very critical skill that can dramatically impact the effectiveness of the
response in achieving the intended impact.
When risk analytics relies on designing challenges as the tool to react to threats,
the process gets more targeted, where the threat actor becomes the focal point of the
analysis. In other words, intentional actions become more paramount in the analysis,
as well as the response.
CRT is designed to challenge an entity. The success of our ability to challenge this
entity must be reflected in its performance. In CRT, this entity can be anything from
a human to a machine, from a company to a country, and from a technology to ideas
and beliefs. Regardless of what the entity is, it needs to have an owner. We will call
the owner a person or an agent.
We will follow a legal definition of a “legal person,” or a person for short.
A person can be natural, as a human, or a juridical, as a corporation. We will
reserve the word “agent” to mean both a person and software that performs some
tasks by producing actions. We will use the word “entity” to refer to agents that
think/compute and act or objects that do not think or act. If our discussion is limited
to a software agent, we will explicitly refer to it as a “software agent.”
While a person and an agent are systems by definition, as will be discussed in this
section, the word “system” will be used to emphasize the structure over the identity,
and the words “person” or “agent” will be used to emphasize identity over structure.
Whatever the type of an agent, we will consider an agent as a living organism:
it continuously produces actions in the environment. Even if the agent stays still,
staying still is an action. When a human or a computer program goes to a sleep, this
2.2 Risk Analytics 51
A properly designed intentional action needs to consider the outcomes the agent
intended to achieve the fulfilment of the agent’s objectives, goals and the uncertainty
surrounding the achievement of these outcomes. This begs the question of what
these concepts mean.
Definition 2.3. An objective is an approximately measurable phenomenon with a
direction of increase or decrease.
The phenomenon can be the agent’s state. For example, when an affective state
such as happiness or a physical state such as monetary richness become the subject
of an objective, we would usually have a metric to measure this state. In the case
of the affective state of happiness, we may not have a direct manner by which to
measure the state itself, but we can use a set of indicators. These indicators are
blended (fused) to provide a measurement of the degree of happiness. We would
then either attempt to increase (maximize) or decrease (minimize) the degree of
happiness.
In CRT, the objectives of both teams are somehow interdependent because
the agent’s states are interdependent on each other. For example, the red team’s
affective state of happiness may be negatively influenced by the blue team’s state
of richness (as in the simple case of human jealousy); thus, a decrease in blue’s
richness generates an increase in red’s happiness. In this case, the red team may
have an objective of minimizing the richness of the blue team to maximize its own
happiness. If the teams’ objectives are independent of each other, they should act
independently; therefore, there is no need for the CRT exercise in the first place.
If red and blue objectives are positively correlated,1 they can optimize their
objectives either by continuing to act independently, or by taking an opportunity
that might arise to optimize their objectives by acting cooperatively. In this case, the
objective of the CRT exercise is to explore novel opportunities for collaboration.
However, in most cases, CRT exists for competitive situations.2 In this case, a
blue-red competition can only exist if blue and red have conflicting objectives. Con-
flicting objectives can take two forms. In the first form, the objectives themselves
are in direct conflict with each other. For example, in a situation of war, blue wishes
to win at the cost of red losing, and vice versa.
In the second form, the objectives may not be in obvious conflict, but limited
resources place them in conflict. For example, there are two departments in a
company, one is responsible for research and development (R&D) and the other
is responsible for the core-business production line (the production department).
1
Two objectives are said to be positively correlated if an improvement in one is accompanied with
an improvement in the other and vice versa.
2
Even when we discuss CRT for cooperative situations, we use competition as the way to achieve
cooperation. For example, by challenging the student’s mind with stimulating ideas, the student
becomes more engaged, and pays more attention to and cooperates with the teacher.
2.2 Risk Analytics 53
Fig. 2.3 Blue and red objective spaces and their correlations. A solid arrow/line indicates positive
correlation; a dotted arrow/line indicates negative correlation
54 2 Analytics of Risk and Challenge
positive relationship with ob6 for blue. That is, it is beneficial for both blue and red
to cooperate to maximize these objectives.
However, this conclusion is superficial. We need to understand the complex
interteam and intrateam interactions in the objective space.
For blue, ob6 positively influences ob3, while an improvement in ob3 will
improve ob2, which will negatively influence ob6. This generates a negative cycle
with blue’s objective space. For example, improving education intake and quality
would improve health, but improving health would increase the age of retirement,
degrading job market, which then negatively influences education. Similarly, in
a network-security scenario, creating a stronger security system through multiple
biometric authentication protocols would increase system protection, but increasing
system protection would reduce the usability of the system (customers need to
spend more time to authenticate), which may increase customer dissatisfaction.
These examples demonstrate the internal conflict that can exist within the interteam
objective space.
This creates an internal conflict within blue objectives. Blue would then need
to establish its own trade-offs. In the meantime, red does not have the same
internal conflict. or7 negatively influences or6, which positively influences or2,
which positively influences or4, which negatively influences or1, which positively
influences or3. That is, or7 positively influences or3 (if we multiply all signs on the
path, we obtain a positive sign). We notice that there is a conflict between or4 and
or1, but this conflict does not impact the interdependency between red’s external
objectives.
If we examine the intrateam interaction, we see that ob6 for blue positively
influences ob3 for blue, which negatively influences or3 for red. Therefore, blue
has the following two problems:
1. Blue has a negative feedback cycle internally: ob3 ob2 ob6 o3b. Red can
influence this negative feedback cycle as red’s or7 objective interacts positively
with blue’s ob6 objective. Thus, red can influence blue’s decision made on any
internal level of trade-off.
2. Red’s or3 and or7 objectives reinforce each other. In the meantime, red’s or3
objective is in conflict with blue’s ob3 objective. As red improves its own or3
objective, blue’s ob3 objective deteriorates.
Once these objectives become known, each team attends to design plans to
achieve their objectives. To monitor progress toward the objectives, goals are
defined.
Definition 2.4. A goal is a planned objective.
Based on the agent’s assessment of what is possible and what is not, the agent
can establish an “aspiration level” for each objective. This process of planning and
designing aspiration levels transforms each objective, where the agent wishes to
optimize the objective, to goals, where the agent wishes to reach the way-point
indicated by the aspiration level.
2.2 Risk Analytics 55
# f .x/
S.T. x 2 ˚.x/
where, f .x/ is the objective the agent wishes to optimize (minimize in this case),
x is the decision variable(s), the alternatives or courses of action from which the
agent needs to choose, and ˚.x/ is the feasible space of alternatives. Every solution
belonging to the feasible space ˚.x/ satisfies all constraints in the problem. We use
# to denote minimization, " to denote maximization, and “S.T.” as a shorthand for
“subject to the following constraints or conditions.”
For an agent to optimize one of its objectives, it needs to form a plan, or a
series of actions to make this optimization work. The agent’s plan is designed
after careful assessment of what is possible and what is not, or what we will term
“constraints.” Once planning is complete, the agent becomes more aware of the
environment, as well as what it can achieve and what it cannot. In this case, the
objective is transformed into a goal and the formulation above can be re-expressed
as is presented in the following equation.
# d C dC
S.T.
f .x/ C d C d C D T I
x 2 ˚.x/
1. The first difference lies in who owns the trade-off. For the interteam conflicting
objectives, each team owns their problems and therefore can decide on the level
of trade-off they wish to achieve. In the intrateam conflicting objectives, the
trade-off is owned by both teams together. The issue of ownership is core when
selecting an appropriate technique to solve these problems because it defines
the level of control of a team on implementing a proposed solution. One would
expect that red and blue could exercise more control internally than externally.3
The implication here is an internal decision made by one team will be easier to
implement than an external decision.
2. The second difference lies in the nature of the trade-off. In the intrateam
conflicting objective space, the trade-off is not usually a one-off decision; it needs
to be negotiated and be determined by both teams together. As blue makes a
decision, red responds, and as red makes a decision, blue responds. Therefore,
the trade-off in the intrateam conflicting objective space is more dynamic than in
the interteam conflicting objective space.
3. The third difference lies in the nature of uncertainty and information availability
in the intrateam and interteam conflicting objective space. In an interteam
situation, external uncertainty is almost uncontrollable. The system attempts
to decide on its actions to manage the risk of these external uncertainties. In
the intrateam situation, uncertainty is dynamic. As the two teams interact, their
actions can shape the uncertainty space. This discussion point will be revisited in
Sect. 4.1.
By now, we should ask whether the division between internal conflicting
objectives and external conflicting objectives is meaningful. In fact, this division
largely depends on where we draw “system boundaries.” In the following section,
3
How to deal with the situation when one of the teams has more control externally than internally
is outside the scope of this book.
2.2 Risk Analytics 57
This argument is flawed in two aspects. First, it reflects the limited view that
CRT is a military or national-security exercise. Limiting the concept of CRT to these
domains will harm these domains because the constrained context, while important,
limits the possibilities for CRT to grow as a science.
The second reason the argument is flawed is that the concept of senior man-
agement exists in every problem. Senior management is not an external counseling
service or a legal authority. Members of senior management come from different
portfolios in an organization. Even for matters related to military or national
security, different countries are members of a larger international organization such
as the United Nations. This does not eliminate the need for CRT on a country
level, a state level, a department level, an organization level, a technological level,
or even an algorithmic level. CRT is a nested exercise simply because conflict in
objectives is a nested concept. The fact that larger objectives are comprised of
smaller objectives can create conflict itself, and as each person is responsible for
a different portfolio within an organization, CRT on one level is comprised of CRT
exercises on sublevels.
2.2.3 Systems
As discussed, the primary reason that red and blue are in conflict is that the
objectives of the blue system are in conflict with the objectives of the red system.
In a CRT exercise, it is critical to consider both the red and blue teams as a system.
For red, blue is a system for which red attempts to dysfunction by counteracting its
objectives. The same is true for blue, red is a system that is attempting to dysfunction
blue because red’s objectives are in conflict with blue’s objectives. We use the
word “dysfunction” since interference with a system’s objectives with the aim of
acting against the benefits of the system is a possible cause for dysfunction. This
dysfunction can take the form of simply influencing the objectives of one team to
change, or in more dramatic situations, of damaging the components of the system.
Classically, a system is perceived as a group of components or entities interacting
for a purpose. This definition is too basic here, and does not adequately service our
analysis. Therefore, a system is defined here as follows.
Definition 2.5. A system is a set of entities: each has a capacity to receive inputs,
perform tasks, generate effects, and complement the other toward achieving goals
defined by a common purpose.
Definition 2.6. An effect is a measurable outcome generated by an action or caused
by a change in a system state.
The above definition of “system” can be considered an elaborate definition of the
classical definition of a system. However, this further level of detail is necessary.
It makes it clearer to an analyst that when they define a system (such as the red or
blue system), they must map out the entities; the inputs to each entity; the task each
2.2 Risk Analytics 59
entity is performing (reflecting the purpose of this entity or subsystem); the effects
that each entity generates; and how these entities and their objectives depend on
each other and come together to achieve the overall purpose of the system.
The definition for “effect” clarifies that given actions are produced continuously,
effects are also generated continuously. Every action produces many outcomes.
An effect is a measurable outcome within the context of CRT. If the outcome is
not measurable, it cannot be considered within a CRT exercise before it becomes
measurable (either directly or indirectly through a set of indicators); otherwise the
exercise will become an ad-hoc activity.
If we want to discuss change in happiness as an effect, we need to know how
to measure happiness. Alternatively, we need to find indicators that collectively
indicate happiness so we can measure these indicators. If we cannot measure
directly or indirectly, we cannot manage, we cannot engineer, we cannot define a
reward or penalty, and we simply cannot influence or control.
The definition of “effect” also emphasizes that effects can be produced without
actions. For example, aging is an effect of time. Even if we put the human on a bed
in a coma, the body will continue to age and decay.4 These changes in the state of
the system are naturally occurring without actions per se.
The definitions of system and effects used above are particularly useful for a
red teamer because they create knobs for engaging with the system to steer it and
influence it in a more clear manner. Knowing how the entities interact and the
resultant effects provides us with an idea of which entities are more important than
others, and which are more controllable than others. Once we define the key entities
we wish to control, we can examine how to control them and the desired changes in
the effects. However, given that each of these entities is a system, we can continue
to deconstruct the problem and locate more control points.
The second group of knobs is the inputs, the tasks an entity is performing, and
the effects an entity generates. Chapter 4 will present a more elaborate discussion
on this issue. Understanding these knobs facilitates the task of the red teamers.
Components comprising a system are in their own right, a system. An aircraft is
a system, as it consists of the mechanical, software, fuel and human components,
without which it cannot fulfil its purpose. The purpose of an aircraft is to fly. This is
actually an assumption for which we should pause and consider in depth.
Definition 2.7. The purpose of a system is the reason for being from the perspective
of an external observer.
While the components are internal to the system, the purpose is always in the eyes
of the beholder. The purpose of a system is an external judgment that is made by an
external stakeholder or observer. The purpose is defined by an external entity, which
can also be the owner of the system. Therefore, the same system can have multiple
4
One can consider this concept on a philosophical level as actions produced by the environment
that cause decay to occur, but we will avoid this level of interpretation in this book because it can
create unmanageable analysis.
60 2 Analytics of Risk and Challenge
purposes. For an airline, an aircraft’s purpose is to make money through flying. For
the post office, an aircraft’s purpose is to deliver the mail. For a business passenger,
an aircraft’s purpose is to provide transportation to attend business meetings. For a
world traveler, an aircraft’s purpose is to provide transportation to travel from place
to place for enjoyment.
The different views on the purpose of an aircraft by different external stakehold-
ers in the community may generate conflicting objectives. Making more profit from
an airline perspective can create conflict with a passenger who wishes to minimize
the cost of travel as much as possible. A longer route at an optimal altitude may
minimize fuel costs for the airline as compared to a shorter route at an inefficient
altitude, which burns more fuel. However, for the business passenger, a longer route
may entail late arrival at the destination.
For an airline company, the board will define the purpose of the company.
One can perceive the board as an external entity, which in reality it is because it
represents the interface between the stakeholders of the company and the company
itself. The chief executive officer (CEO) sits on the board as an ex-officio and reports
to the board. Through the CEO, the purpose is translated into internal objectives,
which are then transformed into goals, key performance indicators, and plans.
While the aircraft’s purpose for one person is for them to be able to fly, for
another, it might be a symbol of power and wealth-imagine having an aircraft in
your backyard that you do not intend to use. You only have it on display to show
your neighbors how wealthy you are.
In the latter case, it does not matter whether we run out of fuel since the purpose
of this aircraft is to symbolize power and wealth, not to fly. It does not even matter
if the crew does not arrive or the control software system is not working. These
elements are not critical for the purpose.
Therefore, there is a tight coupling between the purpose of a system, and which
elements of an aircraft are deemed important for that purpose. Elements contributing
to different purposes can overlap. However, all elements of an aircraft may exist, but
not all of them are critical elements for the aircraft (the system) to fulfil its purpose.
Therefore, what defines the “critical elements” in a system can be different from one
observer to another, and from one stakeholder to another.
Definition 2.8. An element or component in a system is termed “critical” if the
removal of, or cause of damage to, this element or component would significantly
degrade the ability of the system to achieve its objective, goal, or purpose.5
For example, the heart is a critical element in the human body because if it is
attacked, the human body defining the system in this context will find it difficult
to achieve its objectives and its purpose of functioning efficiently and living,
respectively.
5
Most of the definitions used for critical elements, hazards, threats, and risks in this book are
compatible with ISO3100 [8], but sometimes get slightly changed to fit the context of this book.
2.2 Risk Analytics 61
In the example of the aircraft in the backyard as a symbol of power, the critical
element of the aircraft is that it has all its exterior body parts, including the wheels.
Scratches in the paintwork may not affect its ability to fly, but would certainly affect
its appearance as a symbol of power. The engine is no longer a critical component;
if it is not working, the appearance is not impacted.
It is clear that what makes a component in the system a critical element is its
contribution to the capacity of the system in achieving its purpose. However, neither
this capacity nor the objectives are deterministic; they are impacted by both internal
and external uncertainties.
A properly designed action must consider the outcomes the agent intended to
achieve at the time the action was formed to fulfil the agent’s objectives or goals,
as well as the uncertainty surrounding the achievement of these outcomes. So far,
we have discussed objectives and goals. However, the perceived outcomes are the
agent’s expectation of an action’s impact on objectives given the uncertainty of
that impact. Many factors come into play in determining this uncertainty, from the
personality traits of the agent to the agent’s sensorial abilities, availability and access
to information for the agent, and the complexity of the situation the agent faces.
Every action must be evaluated through its effects and the impact of these effects
on both red’s and blue’s objectives. These effects need to be designed systematically
and consider the uncertainty in the environment. Therefore, in CRT, the concept of
risk is paramount.
From an agent’s perspective, Fig. 2.6 depicts a basic form of the decision-making
cycle an agent undergoes. The agent relies on its sensors to perceive uncertainty in
the environment. The agent has a set of feasible actions it wishes to evaluate for the
particular context in which it is attempting to make a decision. Together with the
agent’s objectives, the agent needs to make a judgment on how these uncertainties
impact the agent’s objectives for each possible action the agent needs to evaluate.
The agent selects a possible action to execute based on the agent’s assessment of
the impact of uncertainty on objectives if this action is executed. This assessment
is also influenced by the agent’s risk personality traits and experience. The agent’s
personality towards risk gets influenced by the agent’s perception of uncertainty and
the feedback received from the environment; together, they can reshape the agent’s
attitude to risk.
For example, the manner a message gets framed and presented to an agent
influences the agent’s perception of the level of uncertainty in the environment.
Consider for example the difference between “this person is trustworthy” and “to
my knowledge, this person is trustworthy”. The second statement can be perceived
to carry more uncertainty than the first, despite that we understand that whatever
statement someone is making, it is based on the person’s level of knowledge.
62 2 Analytics of Risk and Challenge
When the action is executed, an effect is generated in the environment, which the
agent senses through its sensorial capabilities and feedback; this effect is then used
for further learning. We note that this effect carries uncertainty information as well.
The cycle continues, and the agent continues to perceive the uncertainty in the
environment, evaluating its impact on objectives, producing an action accordingly,
monitoring the effect, and generating appropriate feedback to update its experience
and learn.
The diagram shows that the agent’s risk was a function of its objectives and
uncertainty.
Definition 2.9. Risk is the impact of uncertainty on objectives.6
The definition of risk above includes both positive and negative impact; therefore,
it assumes that risk can be negative or positive. For example, the risk of investing in
the stock market can be positive (profit) or negative (loss). In both cases, we would
use the term risk because at the time the decision was made to invest, the decision
maker should have evaluated both possibilities: the possibility of making profit and
the possibility of making loss. An educated decision maker when making a decision
to invest accepts the negative risk as a possible outcome, and equally, the positive
risk as another possible outcome.
6
We have changed the definition of risk from the one introduced in ISO3100 [8] by using the word
“impact” instead of “effect”. The reason is that the word “effect” has a more subtle meaning in this
chapter.
2.2 Risk Analytics 63
The common goal of a CRT exercise is to manage risk. This claim is safe
because underlying every use of CRT discussed in Chap. 1 lies in objectives and
uncertainties that derive the overall CRT exercise. The CRT exercise is established
to fulfil a purpose that takes the form of a function. One of the main functions of
CRT discussed in Chap. 1 is to discover vulnerabilities as a step towards designing a
risk-management strategy. By discovering vulnerabilities, we become aware of them
and we can take precautions to protect the system. However, what is a vulnerability?
ISO3100 defines vulnerabilities as “a weakness of an asset or group of assets that
can be exploited by one or more threats”[8]. In this book, we will adopt a definition
from a system perspective [4] because words such as “assets” can be confusing
if they are not understood from an accounting perspective. As such, the following
definition of “vulnerability” is provided.
Definition 2.10. A vulnerability is the possibility evaluated through the level of
access or exposure a hazard or a threat has to a critical component of a system.
A hazard is an unintentional act that may harm the system such as a fire. A threat
is an intentional act such as a hired hacker who has the intention to hack into the
computer network and cause damage. For the network administrator, this hacker is
a threat.
Vulnerability exists through exposure to an authorized or unauthorized (even
accidental) access of a critical element to a hazard or a threat; we will refer to this
exposure as “events.” What creates risk is the level of uncertainty of this exposure,
and the magnitude of damage that can accompany the exposure if it occurs; thus, the
uncertainty surrounding the circumstances in which the event will occur will impact
the critical element, which will in turn impact the objectives.
O
Risk D Vulnerability Effect
The building blocks for hazards and threats are shown in Fig. 2.7. These building
blocks provide knobs to control hazards and threats. An entity needs to be capable
of performing the act. Therefore, capability is one building block. We will revisit
the concept of capability and deconstruct it into components in Chap. 4. For the
timebeing, an entity has the capability if it has the ingredients to provide it with
the capacity to perform the act. For example, a computer hacker needs to have the
knowledge to hack into a computer. In Sect. 2.14, we will call this know-how the
skills to hack into a computer. The collective skills necessary to perform the act of
computer hacking represent one dimension of the capability of the entity. Similarly,
for a bushfire to ignite by nature, the ingredients of the capability need to be in
place. These can be the ability of the environment to have high temperature, dry
weather, etc. A thief who is denied the knowledge to hack a computer can’t become
a computer hacker because the thief was denied the capability.
While we will expand more on the concept of a capability in Chap. 4, we will
approximate the ingredients of a capability in this chapter to physical ingredients
and know-how ingredients. Most of the analysis conducted in this book will focus on
the know-how. This is on purpose for two reasons. First, without the know-how, the
physical ingredients are insufficient. While it is true also that without the physical
64 2 Analytics of Risk and Challenge
capability, one can deny access as a mean to prevent exposure and, therefore, the
opportunity to create an impact on critical elements, and one can shape and reshape
intent so that entities with the capabilities and opportunities do not become threats
in the system. This type of analysis can be used to assess the risk accompanying the
different roles of a red team that were discussed in Sect. 1.6.2.
Let us now take a more complex example that mixes hazards with threats.
Assume a system user who leaves their password on their mobile telephone to
remember it, the mobile telephone is stolen and a criminal uses the password to
break into the system. In this case, the user did not have the intention to cause
damage, despite this possibly being considered an act of negligence. While the
password was the means to obtain unauthorized access to the system through
the intentional act of the criminal (a threat), the availability of the password to the
criminal was not intended by the user (a hazard).
A critical component such as the heart in a human becomes a vulnerability
when it is exposed to a hazard such as a car accident or a threat such as
someone intentionally attempting to dysfunction the heart through a stab wound.
The vulnerability here arises from the level of access that was granted to the hazard
or threat by the holder of the critical element. If a fence was built that was capable
of stopping the car from crashing with the human, access has been denied, and
therefore, this particular vulnerability has been eliminated.
Before this discussion ends, one final definition is necessary. This definition is
often ignored in risk-management literature-the definition of a “trigger.” It must be
understood that the event would normally require a trigger. A trigger is a different
type of event. Becoming angry with someone may trigger violence. The event of
violence would expose some critical elements of the system to a hazard or a threat;
thus, creating a situation of risk.
Here, the word “trigger” is preferred over the word “cause.” A strict definition
of a cause is that the effect would not materialize without the cause. If someone
is angry, many things (i.e. triggers) can happen to make this person produce an
undesirable action. More importantly, these things can happen still and the effect
may not occur. None of these things is a cause per se; the real cause is the cause
for the person’s anger, which could have been that the person failed an exam.
Therefore, a trigger can be considered an auxiliary cause or an enabler for the effect
to materialize [1].
For example, if throwing a stone at a window causes the glass to shatter, the effect
of the action is shattering. Before the action is produced, the effect of the action
must be evaluated while considering the possibility that the force of the stone is not
sufficient to cause the window to shatter. Thus, uncertainties should be considered
when evaluating expected effects.
We will avoid discussing causality in its philosophical form. Despite the fact that
some of these philosophical views are the basis for some of the tools used in this
book, they are not essential for understanding the materials in this book. Interested
readers can refer to [1].
66 2 Analytics of Risk and Challenge
the uncertainty by seeking more information. In this situation, the agent changed the
objective from maximizing the expect value for finding the apple to minimizing the
uncertainty in the environment. When the agent manages to minimize uncertainty,
the agent becomes ready to shift its focus back to maximizing return.
Controlling uncertainty is a non-intuitive concept. In almost all types of classical
modeling presented in the literature, the emphasis is placed on how to represent
uncertainty and incorporate it in the model so that the solution produced by the
model is robust and resilient against the uncertainty. That is, classical modeling
approaches uncertainty from a passive perspective, seeing uncertainty as external
to the system, and the responsibility of a system’s designer is to find designs and
solutions that can survive the uncertainty.
CRT has a different perspective on the concept of uncertainty. Through CRT, we
can see uncertainty as a tool. Red must realize that through its own actions, it can
maximize blue’s uncertainty. Blue needs to realize the same. Red can confuse blue
and blue can confuse red. This form of a deliberately designed deceptive strategy
is not about deceiving the opponent team so that it believes one thing will be done
while the intention is to do another. Rather, deception here denotes deceiving the
opponent to the point at which they do not believe anything. The opponent becomes
overwhelmed with the uncertainty in the environment to the extent that it becomes
paralyzed. It does not move because every possible direction in which it can move
is full of unknowns. In such situations, the opponent will either not move at all or
will simply make a random move.
A CRT exercise takes an active approach toward the discovery of vulnerabilities.
In the majority of the CRT exercises, even if the individual exercise is concerned
with the discovery of vulnerabilities caused by hazards, the issue of “intention”,
therefore “threats”, demands a different type of analysis from that which involved
with hazards. A criminal breaking into the system, after obtaining access to the
password through the mobile telephone is an intentional act. This act becomes
deliberate when it is planned. Studying the interaction between objectives and
uncertainties is the key difference between what we will term an “intentional action”
and a “deliberate action.” This difference may appear controversial from a language
perspective given the two concepts of intentional and deliberate are synonymous
in English, and are used synonymously in many textbooks. However, here, we
highlight differences between the two words.
Within the class of intentional actions, we will pay particular attention to the
subset of deliberate actions. We will distinguish “intentional” from “deliberate” to
differentiate between classical decision making in an environment in which risks are
not consciously evaluated by a red teamer (but in which the actions are consistent
with the intention of the person) and decision making that is always accomplished
after careful risk assessments.
2.2 Risk Analytics 69
Definition 2.11. A deliberate act is the production of an intentional act after careful
assessment of risk.
In classical AI, the term “deliberate action” implies an action that has been
decided on based on the construction of a plan. The definition we use above is
more accurate because the emphasis is placed on risk assessment; therefore, a plan
is being produced with risk as the focal point for evaluating different options and
decision paths.
Therefore, every deliberate act an agent generates should contribute to the
objectives. A series of effects is usually required for an agent to achieve one or
more objectives. These objectives in their totality should reflect and be aligned with
the purpose of the system.
In CRT, the impact of the uncertainty surrounding deliberate actions is evaluated
on both red and blue objectives (i.e. self and others). Because the actions are
deliberate, part of the CRT exercise is for each team to assess and analyze the
actions of the other team. By analyzing actions, one team can reveal intent, drivers,
objectives, and even the perception of the other team of the uncertainty surrounding
them.
The previous statement should be read with a great deal of caution because of two
problems. The first problem is that we can become so overwhelmed with analyzing
actions that we utilize almost all resources without reaching any end. The second
problem is that actions can be random and/or deceptive on purpose; therefore, a
naive analysis of actions can mislead and counteract the CRT exercise.
Let us revisit the first problem. Some extreme views may perceive that there is an
intent behind each action. This might even be misunderstood from our discussions
above. We need to remember here that we are not discussing human actions in
general; we are discussing actions within the context of the CRT environment.
Therefore, there is a level of truth that we should expect that actions are produced to
achieve intent. However, the true complexity here lies in the fact that to achieve one
intent, there might be a need to design a number of actions. Some of these actions
need to be generated in sequence, while others do not depend on any order. This
defines a critical problem where the intent of the agent from a series of actions need
to be inferred. This is a difficult problem requiring advanced techniques from the
field of data mining. An introduction to data mining will be given in Chap. 3.
The second problem mentioned above is that actions can be deceptive and/or
random. An agent may produce random actions to confuse the other agent. Here, the
concept of deception is paramount and greatly impacts the behavioral data-mining
methods. We may think this is becoming too complex. We may feel the need to ask
how we can discover intent when deception is used. It can be surprising to learn
that deception can actually help us to discover intent. If we consider the fact that
deception in its own right is a set of deliberate actions designed to lead to an intent
that is different from the original intent, we can see that the intent inferred from
deception can give us an idea of where the real intent of the agent is. Of course we
need to ask ourselves how we would know in the first place that these actions were
70 2 Analytics of Risk and Challenge
designed for deception and how we could categorize deceptive and non-deceptive
actions. This is when complex tools, algorithms, and human’s educated judgements
blend together to answer this question.
2.3 Performance
2.3.1 Behavior
For an agent to produce effects, it needs to act. The set of actions generated by an
agent define what we will term the agent’s “behavior”.
Definition 2.12. Behavior is the set of cognitive and physical, observable, and non-
observable actions produced by an agent in a given environment.
We could define behavior simply as the set of actions produced by an agent.
However, this definition lacks precision and essential details. It lacks precision
because an agent does not act in vacuum; an agent acts within an environment.
First, let us define the environment.
Definition 2.13. An environment for an agent A consists of all entities that reside
outside A, their properties and actions.
Therefore, the environment represents the wider context within which an agent is
embedded. An agent is situated within its environment. The agent receives stimuli
from the environment, generates effects in response, and continues to monitor the
impact of these effects on those environmental states to which the agent has access.
Behavior is not limited to the physical actions produced by an agent’s set of
actuators. Most of the physical actions are expected to be observable from an
external entity. However, there is a group of actions that is generally unobservable;
2.3 Performance 71
these are the cognitive actions: the thinking process an agent experiences to reach a
decision. Cognitive actions represent a critical component in an agent’s behavior.
We cannot simply ignore them because they are hidden in the agent’s mind. In
fact, if we can learn how an agent thinks, or at least the drivers behind an agent’s
decisions, we can predict most intentional physical actions. However, achieving this
is extremely complex.
Meanwhile, one can see physical actions as the realization of cognitive actions.
Walking to the restaurant to propose to my partner is a set of physical actions.
These physical actions indicate that I have thought about the decision, and made a
commitment to execute the action of proposing, with the expectation that the effect
of marriage will become a reality.
The interplay between cognitive and physical actions is important in CRT. Once
more, it is important to remind the reader that we are not discussing actions in life
in general; this is all within the context of CRT, that is, an exercise with a purpose.
Let us consider two examples at the two ends of the spectrum of CRT: one in which
we are red teaming a strategic scenario on a country level and the other in which we
are red teaming a computer algorithm for encryption.
In the first example, analyzing the cognitive actions of blue is about understand-
ing factors such as how the blue team plans, evaluates options, and makes choices.
These cognitive actions can be inferred, with different degrees of difficulty, from the
physical actions of the blue team. For example, the division of the budget between
buying capabilities to conduct cyber operations and buying tanks would provide us
with an indication of how the blue team is thinking, where they see their future
operations, and what possible strategies they have to meet their future uncertainties.
These actions are not created for deception. It is less likely that blue will invest
billions of dollars in tanks simply to deceive red; the scarcity of resources as a
constraint reduces the space for this type of deceptive actions.
In the second example, the cognitive actions represent how the encryption
algorithm thinks internally, that is, how it performs encryption. If the algorithm
is an agent, we can notice its input and output. Breaking up the algorithm here
is to uncover the computations it uses to transform this input to that output. We
are attempting to use the external physical actions to infer the internal cognitive
(problem solving) actions of the agent; by doing this, we can evaluate the robustness
of our system, which is using this algorithm for storing data against attacks.
2.3.2 Skills
A red teamer attempts to interfere with, influence and shape the blue team
behavior (action space). Therefore, for the blue team, the red team is part of
blue’s environment. Similarly, for the red team, the blue team is part of the red’s
environment. The red and blue environments share common elements: the shared
environmental components between blue and red, and the components forming the
interface between blue and red.
72 2 Analytics of Risk and Challenge
As red attempts to impact blue, it needs to rely on this interface, that is, the
shared subset of the environment to generate effects. The ability of either team to
act to generate effects on the other depends on their skill level.
Definition 2.14. A skill is the physical and/or cognitive know-how to produce
actions to achieve an effect.
A skill is about the know-how related to achieving an effect. Some may define
skills as the know-how to perform a task. However, here, the concept of a task is very
limiting. By focusing on know-how for achieving an effect, we have a more flexible
definition for “skill.” This definition links the outcomes (effects) to the processes
and cognitive means (know-how). More importantly, by defining skills from the
effects perspective, we emphasize that the agent’s choice of which know-how to use
is based on the effects the agent wishes to generate, not on what the task that is being
assigned to the agent intends to achieve. This is a crucial distinction for designing
deliberate actions.
A skill cannot be defined in isolation; it always needs to be linked to a specific
effect. However, effects have different levels of complexity and are generally
nested. For example, the effect of producing on a computer a good essay based
on recounting real events, while adding some details from the authors’ imagination,
may be completed using different skills. Each of these skills link some level of
know-how to an effect. One effect might be turning the computer into an “on” state
(i.e. turning on the computer or ensuring that the computer is already turned on).
This effect requires the know-how for sensing whether the computer is on. If the
computer is not on, the know-how must be for sensing whether the computer is
plugged in and that there is an electrical current reaching the machine as indicated
with the power light, then using motor skills to press the “on” button. Another set of
effects might be the production of a letter on a screen (this requires the know-how
for generating motor actions to turn on the computer and press buttons); the effect
of recounting the event (this requires the know-how for writing an account); and the
effect of deviating from the actual story to an imaginary set of events (this requires
the know-how to produce imaginary events in a coherent, interesting and engaging
manner).
Each example of know-how listed above is composed of hierarchical knowledge
divided into subsets of know-how. For example, the know-how to produce a letter
on a screen requires the know-how of the layout of the keyboard; the know-how
to translate the intent to write a letter (a cognitive event) to a series of muscle
movements to press the buttons; the know-how to synchronize the fingers such that
the correct finger is associated with the correct key on the keyboard.
The above level of deconstruction may seem as though it has too much detail.
However, in CRT, the right level of detail will always depend on the objective of the
exercise. If it is desirable to establish a writer profile to authenticate a person on a
computer network, this level of detail will be appropriate.
In this situation, we need to know which fingers the person usually uses, and
which keys are associated with which fingers. These two pieces of information
(fingers used and finger-key association), together with the layout of the keyboard,
2.3 Performance 73
will provide us with an estimate of the time spent between pressing different
buttons. For example, if a person uses only two fingers, one would expect a larger
delay between pressing letters “h” and “o” when typing “hooray” as opposed to
the delay between pressing letters “k” and “o” when typing “Hong Kong.” This
information can establish a different profile for different users, which is then used
as a background process for authentication and user identification.
Therefore, sometimes a level of detail for one exercise is not required for another.
This is a decision that the CRT analysts must make.
A set of “know-how” forms a skill to achieve an effect. However, effects are
hierarchical. Synthesizing effects on one level of the hierarchy requires specific
skills (i.e. know-how to achieve a larger effect on an upper level of the hierarchy).
It is important to recognize that it is not sufficient to take the union of the skills
required to achieve the low-level effects to achieve the higher level effect. We
need to ensure that we also have the know-how to synthesize the low-level effects.
Therefore, the whole is not the sum of the parts.
This discussion indicates that skills are organized in a hierarchy, which is a
commonly accepted notion in information processing and behavioral sciences. The
challenge of a discussion such as this for CRT activities is that we can continue
deconstructing a high-level planning task (as in the case of planning the cultural
change required to accommodate next generation technologies in a society) into
smaller and smaller tasks down to an arbitrarily microscopic level. The main
question is whether this helps?.
In a CRT exercise, we need to deconstruct down to a level after which further
deconstruction of skills is not needed. Therefore, the concept of a skill as defined
above offers the red teamer a critical dimension for analysis. By analyzing the
blue team’s skills, red can evaluate blue’s limitations, discover its vulnerabilities,
and can reshape its own environment to generate innovative effects far away from
the know-how of blue. Red can even help blue by designing training programs to
improve blue’s skills in specific areas so that blue generates effects that are useful
for them but are far away from those in which red is interested. As long as we
avoid deconstructing effects and skills beyond what is appropriate and useful for the
exercise, this type of deconstruction is vital for the success of the analysis.
2.3.3 Competency
An agent’s behavior is defined by the actions the agent produces; these actions are
the product of the agent’s skills. There is a direct relationship between skills and
behaviors. An agent uses its know-how to generate actions to achieve effects. The
totality of these actions represents the agent’s behavior. Thus, an agent’s behavior
is the product of the agent’s cognitive and physical skills. However, how can we
evaluate behavior or skills?
74 2 Analytics of Risk and Challenge
Definition 2.15. Competency is the degree, relative to some standards, of the level
of comfort and efficiency of an agent in adopting one or more skills to achieve an
effect.
Competency is the measure of performance we will use to assess an agent’s
behavior. It acts as an indicator for the nature of the know-how (skills) an agent
possesses.
The definition above requires further discussion related to two factors, the need
for a standard to measure competency, and the distinction that has been made
between comfort, which is a characteristic of the agent, and efficiency, which is
a characteristic of the task.
a specific policy tool to the situation, they can lower their own standards but
realistically, having a very high standard is not a problem in this situation. So, what
is the problem?
The main problem in this situation is that people in Y are extremely competent
in a group of skills that X does not have. It is the know-how to use simplicity
to respond to technology savvy know-how. Therefore, for X ’s CRT exercise to be
effective, X needs to accept that they may have a blind spot in their understanding
of the behavioral space of Y . As such, how can red in X define the standard for
this blind spot? There is no single answer to this question. The obvious answer is
to study Y to the greatest extent possible. Providing the complex answers to this
question is beyond the scope of this book.
Let us take a more objective example. Assume a group of thieves would like
to rob a bank. The bank establishes a red team from their high-tech departments
to identify the vulnerabilities that the thieves may exploit. The red team properly
evaluates the competency of red in terms of every skill required to break into their
computer network. The red team uses their standards to break into the computer
network as the standard to evaluate the thieves’ competency level. Let us assume
that the thieves are not as skilled in cyber espionage, cyber strategies, computer
security, and network intrusions. In such a case, the standards used by the red team
remain appropriate, despite the fact that they are well above the level of capability
of the thieves.
However, the thieves’ objective is to rob the bank, not to break into the bank’s IT
network. Given that we are assuming that breaking into the IT network is a necessary
condition for the thieves to rob the bank, it is fair for the red team to evaluate
this as a vulnerability. However, the thieves do not have the know-how to break
into the network. Instead, the thieves know how to blackmail, exert inappropriate
pressures, and use violence and force. The thieves are not scared of breaking the
law. Their behavior is embedded in an entirely different behavioral space from the
highly educated IT team in the bank.
As such, the primary problem is that the skill space for the thieves cannot be
fully discovered by the red team in the bank. Given that skills are nonlinear, there
is no guarantee that the standards used by the bank are high enough to assess the
competency of the thieves. The thieves may simply cause damage to the electrical
power supply in the city, cause damage in the bank’s computer system, force the
bank to switch to manual operations, and steal the car with the money like in the old
movies.
Setting a standard to define competency in CRT assumes that in a normal setting
behavior is symmetric. CRT addresses symmetric and asymmetric situations; it is in
asymmetric situations that setting standards relies on correctly mapping the behavior
and skill spaces, and having the know-how (required skills) to set the standards
properly.
How can we then establish standards in CRT? First, we need to change the
standard from a ceil that defines a goal to a baseline that defines an objective.
By using the concept of the elite, we establish an upper boundary on what can
be achieved, we then attempt to measure how far the agents are from this upper
76 2 Analytics of Risk and Challenge
boundary (goal) based on the agents’ outputs. However, the red team may not have
the skills or knowledge to estimate this upper boundary properly. Overestimating the
upper bound is not necessarily a bad thing, but arbitrary overestimating this upper
boundary in an ad-hoc, blind manner or underestimating it are real vulnerabilities
for the CRT exercise because the other team might be greatly more competent than
what the red team think.
Moreover, a ceil is established under the assumption that we know what the
effect space is. In the absence of complete knowledge of the effect space, we cannot
define this ceil. Therefore, in CRT, we need instead to move away from this idea
of establishing the standard as a ceil. Instead, competency of one team will be
defined relative to an assessment of the performance of the other team. We term
this “comparative competency.”
Definition 2.16. Comparative competency is the degree of the level of comfort and
efficiency of an agent in adopting one or more skills to achieve an effect in one team
relative to the ability of the other team in achieving the same effect.
In comparative competency, a team expresses its competency relative to the
performance of the other team. Therefore, competencies are expressed as two
percentages, one related to the comfort of red relative to blue, and the other related
to the efficiency of red relative to blue when attempting to achieve a particular effect.
Comparative competency does not address the problem that one team may have
a blind spot in mapping the other team’s skill space. This problem requires multiple
treatments, especially with regards to team membership discussed in Sect. 1.3.
Remember that different skills can come together in different ways to achieve the
same effect. Therefore, when measuring competency, we are measuring to the best
possible performance that the other can display in achieving the effect. Since this
best possible performance is dynamic within a CRT context, because of the learning
occurring within the CRT exercise, comparative competency is a dynamic concept.
Given that we will explicitly distinguish between the cognitive and physical
attributes of, and functions performed by, agents, it is also important to distinguish
between comfort, the level of ease in achieving an effect, and efficiency, the accuracy
and speed in achieving that effect.
Imagine you are at the checkout counter of a supermarket. The cashier behind the
counter is scanning your items, and placing them in a bag. One of the cashiers might
be the elite in that supermarket because every item they place in a bag is scanned
(100 % accuracy) and they can scan and package 20 items per minute. This cashier
is defining the standard for the checkout counters in this supermarket.
Judging on throughput alone is not sufficient for us to understand the long-term
effect. The level of comfort, the cashier’s feelings and perceptions about the ease
with which they perform their job can provide us with a more informative picture
of performance, and the ability to predict long-term effects. If the cashier perceives
2.3 Performance 77
that the job is very easy and simple, we may assume that their performance would
degrade if they worked without rest for 1 h. If they perceive that the job requires a
great deal of effort and they need to concentrate to ensure the accuracy of scanning
and packing the items, we know that the cognitive load becomes an important factor
in this situation and the cashier’s performance may degrade in 30 min without a
break instead.
This discussion emphasizes that competency cannot rely on agents’ physical
and observable actions alone, it should also consider the agents’ cognitive actions.
Whether or not to assess these cognitive actions requires cost-benefit analysis. A
study needs to decide on the importance of this type of data to the particular
exercise. Cognitive data can be drawn from questions posed to the subjects or from
sophisticated data-collection mechanisms such as brain imaging. This is an exercise-
specific decision.
The model we will use in this chapter is inspired by the work of Gilbert [5], the
father of performance engineering or what he termed “teleonomics.” However, we
will deviate from Gilbert’s views in part to design views appropriate for the CRT
context of this book, and to ground his system-thinking views in computational
models.
Gilbert sees the world split into two components: the person (P ) and the
environment (E). We should recall that a person in this book can be a group or an
organization. When the person receives a stimulus, they need to be able to recognize
it. This recognition is fundamentally conditional on their ability to recognize the
stimulus. Gilbert termed this “discriminative stimuli:” S D .
When a person receives a discriminative stimulus, they need to have the capacity
to respond. Gilbert termed this “response capacity:” R. A person may have the
recognition system to receive and comprehend the stimulus, and the capacity to
respond, but they choose not to respond simply because they do not have the
motivation to do so. Therefore, the response needs to be accompanied with “stimuli
reinforcement:” Sr , which for the person represents the feedback to their motives.
The above can be summarized in Gilbert’s notations as
S D ! R:Sr
80 2 Analytics of Risk and Challenge
A scientist who does not understand mathematics will not be able to interpret an
equation written on a whiteboard; thus, they cannot interpret the stimuli that may
trigger an idea in their mind. Thus, education and training represent the knowledge
repertoire required for S D to function.
The capacity to respond for a scientist represents their thinking abilities and
skills. To create new contributions, the scientist needs to have the skills and
creativity to produce scientific outcomes from the stimuli. Their motivations are
assumed to be internal and to take the shape of scientific ambition.
The model above gives us the basis to analyze the person from a CRT perspective,
providing us with the knobs to influence performance and reshape it if needed.
The details of Gilbert’s work can be found in his seminal book [5]; a very
worthwhile read. His work is inspiring and well engineered. However, we need to
search deeper and be more concise to transform his system into a system suitable
for CRT. This is for several reasons.
First, Gilbert focused on a holistic view of performance, resulting in an efficient,
but high-level, model that can guide human managers to improve performance. The
objective in CRT is to challenge performance; therefore, we need to transform this
holistic view into a grounded control model that enables us to steer performance to
either positive or negative sides. Moreover, we need this model to be sufficiently
grounded so that we can use it to compute, but not too grounded to avoid
unnecessary computational cost.
Second, Gilbert did not seem to differentiate between the physical, cognitive
and cyber spaces. By focusing on performance alone, it did not matter in his
work whether the capacity of the agent was cognitive or physical, or whether
the instruments used by the environment were psychological or physical. These
elements are not included for the performance engineer to analyze based on the
context in which they are working with. However, here, we prefer to make these
distinctions clear given the tools and models to be used for CRT will be different.
In the example of a scientist, Gilbert’s model is possibly useful for us as humans
to see how we can manipulate performance from the outset. However, if red teamers
wish to challenge this scientist with ideas, or challenge their environment to steer
their scientific discovery one way or another, it is necessary to dig deeper. We need
to separate the physical (e.g. laboratory) from the cognitive (e.g. creative thinking)
and the cyber (e.g. access to information). Gilbert does this to some extent as we see
in the example in which data and knowledge represent the stimuli, instrumentation
represents to some extent the physical side, and motivation represents the cognitive.
However, we can see also in the scientist example that this is not sufficient. A
laboratory would have people such as post-doctorates and Ph.D. students who
provide ideas to the scientist. These ideas can act as stimuli, responses or even
motivations.
82 2 Analytics of Risk and Challenge
Once the information leaves the control point, it becomes a stimulus, S , to the
agent. The agent receives this stimulus in a different form, B, from what it is in the
environment. This form represents the fusion of different factors: the stimulus that
was generated, the agent’s physical resources, the agent’s cognitive resources, and
the agent’s cognitive skills.
For example, if the agent lost the ability to taste (i.e. had a malfunctioning
tongue), this limitation in an agent’s physical resources would impact the agent’s
perception of tasting information in a stimulus. Similarly, if the agent is autistic, the
lack of certain cognitive resources would impact the agent’s perception of a hug.
Finally, the agent’s cognitive skills (e.g. the agent’s knowledge of how to hug to
reflect compassion or affection) would impact the agent’s perception of a hug.
The perceived stimulus is then transformed into motives or goals. Sometimes,
the stimulus may generate a new goal, as in the case of a new task being assigned to
the agent and the agent needing to add to their repertoire of motives a new goal on
the need to complete this task. At other times, the stimulus provides the agent with
an update to one of its existing goals, as in the case of briefs from a subordinate that
update the decision maker’s knowledge of the rate at which existing performance
indicators are being met.
The states, and the corresponding changes, of an agent’s goals produce intentions
to act. Intentions in this model are a product, not a system-state. The intention
unit fuses the motives, the agent’s cognitive resources, and the agent’s physical
resources to produce a plan. Information during this fusion process moves back
and forth, where the cognitive and physical resources call and modify the cognitive
and physical skills, respectively. During this process, cognitive and physical skills
are updated and checked to produce the plan.
For example, assume an agent who used to be a professional swimmer had an
accident in which they lost their right arm. Assume that the goal of the agent remains
to be able to swim fast. Both the agent’s cognitive and physical skills need to be
updated. The agent needs to form a plan to move from the previous skills to a new
set of skills that consider and match the new physical constraint.
The agent’s internal plan can take the form of a series of actions that the agent
needs to produce. However, only a limited number of responses can be produced
at any point of time. Therefore, the intention unit also produces a schedule for the
generation of responses. The first group of mutually compatible responses (e.g. a
smile on one’s face, together with a handshake) form a “response:” R.
The agent’s internal response may be produced differently in the environment.
For example, as the agent is moving their arm to shake a person hands tightly,
the intended pressure on the other person hand is not properly produced. Thus, the
handshake does not produce the intended effect.
Two rewards systems operate as action-production works. The first is the internal
feedback, self-reward or self-punishment system in which the agent internally
rewards itself. A person may attempt to reinforce their own goals to the extent
that the person perceives that their goals are satisfied when they are not. This
internal reward mechanism is very important because it is generally hidden and
inaccessible from the outside world. It can act as a negative feedback cycle that
2.3 Performance 85
The primary design mechanism a red teamer uses to alter performance or change
the effects of blue is by designing a challenge. This is discussed in details in the
following section.
It is very difficult to isolate the very few scientific articles discussing the concept of
a “challenge,” from the countless scientific articles using the concept to indicate a
“difficult” or “impossible” situation. Therefore, it is logical to devote time here to
explaining what a challenge is within the context of CRT. We need to go beyond
a dictionary-level, common explanation of a challenge, to a more formal definition
to ensure the possibility of designing models of challenge. We should see that a
“challenge” here is a state that constructively achieves an objective. It does not
denote the impossible, or a difficult situation.
Initially, we may see a challenge in simple terms: a challenge exposes an entity
to a difficult situation. However, the main question is at which level of difficulty we
are to employ the term “challenge.” It is very simple to ask other people difficult
questions and criticize them for not knowing the answer, making them to feel
inferior or incapable.
2.4 Challenge Analytics 87
Take for example a situation in where parents would tell a 6-year-old child that
they cannot earn money, that they are the ones who can buy the child what they want,
and therefore, the child should listen to them. The child is exposed to what we would
term in common English a “challenge.” They feel inferior in their ability to attract
and own money. The child would be wondering what their alternative is to listening
to their parents. The answer is obvious in this context; the child needs a manner
in which to obtain their own money. The parents, without intention, generated an
undesirable causal relationship in the child’s mind, that is, if the child was able to
obtain money, the child could buy whatever they wanted, and therefore, the child
could have an excuse for not listening to their parents. As presented below:
These types of challenges are like a lose canon, they can fire randomly and even
hit their owners. Within the scope of this book, we will not consider this example to
constitute a challenge; we will simply consider it as an unthoughtful exposition to a
state of hardship.
It is unthoughtful because the parents above would like to gain a position of
power over the child as rapidly as possible. As a result, they state that if the child
is unable to achieve something that they know or believe is far beyond the existing
capacity of the child, the child must comply with certain conditions imposed by the
parents. The parents fail to understand that this behavior may trigger a reaction of
hostility and impose a feeling of hardship for the child. The child may rapidly adopt
a hostile attitude toward their parents, or use their level of knowledge to find the
quickest way to find money, which is obviously from the parents’ own pockets!
This is not the type of challenge we will model and discuss in this book. Instead,
we will examine engineered, thoughtful and constructive forms of challenges
whereby, the challenge is designed to achieve a desired outcome or effect.
Stimulating ¹ Challenge
Challenge ¹ Stimulating
88 2 Analytics of Risk and Challenge
The above notations emphasize that a stimulating situation does not necessarily
mean that the situation was stimulating because there was a challenge associated
with it. Similarly, a challenging situation does not necessarily stimulate the agent.
An agent may be exposed to a properly designed challenge, but the agent may lack
motivation or interest, which makes the situation less stimulating to them.
Criteria such as “stimulating” and “motivating” are more suitable for a human
agent, as they require human traits and judgment. To generalize the concept of a
challenge to a machine agent, we need to reduce these criteria to a set of indicators
that can be used to assess and/or judge the process objectively without the need to
rely on subjective judgment.
We use a simple to understand, but more complex to implement, criterion.
Definition 2.18. A task is challenging when the distance between the aggregate
skills required to do the task and the aggregate skills that agent possesses is positive
and small.
That is: Aggregate required skills—Aggregate possessed skills > " ! a
challenge iff " is small and > 0.
We need to offer two words of caution here:
1. The concept of “distance” in the discussion above is not a simple quantitative
metric.
2. The aggregate of skills is not the sum of skills.
Several sets of skills can be united in different ways to create different high-
order skills. For example, let us assume that Jack is a creative person with excellent
writing skills, and cryptographic skills. The skills of creativity and writing when put
together may make Jack a creative writer. The skills of creativity and cryptography
when put together may make Jack a good computer hacker. Practice plays the role of
increasing the competency level of the agent. As the agent becomes competent, new
skills emerge. As Jack practices his creative writing and computer hacking, he may
develop skills in script writing for science fiction movies on quantum computations.
A good computer hacker is not created through simply by adding creativity and
cryptographic skills. If it does, then we simply obtain two different people, one who
is creative but has no understanding of computers, and the other who is a well-
educated cryptographer but is not creative. When we put these two people together,
it is unlikely that a good computer hacker will emerge for a long time, that is, the
time required for each person to transfer some of their core skills to the other.
The above raises the important question of how to create a good computer-
hacking team. The creative thinker needs to have some understanding of cryptog-
raphy and the cryptographer should have a degree of creative-thinking ability or
should be “open-minded.” There must be overlap of skills to establish a common
ground for the members of the team to speak to each other in a language they can
both understand, while not necessarily being an expert in the other’s field.
To recap the above discussion from a mathematical perspective, a distance metric
on a skill space is not a trivial task, and the aggregation of skills is usually a
nonlinear coupled dynamic system.
2.4 Challenge Analytics 89
There is not a great deal of literature on the concept of a challenge but there is small
amount in the fields of education and psychology. Here, we will build on the work
that does exist, but we must first deviate. As we will see, most of the literature treats
the concept of a challenge in a holistic manner. A challenge is defined, then the
concept is left to a designer such as an educator to interpret what it means within
their context.
The online free dictionary [7] defines a challenge in many different manners.
One definition that is particularly relevant to this book is the following: “A test
of one’s abilities or resources in a demanding but stimulating undertaking.” This
definition highlights the delicate balance that needs to exist between the two words
“demanding” and “stimulating.” The need to strike this balance is supported by
theories in educational psychology. Sanford’s theory of challenge is key in this area
[10]. In his work, he explains the subtle difference between a challenge and a stress.
He emphasizes the need to strike the right balance so that a challenge to a student
does not turn into a stressful situation. This work was followed in the education
domain by some scientific research on the topic [3, 9]. The pattern of a challenge
was recently mentioned in a study on immersion, although there was no analysis of
the specific pattern [6].
The above definition linked the concept of a challenge with the concept of
“stimulating”. However, we separated these two concepts in the previous section.
The word “demanding” is interpreted in our definition as exceeding the boundary.
The word “stimulating” is interpreted that it is not too demanding to the extent
that the agent may give up. However, the concept of “stimulating” has a second
dimension related to the agent’s motives. A challenge would become stimulating if
it has elements that triggers the agent’s motives. This dimension is agent-specific.
As we discussed before, we separate the two concepts of a “challenge” and this
dimension of the concept of “stimulating” in this book.
The concept of a challenge is traditionally found in the literature on “dialectics.”
Naturally, considering a challenge can take the form of questions. Can we design a
counter-plan for the opponents plan? Can we design an example to teach the students
an extra skill they currently do not possess? Can we design an anomalous dataset
that is sufficiently similar to normal behavior to be able to penetrate the anomaly-
detection system for our testing purposes? Therefore, questioning is a natural mean
to communicate a challenge. However, we should be cautious since not every type of
questioning is a challenge. Questioning can be a mean for examination, interrogation
and extraction of truth, sarcasm, or even a dry joke.
Mendoza [9] thinks of a challenge as “forcing myself to learn always to think
at the limits.” Admiring the work of Ellul on dialectics, Mendoza cites Ellul’s four
notions of a theory of dialectics:
1. Contradiction and flux are two characteristics in life that must be reflected in
the way we theorize. Through the holistic approach of dialectic, meaning can be
grasped.
90 2 Analytics of Risk and Challenge
2. The coexistence of a thesis and antithesis should not lead to confusion or one
suppressing the other. The synthesis should not also be a simple addition of the
two; instead, it emerges through “transformative moments” with “explosions and
acts of destruction.”
3. The negative prong of the dialectic challenges the spectrum between the positive
and negative prongs, creating change; or what Ellul called “the positivity
of negativity.” Ellul sees change as a driver for exploration. Mendoza offers
examples of the positives, including: “an uncontested society, a force without
counterforce, a [person] without dialogue, an unchallenged teacher, a church with
no heretics, a single party with no rivals will be shut up in the indefinite repetition
of its own image.” [9]. These positives will create a society that resists change.
4. Automaticity of the operation of dialectic is not possible because many of the
contradictory elements in the society are necessarily going to create those unique
dialectic moments. Ellul cautions that “Dialectic is not a machine producing
automatic results. It implies the certitude of human responsibility and therefore
a freedom of choice and decision.” [9].
Generalizing from the above four notions of dialectics in a manner relevant to
this book, we can identify four factors for a challenge:
1. Coexistence of thesis and antithesis
2. Change and negatives derive challenges
3. Synthesis is an emerging phenomenon
4. Noise.
While noise was not an explicit topic in the above discussions, it needs to be
induced. Given the many contradictions that exist in the world with no potential to
influence a challenge, they can inhibit the emergence of challenges when attempting
to automate the process. Therefore, they should be filtered out. Given the nature of
this noise, it is best suited for humans to filter them out than automation.
The above does not necessarily offer a solution to how we can model, design,
and create a challenge, but it certainly offers cautionary features for which we need
to account for when discussing automation. As a principle, this book does not claim
that we can automate the concept of a challenge; in fact, this is precisely why we
will dismiss the concept of automating the CRT process.
intuition of a challenge. The totality of the skills an agent possesses represents the
set of tasks the agent is capable of performing. To challenge this agent is to find
such task that the agent cannot perform because the agent lacks certain skills, while
at the same time, this task is very close to what the agent can currently do.
For example, we ask a child to multiply three by three knowing that the child
has learned how to add and knows the basics of the concept of multiplication, for
example, knowing how to multiply two by two. However, the child is unable to
multiply three by three because the child has not been exposed to sufficient examples
to generalize the concept of multiplication to arbitrary multiplication of any two
numbers. Nevertheless, the child was able to generalize the concept of addition
to arbitrary numbers, and understands the basics of multiplication in the simple
example of multiplying two by two. The child has all the skills to multiply three by
three, except one skill: the know-how to generalize that multiplication is a recursive
addition. Whether or not this extra step is simple enough or too hard for the child
will depend on the child’s cognitive resources and skills.
Likewise, we can teach a person how linear regression works then challenge them
by giving them a simple nonlinear example that requires a simple transformation to
make it linear. The person needs to synthesize their knowledge to solve the example
in a manner in which they have no experience. Even if they fail, once the solution
is explained to them, they see no problem in understanding it. This is the point at
which we hear exclamations such as “Ah, I see, this now sounds so obvious, it just
did not cross my mind the first time I attempted to solve this problem.”
The above example demonstrates an important point that many people may find
counterintuitive, that is, a challenge can only synthesize existing knowledge, it
cannot introduce new axiomatic knowledge. Now is a good time to differentiate
between these two types of knowledge.
We will argue that there are two broad categories of knowledge an agent can
have: axiomatic and derivable (learned) knowledge. Axiomatic knowledge can only
be gained through direct exposition to certain facts, processes, and tasks. Similar
2.4 Challenge Analytics 93
to mathematics, once we believe in the axioms, theorems can be derived from the
axioms, both deductively or inductively. To develop a new type of calculus, it is not
sufficient to study and practice calculus, we need different types of knowledge to
understand what it means to develop a new calculus in the first place.
Similarly, people who studied humanities may be very creative when writing a
story or analyzing a conversation. However, if they have never studied mathematics,
no challenge can synthesize their existing knowledge into a new type of knowledge
that enables them to understand mathematics. The distance between the two spaces
of knowledge is large. The same result will ensue by asking a mathematician to
understand Shakespeare if they have not been exposed to literature before or by
asking a person who has recently begun to study a language to understand complex
jokes in that language; we know this is difficult because a joke does not just play
with words in a language, it also relies on cultural elements that the person may not
have gained this type of axiomatic knowledge of this particular culture.
This is not to say that a challenge does not produce new knowledge; on the
contrary, if it does not, then it is not a challenge. Instead, a challenge can only move
us from one place to a place close by; thus, the knowledge the challenge produces
may impress us but it must come from within a space that is sufficiently close
to the space of the old knowledge. This knowledge can be “transformative”—as
Ellul indicated with “transformative moments”—in the sense that it is a non-linear
synthesis of existing knowledge. Because of non-linearity, it is hard to explain
it deductively from existing knowledge. The agent may perceive that it is new
axiomatic knowledge, but the agent would feel also that it is not too difficult and
that it can vaguely be associated with what they already know.
Recasting the previous conceptual diagram of a challenge in a different form,
the skills of an agent would influence the agent’s behavior. Figure 2.15 depicts this
process by conceptualizing the space of possible behaviors an agent can express.
The model assumes that we wish to challenge a thinking entity, let us refer to
this entity as an agent. Similar to the theory of challenge in the field of education,
our aim is to push further this agent to acquire skills and knowledge beyond those it
currently possesses.
Figure 2.15 offers a complementary perspective on a challenge when the aim
is to challenge the behavior of an agent or a system; the aim is to encourage the
system to express a behavior that is outside the boundary of its normal behavior. For
example, challenging a passive person to take a more proactive attitude should not
be considered a process that will magically transform this person into a proactive
person overnight. It is extremely unlikely that such a transformation will occur
so rapidly simply because being proactive requires many skills to be acquired,
including thinking and communication skills.
Within this space resides a subspace of the behaviors the agent currently
expresses, which we assume in this example to represent the space of passive
behaviors. To expand the behavior subspace of the agent to include proactive
behaviors, the small dark circle represents the closest subspace that features
proactive behaviors but is not too far away from the agent’s current subspace of
behaviors.
94 2 Analytics of Risk and Challenge
To achieve the intended effect from a challenge, the agent must be engaged
during this process, that is, the agent should not find the challenge process too
boring or too difficult, but instead, stimulating and motivating. The challenge needs
to stimulate the agent so that a new behavior is expressed by the agent. Therefore,
to ensure that the challenge is effective in achieving the desired effect, its design
needs to be agent-centric to connect agent’s skills with agent’s motives. To this end,
and before progressing any further, we must pause to explain what we mean with
behavior in this context.
Figure 2.16 expands this discussion beyond the use of the concept of a challenge to
expand the skill set or behavioral subspace of an agent, to testing and evaluating
algorithms and systems. This new example will allow us to dig deeper in an
easy to understand context. In Fig. 2.16, we assume a computer network. In this
environment, A represents the space of all behaviors or all possible traffic that goes
through this network. Some of this traffic will constitute anomalies and is depicted
by the subspaces B and D. The difference is that we know of the existence of B but
we do not know of the existence of D because of our bounded rationality, limited
knowledge or any other reason that would prohibit our thinking from knowing about
the types of anomalies hidden in D.
We can assume an algorithm that is able to detect anomalies. This algorithm may
be able to detect anomalies in subspace C , which is a subset of B. A classical test
and evaluation method such as stress testing to evaluate this algorithm will very
likely end up with the subspace B C . This is because the bias that exists in our
design of these stress-testing methods is (subconsciously) based on our knowledge
of B.
2.4 Challenge Analytics 95
spaces of a challenge for both teams. In the top space, blue searches the space of red
attacks. While there is a space of red attacks that is known to blue, there is a subspace
within this space where blue knows that if red attacks come from this subspace, blue
can’t detect them. This is the subspace where blue is aware and conscious of its own
vulnerability.
There is also a subspace in the space of all possible attacks by red, where blue
is unaware of it. Thus, this subspace represents the blind spot for blue. The same
analysis can be done on the red side in the bottom diagram.
Sofar, the discussion introduced many concepts that underpin the risk and challenge
analytics areas. It is time to synthesis these introductory materials into computa-
tional forms. The discussion will start with the first formal perspective on CRT and
how it relates to the analytics of risk and challenge. This will be followed by a more
focused discussion that synthesizes the introductory materials into a coherent form
for each of the cornerstones of CRT.
Fig. 2.18 Transforming sensorial information, from sensors, to effects, through effectors, cycle
2.5 From the Analytics of Risk and Challenge to Computational Red Teaming 97
The technical details for implementing an SOA is beyond the scope of this
book, but the concept of SOA is simple enough to be understood on this level
of abstraction. SOA can be implemented using web-services; which relies on the
internet as the backbone for the service bus.
Figure 2.20 shows one view of the SOA for CRT, which connects sensors to
effectors. It also emphasizes that the system internally measures indicators for
the success of achieving the objectives; thus, providing evidence-based decision
making approach. The risk analytics component has a number of services, including
optimization and simulation services. The role of these technologies will be
discussed in the following chapter.
Both challenge analytics and risk analytics rely on three technologies: simula-
tion, optimization and data mining, similar to risk analytics. These technologies
will be discussed in more details in Chap. 2, and an example to illustrate how they
need to work together for CRT purposes is given in Sect. 3.1.2.
Risk analytics and Challenge analytics are the two cornerstones of a CRT system.
Figure 2.21 depicts this relationship by factoring risk into its two components:
uncertainty and objectives. Together, the objectives of the organization and the
uncertainty surrounding the decision making process constitute risk. The challenge
analytics component aims at designing challenges for uncertainty, constraints and
objectives. This point will be elaborated on in Sect. 2.5.4.
The prices of shares in the stock market can rise very quickly, but we can estimate
a boundary on how far they can rise. It is possible to estimate multiple boundaries
with different levels of confidence. If blue can estimate the bounds on red’s
uncertainty, blue can design strategies to challenge red by creating uncertainties
outside these bounds.
Similarly, challenge analytics need to challenge objectives. We discussed that
classical decision making assumes that objectives are mostly defined and fixed.
However, objectives in CRT are controllable elements that can be reshaped. One
way for blue to influence red is to reshape red’s objectives. Challenge analytics can
help blue to estimate the boundary conditions on red’s objectives so that blue can
challenge red by aiming to reshape red’s objectives. This reshaping process can be
done by changing these boundaries, moving them in different directions.
To illustrate a simple example using classical linear programming, assume that
red aims to maximize profit, where the profit objective function is formulated as
follows:
" 2xC3y
with x and y representing two different types of effects that red wishes to
generate. For blue to challenge these objectives, blue needs to analyze two different
boundaries: the boundaries on the coefficients, and the boundaries on the structure.
The boundaries on the coefficients is to estimate how far the two coefficients of 2
and 3 for x and y can change, respectively. However, some gains achieved by red are
influenced by blue. These coefficients represent red’s gain from each type of effects.
In essence, they represent how red values these effects. As such, to challenge these
coefficients is to understand the boundary constraints on them; that is, for example,
the coefficient of x may change between 1 and 5 based on a number of factors. Blue
can then design a strategy to influence these factors so that this coefficient changes
in the direction desired by Blue.
The boundaries on the structure aims at estimating the constraints on the effect
space for red. In other words, can we introduce a third variable z to this equation
that is more beneficial for us? These structural boundaries are very effective tools.
102 2 Analytics of Risk and Challenge
Fields such as Mechanism Design and Game Theory can assist in discovering this
third dimension, although we will avoid discussing Mechanism Designs in this book
because most work in this domain falls in the same classical trap of game theory,
which assumes (1) rational agents and (2) agents are self-aware of the value of any
alternative (i.e. when an agent is faced with an alternative, the agent has an internal
value representing the maximum value the agent would be willing to pay for that
alternative). The advantages of CRT is that, it does not have such restrictive and
unrealistic assumptions. For example, what would be the maximum price you would
be willing to pay to save your life? In essence, the question also means, how far can
you go to save your own life? Mechanism design assumes that each agent knows
the answer to this question precisely!
The third element that CRT can challenge is the constraints on the other team.
Constraints normally exist for two reasons; either the structure and properties of
the system are inhibiting the system from expressing certain behaviors, or the
environment is doing so. Constraints from the environment are forces impacting
the system in a similar way to uncertainties. The primary difference between an
environmental constraint and uncertainties is that the former are certain forces,
while the latter are uncertain. For example, weather conditions are environmental
conditions impacting a flight. When weather conditions are known, we can take
them as constraints when designing an optimal flight path. When weather conditions
are not known, they become uncertainties that a flight path needs to be evaluated
against a range of possible weather conditions. In classical optimization, the two
concepts can be combined in the form of a stochastic constraint.
Most of the time challenge analytics is concerned with designing counterac-
tions to challenge the other team. This design process for blue(red) will require
mechanisms to estimate boundary conditions for red(blue) constraints, uncertainties
and objectives, designing actions outside these boundaries, projecting the impact
of these actions in the future, and selecting the most appropriate counteraction for
blue(red) in response to red(blue) actions.
As will be illustrated in Sect. 3.1.2, challenge analytics rely on three technolo-
gies: simulation, optimization and data mining, similar to risk analytics.
Computationally, challenge analytics requires a proactive architecture that can
support proactive generation of counteractions. One possible realizations of this
architecture is the following Observe-Project-Counteract agent architecture. This
architecture has three components as follows:
Observe: In the first stage, each team needs to observe the other team by contin-
uously sensing information, extracting behavioral patterns, and assessing their
skills (assessing boundary constraints).
Project: In the second stage, the creation of a model of how the other team acts is
required, so that each team can use this model to estimate their actions in the
future, and evaluate the impact of one team’s actions on the other team. In the
debate, if we can estimate through observations what the other team knows, we
can equally estimate their response to our future questions.
References 103
References
1. Abbass, H.A., Petraki, E.: The causes for no causation: a computational perspective. Inf.
Knowl. Syst. Manag. 10(1), 51–74 (2011)
2. Einstein, A., Infeld, L.: The Evolution of Physics. Simon and Shuster, New York (1938)
3. Ellestad, M.H.: Stress testing: principles and practice. J. Occup. Environ. Med. 28(11),
1142–1144 (1986)
4. Gaidow, S., Boey, S., Egudo, R.: A review of the capability options development and analysis
system and the role of risk management. Technical Report DSTO-GD-0473, DSTO (2006)
5. Gilbert, T.F.: Human Competence: Engineering Worthy Performance. Wiley, Chichester (2007)
104 2 Analytics of Risk and Challenge
6. Grimshaw, M., Lindley, C.A., Nacke, L.: Sound and immersion in the first-person shooter:
mixed measurement of the player’s sonic experience. In: Proceedings of Audio Mostly
Conference (2008)
7. http://www.thefreedictionary.com/. Accessed 1 Feb 2014
8. ISO: ISO 31000:2009, Risk Management - Principles and Guidelines (2009)
9. Mendoza, S.: From a theory of certainty to a theory of challenge: ethnography of an
intercultural communication class. Intercult. Commun. Stud. 14, 82–99 (2005)
10. Sanford, N.: Self and society: social change and individual development. Transaction
Publishers, Brunswick (2006)
11. Sawah, S.E., Abbass, H.A., Sarker, R.: Risk in interdependent systems: a framework for anal-
ysis and mitigation through orchestrated adaptation. Technical Report TR-ALAR-200611013,
University of New South Wales (2006)
Chapter 3
Big-Data-to-Decisions Red Teaming Systems
The general who loses a battle makes but few calculations beforehand.
Thus do many calculations lead to victory, and few calculations to defeat:
how much more no calculation at all! It is by attention to this point that I can
foresee who is likely to win or lose.
Sun Tzu (544 BC - 496 BC) [33]
Before more technical discussions on CRT, one may need to understand the
differences between a classical problem solving approach and CRT. Figure 3.1
depicts a categorization that attempts to separate the two classical schools of
thinking in problem solving.
The “think-to-model” (T2M) school represents classical AI and quantitative
Operations Research (OR). Within the military, it represents what is known as the
military appreciation process (MAP), which is the process officers get trained on
to solve problems. It starts with defining what the problem is, after all, without
knowing what the problem is, the activity can be counter-productive. Once the
problem is defined, it gets formulated either mathematically or qualitatively but
Let us recall that the two key concepts in CRT are risk and challenge. Decisions
are evaluated using a risk lens to challenge the system under investigation. In this
section, we will present a synthetic scenario for a CRT exercise to demonstrate how
the different bits and pieces of modeling come together to present a coherent CRT
environment.
Ramada (red) and Bagaga (blue) are two nations: Ramada is developing and
Bagaga is developed. Ramada relies on foreign aid from Bagaga to provide financial
support to its senior citizens. Bagaga provides this financial aid to increase the
loyalty of Ramada’s citizens to Bagaga.
Bagaga established a CRT exercise to understand the implication of the different
levels of financial aid it can provide to Ramada. Given that Bagaga established
this CRT exercise, the blue team represents Bagaga, and the red team represents
Ramada.1
Over the years, Bagaga has developed technologies to conduct CRT exercises
of this type. Given the complexity of the situation, Bagaga decided to use its CRT
capabilities to implement the CRT exercise.
Bagaga formed a highly qualified red team consisting of five experts: an anthro-
pologist; a social scientist; and a psychologist (all of whom specialize in research
on, and have a working knowledge of, Ramada); a strategist (who is familiar with
the machinations of the political policies of Ramada); and a computer scientist (who
specializes in running CRT models). In addition, a number of technicians have been
enlisted to support the red team.
Bagaga has constructed a blue team consisting of experts in economics, interna-
tional relations, and a computer scientist specialized in running CRT models.
The purpose of the exercise was explained to both teams as follows: “the purpose
of this exercise is to design a strategy to maximize the value gained by Bagaga
from the financial aid given to Ramada (benefit), while minimizing the amount of
financial aid (cost).”
Each team was assigned their roles as follows: “the blue team needs to decide on
a level of financial aid that Bagaga can afford, while the red team needs to discover
vulnerabilities in the blue team’s decision that can cause the value for money to be
less than expected.”
In this exercise, value for money is defined as
Benefit
Value for money for the blue team D
Cost
Positive Effects
Value for money for the blue team D
Negative Effects
1
Notice that the first letter of the country name corresponds to the first letter of the color to help
remember which team is which.
108 3 Big-Data-to-Decisions Red Teaming Systems
The exercise will continue as a cycle. The objective of the blue team is to make
a decision on a level of financial aid that Bagaga can afford. The outcome of this
decision will be communicated to the red team, whose objective is to analyze the
vulnerabilities of Bagaga’s decision. The red team then sends the blue team its
findings in the form of the level of loyalty the financial aid achieved in Ramada.
The financial aid’s vulnerability cycle will continue until Bagaga is comfortable
that the analysis has covered the space well.
What computer models would the exercise use for this activity?
The blue team decided to use economic models to understand what level of
financial support Bagaga could commit given its tight budget constraints. They will
rely on the international-relations experts to forecast the expected impact of the
assigned financial support on Ramada’s loyalty.
The red team decided to augment their expertise with the advanced computa-
tional capabilities available to them.
In the first cycle, the blue team ran their economic models and decided that an
appropriate level of financial aid would be B1. The decision was communicated to
the red team.
While the blue team was working on finding an appropriate level of financial aid,
the red team was attempting to understand the relationship between financial aid
from Bagaga and the loyalty of citizens in Ramada. To achieve this, the red team
needed to have a model of Ramada.
The model needed to capture the behavior of Ramada’s government in response
to different levels of financial aid. The red team decided to use a model they
named ScioShadow-Ramada. This was an advanced two-layer simulation model of
Ramada: one layer was a behavioral model of the government, while the second
layer was a behavioral model of the citizens.
The model used different variables about the lifestyle of a typical citizen in
Ramada, and mapped them to the feelings and emotions that are translated to loyalty
toward Bagaga. The model worked on different levels of detail and could be detailed
to the extent of mimicking the dynamics of how feelings and emotions are created
for different types of citizens in Ramada.
By varying the parameters of ScioShadow-Ramada, the red team can study the
impact of different levels of financial aid on the level of loyalty Ramada has to
Bagaga. We will term the mimicking behavior of the model “simulator.”
A simulator within CRT is the oracle that represents the phenomenon under
investigation. We can question this oracle with any factor we want, providing the
models inside the oracle cover this factor. Let us ask the oracle one of the main
questions arising from the situation at hand: “If the level of financial aid from
Bagaga is B1, what is the level of loyalty expected in Ramada?”
The red team can run a great deal of simulation (i.e. many calls to the simulator
with different parameters initializations) using ScioShadow-Ramada. However, this
is very time consuming and computationally very expensive. Instead, the red team
decides to use optimization technologies to find the points representing the best
mappings (optimal solutions) between the level of financial aid from Bagaga and the
corresponding level of loyalty in Ramada. They execute a number of optimization
runs to find all optimal solutions.
3.1 Basic Ingredients of Computations in Red Teaming 109
Fig. 3.2 Fitness landscape of Ramada’s loyalty in response to Bagaga’s financial aid
In this exercise, the red team finds three optimal solutions. In Fig. 3.2, these
solutions are labeled M1, M 3, and M 5.
Figure 3.2 presents the relationship between the conditioning of the parameter
of interest, level of financial aid, and the possible response in the effect of interest,
which in this example, is measured by the loyalty level of citizens in Ramada toward
Bagaga. This diagram is sometimes termed the “response surface” or the “fitness
landscape.”
A response surface presents the effect (response) as a function of the cause
(parameter under investigation). A fitness landscape is a concept from biology that
presents the fitness of different species in the population; we can simply assume that
each solution in this diagram is a configuration for a policy or a “species”. In both
cases, understanding this surface is important because this is the diagram that blue
must consider when making a judgment.
Recalling that the level of financial support that the blue team chose is B1, we
can immediately deduce from Fig. 3.2 that B1 does not lead to the highest level of
loyalty. In fact, decreasing the financial aid can lead to a higher level of loyalty.
However, why is this the case?
To answer this question, the red team needs to dig underneath the fitness land-
scape using their knowledge and expertise on how Ramada works. The optimization
process that discovered the three optimal solutions has undergone a search process
to find these optimal solutions. The search process usually moves from one solution
to another (or from a sample of solutions to another) in the search space, evaluates
encountered solutions, and then decides either to generate new ones or to cease the
search.
If the red team saves all the solutions encountered during the optimization
process, it can visualize them as demonstrated in Fig. 3.3.
Let us remember that each solution in Fig. 3.3 arose from running the simulation
system in ScioShadow-Ramada. Therefore, the environment we used to initialize
110 3 Big-Data-to-Decisions Red Teaming Systems
Fig. 3.3 Solutions encountered during the optimization process to construct the fitness landscape
each simulation that generated each of these solutions can be saved. Based on the
analysis executed by the red team, two key variables in this environment (apart
from the level of financial aid) can be considered the cause for the variations in
these solutions. These two variables are the corruption level in the government of
Ramada, and the level of government control the Ramada government exercises
within the country.
Since it is established that the government of Ramada receives the financial aid
from Bagaga, the level of corruption in the government of Ramada means that the
financial aid does not all go to the citizens. However, this also depends on the level
of control of the government. If the level of control is very high, and the corruption is
very high, one would expect this to correspond to a very low portion of the financial
aid being passed to Ramada’s citizens.
In Fig. 3.3, the red team has used Z1, Z2, and Z3 to denote solutions encoun-
tered during the optimization process and fall within the local area of each of the
three optimal solutions: M1, M 2, and M 3, respectively. It is important to note that
it is not necessarily true that all Z1 were encountered during the search for M1.
Therefore, the red team can visualize the relationship between these two variables
and the corresponding loyalty level as presented in Fig. 3.4. Fortunately, the red
team can see that each optimal solution and its surrounding neighborhood occupies
a distinct subspace of the government of Ramada’s corruption and control space.
Given the large amount of data that has been collected, the red team can apply
classification, which is a type of data-mining technique that can automatically
discover the boundaries between different areas in the diagram. The output of the
classification method is presented in Fig. 3.5.
This output is interesting because it divides the corruption-control space into
three distinct regions that impact the relationship between the amount of financial
aid from Bagaga and the level of loyalty of Ramada’s citizens to Bagaga.
3.1 Basic Ingredients of Computations in Red Teaming 111
Fig. 3.4 The relationship between fitness landscape and causal space
The red team sent their findings to the blue team. They have demonstrated to the
blue team that the original decision of the blue team was vulnerable.
The blue team decided to challenge the red team. They needed to find a way
to push the boundaries that separate the three areas of optimal solutions to their
advantage. To design this challenge, the blue team began to analyze the skills (know-
how) the two concepts of government control and government corruption require.
The international-relations expert identified three skills for government corruption
and two skills for government control. These skills are presented in Fig. 3.6.
The skills presented in Fig. 3.6 are interesting to note. A corrupt government
requires the skill of having a strong understanding of the social and political context
of the country. Moreover, they may also have excellent social-intelligence skills.
These two skills are interesting because they are also the skills needed for a healthy
112 3 Big-Data-to-Decisions Red Teaming Systems
government. The third skill for corruption is that the person (government) needs to
know how to suppress their conscience, be selfish, and avoid altruism.
This situation is similar to an excellent thief and an excellent detective: the two
have the same skills but their attitude toward society is driven by two opposing value
systems.
For the government to have control, they need to know how to design laws to give
them this control, and how to establish strong law-enforcement agencies to protect
and uphold the laws.
The analysis conducted by the blue team revealed the skills that can be influenced
to challenge and reshape the relationships presented in Fig. 3.5. The blue team used
simulation, optimization and data-mining techniques again, in conjunction with
ScioShadow-Ramada, to find ways to challenge the red team. Figure 3.7 presents
the outcome of this exercise, where the blue team identified that the line separating
Z3 from Z1 and Z2 can be pushed by a distance: c1, and the line separating Z1
from Z2 can be pushed further a distance: c2.
To explain the findings of the blue team, let us begin with c1. If line Z3 .Z1
Z2/ is pushed toward c1, the area of Z3 will be larger. The fitness landscape did not
change, but what changed was that the blue team found ways to influence Ramada
3.1 Basic Ingredients of Computations in Red Teaming 113
such that lower government control could generate higher benefits. For example, if
the blue team directed some of the financial aid into media programs to promote a
sense of community in Ramada, this would counteract the selfish behavior displayed
by some government officials in Ramada and convert them to exhibiting altruistic
behaviors. Similarly, by enhancing the laws in Ramada, Z1 Z2 can be pushed in
the direction of c2; thus, higher levels of corruption can be counteracted with better
laws.
In discussing the concept of a challenge, the use of data-mining techniques to
estimate the boundaries between the different sets of solutions was crucial. These
are the boundaries that needed to be discovered before establishing the impact on
Ramada. The blue team can now use a portion of the B1 financial aid as a budget to
influence and reshape the government of Ramada to achieve a higher level of loyalty
in the citizens of Ramada.
This example presents a number of CRT technologies that were essential in the
exercise. First, we can see from the beginning of the exercise that the purpose and
roles of the teams were assigned unambiguously. This is part of the concept of
“experimentation,” which we will discuss briefly in the next section.
The second set of skills relates to optimization, data mining, and simulation.
Optimization provides the tools to search for and discover optimal solutions or
promising areas of the search space. Data mining provides the tools to group
information, find ways to discriminate between information, and find possible causal
relationships in the data. Simulation is the oracle that represents the system under
investigation in a computer program to which we can simply ask questions instead
of asking the real system itself.
114 3 Big-Data-to-Decisions Red Teaming Systems
3.2 Experimentation
and the RT-S must be clear. If the cost exceeds the benefits, the CRT exercise
should not take place. There is no point in asking a big fancy question when the
budget is limited. Whatever the answer we get, it creates vulnerabilities in the
decision making cycles by exposing the credibility of the answer to threats of
doubts.
The CRT question(s) will usually lead to hypotheses. There are two definitions
of a hypothesis.
Definition 3.1. A hypothesis is a statement of a prior belief formed about the
outcome of an experiment.
Definition 3.2. A hypothesis is a prior belief of the existence of a cause–effect
relationship.
If the first definition is used, a hypothesis may sound strange to some CRT
designers; primarily because in complex CRT exercises, the formulation of a belief
about the outcome of a CRT exercise is either a trivial deductive exercise from
the question, or conveys the image of an academic experiment, rather than being
a classical in situ CRT experiment.
For example, if the question that triggered a CRT exercise was whether a security
layer be penetrated. The hypothesis would appear trivial if we simply stated that we
believe that this security layer can be penetrated. The word “hypothesis” itself, with
the first definition in mind, may also appear overly academic in a CRT context.
If the second definition of a hypothesis is used, we can see the importance of
defining a hypothesis more clearly; in fact, it becomes a necessity. In this definition,
a hypothesis is a belief about the cause of the effect. If the effect is penetration
of the security layer, a hypothesis can be formulated to state that lack of physical
protection of key computer access points makes it possible to penetrate the security
layer. Here, the cause is our key to open the door toward reaching and achieving the
effect. By stating the hypothesis of the CRT exercise, we are stating our initial belief
about the first key we will use to generate the effect.
Formulating the right hypothesis substantiated with systematic and logical
thinking is a valuable step toward obtaining rigorous results. If it eventuates that the
hypothesis is invalid, this becomes a finding in its own right. In the example above,
if a lack of physical protection of key computer access points does not lead to a
penetration of the security layer, this finding would convey that the security layer is
robust against this cause. More importantly, it will prompt us to ask why the lack of a
physical protection layer of computer access points is not a door toward the security
layer: Is it because there is an internal firewall between internal subnetworks and the
core network within the organization? Is it because there are strong cryptographic
protocols or is there another reason?
These follow-up questions will be the basis for the evolution of the study, the
formulation of updated and new hypotheses, and the exploration of more means to
achieve the effect.
Sometimes the CRT exercise is executed using a human-in-the-loop simulation.
The red team needs to make decisions on what analysis tools they will adopt to
116 3 Big-Data-to-Decisions Red Teaming Systems
evaluate the plans of the blue team, then the simulation environment provides
feedback to both teams on their plan and counter-plan. A hypothesis is still required;
otherwise, the problem is open ended for the red team. The red team needs to use
the hypothesis as a mechanism to begin the exercise and initiate the activity to avoid
confusion about where to begin.
While we emphasize the need for a hypothesis, the hypothesis should not be
considered a bias for the red team that would stop them from innovating. The
questions and hypotheses frame the CRT exercise in its reason of being, but should
not constrain the thinking process, ideas or innovations that should emerge from the
exercise. A hypothesis is merely an initial belief. Once the exercise begins, the team
members may dismiss it completely. However, even when this occurs, the hypothesis
acts like a seed for the discussion and investigation. Even if the red team is not
persuaded that the lack of physical protection of key computer access points makes
it possible to penetrate the security layer, they will begin to debate and dismiss this
claim. This debate will encourage the analysis.
3.2.2 Experiments
The designer of an experiment has the daunting task to ensure that the experimental
environment is conditioned to exclude many unwanted factors. These unwanted fac-
tors can be elements that do not impact the experiment and therefore, including them
may introduce noise and confusion in the mind of analysts, constituting unnecessary
factors that complicate the experimental environment. Unwanted factors may also
be elements that impact the experiment, but the designer wishes to exclude them to
be able to isolate cause and effect.
For example, imagine we want to examine the effect the exposure of a manager to
depressive situations will have on the quality of the manager’s critical decisions. Let
us assume the hypothesis of this experiment is that as the degree of depressive events
to which the manager is exposed increases, the quality of the manager’s decisions
decreases.
3.2 Experimentation 117
To evaluate the degree of depressive events, RT-D may rely on changes in the
physiological responses of the manager such as changes in skin temperature and
heart rate once the manager is exposed to depressive events.
Therefore, we need to be able to detect when a depressive event occurs and
measure the physiological responses before and after the occurrence of the event.
However, the manager may get depressed at home before going to work. The
designer of the experiment may opt only to run the experiments if the manager
is in a pleasant psychological state; this may require the manager to sleep on site
or in a near-by hotel, or may require preventing routine problems arriving on the
manager’s desk during the experiments so that the experiment can focus on critical
decisions.
This type of conditioning ensures that the experiments are conducted in the right
circumstances to establish the cause–effect relationship, if it exists. Unnecessary
factors are eliminated; those that are key to the experiment are included, and the
environment is conditioned such that no external factors influence or bias the results.
An experiment is designed to test the hypothesis, that is, to test whether we can
establish a relationship between the cause and the effect. Assume we want to test the
robustness of an algorithm that routes the vehicles in a furniture-delivery company.
The objective of the CRT exercise is to discover when this algorithm fails; and
robustness is not achieved. A possible question can be the following: When does the
algorithm fail? A possible hypothesis might be that disturbance in the environment
would cause the algorithm to deviate significantly from the best possible solution in
these situations.
The CRT team begins the exercise by brain storming the type of disturbances that
can happen, designing novel methods to synchronize these disturbances, and create
chains of different disturbances so that the overall uncertainty in the environment
cascades to a very high level. The CRT team will write code to automatically
generate scenarios with the designed disturbance characteristics, and stress test the
routing algorithm with these scenarios. They will scrutinize the performance of the
algorithm on different scenarios, learn patterns of failures, and redesign scenarios
that are likely to cause these failure patterns to cascade the level of failures.
The process described above attempts to search for scenarios in which factors
come together to cause maximum disturbances. To conduct this search and opti-
mization process, we can either rely on human thinking to search and optimize
explicitly, or rely on human thinking to design automatic search and optimization
methods. The next section introduces the basic ideas behind search and optimiza-
tion. As the scenario discussed in Sect. 3.1.2 and this example demonstrate, the
purpose of these computational tools is to discover the cause–effect relationships.
118 3 Big-Data-to-Decisions Red Teaming Systems
In breadth first, the 1s will begin appearing from the top level, moving from left to
the most right, then going back all the way to the left, but one level down, with the
last one appearing corresponding to the bottom right node. In the depth first, the 1s
will begin appearing from the top all the way to the bottom of the tree, then back up
to the highest level with a remaining 0 node, then back down again from that node.
Search in this case is guided by this binary variable, and a simple rule for visiting
a node that has not been visited before. In fact, this simple rule is all we need to
perform a complete search of the tree. The reason we follow depth or breadth first
is that they correspond to the minimum cost of tracking which nodes have been
visited. If we follow a random order, we need to have a list of nodes that have not
been visited, and every time we visit a node, we remove the node from the list. This
involves extra storage that we do not need if we follow breadth first or depth first.
Assume that the objective is not to visit all nodes in the tree. Instead, assume
each node is a nest with a number of ants. Our objective is to find the smallest nest
in the tree. This is an optimization problem. We can still use breadth-first or depth-
first strategies and search the entire tree. This is perfectly fine if the size of the tree
is relatively small and there is no domain knowledge on how nests are distributed in
the tree.
However, if we know that nests tend to get smaller as we go down the tree, we
can search only the leaves at the bottom. This characteristic defines the advantage of
optimization techniques. They exploit domain knowledge about how solutions are
distributed in the search space and utilize this domain knowledge to find the optimal
solution fast.
Optimization is a core technology in any type of problem-solving activity,
manual or computer based. When a need arises to solve a problem, we need to
be able to evaluate what is an allowed (feasible) solution and what is not an allowed
(infeasible) solution. This is achieved by designing the set of constraints. If a solu-
tion satisfies all constraints, it is evident that this solution is acceptable and feasible.
If many solutions or alternatives exist that satisfy the constraints, we need a criterion
or a set of criteria to decide which of these solutions is more appropriate. A criterion
can take the form of an objective function, where one solution is better than another
if it has a better value on this criterion. Another form a criterion can take is of a
goal, where one solution is better than another if it betters satisfies the goal.
Optimization is usually considered the process of finding one or more solutions
that have the minimum cost according to some cost function. Any search problem
can be modeled as an optimization problem. In the example above where we wanted
simply to visit all nodes in the tree, the optimization problem is to maximize the
number of 1s in the tree. The optimal solution exists only after we have visited all
nodes.
Let us take another example that is more messy than the structured exam-
ple above. In Manysoft, Mario is the technical person who knows the secret for
the company’s next revolutionary product. Therefore, Mario is a critical element
in the organization. Marcus works for Minisoft and knows how to entice someone
like Mario to speak and reveal the secret. Manysoft knows this fact. The objective
of Manysoft is to find a strategy to minimize the probability that Mario encounters
120 3 Big-Data-to-Decisions Red Teaming Systems
Min. F .X /
Subject to
D.X / and C.X /
Solving an optimization problem has two parts. First, we need to find the set
of feasible solutions V .X / 2 D.X /, where .v1j .x1 /; : : : ; vij .xi /; : : : ; vnj .xn // 2
V .X /; i D 1; : : : ; n; j D 1; : : : ; k, with vij .xi / is the value assigned to variable
xi 2 X , n and m represent the cardinality of the set of variables and feasible
set respectively, and .:/ represents the ordered solution vector that satisfies all
constraints; in other words, we need to find an assignment of a value, vij , to each
variable, xi 2 X , such that, the value is within the domain, vij 2 Di , and it satisfies
all constraints, C.X /.
Second, we need to find the optimal solution, V V .X /, within the set of
feasible solutions V .X /.
Define I.X / as a function representing the amount of constraint violation. The
amount of constraint violation can be measured in many ways including by the
number of constraints violated, a distance metric between the value of the current
3.3 Search and Optimization 121
solution and the closest point in the feasible region, or the amount of effort needed to
retrieve feasibility. The problem can be reduced to an unconstrained version taking
the following form:
Min. F .X / C I.X /
2
The word “algorithm” in GA is not used in the same manner in which an algorithm was defined.
A GA is a heuristic. The use of the word algorithm here refers to the corresponding computer code
so the heuristic is written as a computer algorithm.
3.3 Search and Optimization 123
our initial bias is incorrect, the Simplex method is guaranteed to find the optimal
solution for the linear-programming problem. Therefore, our bias may cost us more
time if we get it wrong, but it will not cost us quality.
We may ask why we should rely on heuristics. The reason is that in any realistic
large-scale real-world problem, heuristics are more efficient than algorithms. This
may shock some mathematicians. However, this discussion is important for CRT
because of its impact on the choices we will make in optimization. As such, we
need to explain this point further.
The Simplex method, and some variations of the method, constitutes an efficient
algorithm that can realistically solve large-scale real-world problems and guarantees
to find the optimal solution. In fact, it will be faster than most heuristics, and we can
solve problems with millions of variables. As such, how is it that we are making a
claim that heuristics are more efficient?
To use the Simplex method, the objective function and constraints need to be
linear and the variables need to be continuous. Real-world problems do not satisfy
these requirements easily. Even if the constraints are linear, most likely the cost
function will be nonlinear. More often, we will find some variables need to be
integer, and in most CRT problems, we will have many conflicting objectives rather
than a single linear objective.
Therefore, if we wish to use the Simplex method, we have to make the problem
fit the model. This is a trick that is not accepted in CRT. In the CRT exercise, we
attempt to discover vulnerabilities, or evaluate the system we wish to study. By
approximating the problem too much, we might be hiding vulnerabilities, and more
dramatically, we might be biasing the findings away from an area of high risk toward
areas of lower risk.
intrateam optimization. For example, the problem of minimizing the fleet size of a
furniture-delivery company: this problem consists of a number of interdependent
problems. One interdependent subproblem is routing. If a vehicle chooses the
shortest path, it might also be the busiest path; therefore, delays will occur. Taking
a longer path would mean fewer trips were made by a vehicle and an increase in the
fleet size.
A second interdependent subproblem is bin-packing. Optimal packing of a
vehicle to maximize the load it carries may result in an increase in loading and
unloading time and longer routes for the vehicle. A bad packing of a vehicle would
mean unutilized space, shorter than necessary trips, and a need for a larger fleet.
A third interdependent subproblem is timetabling or scheduling. Increasing on-
road time would reduce the fleet size, but may also increase breakdowns and
maintenance rate. Minimizing the idle time of a vehicle would mean that disturbance
in the route or delay in executing a delivery would delay all subsequent tasks. In the
latter case, some deliveries may even need to be postponed to the following day.
Moreover, it may cause an increase in the rate of fatigue for drivers.
A fourth interdependent subproblem is fleet mix. Having a single type and size
of a vehicle in the company would provide the best maintenance services because it
would mean lower maintenance costs, as the company would require a single type
of expertise in its workshop; having a single type and size of vehicle could also lead
to more efficient material-handling processes. However, having all vehicles of the
same size is likely to increase the underutilization of vehicles, and may increase the
number of trips necessary to execute all deliveries.
To decide on the optimal fleet size of this delivery company, at least the above
four interdependent subproblems need to be solved. Routing would depend on the
type of vehicles, timetabling, and bin-packing. Similarly, the optimal decision on a
fleet mix would be influenced by decisions made in the other four subproblems.
To formulate a single optimization problem to solve all subproblems is not
desirable. The interdependency of the four subproblems is translated in the math-
ematical model to a high level of coupling and nonlinearity. The model will be
too complicated, and it will be very difficult to design efficient optimization search
strategies to solve it. Moreover, a disturbance in one subproblem would impact the
entire model, making it difficult for the model to adapt in a changing environment.
In these problems, it is easier to imagine that each subproblem is handled by a
computer/software agent. Each software agent attempts to optimize its own model.
When one agent proposes its optimal solution to the other agents, the other agents
may reject the solution because it imposes a constraint on them that deteriorates
their own optimum outcome. Therefore, each agent must negotiate the amount it is
willing to lose to achieve its own optimum outcome. A schematic diagram of this
multi-agent system is presented in Fig. 3.8, while methods to solve this problem are
discussed in [1, 3].
3.4 Simulation 125
Fig. 3.8 Multi-agent system for negotiating the optimization of interdependent problems
3.4 Simulation
where Bs .t/ and Rs .t/ represent blue and red force size, respectively, and ˛b and ˛r
represent the single-shot kill probability for blue and red, respectively. The solution
to the Lanchester equation, Bs .t/ and Rs .t/ should satisfy the following equation:
Bs2 .t /
when ˛r
˛b D Rs2 .t /
, the force ratio during combat becomes a constant, with the rate of
dBs .t / Bs .t /
attrition ratio dRs .t / becoming proportional to the force ratio R s .t /
.
Lanchester equations constitute a simple model. They have been widely criti-
cized because of their assumptions. As such, it may serve us to discuss this issue.
Since every model is a representation of something in a form that is not the form
of the something itself, equivalence between the original system and the model can
only be a matter for theoreticians to debate. In fact, it is inconceivable to think of
a model of any real-world phenomenon that is equivalent in a strict mathematical
sense to the phenomenon itself. As such, we need to understand the following four
related concepts: model assumptions, resolution, level of abstraction and fidelity.
Definition 3.5. Model assumptions represent the conditions under which the model
is valid.
That is, model assumptions represent the conditions under which the model
represents what it is supposed to represent correctly.
In the Lanchester equations presented above, one of the assumptions is that the
two forces rely on direct fire only. If we wish to model a situation with indirect fire,
we should not use the model above as it is. The model is a tool in our hands; it is
our choice how and when to deploy it. If we deploy it incorrectly, we are to blame,
not the model.
The starting point of any computational modeling is to build a model. We can
then transform this model into a piece of code.
Definition 3.6. A simulator is an encoding of the model in a suitable software
system (i.e. program).
For example, an aircraft simulator may look like a physical aircraft with all the
gears except that this aircraft is flying in the virtual world using a model.
Definition 3.7. Simulation is the ability to reproduce the behavior of a system
through a model.
Simulation is the process whereby we sample different inputs, use the simulator
to generate the corresponding outputs, group the outputs and attempt to understand
how the aircraft responds to different inputs. In essence, a model represents the
system, a simulator encodes the model, and simulation is the overall process with
the objective of mimicking the original system, sampling the input space, and
reproducing the behavior of that system. The relationships between the original
system and the produced mimicked behavior are captured in three concepts:
resolution, abstraction and fidelity.
3.4 Simulation 127
The three concepts of resolution, abstraction and fidelity are sometimes used
interchangeably. We will provide a different view to these three concepts to ensure
that they are understood as distinct, albeit interdependent concepts. Figure 3.9
illustrates the interdependency among these concepts, where if we imagine we
see a system through a telescope, resolution is what we see through the telescope
lens. We then need to make a decision on how to represent what we see, where
abstraction comes into play. In what we see lies many pieces of information. Fidelity
is how much of these information we will bring inside the model or the simulation
environment. What we will bring in, will reflect on what the simulation environment
is able to generate. Thus, fidelity is both an input and an output.
Formally, resolution is a function of the system while abstraction is a function of
the model.
Definition 3.8. Resolution is what the modeler intends to model about the problem.
Definition 3.9. Abstraction is what the modeler decides to include or exclude in
the model.
Let us take the Lanchester equations discussed above as an example. Here, one
objective could be to understand force-level attrition. This is the level of resolution
the modeler decides to consider. The modeler may zoom in the system and instead
of examining a force-level question, they may decide to examine a company or even
a single soldier.
Thus far the discussions have revealed a need for data-analysis techniques to support
the CRT exercise. For experimentation, classical hypothesis testing is required to
establish confidence about whether the trends discovered in the data are reliable
trends or happened by chance.
In this section, we will introduce a bit more advanced data-analysis techniques.
One of the main tasks in CRT is to discover the boundaries of behavior to create
a challenge. In the scenario presented in Sect. 3.1.2 on Ramada and Bagaga, the
simulations generated large amount of data that were labeled with one of three
labels: z1, z2, and z3. As presented in Fig. 3.5, we needed to discover the boundaries
that separated each of these three labels in the space. This task is traditionally known
as “classification:” one of the main problems in the wider field of data mining.
In the remainder of this section, we will first introduce data mining and machine
learning. We will then discuss different approaches to classification. The discussion
will then continue on how we can adopt these approaches to approximate the
boundary of behavior that will enable us to decide on how to design challenges
within the CRT exercise.
Historically, the field of knowledge discovery in databases (KDD) [14] is
concerned with the overall process of transforming data into knowledge. This
process has many steps. Data can exist in any form, including folders in file cabinets,
files on computers, web pages on the internet, audio and video files in mobile
telephones, and data that resides in our head. To process these data on a computer,
we need to digitize them, that is, they need to be stored on a computer in 0s and 1s.
This may involve hiring people to type the data, or using automatic methods such as
scanners, optical-character-recognition software, and speech-to-text software.
The process of transforming the data into a digital format can involve mistakes.
To ensure data quality, we need to fix these mistakes by deciding on what to do
when we encounter a missing value (e.g. the data entry did not include the age of the
client); an inconsistent value (e.g. the customer is 6-years-old working as a CEO);
or many other data-cleaning issues. Once the integrity of the data is established, we
can then transform these data into a form suitable for the specific algorithm we are
using to discover “knowledge.” This step is traditionally termed “data mining.”
Data mining is a step within KDD in which the data are in a state ready to be
processed, and the data-mining technique takes these data and discovers knowledge
in the form of patterns and relationships. Within CRT, data mining offers extremely
powerful tools that one team can use to learn about the other team. But first, let us
focus on the word “knowledge.”
One way of thinking about knowledge is to see it as a set of rules. For example, if
John is not at his desk, the security system is vulnerable. Obviously, we can discuss
130 3 Big-Data-to-Decisions Red Teaming Systems
many issues about this rule, from its validity to its causality and generalization.
However, this is not the point at present. The two main points we need to discuss
about this rule are the following: representation (the form) and inference (how we
discovered it).
Representation can be perceived as “IF : : : THEN : : :” representation. It is a very
powerful representation despite its classical form having critical assumptions such
as linearity. It is powerful because it has an expressive power, that is, a human can
understand it easily. Symbolic representations such as this are consistent with the
manner in which we reason about entities in the world.
However, on what basis have we discovered this rule? Why is the security
system vulnerable when John is not at his desk? These are two different questions.
We have observed the system behavior over time. Assume we are discussing a
computer network. We have noticed that many times when John leaves his desk,
a denial-of-service attack on the network occurs. Such a rule can be discovered
through data-mining techniques that attempt to correlate events across different
databases. These types of correlations can be misleading because there might not
be any relation between John leaving the desk and the denial-of-service attack.
Nevertheless, whether John is a cause for the denial-of-service attack is not the
issue. We first need to discover the rule/pattern. Before we can dismiss the pattern,
we need to consider it a hypothesis that warrants further investigation. That is, this
rule is simply a hypothesis that is yet to be validated. We can then ask why.
Asking “why” may trigger a data-collection exercise for data that we have not
been collecting. For example, we may collect data on where John goes when he
leaves his desk if one of the hypotheses is that John is generating the attack. We
may collect data on John’s experience and attitude on the network if we believe that
John is an excellent network administrator who can quickly sense espionage activity
that occurs before the attack, and diverts the attack to a dummy network. Therefore,
leaving his desk is a window for an intruder to penetrate the network.
The above discussion illustrates a point that is critical for CRT. The data-mining
process can help us to generate hypotheses that are supported by evidence from the
data we have. This can be an entirely blind process without a bias of any specific
presupposition. One hypothesis raises questions that trigger more analysis and more
hypotheses can be discovered during the process. Therefore, one is able to see the
overall CRT exercise as a data-mining exercise; it begins with hypotheses, conducts
experiments and/or discussions to collect evidence, either confirms or refutes the
hypotheses, and the cycle continues.
The previous representation can be extended to “IF : : : THEN : : : ELSE : : :”
and can contain a series of nested “IF : : : THEN : : :” rules. For example, see the
following rule, which assumes that the first step to authentication in the system is
based on a fingerprint.
If subject’s finger is oily, authorization is not granted; authorization is granted
subject to identification.
This rule is a compound rule; we can split it into three basic rules that we can
easily map to each path from the root node to a leaf node in a tree-like form. The
three rules are the following:
3.5 Data Analysis and Mining 131
This type of decision tree is termed a “univariate decision tree.” In the oblique
hyperplane case, a split can involve a weighted sum of multiple variables (e.g.
3 x Ages C 5 x Loyalty Points <120). This type of decision tree is termed a
“multivariate decision tree.”
The leaves of the decision trees above represent a categorical variable referred to
as the “class.” This type of decision tree is referred to as a “classification decision
tree.” If the leaf is a predictive function (point value), the tree is referred to as a
“regression decision tree.”
In this book, we will limit the discussion to classification problems as the main
technique discussed so far to discover boundaries to design a challenge. However,
many other data mining technologies such as clustering analysis, association rules,
and point prediction are useful across the CRT exercise.
Similar to Fig. 3.5, let us assume that we collected data through simulation or
real-world sensors that indicated three categories of risk: high represented with “x,”
medium represented with “C,” and low represented with “” as in Fig. 3.12. In
this figure, we present two approaches to classification termed “inner” (diagram on
the top) and “outer” (diagram on the bottom) classification. An example of inner
classification is discussed in [21], while an example of an outer classification is
discussed in the famous C4.5 [29].
In inner classification, we attempt to provide an exemplar (also termed a
“prototype”) in each category. In the top diagram of Fig. 3.12, the exemplar is
presented with a large bold label in the middle of each group. When a new
observation arrives, we measure the similarity between the observation and the
three prototypes. We then assume that this observation belongs to the group with
maximum similarity. For example, if a customer has loyalty points of 20 and an age
of 25, this customer is closest to the prototype labeled “X;” thus, we will say that
this customer is a high-risk customer.
In outer classification, we attempt to find the boundaries between the classes as
demonstrated in the bottom diagram of Fig. 3.12. When a new observation arrives,
we will see within which area it falls and assign it the category of this area. This
matching process is performed using a decision tree, but other models exist, for
example, rule sets and artificial neural networks.
In CRT, we attempt to estimate the boundary between classes to define the
boundary of challenges. Therefore, it is more appropriate to use an outer classifier.
We will limit the discussion to classification trees as an efficient manner in which to
approximate the boundary between classes. In the remainder of this section, we will
present a simple introduction on how we can build a classification tree. Despite that
many algorithms exist on this topic, including CHAID [20], CART [8], ID3 [28],
C4.5 [29], LMDT [9], SPRINT [31], SLIQ [26], and QUEST [25], our discussion
will focus on C4.5 [29] as a commonly used algorithm in industry.
134 3 Big-Data-to-Decisions Red Teaming Systems
3.5.1 C4.5
three labels (let us also refer to them as “classes” or “categories”) in the computer
in 0 and 1s, we need two bits. In this case, we can represent Z1 as 00, Z2 as 10,
and Z3 as 01. We are able to discover the number of bits required using a simple
mathematical concept known as a “logarithm” or “log” for short. We will write the
log of a number x to the base y as z D logy .x/. In essence, the relationship between
the value of the log, x and y is that y z D x. Therefore, log2 .1/ D 0 because 20 D 1.
Given that we wish to represent these three classes in binary formats using bits,
and each bit can take two values (0 or 1), the log needs to be of base 2. Given
that we have three classes, log2 .3/ D 1:584963. In our representation above, we
used two bits. However, the average number of bits we needed was 1.584963. This
can quickly indicate to us that we used more than we needed; thus, the space used
to store the three classes is underutilized. This is obvious because we do not have
a class corresponding to both bits having the value of 1, that is, there is no class
encoded as 11.
The log gives us the average number of bits we need to use to encode data for
storage or transmission. Equally, we can use this idea to measure the information
content of a message or dataset. Assume we have a random sequence of the three
labels: Z1, Z2, and Z3. Given the sequence is completely random, the probability
that we select a label from this sequence and we correctly predict this label
before identifying it is 31 . Let us calculate the information content (entropy) of this
sequence, we will then explain what it means.
X
c
Entropy.S / D pi log2 pi (3.4)
i D1
where, pi is the probability that one of the three classes will appear in the sequence,
and c is the number of classes.
Therefore, in our example, the entropy is
1 1 1 1 1 1
log2 . / log2 . / log2 . / D 1:584
3 3 3 3 3 3
Given that the data has more regularity, it is becoming more predictable. In fact,
if we make our prediction to be Z1 always, we will have 90 % accuracy. In this case,
we do not need to transmit our prediction every time.
An entropy is a measure of the impurity of the data. If the data are impure and
completely random, entropy is at the maximum. We will say that the information
content is very high and that we cannot find regularity in these data to reduce the
storage or the length of the message required to transmit them.
When the data are pure, containing the same information everywhere, we can
optimize the space required to store these data and we can easily predict the contents
of these data.
Using the same principles above, we can design criteria to automatically detect
whether the criteria we will be using to split the data into two groups to improve
predictability will be successful.
Let us revisit Fig. 3.5. Before we discover the three hyperplanes that separate
the three classes from each other, the data are mixed, regularity is low; therefore,
entropy is high.
If we correctly discover the two hyperplanes that separate the three classes from
each other, we will end up with three groups of data. In each group, there is only
a single class, regularity is at its peak value, predictability in each group is perfect;
therefore, entropy is very low.
That is, we are offered a discriminatory condition to split the data into two
groups. We can check the discrimination power of this condition by using the
concept of the entropy to test the change of impurity of the data. If entropy before
the split is higher than the entropy after the split, we know that this discriminatory
condition split the data into more regular subsets.
C4.5 follows the same principle. Comparing the entropy of the data before and
after a discriminatory condition is applied defines the difference as information gain
or gain ratio.
C4.5 defines I nf o.S / as the information embedded in a dataset S , with jS j
records, and k number of classes .C1 ; : : : ; Ci ; : : : ; Ck / as follows:
X
k
freq.Ci ; S / freq.Ci ; S //
Info.S / D log2 (3.5)
i D1
jS j jS j
where Freq.Ci ; S / represents the number of records in the data S belonging to class
Ci .
Given a criterion x that splits this data set into n subsets fS1 ; S2 ; ; Sn g, we can
calculate the information content of the data after this split as follows:
n ˇˇ ˇ
X Sj ˇ
Infox .T / D Info.Sj / (3.6)
j D1
jS j
3.6 Big Data 137
Information gain is the difference in the entropy before and after the split
n ˇˇ ˇ ˇ ˇ!
X Sj ˇ ˇS j ˇ
SplitI nfo.x/ D log2 (3.8)
j D1
jS j jS j
Gain.x/
GainR atio.x/ D (3.9)
SplitI nfo.x/
“Big data” is a “buzz word” that will survive for many years. It is considered a buzz
word because it has no clear definition. However, it comes with a clear message:
there is an urgency to rethink how data are mined and analyzed. This urgency is
driven by technological, economic, and social factors [15, 30, 40].
On the technological level, different computer architectures exist that facilitate
the collection and storage of massive amounts of data. Examples of these architec-
tures include:
• Service-oriented architecture (SOA), a current industry standard that offers the
conceptual basis for the development of web services (see Sect. 2.5.1).
• Grid computing, which is very likely to be subsumed with the architecture
discussed below.
• Cloud computing, which provides the flexibility to store and process data
remotely in a distributed fashion across computers around the world.
On the economic side, the field of data mining offers the tools to process the
data. However, in the nineteenth century, organizations were striving to collect data,
then the data bloom occurred in the current century. The need to collect and store
massive amounts of data has long been recognized by many organizations, including
Walmart, Google, and government (for reasons of national security and others).
138 3 Big-Data-to-Decisions Red Teaming Systems
Rather than defining big data, the literature takes the approach of defining the
characteristics of big data. The letter “V” was selected to represent these charac-
teristics [19, 30]. The journey began with the 3 Vs (volume, variety, and velocity),
extended to the 5 Vs by adding (veracity and value), and finally extended to the 6
Vs by adding (variability).
Volume is about size, number of features, and number of instances. From
terabytes, to petabytes, and beyond, this massive size of data needs to be stored,
processed, and managed appropriately.
Velocity in data-mining language is about the changes occurring in the envi-
ronment and concept drift. The implication of dealing with streams of data is that
3.6 Big Data 139
relationships and concepts underlying these data can, and will, change over time.
The analysis must be able to detect these changes, and adapt the knowledge learned
as the environment changes.
Variety reflects the heterogeneous nature of the data. As the transactions of
each customer are recorded, information about the customers needs to be linked to
understand the relationship between the type of customer and the type of transaction.
Simultaneously, information about the products and the supply chain need to be
extracted to understand whether the higher than usual expected demand can be
fulfilled, or the supply can be delayed in the presence of low demand. To compound
the complexity of the situation, unstructured text from newspapers must also be
analyzed to extract economic indicators and trends.
Veracity is about the truthfulness of the data. It can be encapsulated in two main
problems that need to be managed: trustworthiness of the source and noise. A great
deal of research is currently focusing on the area of estimating the trustworthiness
of a data source. Unfortunately, this is a self-defeating course of action. Once a
model for estimating trustworthiness is built, it can be used to deceive the system.
Conversely, noise is a very long-standing topic in data analysis and mining. It can
come in many different forms, including noise in the communication channel, noise
because of approximations and rounding decisions, noise because of ambiguity in
the representation language, or noise in perception.
Value (for money) is about the worthiness of a big-data decision-the cost and
benefit trade-off that an organization must research and evaluate before committing
its resources to decisions on big data. This is a long-standing problem in data mining
in general. For example, the board of a company wants to see a good business case
on why large investments need to go into the data-mining department, or what
is known in business terms as return-on-investment. This constitutes a catch-22
situation: it is not possible for analysts to evaluate the knowledge hidden in the data
to demonstrate value without having data-mining capabilities in place. However, to
have data-mining capabilities in place, the analysts need to demonstrate the value.
In such situations, organizations usually turn to the following principle: “start
small first; walk before you run.” Unfortunately, this often turns out to be a very
bad and unwise decision for data mining in many organizations. As the analysis
team starts small, with few capabilities, they discover little information. Over time,
the analysis team becomes occupied with the little amount of information, routine
reporting, and being pressured to continue to demonstrate value with no investment
in place. It does not take long for the data-mining department to grow in size with
several more people, and diminish in value as it becomes overloaded with classical
analysis tasks that can mostly be automated if investments become available.
Business cases for big data need to take a different form. It is not wise to invest
blindly, but it is also not the type of investment that can be executed incrementally,
from the bottom of the stairs up to the roof. The initial investment is indeed
significant; therefore, the decision must be very well researched. However, the initial
step should not be small under any circumstances; otherwise, the organization will
enter into a loophole.
140 3 Big-Data-to-Decisions Red Teaming Systems
Variability is about changes in the format and data structure of the incoming data.
Many big-data problems rely on third parties for streaming the data. For example,
when mining online news, the news company may change the format of its website
to make it more accessible, attractive, or even more difficult for non-subscribers to
access. The big-data infrastructure needs to be able to accommodate these changes
rapidly and seamlessly. Some of this complexity can be reduced in the presence of
proper contracts between different parties, and meta-data standards.
The above discussion has not defined what a big-data problem is but has dis-
cussed how to recognize a big-data problem. The meaning of the word “big” is not
completely clear: How big is really big? However, this haziness is necessary because
as organizations accumulate more data, complexity will continue to increase and
the big-data problem will remain. Perhaps today the focus is about problems in
managing terabytes, but tomorrow the problem will be about exabytes, and in several
years, it will be about yottabytes.
Classical databases such as Oracle and DB2 are not directly suitable to handle big-
data problems. The primary reason is that these databases assume that the data
structure is known in advance. This assumption is very limited in big data, since we
may know the data structure of the data we currently have, but we do not necessarily
know the data structure for the data we are collecting now, or the data we will
collect in the future. Moreover, most data in the big-data domain are unstructured.
Therefore, this assumption is one of the main reasons for the high costs that used to
accompany big-data applications in some large companies.
To manage big data, the storage of the data needs to be structure-free. The
structure needs to be left to the processing time; thus, the same data can be structured
differently according to the need of the application. This was the basic idea behind
Apache Hadoop [32, 34, 35], an open-source project to facilitate cloud computing.
At the core of Hadoop, the Hadoop Distributed File System (HDFS) provides
the software level to distribute data across the cluster. HDFS allows for redundancy
such that a node failure does not impact calculations. The primary question is how
to distribute the data and how to process such distributed data. This is the task of
the MapReduce model developed by Google. In 2008, Google claimed that it could
process 20 petabytes of data each day using MapReduce.
The idea of MapReduce is to provide the facilities to split (Map) and aggregate
(Reduce) data. That is, one component is responsible of taking the incoming data,
splitting it, then HDFS save these data across the cluster. The other component
processes the data locally on different nodes in the cluster then aggregates these
data to provide an answer. HDFS takes this answer and stores it back to the cluster.
The Hadoop project has many other tools including Pig, a high-level parallel
computation programming language and execution framework, Hive, an environ-
ment that transforms Hadoop into a data warehouse with a predesigned structure;
3.6 Big Data 141
HBase, a column oriented database that can hold billions of rows; or Mahout, a
library of data-mining algorithms, among others.
Hadoop provided the architecture to store and process big data. However, in
any real-world application where the processing of the data goes beyond simple
queries and correlation analyses, another type of architecture is needed. This is
the architecture that needs to support big-data mining. Discussions on related
architectures are covered in Sect. 3.6.4.
The term “real time” has also been a buzz word commonly used in today’s jargon.
Some see real time as fast computations, but the word fast is relative: Does it mean
completing calculations in a few minutes, seconds, milliseconds? How fast is fast?
A real-time system is usually characterized with temporal and logical correct-
ness [16]. Temporal correctness sets a time constraint on the system to provide an
output, while logical correctness sets a constraint that the output must be correct
according to certain specifications.
In simple terms, a real-time system provides the user with the right answer
(logical correctness) at the right time (temporal correctness). Therefore, real-time
systems are not about being incredibly fast. They need to be sufficiently fast to meet
the time constraint without violating the correctness of the answer/response. In one
application, the response might be needed in milliseconds, while in another, the
response might be needed in several months.
The response time of a decision module or a system, is the time needed for every
operation from the time the needed/requirement of a response is established (request
time) to the time the response is generated (delivery time).
suggested by JDL. For example, object assessment would generate data that are
added to the database, this can trigger another process of preprocessing. In addition,
some of these functions on one level of abstraction in the system would play a
different role on a different level of abstraction in the system.
The remainder of this chapter will be devoted to discussing the architectures
proposed in this book for the type of modeling required to provide the necessary
components and functions to produce CRT decisions.
Fig. 3.14 CRT1: level one preliminary computational red reaming system
Fig. 3.15 CRT2: level two preliminary computational red reaming system
Fig. 3.16 CRT3: level three preliminary computational red reaming system
In CRT2, learning normally happens on one team side, while the other team is
fixed. This is not the case in CRT3, as shown in Fig. 3.16, where both teams learn
together through reciprocal interaction with the other team. Both teams go through
the evolutionary learning cycle of evaluation, selection, recombination. As each
team learns, the landscape of the evaluation process changes. Therefore, a strategy
adopted successfully in one step of the coevolution may fail in a different step as it
gets to face a better opponent.
3.7 Big-Data-to-Decisions Computational-Red-Teaming-Systems 145
Fig. 3.17 CRT4: level four preliminary computational red reaming system
CRT4 (Fig. 3.17) is the most advanced version of these preliminary models,
whereby social and individual learning are combined. As the society evolves,
individual agents also exhibit individual lifelong learning abilities.
CRT0 to CRT4 are typical forms of blue–red simulations. These systems have
certain limitations from CRT perspective. They assume that the overall CRT
exercise can be captured in a simple simulation. Unfortunately, any practical CRT
exercise relies as we have discussed so far on the analytics of risk and challenge.
Even in these blue–red simulations, these two cornerstones exist but they are
external to the simulation. Evolution is used to find a strategy where one team will
win, thus, it can be seen as a weak form of a challenge. But the true meaning of a
challenge is not considered. As for risk analytics, this is left to the scientist who,
in the simple case, runs these simulations many time to estimate the risk of the
decision.
CRT is a very complex exercise. These simulations have been written in the
context of wargaming, but they are all by far too simplistic to capture the real and
complex dynamics of a war. While they are useful in modelling some aspects of a
war on a certain level of abstraction, their results are only confound to these aspects
and need to be incorporated into a wider context. As such, these preliminary systems
are only one tool that can be used within a real and wider red teaming exercise.
146 3 Big-Data-to-Decisions Red Teaming Systems
The scenario presented in Sect. 3.1.2 demonstrated the basic technical tools for a
CRT environment. The rest of this chapter will elaborate on these tools. However,
it is important to mention that CRT implementations can vary widely in the level of
sophistication and the type of science that needs to go into the development of such
CRT systems.
Table 3.2 shows the progressive stages of sophistication that the implementation
of a CRT system can go through and the corresponding response capabilities of
each level. As the system progressively moves away from mere dependence on
subjective opinion to autonomous generation of scenarios combined with integrated
simulation, data mining and optimization capabilities, the system improves its
abilities to prepare the organization for the unknown that may face the organization
causing shocks and surprises.
This software program will output the information described above for any set of
inputs. The simulator is almost like a black box. The model resides inside this black
box. The pure job of the black box is to transform the inputs into the appropriate
outputs using the model residing within.
If every input above is known with certainty, all what we need is the simulator
to feed the inputs and we will be able to determine the exact output. However, this
deterministic behavior is not useful in the real world. Usually, we can only estimate
the inputs. Many of these inputs are external to the aircraft and we are not sure how
the aircraft will behave for different values of these external inputs.
For example, wind information is external to the aircraft. We may need to test
the behavior of the aircraft for different values of wind. This process is known
as “sampling,” whereby different values of all uncertain elements are generated
according to certain rules or distribution.
As was discussed before in Sect. 3.4, simulation is the ability to reproduce the
behavior of a system through a model. Simulation is the process whereby we sample
different inputs, use the simulator to generate the corresponding outputs, group the
outputs and attempt to understand how the aircraft responds to different inputs.
A number of questions arise when we examine Fig. 3.18. One question is about
validation, asking the question of how one knows that the chosen model and
the behavior it exhibits reflect the behavior of the real system. Another is about
verification, asking the question of whether the model is implemented properly (i.e.
Is the simulator a correct implementation of the model?).
This question is answered in Fig. 3.19. In this figure, the behavior of interest of
the real system is observed and recorded. Similarly, the behavior of the simulation
is recorded. Patterns can be extracted from each set of observations, by using
clustering analysis or some other data mining technologies, to deconstruct the output
from the simulation into similar groups.
The behavior of the aircraft on a simple level can be captured through recordings
of the state vector of the aircraft in the simulation and in the real world. State vector
information includes longitude, latitude, altitude, time and/or speed information.
Comparing these state vectors can tell us whether the simulated aircraft is following
identical path to the real aircraft.
The previous comparison denotes one level of validation. However, it is too
simplistic. It may result in the aircraft in the simulation just playing back the data it
received from the real world. We need to dig into the concept of behavior to validate
this simulation. One manner in which to achieve this is to calculate fuel burn and
change of flight weight after each transition in space (i.e. flight segment). We can
then compare the behavior of the fuel burn and changes in flight weight to the factors
in the real world. We can conduct further research by comparing other aerodynamic
details.
Figure 3.19 addressed issues of validation, but not verification. The verification
process is internal, one the modeling team must undertake. It usually relies on
comparing the specifications of the model to the implementation in the simulator,
comparing inputs and outputs of the model through certain hand calculations or
other means with the outputs of the simulator, and other software-verification
techniques. The issue of verification is outside the scope of this discussion.
3.7 Big-Data-to-Decisions Computational-Red-Teaming-Systems 149
Figure 3.19 stops at the level of simulating an aircraft. This is similar to blue–red
simulation, where we stop at the level of having a simulation of the blue and red
teams to sample the response of blue and/or red to the actions of the other team (see
Sect. 1.7.4).
A natural question from decision makers once we have the system implemented
in Fig. 3.19 is that if we are successful in imitating the real system, can we use this
imitation to test the system for conditions that we cannot test in the real world?
Further, can we use this imitation to reveal how can we optimize the real system?
Figure 3.20 adds an optimization loop, whereby the simulated system is used
to find the optimal solution for an objective function. For example, in the above
example of an aircraft, one can ask what is the optimal trajectory of an aircraft that
minimizes fuel burn between a specific origin and a specific destination, and given
specific wind conditions.
In the question posed above, the optimization needs to take as input an origin,
destination, and a wind profile, and provide an output on a second-by-second basis
the longitude, latitude, and altitude of the aircraft. Another possible output would
include factors such as the thrust level and flap settings, representing the settings
of the flight-management system to fly the aircraft. We will continue with the first
type of output for ease of understanding. We will call this type of optimization
as “behavior optimization” because the primary focus is placed on reproducing the
150 3 Big-Data-to-Decisions Red Teaming Systems
behavior of the system without necessarily constraining the system to ensure that the
model is plausible in its working mechanisms with the internal working mechanisms
of the system itself. In simple terms, we would like to optimize the behavior of the
ants without necessarily paying attention to whether or not the model used is a
biologically plausible model of how the ants make decisions.
There are many different methods by which this optimization method can work.
One is that the optimization method will first generate a trajectory at random,
or use a previously flown trajectory from a real situation. The method will then
systematically change the output (i.e. change the positions on the trajectories by
changing the speed of the aircraft and flying angle) and measure the impact of that
change on the objective function. These changes need to be implemented in small
steps. The changes are accepted if they improve the objective function and discarded
if they do not.
This systematic search procedure is conducted by an “optimization algorithm” or
a “search procedure” as we discussed in Sect. 3.3. When no changes can be found
that improve the objective function, the search ceases.
So far, we have been successful in optimizing the system indirectly through the
simulation. In fact, it is a great deal cheaper and safer to use the simulation for this
type of optimization. It does not make sense that we perform this optimization on the
real aircraft. Even if we do, we will spend many years validating the system, not to
3.7 Big-Data-to-Decisions Computational-Red-Teaming-Systems 151
mention the high risk associated with this. The simulation environment enables us to
run thousands of scenarios in a computer environment in a much smaller timeframe.
Figure 3.20 opened opportunity for optimization. Figure 3.21 opens even more
opportunities. If we were to validate the simulation environment as discussed
(Fig. 3.19), can we actually ask the simulation questions to reveal patterns that we
cannot ask the real environment?
For example, we may ask what type of behavior an aircraft would exhibit
if we were to apply certain flight-management system settings that are believed
dangerous? These dangerous settings cannot be tested in the real world on a real
aircraft.
Simulation mining is about applying data-mining techniques on data obtained
from simulations to extract indirectly patterns and generalizations on the real
system. We are now moving into the space of fantasy in which we can execute
experiments in the simulated environment that we cannot do in the real environment.
We can extract information about the system of interest without touching the actual
system. The simulation environment acts like an oracle that can tell us what will
happen if we change the system in certain ways: it becomes the crystal ball that we
can use to query the system from a distance.
Despite all the levels of sophistication we have introduced thus far, we made one
dangerous and undesirable assumption: the model is fixed. What would happen if
152 3 Big-Data-to-Decisions Red Teaming Systems
through the validation process we discussed for Fig. 3.19, the behavior we obtained
from the simulation was consistent with the real-world behavior and then after some
time, the simulation began to drift away from the real world? We must remember
that the only constant in the real world is flux.
It is natural in many systems that the above phenomenon would happen. Not all
aircraft behave in the same manner and their performance degrades over time. If
we are simulating a manufacturing environment, it is likely that over time, workers
become more efficient in the work they do and the simulation underestimates this
efficiency.
We cannot rebuild the model every time a change in the real system occurs.
Equally, we cannot afford to continue using the simulation if it no longer represents
the real system. The solution for this dilemma is to utilize the data-mining loop for
validation to reveal why drift is occurring. The pattern that explains drift can then
be sent back to the model to change the model adaptively and autonomously.
Figure 3.22 adds an error term that allows the model to change its parameters
to adapt to new situations. These adaptive search capabilities are very interesting.
Imagine now that we began with a rough model. In many complex real-world
applications, such as modeling the dynamic of a government, to conceive of a good
model is a non-trivial task. We may have a rough idea about the system that we use
to build a rough model. As the adaptive search capability enables us to recover when
the real environment changes, it can also be used to recover when the model is not
entirely accurate.
Our last level of sophistication is presented in Fig. 3.23. The adaptive search
capability is augmented with an optimization module to search for new models and
speed up the refitting of the model. Adaptation takes a long time and it may not
provide solutions that are close enough to the optimal behavior required. Moreover,
if we rely only on an error term to update the parameters of a model, and if the
change in the environment requires a structural change of the model (e.g. the original
model was linear, while the model needed after the change is quadratic), simple
adaptive mechanisms will fail. The adaptive search mechanism merely represents
incremental modifications of the model by augmenting it with the patterns extracted
from the data-mining module. In its simplest form, it can be considered a set of
exceptions that are added every time an exception occurs.
The situation of relying on a simple adaptive search mechanism forever (some-
times referred to as “lifelong learning for a learning problem” or “lifelong opti-
mization for an optimization problem”) is not optimal. The exception list will grow
very fast and get out of control, or simple changes of parameters will not be able
to approximate the changes in the nonlinearity in the relationship. Instead, it is
preferable to use an optimization module to auto-generate new models, as in the
case of using a genetic programming or rule-discovery to auto generate a controller.
154 3 Big-Data-to-Decisions Red Teaming Systems
By now, the loop is complete and the system is ready for one side of CRT.
We have an intelligent system that can correct the models it is using. It can use
simulation as its brain to think about the real system and play “what-if” scenarios. It
can discover information about the real system without touching the real system. It
can even discover how to influence and optimize the real system without revealing
itself to the real system.
It is important to emphasize that the system under investigation mentioned in the
example so far can be a physical, socio-technical, cognitive or human system.
In classical CRT, we are possibly more interested in socio-technical systems. Red
needs to challenge the thinking of blue. The RAA can be implemented regardless of
whether the system to be modeled is physical or socio-technical.
For example, if the simulation is about reproducing the behavior of a group of
people in a social context, the details of the calculations performed for the aircraft
are not a fit for this problem. In this case, we may rely on more sophisticated
forms of data mining. We may capture the networks of communication among
group members, identify the characteristics of this network, and compare these
characteristics in the simulated group and the real group. As we move into social and
cognitive simulations, the comparison between the simulated behavior and the real
behavior necessitates the use of more sophisticated forms of data-mining methods.
The risk analytics system is perfect for modeling human behavior. The adaptive
search and optimization module enable us to begin with rough models (i.e. initial
hypotheses). The system will then refine these models over time through collection
of more intelligence and observations about the human system. The simulation
mining loop enables us to question the simulation instead of questioning the
humans; thus, we can reveal information without contacting the human system
we are observing. The optimization algorithm loop enables us to discover how to
influence the human system to achieve certain goals and objectives. Consequently,
hypotheses about the system become better descriptors of the behavior of the system
in questioning.
The risk analytics system is the computational thinking machine to think red,
think about red, think for red, think blue, think about blue, and think for blue.
However, designing and developing the risk analytics components in practice
requires indepth technical know-how of these components (skills) and a high level
of competency in synthesizing them.
agents as desired can be created for both the red and blue sides. While the overall
architecture will be the same, the data and models used by each agent might differ.
CRT does not stop at the level of building a smart computational environment of
a system. Recall that CRT is about designing deliberate challenges. Risk analytics
can design deliberate challenges with the architecture discussed thus far. However,
the challenges will not be complete. CRT can offer the Shadow CRT Machine.
Definition 3.11. A shadow CRT machine is a computer environment that works in
parallel with an actual system, shadowing and monitoring its operations, projecting
ahead to create the space of possible future states of the system, and challenging any
negative risk that arises in that space by proposing appropriate responses.
Classical system models are categorized in two generic types: closed systems
and open systems. In closed systems, we draw a strict distinction between the
system and its environment. We assume that everything in the environment is
uncontrollable (exogenous variables). We can only control the variables within the
system (endogenous variables); clearly, our control of endogenous variables is not
unlimited. Endogenous variables would have constraints; we can only change these
variables while respecting these constraints.
Closed systems arise from the reductionist school in which a problem needs to
be decomposed into smaller subproblems. When modeling each subproblem, we
assume that the subsystem associated with the subproblem is a closed system.
In open systems, we bring some variables from the environment inside the
system. The modeler understands that a system is not situated in a vacuum.
In CRT, we will go beyond open systems. Perhaps we can refer to it as a
“wide-open” system approach. We consider the fact that every action from the
system influences the environment. Therefore, there is a degree of control (which
may be limited but is certainly not insignificant) that a system can exercise on its
environment.
Figure 3.24 presents the risk analytics for CRT. We can assume in this figure that
the environment is the red team, while the system is the blue team. However, in CRT
the overall architecture presented in Fig. 3.24 would be the architecture used within
a team. As every team attempts to represent itself and the opponent in its thinking
process, each team would need to have a representation of both red and blue and
mechanisms to evaluate decisions and reciprocal interactions.
The environment is modeled and represented explicitly, almost in the same way
the system is modeled. Both the system and the environment interact. However,
clearly, one objective of modeling the environment is to identify the environmental
forces that can be reshaped for the benefit of the system.
In Fig. 3.24, the human agent sits at the interface between the shadow
CRT machine and the external world for the blue and red teams. A real-world
CRT exercise will involve multiple pieces of analysis of nested nature. There can
be CRT activities within the CRT activities. It is also very likely that each team will
contain a number of humans.
156 3 Big-Data-to-Decisions Red Teaming Systems
This mix of humans and shadow CRT machines is depicted in Fig. 3.25 using
the CoCyS system. For this environment to operate symbiotically, each human and
machine are thinking entities. They have different, but complementary skills. The
seamless blending of humans and machines as a single living organism in a CoCyS
system is a sophisticated goal. A demonstration of it will be covered in the third
case study in Sect. 5.3.
References 157
While this chapter covered the sophisticated roles different models can play
within the shadow CRT machine, the human plays other form of sophisticated roles.
The human has the daunting task of deconstructing messy and complex problems
into structured problems in a principled manner. The human needs to understand
how to deconstruct a complex organization like a socio-technical system or a cyber
system into building blocks that can be analyzed properly. The following chapter
describes some of these thinking tools that a human can use and rely on for CRT.
References
1. Abbass, H.A., Baker, S., Bender, A., Sarker, R.: Identifying the fleet mix in a military setting.
In: The Second International Intelligent Logistics Systems Conference, pp. 22–23 (2006)
2. Abbass, H., Bender, A., Gaidow, S., Whitbread, P.: Computational red teaming: past, present
and future. IEEE Comput. Intell. Mag. 6(1), 30–42 (2011)
3. Baker, S., Bender, A., Abbass, H., Sarker, R.: A scenario-based evolutionary scheduling
approach for assessing future supply chain fleet capabilities. In: Evolutionary Scheduling,
pp. 485–511. Springer, Berlin (2007)
4. Blasch, E.P., Plano, S.: JDL level 5 fusion model: user refinement issues and applications
in group tracking. In: AeroSense 2002, pp. 270–279. International Society for Optics and
Photonics (2002)
5. Blasch, E., Plano, S.: DFIG level 5 (user refinement) issues supporting situational assessment
reasoning. In: 2005 8th International Conference on Information Fusion, vol. 1, pp. xxxv–xliii.
IEEE (2005)
6. Bowley, D., Comeau, P., Edwards, R., Hiniker, P.J., Howes, G., Kass, R.A., Labbé, P., Morris,
C., Nunes-Vaz, R., Vaughan, J., et al.: Guide for understanding and implementing defense
experimentation (GUIDEx)-version 1.1. The Technical Cooperation Program (TTCP) (2006)
7. Bowman, C., Steinberg, A., White, F.: Revisions to the jdl model. In: Joint NATO/IRIS
Conference Proceedings, Quebec (1998)
8. Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees.
1 edn. Chapman and Hall/CRC, New York (1984)
9. Brodley, C.E., Utgoff, P.E.: Multivariate decision trees. Mach. Learn. 19(1), 45–77 (1995)
10. Choo, C.S., Chua, C.L., Tay, S.H.V.: Automated red teaming: a proposed framework for mil-
itary application. In: Proceedings of the 9th Annual Conference on Genetic and Evolutionary
Computation, pp. 1936–1942. ACM, New York (2007)
11. Dahmann, J.S., Fujimoto, R.M., Weatherly, R.M.: The department of defense high level
architecture. In: Proceedings of the 29th Conference on Winter Simulation, pp. 142–149. IEEE
Computer Society, Washington, DC, USA (1997)
12. Dantzig, G.B.: Linear Programming and Extensions. Princeton University Press, Princeton
(1998)
13. Director, C.O.: Plans defence capability development manual. Tech. rep., Technical report,
Australian Department of Defence (2006)
14. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: Knowledge discovery and data mining: towards
a unifying framework. In: Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.)
Advances in Knolwedge Discovery and Data Mining, pp. 1–36. AAI/MIT Press, Cambridge
(1996)
15. Frankel, F., Reid, R.: Big data: distilling meaning from data. Nature 455(7209), 30–30 (2008)
16. Greenwood, G.W., Tyrrell, A.M.: Introduction to Evolvable Hardware: A Practical Guide for
Designing Self-adaptive Systems, vol. 5. Wiley, New York (2006)
158 3 Big-Data-to-Decisions Red Teaming Systems
17. Ilachinski, A.: Enhanced ISAAC neural simulation toolkit (EINSTein): an artificial-life
laboratory for exploring self-organized emergence in land combat (U). Center for Naval
Analyses, Beta-Test Users Guide 1101, no. 610.10 (1999)
18. Jacobs, P.H., Lang, N.A., Verbraeck, A.: Web-based simulation 1: D-sol; a distributed java
based discrete event simulation architecture. In: Proceedings of the 34th Conference on Winter
Simulation: Exploring New Frontiers, pp. 793–800. Winter Simulation Conference (2002)
19. Jin, Y., Hammer, B.: Computational intelligence in big data [guest editorial]. IEEE Comput.
Intell. Mag. 9(3), 12–13 (2014)
20. Kass, G.V.: An exploratory technique for investigating large quantities of categorical data.
J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 29(2), 119–127 (1980)
21. Kirley, M., Abbass, H., McKay, R.: Diversity mechanisms in pitt-style evolutionary classifier
systems. In: Triantaphyllou, E., Felici, G. (eds.) Data Mining and Knowledge Discovery
Approaches Based on Rule Induction Techniques, Massive Computing, vol. 6, pp. 433–457.
Springer, New York (2006)
22. Kuhl, F., Dahmann, J., Weatherly, R.: Creating Computer Simulation Systems: An Introduction
to the High Level Architecture. Prentice Hall, PTR Upper Saddle River (2000)
23. Lauren, M., Silwood, N., Chong, N., Low, S., McDonald, M., Rayburg, C., Yildiz, B., Pickl, S.,
Sanchez, R.: Maritime force protection study using mana and automatic co-evolution (ACE).
In: Scythe: Proceedings and Bulletin of the International Data Farming Community, vol. 6,
pp. 2–6 (2009)
24. Lim, T.S., Loh, W.Y., Shih, Y.S.: A comparison of prediction accuracy, complexity, and training
time of thirty-three old and new classification algorithms. Mach. Learn. 40(3), 203–228 (2000)
25. Loh, W.Y., Shih, Y.S.: Split selection methods for classification trees. Stat. Sin. 7(4), 815–840
(1997)
26. Mehta, M., Agrawal, R., Rissanen, J.: Sliq: A fast scalable classifier for data mining. In: Apers,
P., Bouzeghoub, M., Gardarin, G. (eds.) Advances in Database Technology–EDBT ’96. Lecture
Notes in Computer Science, vol. 1057, pp. 18–32. Springer, Berlin/Heidelberg (1996)
27. Porter, M.E.: What is strategy? Harv. Bus. Rev. (November–December), 61–78 (1996)
28. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
29. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
30. Russom, P., et al.: Big data analytics. TDWI Best Practices Report, Fourth Quarter (2011)
31. Shafer, J.C., Agrawal, R., Mehta, M.: Sprint: A scalable parallel classifier for data mining.
In: Proceedings of the 22th International Conference on Very Large Data Bases, VLDB ’96,
pp. 544–555. Morgan Kaufmann Publishers, San Francisco (1996)
32. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: IEEE
26th Symposium on Mass Storage Systems and Technologies (MSST), 2010, pp. 1–10. IEEE,
Nevada, USA (2010)
33. Tzu, S.: The art of war (translated by Samuel b. Griffith), p. 65. Oxford University, New York
(1963)
34. Venner, J., Cyrus, S.: Pro Hadoop, vol. 1. Springer, Berlin (2009)
35. White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Sebastopol (2009)
36. Yang, A., Abbass, H.A., Sarker, R.: Evolving agents for network centric warfare. In: Proceed-
ings of the 2005 Workshops on Genetic and Evolutionary Computation, pp. 193–195. ACM,
New York (2005)
37. Yang, A., Abbass, H.A., Sarker, R.: Landscape dynamics in multi-agent simulation combat
systems. In: AI 2004: Advances in Artificial Intelligence, pp. 39–50. Springer, Berlin (2005)
38. Yang, A., Abbass, H.A., Sarker, R.: Characterizing warfare in red teaming. IEEE Trans. Syst.
Man Cybern. Part B: Cybern. 36(2), 268–285 (2006)
39. Yang, A., Abbass, H.A., Sarker, R.: How hard is it to red team? In: Abbass, H.A., Essam
D. (eds.) Applications of Information Systems to Homeland Security and Defense, p. 46. IGI
Global, Hershey, PA (2006)
40. Zhai, Y., Ong, Y., Tsang, I.: The emerging? Big dimensionality? IEEE Comput. Intell. Mag.
9(3), 14–26 (2014)
Chapter 4
Thinking Tools for Computational Red Teaming
4.1 Scenarios
The word scenario is very commonly used in all scientific fields. In experimental
design, a scenario is an engineered situation to which the experimenter exposes
the units of the experiments. For example, in psychology, if we want to examine
human behavior under stress, the designer may engineer situations in which a human
subject will feel stressed. These situations may be imaginary and deceptive, but they
are engineered to condition the subject to ensure that the phenomenon to be tested
is expressed.
In strategic planning, a scenario is usually an imaginary future. Each future
represents a situation that a country or an organization may face in the future. In
decision sciences, a scenario classically captures a possible set of changes that
might occur for the model. For example, a decrease in the assigned budget is a
scenario, and the analysts would like to understand the impact on performance if
the decrease were to occur. In engineering, a scenario is usually referred to as a
“test case.” These test cases represent the possible situations that a machine may
encounter in the future. They test the machine’s ability to perform under a wide
range of circumstances. In finance, a scenario is usually perceived as variations of
the financial position of a company or variations of the budget.
In all of the above, despite the fact that a scenario may take different forms
in different fields of science, the fundamental concept of a scenario remains
unchanged.
Definition 4.1. A scenario is a picture of how uncertainties may come together to
form a plausible set of forces (a context) that impact the performance of a system.
We avoided to use the word “future” in the above definition, primarily because it
confuses the concept of a scenario by making people assume that a scenario can only
be about the future. In some analysis, we design scenarios to understand the past. If
we can find the set of forces that shaped up the context in a situation that occurred 20
years ago, we can explain the phenomenon that happened then. One simple reason
to do this is that we may not have enough data to explain the phenomenon as in the
case to attempt understanding a market decision that occurred in the past.
The shape and form a scenario will take will differ in different fields. It can be
a story in strategic planning; an excel sheet with the company budget in finance,
a series of events to which to expose a subject in psychology; a range of values a
parameter may take in an optimization model; or an idea that flies through one’s
mind about a question that may be asked in a job interview.
Scenarios are the language used to represent uncertainties. A scenario is a
plausible set of forces that the system might face. It is important here to emphasize
that we use the word “plausible” rather than the commonly used, inaccurate, word
“possible.”
A scenario is not a “possible” set of forces. Emphasizing what is possible and
what is not can cause a scenario designer to focus on the likelihood that something
will occur. Plausibility deals with the manner in which the internal events of a
situation can come together to create the entire situation logically, consistently, and
coherently.
The distinction between plausibility and possibility is important, as it shifts the
focus of the designer from thinking whether an event is possible to focusing on
the internal dynamics of the event and asking questions concerning which elements
need to come together to make this context plausible. This shifts the focus from
possibilities and probabilities, to reasoning, holistic coherence and overall logical
consistency.
Nevertheless, plausibility implies a non-zero probability and a possibility. Plau-
sibility centers the analysis on inferencing root causes rather than merely on
occurrences. The difference in the two approaches of analysis lies in the angle
taken for the analysis and the corresponding possible bias that can be generated.
Plausibility is akin to a “bottom-up” approach. We begin with the building blocks,
the basic forces that define and shape the dynamics of a situation. We then examine
how these forces can interact to condition a context. Some of these context may
not be sufficiently plausible (weakly plausible) and are therefore, excluded, while
others are strongly plausible and are included. Through this approach, we conceive a
strongly plausible situation that we may have once considered less likely. However,
possibility alone may limit a designer’s focus to a local context, without considering
the overall logic of the scenario in the wider context. That is, possibility focuses
4.1 Scenarios 161
time because they are either not skilled enough to find a new job, they are scared of
making a change in their life, or they are not self-motivated. For me, I did not want
to fall into this trap. Instead, I wanted to demonstrate that I can move from one job
to another and that my skills are needed in different positions. I am now satisfied
that I have demonstrated my skills sufficiently in many types of positions, and am
now seeking a position where I can achieve more stability in my life and spend the
rest of my career giving to one company. I feel I have demonstrated sufficiently to
myself that I can move between jobs, and I now need to demonstrate to myself that
I can stay in one place for a long time.”
In this answer, John did not only answer the question that Amy and Martin saw
as a possible problem for John, but he also answered the question in such a manner
that will make Amy and Martin question their hypothesis-the hypothesis that people
who stay in one place for a long time are better than people who move from one
job to another. John played with Amy and Martin’s uncertainties and injected new
uncertainties in their minds about other applicants that they may have favored.
Overgeneralization can be a problem in CRT, and this example is no exception.
John may have opened a can of worms by answering in the manner described
above. The assumption here is that John did not only imagine what Amy and Martin
think, and respond to his imagination, but that he had carefully studied Amy and
Martin (maybe through his experience with them) and only then, could he make an
informed decision on how to play with their uncertainties.
John has learned how to execute CRT correctly. He coupled in the scenario he
designed in his mind two uncertainties: the uncertainty that he may be facing and he
does not control, and the uncertainty of the other team that he can design, influence,
shape, and even control. Effective representation of scenarios in CRT requires
mechanisms to capture the interaction of blue and red uncertainties simultaneously.
However, how can we design and capture these scenarios? The rest of this section
provides insight into this question.
Traditionally, scenarios are designed through many different methods that range
from being completely ad hoc to very systematic. We will use one systematic
method in particular to explain one of the classical scenario-design methods, which
is termed Field Anomaly Relaxation (FAR) [3].
Let us first explain the FAR technique for generating scenarios. In the 1970s,
Rhyne developed FAR as a method that does not restrict the analysis to quantifiable
factors; it draws on insight and judgment, provides an audit trail, produces a range
of explicable scenarios, and begins and ends with short essays.
FAR relies on morphological analysis [4, 5], which in simple terms means
the identification of the basic independent blocks defining a structure, the
non-overlapping values each building block can take, then the generation of all
plausible combinations of these values. For example, if we say that a face is
4.1 Scenarios 163
comprised of building blocks such as a nose, two eyes, two ears, and a mouth, we
can then create categories of all possible shapes a nose may take. The same can be
executed for the shape of an eye, mouth, and ear. In FAR, the building blocks are
termed “sectors,” and the values each building block can take are termed “factors.”
Hence, an eye is a sector and an almond-shaped eye is a factor.
If we generate all possible combinations of shapes for eyes, ears, mouths, and
noses, we can enumerate all possible shapes of a face that we would ever encounter.
Similarly, we can enumerate plausible futures that we have not experienced.
FAR begins with a story that describes a strategic future. From this story, and
through brain storming, sectors are extracted. For each sector, the possible factors
are defined. The superset of all possible combinations of these factors is generated.
Some combinations may not be plausible; these are eliminated from the set. The
superset now only contains the plausible combinations of factors. These can then
be grouped together, ordered in an evolutionary path, and a tree describing how
possible futures may unfold is constructed. Each path in this tree is a possible
sequence of situations that may unfold to uncover different futures. This unfolding
process is written in the form of a story. As such, FAR begins with one story and
ends with another.
A simple example is the following. Let us assume that Manysoft is faced with
a challenge against Minisoft. Manysoft does not know whether it should expand
its workforce. A brain-storming session of this situation identified two sectors:
market stability and resource pressures. The market can either be stable or unstable.
Resource pressures can either be low or high. Therefore, in FAR language, each
sector has two factors.
The superset of factors can then be defined as follows:
S1 ! stable market, low resource pressures
S 2 ! unstable market, low resource pressures
S 3 ! stable market, high resource pressures
S 4 ! unstable market, high resource pressures
It is clear in this example that all four combinations are plausible futures that
the company may face. If we assume that S1 is the current situation, we can order
these scenarios. For example, from S1 we can move to either S 2 or S 3, then we can
move to S 4. It may not make sense in this domain that suddenly resource pressures
become high, and the market becomes unstable. It may be more logical to assume
that stability of the market changes and consequently resource pressures change.
Alternatively, resource pressures change, then the stability of the market changes.
We can then write two ways in which the future may evolve:
S1 ! S 2 ! S 4
S1 ! S 3 ! S 4
FAR then suggests that a story is written for each future path.
164 4 Thinking Tools for Computational Red Teaming
FAR is based on morphological analysis and as such, would fall into the trap of
any method that attempts to capture everything. For example, the above describes
the basis of FAR using an example of enumerating all possible shapes of faces that
one can encounter. What this example did not consider is that through interaction
with the environment, the morphology of the face can be mutated through mutations
in the gene: a face may contain only one eye, or even three eyes. This may not be
imaginable even through a well-designed brain-storming session. FAR assumes that
the structure of the face as we know it is not subject to change.
However, the real problem in these methods is that they stop at the morphological
level. It is important to understand that the real building blocks defining these
morphologies are not just phenotypic building blocks, but also genotypic building
blocks.
For example, when designing a strategic scenario, economic and political stabil-
ity are two examples of typical sectors considered with FAR. They are what we term
“phenotypic building blocks.” What derive these building blocks on the genotypic
level might be culture, education and natural resources. Culture may shape how
disagreement is resolved within a particular country, and whether violence is
pertinent in the manner in which conflict is resolved. Education may shape how
strategy is formed and whether the country advocates for lateral thinking or a
classical memory-based and obeyance-based education system. Natural resources
are fundamental enablers to any economic growth.
On one level, phenotypic building blocks are essential because they hide more
complex details and interactions; thus, they make discussion of the scenarios
manageable. On another level, precisely because they hide essential interactions,
we need genotypic building blocks to understand a scenario more clearly despite
the fact that they will come with a complex space of possibilities. Therefore, both
levels are needed, and in fact, more levels can be needed.
A phenotypic building block relies on the deconstruction of how a system
appears to an observer. A genotypic building block relies on a deeper analysis and
understanding of the true forces that make the system appears as it does. A gene
can be mutated and this can either mutate the structure of a face or a morphological
building block of a face.
Morphological methods ignore the fact that there can be many other layers on top
of a face. They ignore that a mask can be worn to hide facial components or make-
up can be used to reshape an eye into a shape that is not natural or biologically
possible, but is indeed a plausible shape.
FAR demonstrates a simple, efficient and very effective methodology to develop
scenarios. However, it is missing one critical element when in the context of a CRT
scenario: the evolution of blue’s future is dependent on the evolution of red’s future
and vice versa. Blue should not define sectors and factors that are all independent
of red’s objectives. This can waste a lot of resources in discussing forces that sound
right, but are not plausible for red.
4.1 Scenarios 165
In our previous example, the scenarios were designed for the wrong aim.
Manysoft does not know whether it should expand its workforce. However, what
went wrong in the design of the scenarios above is that Manysoft did not consider
that Minisoft is its main competitor and may be asking the same question as
Manysoft, or a question that has a conflicting answer to that of Manysoft. Designing
the scenarios in the manner in which they were above assumes that Manysoft acts
in a passive manner to the uncertainty in the environment. However, in CRT, a
scenario needs to capture the essence of CRT, that is, there is a continuous reciprocal
interaction between two entities.
A blue scenario in CRT needs to capture blue uncertainties as well as red
uncertainties. Blue’s uncertainties can be uncontrollable for blue. However, blue
can control some aspects of red uncertainty. Figure 4.1 presents the basic unit of a
building block that defines a scenario within CRT. The sign on an arrow reflects the
nature of correlation or the behavior of the interdependency relationship. A positive
sign means a positive correlation (an increase in one factor would increase or
positively enforce the other), while a negative sign means negative correlation (an
increase in one factor would decrease or negatively enforce the other).
While traditional definitions of scenarios focused on factors representing sources
of uncertainty, in CRT, a scenario is comprised of building blocks. For example, the
fundamental unit of each building block for blue is comprised of four components:
outcome/effect, objective, blue uncertainty, and the portion of red uncertainty that
blue can influence.
Before we progress, we should discuss what may be the most controversial
component to have of the four components: to include objectives in scenario design.
Traditionally, scenarios focus on uncertainties. Traditionally, uncertainties exist in
the environment, they are uncontrollable, and therefore, the objectives of the system
166 4 Thinking Tools for Computational Red Teaming
are not considered when designing the scenarios. Remember that in CRT the two
teams have conflicting objectives. Therefore, one team’s objectives are a threat to
the other team. Blue needs to consider its objectives in its scenario design simply
because blue’s objectives are the source of uncertainty for red, and will impact how
red’s uncertainties are developed. Blue wishes to generate an outcome to achieve its
objectives. To use another example, John’s objective is to earn money. Getting a job
is the outcome of the interview that John wishes to undergo to achieve his objective.
As such, blue’s outcomes should always positively influence blue’s objectives. If
a blue outcome is designed to negatively influence one of blue’s objectives because
it may have a more profoundly negative effect on red’s objectives, blue needs to
redefine its objective to make this intentional negative influence positive. That is,
if blue will sacrifice a little damage in itself to generate greater damage in red (as
in the case of a company losing some shares in the market to influence the market
and force the competitor to lose more shares), blue needs to redefine its objective
in terms of red’s loss (i.e. positive objective for blue). Therefore, we will always
assume that blue’s outcome will always be designed to influence blue’s objectives
positively.
The uncertainty facing blue will always negatively impact blue’s outcomes.
Meanwhile, blue’s outcomes need to negatively impact red’s uncertainty. Before the
above statements generate skepticism, we need to note that these sentences are only
valid within the context of designing the basic building block of a scenario. With
scenario design, it is natural to focus on the negative impact of blue’s outcomes on
red’s uncertainties. The execution of a scenario may produce the opposite effect.
Moreover, blue may design strategies with the opposite effect. Still, there is no logic
that would justify why blue would generate an outcome to help red, unless at the
end of the process of helping red, there is a trap for red or a large gain for blue.
The building block of a scenario in CRT consists of the elements presented in
Fig. 4.1.
Interestingly, this figure demonstrates that the central point for a scenario remains
the space of uncertainty for blue. However, it emphasizes that this space depends on
blue’s objectives and red’s uncertainties. In fact, blue’s uncertainty can be great.
Blue’s objectives scope blue’s uncertainties. Only the uncertainties that impact
blue’s objectives are relevant here. This bounds the uncertainty space for blue.
Conversely, red’s uncertainties may expand the uncertainty space of blue.
Red may inject more uncertainty into blue’s uncertainties than necessary. For
example, red may embark on deceptive operations that appear to be real solely to
increase blue’s uncertainties or shift blue’s attention to other desired uncertainties.
If blue does not consider the interdependencies between its objectives and uncer-
tainties with red’s uncertainties, blue may fall into the trap of underestimating its
own uncertainties or prioritizing its uncertainties and objectives incorrectly.
Figure 4.2 does not present red’s objectives. However, Fig. 4.3 does present
these to emphasize the symmetric nature of scenario building blocks in CRT.
Red’s uncertainties would be impacted by blue’s uncertainties, as would red’s own
objectives. In essence, red and blue do not necessarily have complete access to each
other’s objectives. Even if they do, while the objectives are in conflict, the main
“direct” interaction between red and blue scenarios is through the uncertainty space,
not the objective space.
In the second form, we can maintain two nodes only: blue’s outcomes and red’s
uncertainties. Risk for blue is defined as the impact of blue’s uncertainties on blue’s
objectives. As such, blue’s outcomes are blue’s risk, which can be both negative
and positive. Consequently, this second form emphasizes that a blue scenario can be
defined in terms of building blocks, where each building block takes the form of a
risk for blue and how red’s uncertainties impact that risk.
168 4 Thinking Tools for Computational Red Teaming
It is important to explain why red’s uncertainties are not considered part of blue’s
risk. The main reason is that red’s uncertainties may be certain for blue, blue may
inject its own designed uncertainties in red, blue may shape red’s uncertainties, and
it might even be the case that red’s uncertainties do not impact blue’s uncertainties
at all.
For example, John’s risk in getting the job consists of John’s uncertainty of
not knowing how he will perform in the interview, and what questions he will be
asked, and whether he will achieve his objective of getting the job. The company to
which John is applying has different uncertainties: whether the right candidate for
the job will apply, whether the selection committee will make the right choice and
detect the right applicant, and whether the right applicant, if selected, will accept
the job. John can play with the company’s uncertainties even if he is not the right
applicant. John may demonstrate that he is keen to get the job, or that he is loyal
and it would be cheaper to hire him and train him a bit more to become the right
applicant than hunting for the right applicant for whom there is high demand in the
market. John needs to consider the company’s own uncertainties when designing
his own scenario space.
1
Sometimes this sentence is misinterpreted to mean that bad means can be forgiven if the ends are
good.
4.2 A Model to Deconstruct Complex Systems 169
Definition 4.2. A strategy is the “ways” in which we use the “means” (resources
and capabilities) to reach and achieve the “ends” (objectives and goals).
Definition 4.3. Strategic thinking is the creative process used to design and connect
the means, ways and ends.
The above definition of strategy stems from other works such as [1, 2]. In CRT,
this thinking needs to be about both red and blue. From a blue team’s perspective, it
is important to break the box and think creatively about how to achieve its objectives.
In doing so, the blue team needs to consider how to force the red team into a box
that is strategically important for blue. It is beneficial for blue that red’s thinking
is within a box; it is detrimental for blue to have a box of its own during strategic
thinking. From the red team’s perspective, it is beneficial for red that blue’s thinking
is within a box; it is detrimental for red to have a box of its own during strategic
thinking.
This race to shape a box for the opponent and break a team’s own box is what we
will term here a “thinking strategy.” When this thinking is guided with appropriate
risk analysis, we will term it “thinking risk.”
4.2.2 Resources
This first level presents opportunities for controlling an organization. Any organiza-
tion, even an entire country for the sake of the argument, has limited resources. In
most situations, in one organization one of these four categories is a factor that is
more limiting than the other factors when compared to another organization.
For example, let us assume that for Minisoft, “people” is the resource that is most
scarce of the four basic resources. Promoting certain activities in Minisoft would
mean the organization would be forced to shift people from one area to another.
For example, Manysoft may simply leave a portion of the market focusing on client
support untouched. Minisoft sees this portion of the market and attempts to profit
from it. However, the limited people resources available in Minisoft would force the
company to shift some software programmers to customer support. This reduces
170 4 Thinking Tools for Computational Red Teaming
Fig. 4.4 Schematic diagram displaying how the blue and red teams are connected strategically as
a system
The second level is fundamental inputs to capabilities [1]. These are the building
blocks that require the synthesizing of resources within an organization to establish
a capability.
A capability is the capacity to achieve an objective [1]. Within an airline, an
aircraft is a capability. It requires a number of fundamental inputs to capabilities
such as supplies, collective training, and personnel. Each fundamental input to
capability requires different mixes of all resources. For example, collective training
as a fundamental input to capabilities is concerned with enabling groups such
as air crew to train together so that they can understand how to communicate
effectively with each other; how to overcome misunderstandings that arise because
there are people with different specialties such as pilots and air stewards; and how
to synchronize actions and roles in emergency situations. Collective training is a
means to achieve interoperability in human interactions.
Collective training requires the four resources. It requires a place for the training
to be conducted: land; it requires capital and investments; it requires people to
conduct the training (trainers); and it requires knowledge in the subject matter to
make the training meaningful and effective.
Collective training alone does not create a capability for an airline. However,
it is an essential building block for flying capability. Nevertheless, one may find
a company specializing in collective training where collective training in that
company is a capability in its own right. As such, a fundamental input to capabilities
in one system may be a capability in a different system; in the same way, a
component in a system is a system in its own right.
The aircraft itself is a fundamental input to capability known as the platform. An
aircraft on its own will not fly. It needs supplies of fuel, crew (air or ground), and
many other elements before it delivers a flying capability.
Now, imagine someone donating five different types of aircraft to an airline. At
first, this seems great. However, scrutinizing its impact on fundamental inputs to
172 4 Thinking Tools for Computational Red Teaming
capability can prove that it is a damaging occurrence. The airline needs to have five
experts in different subjects to manage the five different types of aircraft. It needs to
establish different maintenance regimes and technical skills to cover the portfolio of
its fleet. In fact, in this scenario, the airline will stretch itself thinly until the point of
possible collapse.
Fundamental inputs to capabilities are the second knob to control or influence
a system. While the system may have resources, shifting the resources from one
fundamental input to capabilities to another would create a gap and the capability
will not materialize.
4.2.4 Capabilities
The third layer contains capabilities. Each capability is designed to deliver functions
to achieve effects or outcomes for the organization. If one cannot influence resources
and fundamental inputs to capabilities, the capabilities will exist. What one needs
to focus on is whether the functions that these capabilities will perform are
controllable.
For example, assume a flying capability. The organization establishes the
capacity to fly aircraft. The functions can be flying domestically or internationally,
carrying passengers or cargo. A competing airline can shape the market so that it
focuses on international flights, leaving the domestic market for a different airline.
For that second company, while the fleet has the capacity to fly internationally, the
market is reshaped such that performing this function is not a wise move.
As we approach the final two layers, effects/outcomes and strategies, an external
entity will need to exercise a different type of control. We may not be able to stop
an outcome if the resources, fundamental inputs to capabilities, capabilities, and
functions are uncontrollable. Instead, we need to manage the effects in one way or
another.
To manage effects, we need to introduce the concept of a network, then discuss
the operations that one can achieve on networks. Understanding these operations
will demonstrate how an effect can be managed. How to control or influence this
network of effects is the basis for the following section. Before we progress in this
topic, we need to continue explaining the remaining layers in our high-level model.
Operations on networks and their use for influencing effect spaces are discussed in
Sect. 4.3.
The layer at the most right-hand side of the diagram captures the vision, mission
and values of an organization. Vision denotes the long-term goals an organization
needs to achieve. Mission is what the organization is about: spelling out in concrete
4.2 A Model to Deconstruct Complex Systems 173
terms the intermediate goals, way-points, and performance indicators that the
organization needs to achieve to reach its vision, and the set of functions that the
organization needs to perform to be able to achieve these goals.
Values represent the boundaries of behavior for both the organization and its
employees that should be maintained while making decisions and searching for
solutions.
Vision, mission and values are designed by the board to provide the organization
with a coherent sense of direction, focus, and culture.
4.2.6 Strategy
Between the effects layer and the vision, mission and values layer, the strategies
layer design the “ways” to connect the “means” (i.e. all layers on the left-hand
side of the strategy layer) to the “ends” (i.e. the layer on the right-hand side of the
strategy layer) [1]. The role of a strategy is to understand the “hows” of, and risks
in, transforming and translating:
• the goals into outcomes and effects that need to be met to demonstrate that the
goals have been achieved
• the outcomes and goals into functions that need to exist to enable the successful
achievement of these effects
• the functions into capabilities or integrated functioning systems with the capacity
to perform these functions to deliver the required effects
• the capabilities into fundamental inputs to capabilities or the building blocks
required to have an integrated functioning system
• the fundamental inputs to capabilities into required resources that need to be
synthesized to produce each fundamental input to capability
• an overall framework to ensure that all layers on the left-hand side of the
strategies layer and all strategies are linked in an efficient, coherent, cost
effective, and meaningful manner to achieve the mission, vision and values on
the right-hand side.
While a strategy on a strategic level would stop at translating the vision, mission
and values into effects, in each level of the organization, a strategy needs to be in
place to understand the ways a layer on the left-hand side would deliver the goals
and objectives of the adjacent layer on the right-hand side.
While objectives and goals move from right to left (descending down in the
organization from a strategic level to a tactical level), constraints on achieving
these goals usually move from left to right (ascending up in the organization from
tactical levels to the strategic level). A strategy between two layers ensures that the
objectives will be achieved despite the constraints, or the objectives will be reshaped
to account for the constraints.
Each layer in this model for the deconstruction of a complex organization
provides opportunities for both blue and red to influence the organization. If this
174 4 Thinking Tools for Computational Red Teaming
is a red organization, red can use this model to identify problems and any force that
is interfering with the goals of the organization. In the meantime, blue will focus on
designing forces to influence and shape red’s organization. Red in this case has a
much more complex problem than blue. It is sufficient for blue to focus on one layer
to influence the overall red organization, while red has the daunting task to ensure
that all layers are well-functioning to achieve its objectives and goals.
contexts than in pure technical contexts, where the concept of social engineering
would be more relevant and applicable.
Before we proceed with a discussion on network operations and how they can
contribute to the fields of social engineering and Cyber Security, we need first to
define what we mean with Cyber. This is important because it will make it clear
why “networks” are at the center of any Cyber Security operation.
Traditionally, one would discuss the information domain rather than the cyber
domain. Recently, the cyber domain is emerging as a more encompassing concept.
However, the definition of “Cyber” is somehow confusing. For computer scientists,
it has been seen as another form of computer security, despite that the Cyber space
extends well beyond classical computer security issues such as cryptography and
network security to issues where an understanding of complex systems, network
theory and psychology are paramount.
In the military, the word “Cyber” has also been confusing. The military has
been conducting operations in the electromagnetic spectrum such as electronic
warfare operations for decades. Therefore, the Cyber space is not a new concept
to the military as it is almost a new buzz word in the civilian world. This
begged the question of whether the word Cyber should be differentiated from the
electromagnetic spectrum.
Within science, the field of Cybernetics is maybe the oldest scientific field that
uses the word Cyber in its title and roots. One of the main journals in this field is
the IEEE Transactions on Cybernetics, which used to be called, IEEE Transactions
on Systems, Man and Cybernetics Part B: Cybernetics. The scope of this journal
emphasizes papers on “communication and control across machines or machine,
human, and organizations” (Source: IEEE Transactions on Cybernetics - Aims and
Scope Statement).
The above is a very simple demonstration on how confusing the word Cyber is.
Despite that many people would claim to know what it is, but it will be more difficult
to ask them to define it precisely in such a way that truly distinguishes its meaning
and use from other disciplines.
Figure 4.6 shows a conceptual diagram to explain the Cyber space. The starting
point is the physical infrastructure that supports the Cyber space, which contains
elements such as the physical backbone of the internet, servers, routers, signal
receivers, signal transmitters, and satellites. These elements are necessary for the
existence of the Cyber space. Together, they form different physical networks.
Within an organization, a local area network connects the organization’s internal
computers together, but does not necessarily connect the organization with the
external world. To protect the physical infrastructure, it is important to understand
physical security. For example, access control to the server’s room in an organization
is a type of physical security.
4.3 Network-Based Strategies for Social and Cyber-Security Operations 177
Fig. 4.6 An outline of the building blocks for the Cyber space
The physical layer provides the physical infrastructure that allows the generation
and propagation of electromagnetic signals. Signals can be seen as the water flow in
a river, they carry things such as fishes and boats. Information are the equivalent of
fishes and boats; that is, signals carry information.
The electromagnetic flows form another type of networks known as the logical
network. Nodes in the logical networks do not necessarily correspond, and mostly
they do not correspond, to nodes in the physical network. This network requires
protection of a different type from the physical protection. Securing communication
channels are one type of communication security required to protect the logical
network.
Signals carry, or encode, information. These information can be texts carried in
an SMS message, voices in telephone calls, email messages in a computer network,
or videos on the internet. These information forms other type of networks, that
we will call information networks. An information network connects information
together. For example, a database of customers in a supermarket, connected to a
database of financial transactions of credit cards can provide the network required
to understand loyal customers or behavior of customers across a sector. Information
security is the classical focus of computer security.
The Cyber space is made-up of the components shown in Fig. 4.6. Therefore,
Cyber is the space spanning any flow in an electromagnetic spectrum. This flow
can be a flow of information in the form of bits, regulated with electrical signals,
178 4 Thinking Tools for Computational Red Teaming
as in the case of pieces of data moving from one computer to another. This can
be observed in the logistics company that was transmitting orders from customers
to the decision-support unit installed in vehicles so that drivers would act on these
orders. The flow can be a flow of signals as is used in a Global Positioning System
(GPS) in transmitting vehicle positions in real time so that the logistics company can
monitor and adequately optimize the use of its fleet. The cyber domain is currently
evolving to represent the space in which complex flows occur in the electromagnetic
spectrum. Injecting a virus by a rival company into the computer system of
the logistics company, intercepting the GPS signals produced by the vehicles of
the logistics company by rival companies, or jamming the communication lines
between the logistics company and its fleet are examples of offensive operations
in the cyber domain to generate a cyber effect. Examples of operations to generate
positive effects for an organization using the cyber domain include marketing the
company in online social networks, establishing a space for the company in a virtual
game such as second life, and using emails to announce discounts and special offers.
Definition 4.4. Cyber space is formed from all flows regulated by the electromag-
netic spectrum
Definition 4.5. Cyber security is the business processes and tools needed to protect
any flow in the electromagnetic spectrum.
Definition 4.6. Cyber operations are any sequence of activities conducted in the
electromagnetic spectrum with the intent to achieve one or more effects.
As being demonstrated in Fig. 4.6, networks are the basis for the Cyber space.
In fact, security can’t be claimed in any Cyber subspace unless the three types of
networks shown in Fig. 4.6 are secured. Therefore, it is paramount to categorize
network operations since the Cyber space is likely to be part of most red teaming
exercises in any large organization.
There are many network operations that need to be discussed and understood by the
teams conducting a CRT exercise. The technical details on how these operations are
conducted are context and exercise specific, but an understanding of these categories
is essential for members of the CRT exercise (Fig. 4.7).
The main challenge in managing a network is that connections between nodes
create interdependencies that make it difficult to manage consequences. A change
in one node may propagate undesirable effect(s) in the overall network, or even
cascade and generate a massive blackout in the overall system, for example,
cascading failures in power networks or cascading anger in a social network.
This challenge creates many opportunities to manage, shape, or break down a
network. Each of these opportunities will be explained below as an operation that
can be conducted on a network.
4.3 Network-Based Strategies for Social and Cyber-Security Operations 179
Possibly the first operation on a network is being able first to detect that a network
exists. In some cases, it might be simple to expect that an organization would have a
network of effects. In other cases, this network may not be detectable by an observer.
The simplest manner in which to explain this is to think of a network of criminals
within Manysoft who are trying to commit fraud. Manysoft cannot do anything
about this network until it is able to detect it. Through the detection of several nodes
Manysoft may be able to establish with confidence that a larger network exists.
The detection problem can be extremely difficult. However, one characteristic of
a network that makes this detection problem more easy (as opposed to easy) is the
existence of many nodes. By definition, the network cannot survive as a network
without links. Therefore, while detecting a single node can be a very challenging
problem, the larger the network, the more likely that the existence of the network
becomes detectable. Nevertheless, a hidden inactive (i.e. sleeping) network is more
difficult to detect than an active network.
180 4 Thinking Tools for Computational Red Teaming
Identification comes after detection. Once a network has been detected, the question
that must be posed is what the network is about. To identify a network is to associate
an identity (purpose or intent) with it, which is a form of contextual information.
For example, is this network for fraud or is it a gossip network? Identification can
also involve many more features about the network including an estimate of its size,
topological characteristics, and a characterization of the dynamics and types of flows
on the network. All these features can help to zoom in more to clearly identify the
network and distinguish it.
type, a barrier is placed between the network and its objectives. Here, capabilities
are not necessarily being denied because the network may still have the capability
to perform its function. For example, a fraud network may have access to the system
where fraud can be committed. However, every time the network attempts to commit
fraud, a barrier is in operation. For example, the nodes are called for jobs in different
locations, or the system is shut down for maintenance.
This first type of prevention is fundamentally different to denying capabilities for
two reasons: the first is that the network still has the capacity to commit fraud; and
the second is that to prevent the network from committing fraud, there is a need for
an extremely efficient monitoring processes to establish perfect situation awareness
about the network intent and the expected time for an action to be taken. Only at
this time can prevention through these barriers can be successful.
The second type of prevention operates to prevent the network from growing
or increasing its connectivity. This is also a type of a barrier, but it is a barrier
surrounding a network topology, rather than a barrier surrounding a network
function.
Prevention may seem a difficult operation. However, it is an indication of
a healthily functioning organization. If prevention cannot be achieved, it is likely
that the organization does not have sufficient situation awareness of the networks
within the organization and intended actions. As such, it is likely that this organi-
zation has networks that have not been detected. Prevention is better than cure; it is
immunizing the organization against a potential attack from the network of viruses
that may intend to harm the organization.
Assuming a situation in which a network has the capability to perform its intended
function, and it has not been possible to prevent the network from achieving its
goals or isolating the network: How can we manage the outcomes? Assume that
a network of hackers manages to penetrate the IT system of an organization, steals
information and is now holding the information to threaten the system. The question
now is whether we can neutralize the effect.
It might be simpler to think of someone who blackmails you with a photo of
you naked: What should you do? If you feel extremely embarrassed, they will
be successful in blackmailing you and getting what they want. You will open
opportunities for more people to blackmail you. Another strategy would to go public
naked. Yes indeed! Regardless of how much fear and embarrassment you may feel in
normal circumstances of appearing naked in public, if you are blackmailed, you may
need to overcome this fear, as the cost of being blackmailed exceeds the cost of your
fear of being seen naked. Despite that this example may sound like an exaggeration,
it sufficiently illustrates the point we wish to make. Blackmailing rests on fears. One
possible strategy to manage blackmailing is to face the fears.
Similarly, if information has leaked about an organization and the situation
is difficult to contain, it may be easier to make the information public yourself.
Preempting the effect of the attacker may generate opportunities for you. You can
go public and demonstrate that the organization is moving toward sharing more
information with the public, regardless of how much it may be damaging to the
organization. It is likely that the damage will be much less if you go public than if
the hacker goes public with the information because you will have the opportunity
to frame these information as you wish and pre-empt the hacker’s, possibly more
damaging, framing of the information.
If the hacker goes public, the damage is not only in revealing the information.
The damage extends to the security system of the entire organization, the image
of the organization, and its ability to protect its own information. In addition, one
successful hacker may also become a hero for other hackers to follow. Proper risk
assessment can involve the principle that controlled damage is a preferable strategy
to follow than aspiring for a damage-free situation.
overseas branch and to a different area in the organization would destroy the fraud
network. A hub in this situation can mean many things, for example, the node that
is most connected socially to all other nodes; the node that is most influential on all
other nodes; the node that is doing the core thinking on behalf of all other nodes; or
the node with the technical competency to execute the act. By destroying the links,
the network collapses.
Given we are discussing CRT, it is natural that for each network operation we
discuss, we also discuss a counter-operation. In CRT, hiding a network is an
operation in which one of the two teams does not want certain networks to be
detectable by the other team.
Hiding a network can be a large area of research in its own right. Here, we will
discuss this operation at a surface level. Regardless of the level of depth required to
discuss this operation, there are fundamental properties that the operation of hiding
a network requires. These properties are node autonomy, and link invisibility. These
two properties may seem dependent on each other. As the degree of autonomy
increases in a node, this node will detach itself from other nodes; thus, there will
be no need to establish or reinforce links with the original network (i.e. no need to
communicate), which will eliminate the links (dependency) between the nodes.
However, links exist for many reasons. Two individuals can be autonomous
in their actions, but they live together, work together, talk to each other on the
telephone, and perhaps even have a personal relationship.
References 185
The opposite is also true. If no link exists between two individuals, it does not
mean that autonomy is high. Broken links can be an indicator of a dysfunctional or
a sleeping network.
Hiding a network is not about eliminating links or nodes, or even making the
network hidden. A network cannot function properly if no links exist and nodes
are fully autonomous. The need to synchronize action, share information, manage
resources, to name a few, always means that the network cannot be fully hidden.
Instead, effective hiding operations of a network are about balancing the signal-
to-noise ratio, that is, embedding one network within many different networks is
one way to hide the main network. Imagine an individual who is extremely sociable.
The larger the number of people this individual meets, interacts with, works with,
collaborates with, and even walks with, the more difficult it will be to detect the real
network of interest through this particular individual. Every interaction between
this individual and another individual will be considered an observation, and every
observation is either important or noise. As the number of noise observations
increases relative to the important ones, the probability to detect a real observation
decreases.
The story does not end at creating many noisy links to hide the real links.
What is important here is not the word “many,” but the nature of the few.
If an individual meets with the gang of fraudulent employees twice each month,
this individual needs to make “twice-a-month” noise encounters more frequent than
single encounters. The encounters need to overlap and synchronize such that the
signal cannot be isolated from the large noise by which it is surrounded.
Such an operation for hiding networks is normally very sophisticated and
requires very advanced understanding of network dynamics and intelligence
operations.
References
1. Director, C.O.: Plans. Defence capability development manual. Tech. rep., Technical report,
Australian Department of Defence (2006)
2. Porter, M.E.: What is strategy? Harv. Bus. Rev. (November–December), 61–78 (1996)
3. Rhyne, R.: Field anomaly relaxation: the arts of usage. Futures 27(6), 657–674 (1995)
4. Zwicky, F.: Morphological astronomy. Observatory 68, 121–143 (1948)
5. Zwicky, F.: Discovery, Invention, Research Through Morphological Analysis. McMillan,
New York (1969)
Chapter 5
Case Studies on Computational Red Teaming
Abstract This chapter focuses on the utility of CRT. Three case studies are
presented in varying scale and complexity. The first case study is technology
focused, where CRT is used to evaluate three conflict-detection algorithms in the
domain of air-traffic management (ATM). The second case study presents a lab-
based experiment using a game between humans and computers. The objective
of the experiment is to test the impact of noise in the information presented to
the humans and the impact of deceptive strategies. The final case study presents
a significant exercise with air-traffic controllers (ATCOs). The exercise uses elec-
troencephalographic (EEG) brain data to perform CRT in real time. This case study
connects different enablers for CRT in an integrated manner.
The main case study in this section will focus on the domain of ATM. The
description will be self-contained, avoiding technical jargon and focusing on the
problem in as much level of abstraction as possible. A more technical description of
the case study can be found in [5].
ATCOs have the primary responsibility for ensuring that aircraft are separated
safely as they fly. Aircraft should do not come too close to each other, as this
increases the risk of collision and fatal accidents. The airspace is divided into flying
zones, named “sectors.” We can assume for simplicity that each sector is managed
by an ATCO.
The continuous increase in traffic demands is causing an increase in the amount
of traffic. These increases require more tools to support ATCOs in maintaining
smooth and safe operations. Moreover, the increase in traffic demands has required
changes to occur in the cockpit, with more advanced technologies being added in
cockpits to provide pilots with better situation awareness of the air traffic.
To allow new air-traffic-management technology to be implemented and adopted
by industry, the technology needs to undergo significant testing and evolutionary
cycles. It can take many years before a technology that has been discovered in
a university-laboratory environment to be used in the real air-traffic-management
environment. One reason is the need for rigorous testing. The second reason is the
expense involved in changing the infrastructure to support the new technology and
the legacy systems needed to approve these changes.
In this case study, we need to red team a technology known as conflict detection.
In its basic form, a mathematical algorithm is used to decide on whether two aircraft
will come into conflict (i.e. fly closer to each other than they should) within a given
timeframe. The ATM industry performs such evaluations. Classical methods by
which to evaluate new algorithms such as these include mathematical analysis of the
behavior of the proposed algorithms, and running human-in-the-loop simulations
and fast-mode simulations. In the latter case, data on real traffic are collected, and
then replayed in a simulation environment. These simulations are conducted with
and without the technology enabled to reveal possible complications from using
the technology. These real-traffic data can also be manipulated to increase traffic
volume or change specific traffic characteristics that the test-and-evaluate engineer
believes that they are important for testing the technology.
This approach of testing has always been perceived to be effective because it
relies on real data and while the testing is simulated, it is as close as any simulation
can be to the real-world situation.
However, there is a vital drawback in this approach. The technology that we
are testing today is not the only change that will occur in the following 10–20 years,
which is when it will be implemented. In fact, many other concepts and technologies
are being tested today, but are being tested independently. In the case of air-traffic-
conflict-detection algorithms, the air traffic of today will not continue to be the air
traffic of 10–20 years’ time. More importantly, the characteristics of the sectors and
routes can change because of other concepts that focus on redesigning the airspace,
for example, dynamic sectorization [19] and the free flight concept [16, 17].
The implication of the discussion above is that using today’s data to test tomor-
row’s technologies can be a misleading approach. If conflict-detection algorithms
are tested with today’s air routes, even if we double the traffic on these routes, our
testing is biased. Tomorrow’s routes can have completely different structures and in
fact, as in the case of free flight, may not exist at all.
Such information necessitates a thinking approach that is not bound by the box
of today. CRT offers this opportunity.
The purpose of the CRT exercise in this case study was to identify the vulnerabilities
in three conflict-detection algorithms that at the time of the case study, had passed
advanced prototyping stages. The three algorithms were a nominal algorithm [11],
a probabilistic algorithm [14], and a worst-case algorithm [15]. These were param-
eterized in the following manner:
5.1 Breaking Up Air Traffic Conflict Detection Algorithms 189
• A conflict was defined in the standard manner for en-route air traffic. Two aircraft
are said to be in conflict if their horizontal separation is less than 5 nm and their
vertical separation is 1,000 ft or less.
• Each algorithm needs to project the position of each aircraft ahead of time to
predict future conflict. The look-ahead time window was set to 8 min because the
quality of prediction beyond this time limit degrades rapidly.
• An algorithm will raise an alarm that there is a conflict if the conflict will occur
in 5 min. This is known as the time to closest point of approach (CPA), which is
the time needed for two aircraft to be as close as they can get on their routes .
• Each algorithm will scan 60 nm of the environment every 5 s to detect whether
there is a conflict. These are known as the probe range and frequency,
respectively.
Given the above parameterization, the three algorithms were ready to be tested
through CRT.
validated. It is also important not to be too ambitious and allow all of these data
to change in a scenario because the more variables we allow to change, the more
time and resources we need to run simulations and optimize.
To strike the right balance between what goes into the simulator for initialization
and what goes into the scenario as parameterization of context is a problem-
dependent task. The success of this balance will rely on the expertise of the analysis
team.
In this case study, our focus was on conflicts. The primary purpose of the CRT
exercise can be restated as the following: “how to condition conflicts such that these
conflict-detection algorithms fail.” It is the keyword “condition” that will make us
focus on what we need to optimize by asking what is the minimum amount of
information we need to know to condition a conflict.
To condition a conflict, a mathematical analysis needs to be performed to
understand the mathematical building blocks for a conflict. From this mathematical
analysis, three key parameters derive the formulation of a conflict: distance, angle,
and phase of flight. A conflict is defined when two aircraft violate the minimum
separation distance at the CPA. This distance can be violated in the vertical and/or
horizontal dimensions. Thus, we have two key parameters: horizontal separation at
CPA and vertical separation at CPA.
The conflict angle at CPA is a third parameter that defines the angle at which the
two aircraft are approaching each other. This angle is crucial because it contributes
to the level of fatality of a conflict when comparing, for example, head-to-head and
head-to-tail conflicts.
The final two parameters represent the phase of the aircraft at the time of conflict.
One aircraft may be climbing to a higher altitude, descending to a lower altitude, or
maintaining its cruise phase. Thus, we have two parameters, one for each aircraft.
Each of these parameters can take one of three categorical values (climb, cruise, and
descent).
These five parameters (horizontal separation at CPA, vertical separation at CPA,
conflict angle, phase of first aircraft, and phase of second aircraft) directly define the
characteristics of a conflict. Therefore, performing the optimization directly using
these five parameters would be the most compact representation we can create to
define a conflict. These five parameters exist for one pair of aircraft. In a scenario
with 100 aircraft separated into 50 pairs, the total number of parameters that we
would need to optimize would be 250. This is a large-scale optimization problem,
especially because of the black-box nature and the high level of nonlinear interaction
that occurs within the simulation environment. However, it is the most compact
representation of the problem. Any approximations beyond this level can create
hidden holes in the analysis that pop up later as vulnerabilities.
Thus far, the conflict parameters have been defined directly, but a traffic scenario
is not defined with conflict parameters. A classical air-traffic scenario is defined in
its simplest form by flight plans that describe for each aircraft its origin, destination,
take-off time, and route as a minimum set of information. Therefore, the conflict
parameters defined above need to be transformed to flight-plan information. This
process requires a mathematical transformation known as “backward integration.”
192 5 Case Studies on Computational Red Teaming
The idea is to take the position where the conflict will occur, along with the above
five parameters, then propagate back the position of each aircraft to construct a route
that generates this conflict from a particular origin.
Through backward integration, the 250 parameters are used to generate 100
flight plans that guarantee a minimum of 50 conflicts in the scenario. These 50
conflicts are generated with the characteristics defined by the parameters. Naturally,
it is expected that more unplanned conflicts will arise from the interaction of these
aircraft. However, if no other conflict arises, there are at least 50 conflicts in each
scenario. Evolutionary computation is used to optimize this vector. A chromosome
contains 250 variables. Each chromosome represents the parameterization of an air-
traffic scenario with 100 aircraft.
To discover the vulnerabilities in the three algorithms, the problem is to search
for those scenarios for which the algorithms will fail to detect conflicts. The failure
of a conflict-detection algorithm is measured with two metrics: missed detects
and false alarms. These two metrics are generally in conflict. As the algorithm
attempts to detect every conflict, it generally needs to lower its threshold for the
classification of a conflict. This reduces missed detects but increases false alarms
because the lower threshold will categorize the non-conflict situations that are
close to the boundary that defines a conflict as a conflict. This situation suggests a
multi-objective formulation with two objectives: maximization of false alarms and
maximization of missed detects.
The second version of the non-dominated sorting genetic algorithm
(NSGA- II) [10]. NSGA-II is a good evolutionary multi-objective optimization
algorithm. It may not be guaranteed to converge every time to the global optimal
non-dominated set, but in our case, this is an advantage. By running the algorithm
many times, we can generate paths in the search space, some of them focus on local
non-dominated sets, while others focus on the global non-dominated sets.
A population size of 50 scenarios, evolving over 100 generations and the process
repeated 10 times with different seeds would generate 50,000 scenarios. Each
scenario has at least 50 conflicts; thus, each algorithm was probed with more than
2,500,000 conflicts. This massive amount of data generated by the evolutionary-
optimization algorithm was then fed into the data-mining algorithm to discover
patterns of failure.
Given of the large amount of data available, and the need to understand the per-
formance boundaries of each of the three conflict-detection algorithms to estimate
their performance envelope, we needed to rely on an efficient data-mining technique
that was transparent. Transparency here means that the output of the data-mining
technique needs to be simple and in human-understandable language so that the
output can be verified by the designers of the three algorithms, as well as any air-
traffic organization interested in evaluating these algorithms.
5.1 Breaking Up Air Traffic Conflict Detection Algorithms 193
Section 1.7.4 presented blue–red simulations. The next generation blue–red simu-
lation systems allow for representation of human behavior using behavioral models
within the simulation environment. Examples of these systems include ModSAF [9]
and OneSAF [20]. Most of these behavioral-based models rely on studies that
5.2 Human Behaviors and Strategies in Blue–Red Simulations 195
identified how humans react in certain situations. For example, a behavioral model
can represent human walking speed as a function of the load the human is carrying,
the terrain, fitness level of the human, and how tired the human is. More difficult
areas in human modeling are the considerations of the type of planning strategies
a human uses when faced with certain situations, even simple situations such as
chasing one another.
In this case study, we needed to red team against a human to uncover human
strategies in simple goal-following tasks. These types of strategies may sound
overly simple. However, as soon as we add dimensions of deception and noisy
information, we are faced with many options that can lead to different behavioral
models. A need arises to provide evidence to support these different models, and
to provide information about the context in which to apply them for blue–red
simulation environments. Moreover, once it is possible to design a methodology
to design human-based experiments to study a pool of strategies that the human
follows in certain contexts, a second need arises for automatic methods to conduct
this type of analysis autonomously given the large volume of data.
The task was very simple, and was inspired by Tom and Jerry (the cat and mouse).
Tom (blue agent) attempts to catch Jerry (red agent), while Jerry attempts to escape
Tom. In this exercise, Tom is fully autonomous, following simple fixed strategies,
while Jerry is controlled by a human.
The situation is an abstraction of a thief being followed, a software agent in
a cyber space attempting to follow automatically a human intruder, or a group
of autonomous aircraft following another group with ground-controlled or air-
controlled pilots. In many of these problems, the physical space where maneuvers
occur can be isolated (to some extent) away from the strategy followed by humans.
Modeling the environment with this level of simplicity has the advantage of
eliminating the unnecessary complexities that exist in the real world and do not
necessarily contribute to the real phenomenon that we wish to study. Moreover, it
allows the analyst to focus on the question, and develop the necessary tools without
worrying about irrelevant details.
Within the decision-support and information-fusion literature, the above
approach is valid. The Tom and Jerry game is a high-level representation of the
context or situation under investigation.
To explain with an example, assume that the two teams are two companies.
The behavior of these two companies is defined by two factors: investment in
production and investment in marketing. We can see these two factors forming a
two-dimensional space, with Tom and Jerry representing the current distribution of
the budget between production and marketing. Tom is attempting to copy Jerry by
following him in the environment, while Jerry is trying to maneuver to escape from
Tom. Both would attempt to deceive each other, while their information about each
other also contains a level of noise. This level of abstraction allows us to focus on
the key phenomenon we wish to investigate without becoming overwhelmed with
the situation.
196 5 Case Studies on Computational Red Teaming
The main purpose of this CRT exercise was to establish a CRT methodology to
identify vulnerabilities in human strategies within a simple reflex task.
The team size was reduced to one; thus, we only had a single agent on each side.
The blue agent was autonomously controlled through predefined heuristics written
in a scripting language. The red agent was controlled by humans in one set-up, by
a machine in a second set-up using scripts, and by a machine in a third set-up but
using a machine-learning model (artificial neural network) that was trained in the
human data produced in the first set-up.
The human-based experiments relied on 34 (19 females and 15 males) human
subjects. The scenario space involved 20 scenarios. Each scenario was played twice
by each human player; thus, each human played 40 games. The sequence was
shuffled to guarantee that each human player received a random sequence that was
not played by any other players.
Each scenario was defined by two factors: sensorial capabilities (input to blue)
and behavioral capabilities (output of blue). These two dimensions represented the
situation awareness of blue and the deceptive range of blue, respectively. More on
these parameters will be discussed in the simulation section.
The simulation environment was very simple two-dimension grid. This simulator
acted as a game engine for the human experiments, whereby the human interacts
with the simulator, and as a simulator for the machine experiments.
The grid was a bounded 640 640 cells, involving a wall of eight cells at
each side. This left an environment of 624 624 within which the agents could
move. While the space was modeled as a grid to match the resolution of the
space on the computer screen for the players to play, movements were described
in a continuous domain using a fixed speed of step with an angle between
Œ180ı ; 180ı . Therefore, agents had the same speed so that they did not take
advantage of speed differentials. They only controlled the travel angle, which was
b for the blue agent, and r for the red agent. The simulation environment used
discrete time simulation. Nevertheless, the time taken by the human between seeing
the blue agent moving and responding with a move was recorded and analyzed as
reaction time.
The true challenger in this scenario is the red agent because the blue agent acts
using a predefined strategy at the beginning of each simulation independently of
5.2 Human Behaviors and Strategies in Blue–Red Simulations 197
the strategy the red agent is following. Meanwhile, the red agent is expected to
approximate blue’s strategy to escape from blue.
The blue agent attempts to capture the red agent by traveling straight to where
the red agent is moving. The blue agent also applies deceptive strategies, whereby it
may deviate from this straight line, coaxing the red agent to think that it is traveling
away from them. This deceptive strategy allows the blue agent to deviate for some
time then move head on to the red agent.
The blue agent produces an action based on the most recent position it receives on
red. The sensorial information for blue is modeled using two parameters: frequency
of receiving information and noise in received information. As such, when there
is a large time lag in the information that blue receives, the blue agent acts during
this time on a historical position for the red agent. Similarly, if the information the
blue agent receives has a high level of noise, the blue agent will act on a perturbed
position for red.
The red agent is controlled by a human or a machine. Information is commu-
nicated to the red agent in the form of information display-in the case of human
control-or in a digitized manner-in case of machine control. In the former, there
are two classes of scenario to which the human is exposed. One class is termed the
“known–known” and the other is termed the “known–unknown.” In the former class,
the red agent sees where the blue agent is, where the blue agent thinks/perceives the
red agent is, and where the red agent truly is. In the latter case, the red agent only
sees where blue and itself are in the environment, without knowing what the blue
knows. It is hypothesized that in the former class, allowing the red agent to know
where the blue agent thinks it is, will aid the red agent in escaping from the blue
agent.
At the end of all experiments, every trajectory generated by the blue and red agents
in every simulation was recorded. These trajectories are the expression of the red
agent’s strategy in actions. While a trajectory here is simply a sequence of movement
angles that the red agent decided to produce to escape from the blue agent, this
sequence contained a significant amount of information.
To mine this information, the sequence of angles needed to be converted into
information on how a player employs a specific strategy while playing. A strategy
here was the sequence of actions followed by a player with the intention to achieve
a goal. Since the angles on their own do not reveal a great deal of information about
the intention of the player, we needed to calculate the first and second derivative
of the angle. The change in an angle and the acceleration of the change revealed
information on player’s intentions.
198 5 Case Studies on Computational Red Teaming
We were also able to calculate all moments in the sequence of angles, but this will
be a computationally expensive exercise. Therefore, the behavioral miner analyzed
three pieces of information: change information calculated from the sequence of
angles, reaction-time information, and the scores of different games.
The final case study presents a CRT exercise that brought together many of the
computational tools and methods discussed in this book. The initial concept and
results are discussed in [1–3].
Operators in safety-critical domains hold a great deal of responsibility for
maintaining the safety of the environment. In the air-traffic-control domain, an
ATCO would be managing one dozen or two dozen aircraft simultaneously, and
be responsible for the thousands of lives on board of these aircraft. As the traffic
demands increase, and changes are introduced in air routes and structures, the
complexity of the environment simultaneously increases.
5.3 Cognitive-Cyber Symbiosis (CoCyS): Dancingwith Air Traffic Complexity 199
brain-data indicators? We need some tasks that are concrete and simple to provide
us with the confidence that the changes we notice in brain activities are due to the
task and nothing else.
Let us now assume we have two tasks; we will call them baselines. The second
task is a copy of the first task with the addition of one subtask such as counting.
Differences in brain activities between the two tasks, assuming everything else is
constant and controlled for, can be contributed to counting. That is, if we somehow
subtract the two tasks from each other, and we somehow also subtract the brain
signals from each other, in a strict sense, the first result is taken to cause the second
result because the first result occurred before the second result, and under the closed-
world assumption of this test, there are no other alternative explanations for the
second result except the first result.
In this hypothetical example, one non-trivial assumption is that everything else is
constant and controlled for. In a real-world environment, this assumption is very
unrealistic. However, let us now imagine we can repeat this process many times,
and on a continuous basis, we will obtain many differences between tasks and many
corresponding differences between the brain data that we can correlate. Given the
continuity of the data collection, differences in tasks are repeated, allowing for
multiple measurements for similar phenomena that we can average or for which
we can perform some other statistical tricks to eliminate the extra factors for
which we did not account. These differences become the probes that we use to
approximate the performance envelope of the complexity in the environment.
We are then left with the task of designing actions that can maintain the environ-
ment inside the envelope every time it attempts to cross the complexity boundary.
Given the dynamic nature of the environment, these actions need to stabilize the
environment over time. That is, this can be considered a classical control system that
attempts to stabilize this environment. Unfortunately, the task is too complex to use
classical control theory or perform system identification in advance to understand
fully the states of this system or the constraints on these states.
CRT performs this control function using tools from simulation, optimization
and data mining. By having a simulation running in the background shadowing the
traffic in the foreground, we can use this simulation to perform impact analysis.
The optimization search engine can propose solutions and request the simulation to
project the impact of these solutions in the future. Only solutions/actions that are
more likely to stabilize the environment are selected as potential actions to steer the
environment back inside the complexity envelope.
We are now left with one last issue to resolve: how to implement this action.
Within an air-traffic environment, we cannot accept the risk that the system can
produce an action that (because of the approximations in the simulated environment)
can expose the safety-critical system to negative risk. There are a number of
practical solutions for this, but these are too technical and outside the scope of
this summary of the case study. For simplicity, we can assume that the action can
be communicated to the ATCO through a human intermediary. This human expert
can transfer the action in a manner that manages this risk. The implications of this
protocol are also outside the scope of this summary.
5.3 Cognitive-Cyber Symbiosis (CoCyS): Dancingwith Air Traffic Complexity 201
Fig. 5.2 A pictorial representation of the cognitive balance required for users in safety critical
jobs
202 5 Case Studies on Computational Red Teaming
As such, the purpose of this CRT exercise is to estimate the boundary constraints
of this interaction between the environment and the controller’s cognitive state,
with the aim of dynamically designing strategies to maintain the controller in the
“engaged” zone.
The players in this exercise were predefined because of resource constraints. It
was not possible to plan for having many ATCOs. Four ATCOs and two pilots were
available for the study at different times over the period of 1 week.
As discussed in the previous section, contrasting indicators from the traffic with
indicators from the controller’s brain provided a means to identify the boundary
constraints for this problem. Figure 5.3 demonstrates the characteristics of the brain-
traffic interface. On the controller side, the controller’s environment represents the
wider environment within which the controller is embedded. In this environment,
the controller is performing many tasks. Some tasks are non-work related such as
thinking about their partner; some tasks are work related but not traffic related such
as thinking about their boss; while other tasks are work and traffic related. These
latter tasks represent the core job of the controller.
As the controller attempts to manage safety, the controller performs three main
functions of monitoring, planning, and action [12, 13]. These functions require
explanation is sufficient to understand holistically what the system is doing, but the
actual complexity in making this system a reality rests within the technical details
that are left out to maintain the presentation at an appropriate level for a wider
audience.
While real-life ATCOs and pilots were used in this exercise, the experimental nature
of CRT means that the traffic itself was simulated using a complex high-fidelity
simulation environment developed by Eurocontrol. For the sake of our discussion,
we will term this simulator the “real traffic:” the word “real” here needs to be
interpreted as “realistic.” This is to differentiate this simulator from the simulator
used within the CRT environment, which we will term the “shadow simulator.”
The shadow simulator was primarily used in the background to “shadow” or “run
in parallel with” the real traffic. The purpose of the shadow simulator is as before:
it becomes the CRT tool to mimic the real environment. It enables the CRT exercise
to project ahead, and undertake consequence and impact analyzes as required.
The extra tasks that the shadow simulator needs to conduct means it needs to be
very fast. To achieve this speed, the shadow simulator does not need to duplicate
everything in the real traffic. There are many functions and information that are
calculated in the real operational environment that are essential for the controller to
conduct their job. The shadow simulator needs to calculate only those calculations
necessary for the CRT exercise. This is achieved by selecting an appropriate level
of abstraction with which the shadow simulator can work; thus, allowing it to run
in fast mode and at a much faster speed than the real environment to conduct look-
ahead analysis.
Even with many tricks in speeding up the shadow simulator, it is not very
easy to bring the speed to the level required for the CRT environment. Within
this experiment, the exercise is running in a real-time mode. Within a minute or
two, thousands of simulations may need to be conducted by the shadow simulator.
To enable this number of simulations to occur, the shadow simulator can run on
multiple cores, or even on a computer cluster. In this exercise, we did not need to
use a computer cluster because of the team’s vast expertise in designing efficient
simulators and the analysis that was conducted to ensure that the simulator included
all information needed for the exercise and nothing more.
Multiple pieces of data analysis are needed in an exercise such as this. First, brain
data are signals that are measured in real time. Second, the traffic itself needs to be
assessed in real time.
5.3 Cognitive-Cyber Symbiosis (CoCyS): Dancingwith Air Traffic Complexity 205
The brain data were measured from 21 sites on the scalp, including two
references to transform the measurements into voltage information, with each site
sampled at 2,048 Hz, that is, 2,048 readings every second for each site. This is a
large amount of data, amounting to approximately an array of 258,000 real numbers
per min. Using 128 sites would increase these data with an order of magnitude.
The raw signals needed to be processed and transformed into high-level cognitive
indicators in real time. Designing the cognitive model to support this analysis
required many experiments before the actual exercise to ensure that the model was
adequate for the analysis. The details of the cognitive model are too technical for
the purposes of this chapter, but it is very important to note here that a great deal
of work is involved with this modeling, and it requires significant interdisciplinary
experience. This cognitive model cannot simply rely on an understanding of human
cognition or neuroscience, it requires experience and knowledge of air traffic,
optimization, simulation, and the overall CRT tools and concepts.
The amount of traffic data is much less than the amount of brain data. Similar
data-analysis architecture is used to analyze the traffic, extracting indicators such as
the number of aircraft within the sector, crossing angles between pairs of aircraft,
and CPAs for pairs of aircraft.
The role of optimization within this exercise was to find actions that can steer
complexity back to the engaging region. This meant that when the complexity was
too high, it needed to be decreased, and when the complexity was too low, it needed
to be increased to keep the controller engaged.
The list of actions was predetermined and designed to be realistic. The optimizer
was called every 2 min and a decision was needed within 30 s. After an in-depth
analysis of the problem, it was decided that an exhaustive search strategy was
possible. Thus, a complete search was conducted by evaluating the impact of every
action on every aircraft using the simulator. This allowed the optimizer to select the
optimal actions to generate the desired impact.
The second component estimates the complexity boundary of the traffic itself,
and continues to monitor this boundary. The complexity boundary of the traffic was
pre-estimated before the actual CRT exercise. The reason was that we were using a
scenario within a simulation environment. Therefore, we were able to analyze this
scenario before the exercise.
In the case of the human-engagement level, pre-estimation of the complexity
boundary was not possible because of the variations of human subjects. Therefore,
the estimation for engagement level needed to be executed dynamically within the
exercise itself.
The third component of the challenger defines the rules and actions of challeng-
ing. The rules’ subcomponent is built on top of the first two components to decide
whether complexity is crossing the boundaries. If it decides that complexity is
crossing from one subspace that is desirable to an undesirable subspace, the second
subcomponent manufactures the required response through a negotiation process
with the optimizer. This negotiation process does not only rely on the best solution
found by the optimizer, but also on the history of previous actions taken, and the
integrity constraints to ensure that actions are realistic and consistent with the traffic
environment and operating procedures.
The primary set of experiments was conducted over a working week with four
ATCOs and two pilots. Figure 5.5 summarizes the experimental protocol for each
subject. Each subject underwent a debrief and initial training or re-familiarization
with the environment because they all had experience in previous experiments with
similar tasks. A demographic survey was conducted followed by two cognitive tests,
mainly for the purpose of studying differences and similarities of the subjects. Each
subject conducted four air-traffic-control sessions.
Figure 5.6 summarizes the protocol followed in each of the four sessions. Each
session lasted for 75 min, and included one ATCO and the two pilots. The ATCO
will be referred to as the “measured player,” while all other team members within
the experiment, including analysts and pilots, will be referred to as the “unmeasured
players.” The position where the ATCO is sitting will be referred to as the “measured
position,” while other positions are the “measured” positions.
The “measured” refers to the process of measuring data from the human and
the traffic. The purpose of this CRT was to red team complexity with the ATCOs;
therefore, the ATCOs were the only subjects to be analyzed in this experiment.
The traffic scenario needed to be tested under the four following conditions: situ-
ations in which CRT was not used; situations in which CRT was used and relied on
traffic complexity alone; situations in which CRT was used and relied on cognitive
5.3 Cognitive-Cyber Symbiosis (CoCyS): Dancingwith Air Traffic Complexity 207
Fig. 5.5 Protocol for each ATCO/subject tested during the exercise
Fig. 5.6 Protocol for each of the 16 sessions conducted during the exercise
208 5 Case Studies on Computational Red Teaming
complexity alone; and situations in which CRT was used and relied on both traffic
and cognitive complexity. Each ATCO underwent all four conditions over four
sessions. The sequence of conditions was shuffled for each ATCO.
Subjective assessments where conducted at the beginning of each session in the
form of a survey, and at the end of each session in the form of a set of rating
questions to assess the perceived complexity of the scenario.
Human-brain data vary between different humans and even for the same human
at different times of the data and in different situations. As such, at the beginning of
each session, each ATCO was monitored for 6 min to collect baseline information
about the human at this particular point of time. This information was divided into
monitoring for 2 min with eyes closed and the person attempting to relax; 2 min
with eyes open and the person attempting to relax; and 2 min with eyes open and the
person attempting to solve a computational task. The same 6 min where repeated at
the conclusion of the traffic scenario.
The exercises were a great success. Some of the initial technical results were
published in [2]. The CRT environment, along with the cognitive models, were
successful in adapting the environment to the traffic and ATCOs’ states.
The most significant and domain-specific finding was that mathematical models
that test the complexity of a situation, such as air-traffic states, were insufficient
in approximating the cognitive complexity of the ATCO. There are legitimate and
logical reasons for this, besides the evidence extracted from these experiments. First,
the complexity within an air-traffic environment is built up over time.
Therefore, models that rely on a snapshot of the traffic are only examining what is
happening now, rather than how the situation has been evolving since the beginning
of the shift. Therefore, one may not notice an increase in complexity over time, but
continuous exposition of a human to a situation over time can create implications
of cognitive complexity for that human. Therefore, complexity measures need to
integrate their findings over time.
Second, each human is different. For example, humans differ in their skills,
problem-solving strategies, and perceptual abilities, all of which impact human
perception of complexity and the real impact of the air traffic on the human
cognitive processes. Therefore, a single mathematical measure that only considers
the traffic/context without the human managing the traffic/context can be very
misleading in judging a situation.
The findings of this exercise indicated the need to design complexity metrics that
are dynamic; interested readers can visit [3] for more details on this issue.
References 209
References
1. Abbass, H., Deborah, T., Kirby, S., Ellejmi, M.: Brain traffic integration. Air Traffic Technol.
Int., 34–39 (2013)
2. Abbass, H., Tang, J., Amin, R., Ellejmi, M., Kirby, S.: Augmented cognition using real-time
EEG-based adaptive strategies for air traffic control. In: International Annual Meeting of the
Human Factors and Ergonomic Society. HFES, SAGE (2014)
3. Abbass, H., Tang, J., Amin, R., Ellejmi, M., Kirby, S.: The computational air traffic control
brain: computational red teaming and big data for real-time seamless brain-traffic integration.
J. Air Traffic Control 56(2), 10–17 (2014)
4. Alam, S., Abbass, H.A., Barlow, M.: Atoms: air traffic operations and management simulator.
IEEE Trans. Intell. Transp. Syst. 9(2), 209–225 (2008)
5. Alam, S., Shafi, K., Abbass, H.A., Barlow, M.: An ensemble approach for conflict detection in
free flight by data mining. Transp. Res. Part C: Emerg. Technol. 17(3), 298–317 (2009)
6. Amin, R., Tang, J., Ellejmi, M., Kirby, S., Abbass, H.A.: Computational red teaming for
correction of traffic events in real time human performance studies. In: USA/Europe ATM
R&D Seminar, Chicago (2013)
7. Amin, R., Tang, J., Ellejmi, M., Kirby, S., Abbass, H.A.: An evolutionary goal-programming
approach towards scenario design for air-traffic human-performance experiments. In: IEEE
Symposium on Computational Intelligence in Vehicles and Transportation Systems (CIVTS),
pp. 64–71. IEEE, Singapore (2013)
8. Amin, R., Tang, J., Ellejmi, M., Kirby, S., Abbass, H.A.: Trading-off simulation fidelity and
optimization accuracy in air-traffic experiments using differential evolution. In: IEEE Congress
on Evolutionary Computation (CEC). IEEE, Beijing, China (2014)
9. Calder, R., Smith, J., Courtemanche, A., Mar, J., Ceranowicz, A.Z.: Modsaf behavior sim-
ulation and control. In: Proceedings of the Conference on Computer Generated Forces and
Behavioral Representation (1993)
10. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic
algorithm: Nsga-ii. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)
11. Dowek, G., Geser, A., Munoz, C.: Tactical conflict detection and resolution in a 3D airspace.
In: Proceedings of the 4th USA/Europe ATM R&D Seminar, Santa Fe (2001)
12. Endsley, M.R.: Measurement of situational awareness in dynamic systems. J. Hum. Factors
Ergon. Soc. 37(1), 65–84 (1995)
13. Endsley, M.R.: Toward a theory of situational awareness in dynamic systems. Hum. Fact.:
J. Hum. Fact. Ergon. Soc. 37(1), 32–64 (1995)
14. Erzberger, H., Paielli, R.: Conflict probability estimation for free flight. AIAA J. Guid. Control
Dynam. 20(3), 588–596 (1997)
15. Gazit, R.: Aircraft surveillance and collision avoidance using GPS. Ph.D. thesis, Stanford
University, Stanford (1996)
16. Hoekstra, J.: Designing for safety: the free flight air traffic management concept. Ph.D. thesis,
Delft University of Technology, Delft (2001)
17. Hoekstra, J., Gent, R., Ruigrok, R.: Conceptual design of free flight with airborne separa-
tion assurance. In: Proceedings of AIAA: Guidance, Navigation, and Control Conference,
vol. 4239, pp. 807–817 (1998)
18. Shafi, K., Abbass, H., Zhu, W.: Real time signature extraction during adaptive rule discovery
using UCS. In: Proceedings of the IEEE Congress on Evolutionary Computation, Singapore
(2007)
19. Tang, J., Alam, S., Lokan, C., Abbass, H.A.: A multi-objective approach for dynamic airspace
sectorization using agent based and geometric models. Transp. Res. Part C Emerg. Technol.
21(1), 89–121 (2012)
20. Wittman Jr, R.L., Harrison, C.T.: Onesaf: a product line approach to simulation development.
Technical report, DTIC Document (2001)
Chapter 6
The Way Forward
Abstract This book presented the first steps which transform RT, the art, into CRT,
the science. Thomas Gilbert, the father of human performance, based his work on
three principles to describe a science: simplicity, coherence, and utility. In writing
this book, these three principles were closely followed. The language has been
simplified to bridge the gap between management and computational scientists;
architectures were used to both connect the concepts in a coherent manner and
provide the basic structure to design and implement the computational models; and
examples have been given to demonstrate that the true utility of CRT extends beyond
the military to individuals, organizations and a whole of government. The aim to
plant the seed of CRT has been fulfilled, but the way ahead is long. This chapter
attempts to draw patterns in the sand and lay pheromones of trails for ideas that can
inspire CRT researchers.
Many areas start as an art, evolve into a science, mature into engineering solutions,
and blend into our daily life as a technology. RT is the art. As this book
establishes the science for CRT and the first steps toward engineering solutions,
it simultaneously establishes many gaps in that science. By nature, when scientists
solve a problem, they define many new ones. However, the architectures presented
in this book provide designs for engineers to plant the seed for many technologies
that can transform CRT into benefits for the society.
The way forward for CRT from a technological perspective is to simply put
aside the philosophy of CRT and focus on the opportunities that the architectures
presented in this book offer. Some of these opportunities are discussed below.
Humans evolved from a single family to isolated groups. As the size of each
group started to increase, connections started to emerge. Travel from one group to
another was the means to transfer knowledge, while the human brain was the only
computational device to create this knowledge on earth.
Academic publishers emerged and became the service bus in the SOA designed
by the social system. Publishers established a database of science services, through
collecting scientific papers with authors and their addresses. Publishers maintained
a subscribers’ database of scientists, libraries, etc., to disseminate knowledge.
Invention of the electro-magnetic transmitters, such as telephones and telegraphs,
made it possible to share information faster. But then ships and aircrafts provided
the means for publishers to sell whole books over hundreds and thousands of miles.
Up to this point of time, the thinking world was controlled by human minds.
The invention of computers followed by the internet created tipping points in this
thinking sphere. Today, there are machines that can solve complex problems that
humans cannot, and an environment of different nature that connected the overall
Electro-Magnetic Spectrum; an environment we call today: the Cyber space.
The Cyber space redefined cognition. Today, this Cyber space is the host of our
minds and brains. We discovered that our brain is nothing but an electro-magnetic
spectrum in itself. Signals that carry information get transmitted between neurons
that store information. Today, we discovered that we need to shift our focus from the
physical space to the cognitive and Cyber spaces. Classical physics can only offer
limited progress in these new spaces. As the world evolves, our brains will need to
blend with the wider Cyber space.
This blending process has seen many research activities over the years; but our
imagination in the past underestimated the reality of today. Classically, researchers
talk about adaptive automation, augmented cognition, human machine integration,
human machine interfaces, brain machine integration, brain machine interfaces,
human machine symbiosis, brain computer interfaces, and the list goes on. But the
words “machine”, “integration” and “interface” do not capture the true complexity
that is emerging around us. Hence, we adopted the terminology CoCyS in this
book to represent this blending process and the fluidity that exists in CoCyS. We
emphasize that CoCyS is not limited to a single human, a single machine, or a single
computer. CoCyS is the new space that blends information together from machines,
signals from the environment, brain waves, behavioral and social attributes from the
human [2, 3].
CoCyS is a transformation of the social system. It redefines the social system,
as it reconnects the minds across classical physical boundaries. As the minds
disconnect from some and connect to other minds, hearts follow. CoCyS will
reshape hearts and minds. We may fear the science fiction, but if science fiction
is an imagination, we should not forget as humans that our imagination is only
the starting point for what we have called “innovation”. We, as humans, have been
very successful all over the history of mankind in transforming imagination into
prototypes then into fully functional systems and reality! If we can imagine it, we
can design it; if we can design it, we can build it.
CRT is at the heart of CoCyS. Decisions in this new space need to be challenged,
not as we have classically have been doing using our human mind alone, but by
relying more and more on the Cyber space. We need to test trust through probing
for and gathering of information, estimating reliability of information by deploying
powerful data mining techniques, and acting fast and right.
6.1 Where Can We Go from Here? 213
There are many research opportunities that exist in CoCyS. For example, we need
to understand what new forms of thinking exist in this new space, the relationship
between information and decision, how human brains immerse into the Cyber space,
how cognition and cyber blend together, and how to analyze the fluidity of CoCyS.
This is a random sample of ideas, but ideas more relevant to the topic of this book
are discussed in the next section.
CoCyS will evolve and grow with and without our will. In this complex environ-
ment, we need the computer agent that can watch our back, augment our limited
cognitive capacity so that we are able to manage the level of complexity that our
brain is unable to comprehend. The Shadow CRT Machine does this. Today, we
have all our communications in the Cyber space, from emails, telephones, and even
GPS trails. We have our IPads, smart phones, smart glasses, smart watches, smart
houses, to name a few. While CoCyS is about the space where cognition and Cyber
blends, the Shadow CRT Machine is our slave computer agent that only focuses
on “us”. It knows our objective, monitors the environment, challenges the data it
receives and the context that comes with these data, assesses risk, and shares with
us its opinion.
To support the development of the Shadow CRT Machine, we need the data
mining tools that can analyze big data seamlessly, can discover noise in the
information, can reveal deceptive information, and can go beyond simple models
of reliability to complex models of trust.
The Shadow CRT Machine suggests more research into simulation as it needs
the simulation environment to be the platform to transform and communicate
our understanding of how systems work to the computational, then the Cyber,
environment. Simulation starts as the computational form of human understanding,
but can evolve autonomously by refining this understanding based on the abundant
information available in the Cyber space.
The Shadow CRT Machine demands more research into optimization theory, but
not in simple optimization models that assume that mathematical optimality is more
important than the assumptions in a problem. The Shadow CRT Machine requires
optimization solvers that can handle many objectives, noise, changing environment,
and high level of nonlinearity. The Shadow CRT Machine necessitates the design of
optimization solvers that are efficient and fast, and that provide lifelong optimization
capabilities.
214 6 The Way Forward
CRT is a rich area for the application of behavioral mining techniques. New data
mining techniques are needed to understand the dynamics of the interaction between
blue and red in the simulator, to autonomously extract patterns from agents in the
simulation to better understand how the agents act, and to autonomously infer intent
information from group behavior. This last point opens many possibilities in the use
of network mining for CRT.
Departing from the computer-based environment and coming to the real CRT
exercise, there are big data that the humans in the exercise generate. Automated
tools to analyze the exercise offers many different types of problems for data mining.
For example, discussions in the exercise can be captured through voice recording,
lots of drawings on the wall, and exchange of unstructured text. Each type of these
data-capture mechanisms generates different types of data which need to be mined
using methods such as speech analysis, conversation analysis, graph mining, process
mining, and text mining.
In summary, CRT offers many optimization and data mining opportunities for
the computational intelligence literature. It generates all sorts of data, and requires
very efficient tools to transform these data into meaningful information to describe,
explain and understand lessons learnt during the exercise.
This book has demonstrated many examples where CRT can be applied to every
situation where a decision is needed. CRT is a rational approach to decision making,
whereby alternative courses of action are challenged, risk are assessed, and, in light
of the analysis, decisions are made. Therefore, CRT can be applied everywhere.
However, it is important to weigh the cost against the benefits. The implementation
of CRT systems requires very skilled individuals to blend elements of the system
together.
One can say that the cost of any decision is high, but in CRT, a big chunk of this
cost needs to be paid upfront. As an example, for a simple decision like eating, if
we randomly choose what things to eat, the cost each time we make the decision is
small. However, over the many times we will make this random decision, the cost
can be accumulated to the extent that our lives become the cost. Alternatively, we
can pay a large cost upfront by analyzing our body needs, and develop a healthy
dietary program. This large cost that is being paid upfront can reduce the larger cost
that we may pay over time without noticing. CRT systems save cost by bringing
some of the cost and risk forward in time, thus, providing a hedging strategy against
high uncertainty in the future.
Many domains of application can benefit from CRT. For example, CRT has
been used extensively in Cyber-security, including evaluations of web services,
authentication protocols, and computer network security. However, the Cyber space
extends beyond computer networks as we have argued in previous sections. There is
an urge for CRT systems for Cyber Security. Given the computational nature of
CRT, it is more feasible and efficient than relying on humans in the Cyber space.
216 6 The Way Forward
Many large organizations can rely on CRT to connect day-to-day decisions all the
way up to the formation of their strategic plans. By ensuring that every tiny decision
made is linked to organizational vision, and that negative risks are preempted in
advance, a Shadow CRT Machine will become a proactive firewall for all decisions
made in an organization.
Individuals and small businesses can apply the science with pencils and papers.
Individuals can transform the science of CRT and rely on their brains, the best
computational machine that has been created, to do CRT.
The identification of novel applications for CRT is not a difficult task. However,
once a domain of application has been chosen, research and effort need to be
directed to the design of such systems and the identification of proper models to
support a CRT system.
References
1. Abbass, H., Bender, A., Gaidow, S., Whitbread, P.: Computational red teaming: past, present
and future. IEEE Comput. Intell. Mag. 6(1), 30–42 (2011)
2. Abbass, H., Tang, J., Amin, R., Ellejmi, M., Kirby, S.: Augmented cognition using real-time
EEG-based adaptive strategies for air traffic control. In: International Annual Meeting of the
Human Factors and Ergonomic Society. HFES, SAGE (2014)
3. Abbass, H., Tang, J., Amin, R., Ellejmi, M., Kirby, S.: The computational air traffic control
brain: computational red teaming and big data for real-time seamless brain-traffic integration.
J. Air Traffic Control 56(2), 10–17 (2014)
4. Ilachinski, A.: Enhanced ISAAC neural simulation toolkit (EINSTein): an artificial-life labo-
ratory for exploring self-organized emergence in land combat (U). Center for Naval Analyses,
Beta-Test Users Guide 1101, no. 610.10 (1999)
5. Yang, A., Abbass, H.A., Sarker, R.: Evolving agents for network centric warfare. In: Proceedings
of the 2005 Workshops on Genetic and Evolutionary Computation, pp. 193–195. ACM,
New York (2005)
6. Yang, A., Abbass, H.A., Sarker, R.: Landscape dynamics in multi–agent simulation combat
systems. In: AI 2004: Advances in Artificial Intelligence, pp. 39–50. Springer, New York (2005)
Index
A Embodiment, 12
Action Environment, 70
Deliberate Action, 68 Experiment, 116
Intentional Action, 50 Experimentation, 114
B F
Behavior, 70, 77 Fundamental Inputs to Capabilities, 171
Bias, 8, 17
Big Data, 137
Blue-red Simulation, 37 G
Goals, 54
C
Capability, 172 H
Challenge, 6, 89 Hypothesis, 115
Challenge Analytics, 86, 100
Deliberate Challenge, 6
Competency, 73 I
Comparative Competency, 76 Imitation Game, 40
Conflict, 10
Military, 10
Cyber Operations, 178 M
Cyber Security, 15, 178 Mission, 172
Cyber security, 176 Motivation, 87
Cyber Space, 178
N
D Network, 174
Data Mining, 129 Effects, 172
C4.5, 134 Network Operations, 178
Deliberate Action, 69 Deny Operation, 181
Destroy Operation, 183
Detect Operation, 179
E Hide Operation, 184
Effect, 58, 172 Identify Operation, 180