Computational Red Teaming: Hussein A. Abbass

Hussein A.
Abbass
Computational
Red Teaming
Risk Analytics of Big-Data-to-Decisions
Intelligent Systems
Computational Red Teaming
Hussein A. Abbass
Computational Red Teaming

Risk Analytics of Big-Data-to-Decisions
Intelligent Systems
123
Hussein A. Abbass
School of Engineering and Information Technology
University of New South Wales Australia
Canberra, ACT, Australia
ISBN 978-3-319-08280-6 ISBN 978-3-319-08281-3 (eBook)

DOI 10.1007/978-3-319-08281-3
Springer Cham Heidelberg New York Dordrecht London
Library of Congress Control Number: 2014951509
© Springer International Publishing Switzerland 2015

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered
and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of
this publication or parts thereof is permitted only under the provisions of the Copyright Law of the
Publisher’s location, in its current version, and permission for use must always be obtained from Springer.
Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations
are liable to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for
any errors or omissions that may be made. The publisher makes no warranty, express or implied, with
respect to the material contained herein.
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)

To God, for creating our minds as the best
computational devices to red team
To my two sons, life partner, dad and mom,
brothers and sisters, teachers, students,
colleagues, and those angels who believe in
me, thank you for being the best social reality
to shape my mind
Preface
This book is about elite types of thought processes and architectures for big data and
modeling that can enable smart and real-time decisions. Today’s world is abundant
with data and models; many new problems are formulated and solved everyday;
many artificial-intelligence, mathematical, and statistical models exist, but there
is a lack of scholarly work to demonstrate how to bring these data, models, and
opportunities together to produce value for organizations. This book does exactly
that and is written in a style designed to bridge management and computational
scientists.
This is a book about Computational Red Teaming (CRT): a computational
machine that can shadow the operations of any system. The Shadow CRT Machine
can think together with, or on behalf of, the system by asking “what–if” questions,
assessing threats and risks, challenging the system, environment, and competitors,
and using its well-engineered predictive models and computational thinking tools to
make the right decision at the right time.
Red Teaming (RT) is traditionally a decision-aiding art used by the military to
role play an adversary, play the devil’s advocate against one’s own concepts, plans,
strategies, or systems to “test and evaluate” them to improve decision making. This
book has been written to distill general principles from RT, and generalize and
transform RT, the art, into CRT, the science. The discussion will depart from the
military context to demonstrate the utility and applicability of CRT to individuals
and organizations. CRT transforms the classical “test-and-evaluation” process to a
continuous and proactive “test-and-redesign” process.
CRT means systemic and scientific RT. The word “computational” emphasizes
the necessity for systemic and computable steps that can be executed by humans
and computers alike, and allows for an evidence-based decision-making process
that can be traced to causes. Many tools discussed in this book can be employed
by using pencil and paper, and can equally be scaled up to big data and big models
that exceed human cognitive processing and classical computer abilities. With the
advances that have been made in fields such as computational intelligence, data
analytics, optimization, simulation, systems thinking, and computational sciences,
today, we have the tools to implement CRT in silico.
vii
viii Preface
Analytics is the science for transforming data to decisions. CRT uses risk
analytics, where risk is the focal point of the decision-making process, and challenge
analytics, where actions and counteractions are designed just across the system
performance boundary, to test and redesign the right decisions for an organization.
CRT creates opportunities for individuals, organizations, and governments by
grounding RT in system and decision sciences, and by identifying the architectures
required to transform data into decisions.
Risk analytics and challenge analytics, jointly, create the CRT world of this
book. The part of the world that treats risk analytics examines what risk is, and
demonstrates how evidence-based decisions must always be driven by risk thinking.
The part of the world treating challenge analytics structures the concept of what a
challenge is, discusses how to systematically and autonomously design and discover
challenges, and how to challenge an individual, organization, or even a computer
algorithm.
Over six chapters, CRT will be presented. Chapter 1 brings the reader inside
the classical world of RT. It explains the philosophy of this art, and presents a
story to demonstrate that the art of RT can benefit each individual, not only large
organizations. The steps for implementing an RT exercise are explained, and the
characteristics of a successful RT exercise and the ethics of RT are discussed.
The book then sweeps into the two building blocks of risk analytics and challenge
analytics that form the scientific principles for CRT, the science. Chapter 2 uses
a systems approach to establish the basis for risk thinking and challenge design.
Materials in the chapter cross the boundaries of uncertainty and risk, intentional
and deliberate actions, and deliberate challenges to the systems approach, skills and
competency to shape and influence performance.
Chapter 3 presents the big-data-to-decisions CRT. The chapter introduces and
brings together the architectures and building blocks used to design and develop
the computational environment that supports CRT. This chapter presents a gentle
introduction to experimentation, optimization, simulation, data mining, and big data
before presenting how these technologies need to blend to offer CRT architectures.
The CRT science relies on efficient tools to understand the future, and allows an
effective understanding of how to analyze “messy spaces,” as well as discover the
right methods to deconstruct complex organizations and the intermingled physical,
cyber, cognitive, and social domains (PC2SD). Beginning by offering scenarios
to prompt thoughts about the future and concluding with control mechanisms for
networks and generation of effects, Chap. 4 complements the computational tools
presented in Chap. 3 with the necessary system-thinking ingredients to transform
computational models into effective strategic tools. This chapter discusses planning
scenarios, and the complexity arising from the interaction of effects in the PC2SD.
It presents two models to manage this complexity: a model to transform complex
organizations into simple building blocks for analysis, and a model discussing the
operations required to analyze and generate effects in complex networked systems,
that form the basis for a thinking model suitable to design and form cyber-security
operations and complex social-engineering strategies.
Preface ix
Chapter 5 complements the materials by presenting three case studies of

increasing complexity. These are adopted from the author’s research. The purpose of
these case studies is to provide examples to guide the reader in adopting the lessons
gleaned from this book. The cases are discussed in plain language and provide an
overview of the logic beneath each case: what was done and how. The first case study
demonstrates the use of CRT to challenge and evaluate well-designed algorithms for
aircraft-conflict detection. The second case study presents a game used to challenge
a human. The third case study presents a large-scale experiment that combines many
elements of this book. This experiment is aimed at designing a Cognitive-Cyber
Symbiosis (CoCyS; pronounced “cookies”) environment for air-traffic controllers
in safety-critical domains. CoCyS is introduced, in addition to the CRT exercise.
The last chapter (Chap. 6) concludes the book and offers some reflections and
ideas for future work on CRT.
The reader is encouraged to read the book in chapter order. Regardless of whether
the readers find the information too easy or too difficult, they are advised to keep
reading to the end. When the last chapter is reached, the ideas will be connected
despite some concepts may still seem to be confusing. However, by the end of the
book, the reader is expected to know what CRT is all about to the extent that the
book can be read again (and again) to digest some of the most difficult concepts
encountered.
Last, but not least, the author is eager to hear comments, good or otherwise,
from the readers. Please forward any comments to the author’s email address at
hussein.abbass@gmail.com. The author asks the reader to red team this book; only
then will the reader’s own journey toward CRT begin.
Canberra, ACT, Australia and Singapore, Hussein A. Abbass

August 2014
Acknowledgments
We receive limited education through schools and universities. The unlimited

education we obtain in the rest of our lives is through personal reading and
experiences. My personal library at work and home is full of books, only several
of whose titles I recall. These are written by the authors who truly influenced me in
writing this book. I may not refer to their individual work in the text, but I hereby
acknowledge their contribution in shaping my thinking to write this book. These
authors are Robert M. Axelrod, Paul K. Davis, Richard Dawkins, Jacques Ellul, John
M. Flach, David B. Fogel, Thomas F. Gilbert, David E. Goldberg, John H. Holland,
Andrew Ilachinski, John R. Koza, Christopher G. Langton, S. Lily Mendoza, Martin
A. Nowak, William B. Rouse, and Xin Yao.
My experience is constructed through my interaction with many people who
shaped my mind and motivated me to think and reflect. I thank the following
people. My dad and elder sister, whose discussions in front of me when I was a
kid about their own experiences taught me my first lessons on strategy. My life
partner, Dr Eleni Petraki, who taught me about sociolinguistics and intercultural
communication. My first mentor, Prof. Mohamed Rasmy, who remained ahead of
his time until the last moment of his life. The one and only, Dr Axel Bender, who
challenged my thinking and listened to me talking when no one else understood
what I was talking about. The risk guru, Dr Svetoslav Gaidow, who taught me the
importance of risk standards.
Mr Mohamed Ellejmi and Mr Stephen Kirby, who were the best project managers
with whom anyone could hope to work during my project on CRT to evaluate future
concepts of air-traffic control with Eurocontrol. Prof. Akira Namatame, who took
the risk and invited me to teach the first course on CRT at the National Defense
Academy of Japan. I wish I had this book written before teaching this course; but
then, the experience from teaching the course made me realize how I should have
taught it, and how I should organize this book.
My colleagues, Dr Michael G. Barlow, Dr Chris J. Lokan, Dr Robert I. McKay,
Dr Kathryn E. Merrick, and Dr Ruhul A. Sarker, who shared with me the journey of
supervising many students on topics related to this book, and who were supportive
of my style of supervision while teaching red teaming to my students. The following
xi
xii Acknowledgments
scientists worked with me on some aspects of CRT: Dr Sameer Alam, Dr Sondoss

El-Sawah, Dr Lam Bui, Dr George Leu, Dr Kamran Shafi, Dr Jiangjun Tang, Dr
Kun Wang, Dr Shir Li Wang, Dr Ang Yang, and Dr Wenjing Zhao.
Writing this book has been a goal of mine for many years. It was only possible
due to two well-synchronized events: being awarded an Australian Research
Council (ARC) Discovery Grant on CRT, and the generosity of the University of
New South Wales—Australia in allowing me to have a complete sabbatical year in
2014. I would like to thank the National University of Singapore, my host A/Prof.
Dipti Srinivasan, and my friend A/Prof. Kay Chen Tan for hosting me during my
sabbatical and offering me my dream office with a view of the ocean.
I also acknowledge the organizations that have supported me financially to
conduct the pure research on topics that culminated in this book. I am obliged to
them for entrusting me with the fund. These are the ARC, the Defence Science and
Technology Organisation (DSTO), and Eurocontrol.
Mr. Charles Glaser and Ms. Jessica Lauffer from Springer, the publisher of this
book, have been very patient and understanding during the journey of this book. I
acknowledge their professionalism in supporting me to complete this book project.
The views expressed in this book are all mine and do not reflect the views,
opinions, or position of any agency that funded my research.
Contents
1 The Art of Red Teaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1

1.1 A Little Story .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1
1.2 Red Teaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3
1.2.1 Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6
1.2.2 Executing Exercises . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6
1.2.3 Deliberately Challenging .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6
1.2.4 Risk Lens .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8
1.2.5 Understanding the Space of Possibilities . .. . . . . . . . . . . . . . . . . . . . 8
1.2.6 Exploring Non-conventional Behaviors . . .. . . . . . . . . . . . . . . . . . . . 8
1.2.7 Testing Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9
1.2.8 Mitigating Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 10
1.3 Success Factors of Red Teams. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 10
1.3.1 Understanding and Analyzing the Concept of a Conflict . . . . . 10
1.3.2 Team Membership . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12
1.3.3 Time for Learning, Embodiment and Situatedness.. . . . . . . . . . . 12
1.3.4 Seriousness and Commitment .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13
1.3.5 Role Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 14
1.3.6 Reciprocal Interaction . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 14
1.4 Functions of Red Teaming .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 15
1.4.1 Discovering Vulnerabilities . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 15
1.4.2 Discovering Opportunities . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 15
1.4.3 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 16
1.4.4 Thinking Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 16
1.4.5 Bias Discovery .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 17
1.4.6 Creating Future Memories and Contingency Plans . . . . . . . . . . . 18
1.4.7 Memory Washing .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 18
1.5 Steps for Setting Up RT Exercises . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 19
1.5.1 Setting the Purpose, Scope and Criteria of Success . . . . . . . . . . . 19
1.5.2 Designing the Exercise . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 20
1.5.3 Conducting the Exercise .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 23
1.5.4 Monitoring and Real-Time Analysis of the Exercise . . . . . . . . . 24
xiii
xiv Contents
1.5.5 Post Analysis of the Exercise . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 24

1.5.6 Documenting the Exercise . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 25
1.5.7 Documenting Lessons Learnt on Red Teaming . . . . . . . . . . . . . . . 25
1.6 Ethics and Legal Dimensions of RT. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 26
1.6.1 The RT Business Case . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 26
1.6.2 Responsible Accountability . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 26
1.6.3 The Ethics of Budget Estimation . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 31
1.7 From Red Teaming to Computational Red Teaming . . . . . . . . . . . . . . . . . . 32
1.7.1 Military Decision Sciences and Red Teaming .. . . . . . . . . . . . . . . . 32
1.7.2 Smoothing the Way Toward Computational
Red Teaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 33
1.7.3 Automating the Red-Teaming Exercise.. . .. . . . . . . . . . . . . . . . . . . . 36
1.7.4 Blue-Red Simulation . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 37
1.8 Philosophical Reflection on Assessing Intelligence . . . . . . . . . . . . . . . . . . . 40
1.8.1 The Imitation Game (Turing Test) for Assessing
Intelligence .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 40
1.8.2 Computational Red Teaming for Assessing Intelligence . . . . . 42
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 43
2 Analytics of Risk and Challenge . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 47
2.1 Precautions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 47
2.2 Risk Analytics .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 49
2.2.1 Intentional Actions.. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 50
2.2.2 Objectives and Goals . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 52
2.2.3 Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 58
2.2.4 Uncertainty and Risk . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 61
2.2.5 Deliberate Actions . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 68
2.3 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 70
2.3.1 Behavior .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 70
2.3.2 Skills . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 71
2.3.3 Competency .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 73
2.3.4 From Gilbert’s Model of Performance to a
General Theory of Performance.. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 79
2.4 Challenge Analytics.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 86
2.4.1 A Challenge is Not a Challenge .. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 86
2.4.2 Motivation and Stimulation . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 87
2.4.3 Towards Simple Understanding of a Challenge . . . . . . . . . . . . . . . 89
2.4.4 Challenging Technologies, Concepts and Plans . . . . . . . . . . . . . . . 94
2.5 From the Analytics of Risk and Challenge
to Computational Red Teaming . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 96
2.5.1 From Sensors to Effectors .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 96
2.5.2 The Cornerstones of Computational-Red-Teaming . . . . . . . . . . . 99
2.5.3 Risk Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 100
2.5.4 Challenge Analytics Using the Observe-
Project-Counteract Architecture . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 100
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 103
Contents xv
3 Big-Data-to-Decisions Red Teaming Systems . . . . . . . .. . . . . . . . . . . . . . . . . . . . 105

3.1 Basic Ingredients of Computations in Red Teaming . . . . . . . . . . . . . . . . . . 105
3.1.1 From Classical Problem Solving
to Computational-Red-Teaming.. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 105
3.1.2 Run Through a CRT Example.. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 107
3.2 Experimentation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 114
3.2.1 Purpose, Questions, and Hypotheses . . . . . .. . . . . . . . . . . . . . . . . . . . 114
3.2.2 Experiments .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 116
3.3 Search and Optimization.. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 118
3.3.1 Blind vs Knowledge-Based Optimization .. . . . . . . . . . . . . . . . . . . . 122
3.3.2 System vs Negotiation-Based Optimization .. . . . . . . . . . . . . . . . . . 123
3.4 Simulation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 125
3.4.1 Resolution, Abstraction and Fidelity . . . . . .. . . . . . . . . . . . . . . . . . . . 127
3.5 Data Analysis and Mining . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 129
3.5.1 C4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 134
3.6 Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 137
3.6.1 The 6 V’s Big Data Characteristics . . . . . . . .. . . . . . . . . . . . . . . . . . . . 138
3.6.2 Architectures for Big Data Storage . . . . . . . .. . . . . . . . . . . . . . . . . . . . 140
3.6.3 Real-Time Operations: What It Is All About .. . . . . . . . . . . . . . . . . 141
3.6.4 GDL Data Fusion Architecture . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 141
3.7 Big-Data-to-Decisions Computational-Red-Teaming-Systems .. . . . . . 142
3.7.1 Preliminary Forms of Computational-Red-
Teaming-Systems . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 142
3.7.2 Progressive Development of Sophisticated
Computational-Red-Teaming-Systems . . . .. . . . . . . . . . . . . . . . . . . . 146
3.7.3 Advanced Forms of Computational-Red-
Teaming-Systems . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 147
3.7.4 The Shadow CRT Machine .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 154
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 157
4 Thinking Tools for Computational Red Teaming . . . .. . . . . . . . . . . . . . . . . . . . 159
4.1 Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 159
4.1.1 Possibility vs Plausibility . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 159
4.1.2 Classical Scenario Design .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 162
4.1.3 Scenario Design in CRT . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 164
4.2 A Model to Deconstruct Complex Systems. . . . . . . .. . . . . . . . . . . . . . . . . . . . 168
4.2.1 Connecting the Organization .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 169
4.2.2 Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 169
4.2.3 Fundamental Inputs to Capabilities . . . . . . . .. . . . . . . . . . . . . . . . . . . . 171
4.2.4 Capabilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 172
4.2.5 Vision, Mission and Values. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 172
4.2.6 Strategy .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 173
4.3 Network-Based Strategies for Social and Cyber-Security
Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 174
4.3.1 Socio-Cognitive-Cyber-Physical Effect Space . . . . . . . . . . . . . . . . 174
xvi Contents
4.3.2 Cyber Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 176

4.3.3 Operations on Networks .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 178
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 185
5 Case Studies on Computational Red Teaming . . . . . . .. . . . . . . . . . . . . . . . . . . . 187
5.1 Breaking Up Air Traffic Conflict Detection Algorithms . . . . . . . . . . . . . . 187
5.1.1 Motivation and Problem Definition .. . . . . . .. . . . . . . . . . . . . . . . . . . . 187
5.1.2 The Purpose .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 188
5.1.3 The Simulator .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 189
5.1.4 The Challenger .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 189
5.1.5 Context Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 190
5.1.6 Context Miner .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 192
5.1.7 The Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 194
5.2 Human Behaviors and Strategies in Blue–Red Simulations . . . . . . . . . . 194
5.2.2 The Purpose .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 196
5.2.3 The Simulator .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 196
5.2.4 The Challenger .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 196
5.2.5 Behavioral Miner . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 197
5.2.6 The Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 198
5.3 Cognitive-Cyber Symbiosis (CoCyS): Dancing
with Air Traffic Complexity . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 198
5.3.2 The Purpose .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 201
5.3.3 Experimental Logic .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 202
5.3.4 The Simulator .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 204
5.3.5 The Miner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 204
5.3.6 The Optimizer.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 205
5.3.7 The Challenger .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 205
5.3.8 Experimental Protocol .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 206
5.3.9 The Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 208
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 209
6 The Way Forward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 211
6.1 Where Can We Go from Here? . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 211
6.1.1 Future Work on Cognitive-Cyber-Symbiosis .. . . . . . . . . . . . . . . . . 211
6.1.2 Future Work on the Shadow CRT Machine . . . . . . . . . . . . . . . . . . . 213
6.1.3 Computational Intelligence Techniques
for Computational Red Teaming .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 214
6.1.4 Applications of Computational Red Teaming . . . . . . . . . . . . . . . . . 215
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 216
Index . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 217
Acronyms
ACE Automated co-evolution

AI Artificial intelligence
ARC Australian Research Council
ART Automated RT
ATC Air-traffic control
ATCO Air-traffic controller
ATM Air-traffic management
ATOMS Air traffic operations and management simulator
CART Classification and regression tree
CAS Complex adaptive systems
CASTFOREM Combined arms and support task force evaluation model
CEO Chief executive officer
CHAID Chi-square automatic interaction detection
CoCyS Cognitive-cyber symbiosis
CPA Closest point of approach
CROCADILE Comprehensive research-oriented combat agent distillation
implemented in the littoral environment
CRT Computational red teaming
DoD Department of defense
DSA Distributed simulation architecture
DSTO Defence Science and Technology Organisation
EEG Electroencephalographic
EINSTein Enhanced ISAAC Neural Simulation Toolkit
EQ Emotional quotient
FAR Field anomaly relaxation
GA Genetic algorithm
GPS Global positioning system
HDFS Hadoop distributed file system
HLA High-level architecture
ID3 Iterative Dichotomiser 3
xvii
xviii Acronyms
IG Imitation game
IQ Intelligence quotient
ISAAC Irreducible semi-autonomous adaptive combat
ISO International standard organization
IT Information technology
JANUS Not an acronym
JDL Joint directors of laboratories
jSWAT Joint seminar wargame adjudication tool
KDD Knowledge discovery in databases
LMDT Linear machine decision trees
LSL Lanchester square law
M2T Model to think
MANA Map aware nonuniform automata
MAP Military Appreciation Process
ModSAF Modular semi-automated forces
NID Network-intrusion detection
NSGA Non-dominated sorting genetic algorithm
OneSAF One Semi-Automated Forces
OPFOR Opposing Force
OR Operations Research
PAX Plausible agents matrix
PAX3D PAX in three dimensions
PESTE Political, environmental, social, technological, and economic
QUEST Quick, Unbiased, Efficient, Statistical Tree
RAA Risk analytics architecture
R&D Research and development
RT Red teaming
RT-A RT auditor
RT-C RT communicator
RT-D RT designer
RT-Doc RT documenter
RT-LC RT legal councilor
RT-O RT observer
RT-S RT stakeholder
RT-T RT thinker
RT-Tech RT technician
SC2PD Social, cognitive, cyber and physical domains
SDA Sense, decide, act
SLA Sense, learn, act
SLIQ Supervised learning in quest
SOA Service-oriented architecture
SPRINT Scalable classifier for data mining
Acronyms xix
SWOT Strength, Weaknesses, Opportunities, and Threats

T2M Think to model
UAV Unmanned aerial vehicles
USA United States of America
WISDOM Warfare intelligent system for dynamic optimization of missions
XOR Exclusive OR
List of Figures
Fig. 1.1 Connecting relationships among agents to high level reasoning . . . 39

Fig. 1.2 Reasoning in nonlinear dynamics using networks and
time series analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 39
Fig. 2.1 Risk analytics steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 49
Fig. 2.2 Understanding agents’ actions and their relationships
to the environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 51
Fig. 2.3 Blue and red objective spaces and their correlations. A
solid arrow/line indicates positive correlation; a dotted
arrow/line indicates negative correlation . . . . . . . .. . . . . . . . . . . . . . . . . . . . 53
Fig. 2.4 Differentiating interteam and intrateam conflicting objectives . . . . . 56
Fig. 2.5 The nested nature of red teaming . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 57
Fig. 2.6 The role of uncertainty in an agent’s decision making cycle . . . . . . . 62
Fig. 2.7 Building blocks of hazards and threats . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 64
Fig. 2.8 A diagram connecting different concepts related to risk .. . . . . . . . . . . 66
Fig. 2.9 Role of uncertainties and objectives in different environments.. . . . 67
Fig. 2.10 Deconstruction of behavior.. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 78
Fig. 2.11 An architecture of agent-environment interaction model . . . . . . . . . . . 83
Fig. 2.12 Abstract categorization of the role of computational red teaming .. 86
Fig. 2.13 Synthesizing Ellul and Mendoza opinions on a challenge .. . . . . . . . . 91
Fig. 2.14 A conceptual diagram of the concept of challenge .. . . . . . . . . . . . . . . . . 92
Fig. 2.18 Transforming sensorial information, from sensors, to
effects, through effectors, cycle . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 96
Fig. 2.19 Cognitive–cyber–symbiosis of the CRT-based
sensors-to-effectors architecture . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 97
Fig. 2.20 Service-oriented architecture for computational red teaming . . . . . . 98
Fig. 2.21 The cornerstones of computational red teaming . . . . . . . . . . . . . . . . . . . . 99
Fig. 3.1 Problem solving schools of thinking .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 106
xxi
xxii List of Figures
Fig. 3.2 Fitness landscape of Ramada’s loyalty in response to

Bagaga’s financial aid . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 109
Fig. 3.3 Solutions encountered during the optimization process
to construct the fitness landscape . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 110
Fig. 3.4 The relationship between fitness landscape and causal space . . . . . . 111
Fig. 3.5 Classification of causal space . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 111
Fig. 3.6 Challenging the causal space . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 112
Fig. 3.7 Second layer causal space . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 113
Fig. 3.8 Multi-agent system for negotiating the optimization of
interdependent problems . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 125
Fig. 3.9 Relationship between resolution, abstraction and fidelity . . . . . . . . . . 127
Fig. 3.10 The IF : : : Then : : : rule in a tree form.. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 131
Fig. 3.11 A decision tree for continuous variables .. . . . . . . .. . . . . . . . . . . . . . . . . . . . 132
Fig. 3.12 Outer and inner classification . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 134
Fig. 3.13 CRT0: baseline preliminary computational red reaming level.. . . . . 143
Fig. 3.14 CRT1: level one preliminary computational red reaming system . . 143
Fig. 3.15 CRT2: level two preliminary computational red reaming system . . 144
Fig. 3.16 CRT3: level three preliminary computational red
reaming system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 144
Fig. 3.17 CRT4: level four preliminary computational red
reaming system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 145
Fig. 3.18 Stage one of the risk analytics architecture .. . . . .. . . . . . . . . . . . . . . . . . . . 147
Fig. 3.19 Stage two of the risk analytics architecture.. . . . .. . . . . . . . . . . . . . . . . . . . 149
Fig. 3.20 Stage three of the risk analytics architecture . . . .. . . . . . . . . . . . . . . . . . . . 150
Fig. 3.21 Stage four of the risk analytics architecture . . . . .. . . . . . . . . . . . . . . . . . . . 151
Fig. 3.22 Stage five of the risk analytics architecture .. . . . .. . . . . . . . . . . . . . . . . . . . 152
Fig. 3.23 Stage six of the risk analytics architecture.. . . . . .. . . . . . . . . . . . . . . . . . . . 153
Fig. 3.24 The shadow computational red teaming machine .. . . . . . . . . . . . . . . . . . 156
Fig. 3.25 Cognitive-Cyber-Symbiosis for Computational Red Teaming . . . . . 156
Fig. 4.1 Building blocks of a scenario in CRT. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 165
Fig. 4.2 Building block form 1 of a scenario in CRT. . . . .. . . . . . . . . . . . . . . . . . . . 167
Fig. 4.3 Building block form 2 of a scenario in CRT. . . . .. . . . . . . . . . . . . . . . . . . . 167
Fig. 4.4 Schematic diagram displaying how the blue and red
teams are connected strategically as a system . . .. . . . . . . . . . . . . . . . . . . . 170
Fig. 4.5 The interdependent nature of the effect space . . .. . . . . . . . . . . . . . . . . . . . 175
Fig. 4.6 An outline of the building blocks for the Cyber space .. . . . . . . . . . . . . 177
Fig. 4.7 Different operations on networks to achieve an effect . . . . . . . . . . . . . . 179
Fig. 5.1 Signature extraction classifier . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 193
Fig. 5.2 A pictorial representation of the cognitive balance
required for users in safety critical jobs . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 201
Fig. 5.3 Brain traffic interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 202
Fig. 5.4 Brain traffic interface loop . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 203
Fig. 5.5 Protocol for each ATCO/subject tested during the exercise . . . . . . . . 207
Fig. 5.6 Protocol for each of the 16 sessions conducted during
the exercise .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 207
List of Tables
Table 1.1 Categorization of alternative analysis methods as

suggested by Matherly [12] .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 34
Table 2.1 An example for mapping Gilbert’s model to a scientist job . . . . . . . . 80
Table 3.1 Data fusion functions hierarchy.. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 142
Table 3.2 Evolution of computational red teaming capabilities .. . . . . . . . . . . . . . 146
xxiii
Chapter 1
The Art of Red Teaming
The commander must work in a medium which his eyes cannot see, which
his best deductive powers cannot always fathom; and with which, because of
constant changes, he can rarely become familiar.
Carl von Clausewitz (1780–1831) [49]
Abstract Red Teaming (RT) has been considered the art of ethical attacks. In RT,
an organization attempts to role play an attack on itself to evaluate the resilience of
its assets, concepts, plans, and even organizational culture. While historically, RT
has been considered a tool by the military to evaluate its own plans, this chapter will
remove RT from the military context and take steps to generalize it as an art before
discussing it in later chapters as a science. This chapter will first introduce the basic
concept of RT, will discuss the characteristics of what makes a successful red team,
and present a set of systemic steps to design a RT exercise. The topic necessitates
a detailed discussion on the ethics of RT, including the ethical issues to consider
when planning the budget and financial commitments of the exercise. To lay the
foundation for transforming RT to the computational world, this chapter concludes
with an explanation of why RT exercises cannot be fully automated, followed by a
discussion on how RT contributes to the field of artificial intelligence.
1.1 A Little Story
John has an interview for his dream job. He has spent his life dreaming of becoming
a branch manager in a bank. Finally, his dream is close to becoming a reality. He
does not want to risk making a mistake during the interview. However, this is his
first interview. He does not know the questions he will be asked, or whether he is
ready for a surprise question.
Martin is John’s best friend. John talked to Martin about his feelings. Martin
suggested a great idea: “How about we do a mockup interview together: I will act
© Springer International Publishing Switzerland 2015 1

H.A. Abbass, Computational Red Teaming, DOI 10.1007/978-3-319-08281-3__1
2 1 The Art of Red Teaming
as the interview panel you will face tomorrow; you try to think on your feet and
answer my questions.” John liked the idea.
John said to Martin, “If you really want this exercise to be effective, you need
to ask me difficult questions. Ask me questions you know I may not know how to
answer. Do not be nice to me: the harder you are on me today, the better prepared
I will be for tomorrow. Push me beyond my limit, and break me up today so that I
can stand strong tomorrow.”
Martin began asking John questions, some of them John knew would never be
asked in his interview. Martin knew very little about the job of a branch manager in
a bank. His knowledge and imagination of the questions he should ask John were
limited by his experience. He had never been in a bank environment, and so did not
know the issues a branch manager faces.
Martin suggested to John that they invite their friend Amy. She worked as a
branch manager in a bank. Therefore, she must know what it is like to be a branch
manager and the questions to ask. John liked the idea. He invited Amy to join them.
John was torn apart by Amy and Martin’s questions. The questions from Amy
were spot on, and John was surprised by the diversity of challenges a branch
manager in a bank faces. The questions from Martin focused on general personality
and management skills that are vital for the job. They provided John with a set of
challenges he did not anticipate. Amy’s questions triggered questions in Martin’s
mind, and vice versa. At the end of the exercise, John thanked Martin and Amy,
while sweating.
The following day, John went to the interview. At the end of the interview,
he was offered the job. Back at the coffee shop, while celebrating with Amy and
Martin, they both asked John in one voice, “How many questions did we ask that
the interview panel asked as well?” John smiled and replied calmly, “None.”
Amy and Martin were sad. Amy asked, “So we did not help you much! We have
pretty much wasted your time!” John smiled and said, “On the contrary, without
the mockup interview, I would not have secured this job.” Amy and Martin asked
simultaneously, “How so?” John replied, “After the interview, I realized that the
value of the mockup interview we did together was not in predicting the questions
the panel would ask me : : : . It actually prepared me for life. It prepared me to
think on my feet, to manage surprise questions and focus on how to answer them,
rather than being shocked by them. It showed me how to link what I know and
my comfort zone with the unknown questions that I could not anticipate.” John
continued, “Thank you Amy and Martin for being a great red team!”
1.2 Red Teaming 3
1.2 Red Teaming
Know your enemy, but in so doing do not forget to know yourself.

Sun Tzu (544 BC–496 BC) [46]
Today’s world is a great deal more complex than the days of Sun Tzu. During those
old days, the black and white view of the universe was clear: who is the enemy
and who is not. Today, political, environmental, social, technological, and economic
(PESTE) issues are intermingled. Two countries can form a political coalition, while
they compete economically. The country that poses the greatest political threat
can also be the greatest economic supporter. Who is the enemy? Perhaps this is
a question that some people are still able to answer, but the world is not divided
between “us and them:” along the way lie many players who are a critical part of
the game. Therefore, we need to generalize Sun Tzu’s statement to one that is more
appropriate to today’s complex environment.
Know your enemy, your friends, yourself and the environment but in so
doing do not forget to know how to know.
The above statement emphasizes the need to “know how to know.” If we have the
tools in place to know how to know, and use these tools appropriately, we will create
knowledge. Knowledge is power, especially in the age in which a knowledge-based
society is dominant. This book is about an evidence-based approach to know how
to know.
In addition to knowing how to know, we need to decide what we need to know
and about whom. From a competition perspective, possibly the four categories
of entities we need to know well are ourselves, our enemies (or better call them
competitors to leave the sphere of the military), our friends, and the environment. We
need to know ourselves to understand our own strengths and weaknesses; therefore,
we know when to use our strengths and when our weaknesses will expose us to
vulnerabilities. We need to know what we know and what we do not know. Without
such knowledge, we have no way to assess ourselves to be able to understand what
we are capable of and where our blind spots might be hiding.
We need to know our competitors because a hidden enemy is more threatening

than one that is known. We need to know the capabilities of our competitors, their
strengths and weaknesses, their objectives and priorities.
While Sun Tzu emphasized the need to know about the enemy and oneself,
in today’s environment, it is also very important to know about our friends. In a
complex environment, there is a high level of interdependencies in the objectives
of different players (see Sect. 2.2.2). If we only focus on our competitors and
ourselves, we may develop strategies that play against the objectives of our friends,
turning them into competitors. To avoid such a situation, we need to know about
our friends, their capabilities, their weaknesses and strengths, and most importantly,
their objectives so that we do not work against them.
For some, the environment constitutes the space where uncertainties and the
unknowns are residing. In this book, the environment refers to the space where
opportunities are hiding. Learning how to look at, and where to look within, the
environment is a skill that can transform the environment from being a source of
uncertainty into a set of tools to achieve our objectives. The old ancient wars used
winds, rains, terrains, temperature, and the like to turn defeat into victory. Through
proper strategic analysis, organizations use uncertainties in the environment to work
for their benefit.
When the environment is volatile, and uncertainty is high, clever managers know
that the same uncertainty facing them is facing their competitors. Influencing and
shaping the uncertainty in the environment can provide them with the tools needed
to deceive their competitors: this is strategy in action.
In a complex environment, the definition of a problem is in a constant state
of flux. A problem is defined, reshaped, and redefined through intentional and
unintentional negotiation processes among the players. The four dimensions of
analysis (oneself, enemies, friends, and environment) are the dimensions that need
to be analyzed to understand how a problem definition evolves, what aspects
are needed to influence the shaping process of problem definition, and in what
direction we should steer the problem definition so that we do not create negative
consequences for ourselves or our friends.
Red Teaming (RT) is a very effective skill, methodology, tool and exercise for
coming to know how to know. It is an ancient exercise that has been used by
the military for centuries. In its simplest form, it involves two teams: a blue team
representing self, and a red team representing the opponent. When war is planned,
the blue team forms a red team from its own people. The objective of this red team
is to proactively challenge blue’s plan deliberately by discovering vulnerabilities
to exploit it. In its simplest form, RT is the thought process we all go through
to evaluate consequences by asking “what-if” questions and thinking from the
perspective of others. In its complex form, RT is a large-scale experiment, whereby
some actors play the role of “us,” while some play the role of “them,” while others
play the role of the remaining players in the situation.
In 2003, the United States of America (USA) Office of the Under Secretary
of Defense for Acquisition, Technology, and Logistics published an unclassified
document entitled “The Role and Status of DoD Red Teaming Activities” [20].
1.2 Red Teaming 5
In this document, they recommended the instilling of effective RT in the Department

of Defense (DoD). They continued to advocate that the subject of RT should be
made an “intellectual endeavor to be researched and taught.”
In 2008, David Longbine published an unclassified monograph entitled “Red
Teaming: Past and Present” [31]. In that manuscript, he referred to RT as a new
buzzword without a common definition. Nevertheless, putting aside the diverse
definitions of RT, he distilled a common goal for all of these definitions, stating
that they all share a common goal: “improving decision making.”
In 2009, the Canadians published an unclassified manuscript entitled “Red Dawn:
The Emergence of a Red Teaming Capability in the Canadian Forces” [29]. They
defined RT as “an organizational process support activity undertaken by a flexible,
adaptable, independent, and expert team that aims to create a collaborative learning
relationship by challenging assumptions, concepts, plans, operations, organizations,
and capabilities through the eyes of adversaries in the context of a complex security
environment.”
In 2011 and 2012, the Defence Science and Technology Organisation (DSTO) of
Australia published two unclassified technical reports with the same title (“Moving
Forward with Computational Red Teaming”) but with a different author in each
year [21, 50]. In the 2012 report, they defined RT as “the practice of critical
analysis, through means of challenge and contest of argument, as conducted by an
independent party, in the study of reasoning, and for the purposes of improving
decision-making processes.”
The author of this book has published on the topic for a decade, also providing
different definitions of RT. While the definitions presented above may seem to
be different to each other at first, it is important to know that they have more
commonalities than differences. They vary primarily because of how RT is framed
in different countries and situations. While most people working with RT know deep
down what RT is, the socio-political environment shapes the definitions differently.
In some places, we will find RT defined in a narrow and strict manner, with
the red team being asked to be autonomous and even aggressive. In other places,
this attitude may not be appropriate because of legal or cultural challenges. Instead,
softer views of RT exist to the extent that RT may be confused with other forms of
analysis.
In this book, we will present a definition of RT that has evolved over the years to
become a comprehensive definition. In this definition, RT is defined without being
restricted to a military or government use.
Definition 1.1. RT is a structured approach for modeling and executing exercises
by deliberately challenging the competitive reciprocal interaction among two or
more players and monitoring how concepts, plans, technologies, human behavior,
or any events related to the system or situation under study is unfolding from a
risk lens, with the objective of understanding the space of possibilities, sometimes
exploring nonconventional behaviors, testing strategies, and mitigating risk.
This definition emphasizes the following words and terms: “modeling,” “exe-
cuting exercises,” “deliberately challenging,” “risk lens,” “understanding the space
of possibilities,” “exploring nonconventional behaviors,” “testing strategies,” and

“mitigating risk.” Each of these warrants further explanation.
1.2.1 Modelling
The word “modeling” reflects the process whereby a situation that might be too
complex is transformed into a representation (such as diagrams) that focuses
on important information and relationships in that situation, while ignoring less
important information and relationships. By emphasizing modeling, we emphasize
the thinking process that is required for a RT exercise, where information is
collected, filtered, and mapped into a form that is simple for people to comprehend.
A model does not need to be a mathematical one for a special use. A model can be
a simple diagram drawn on a wall that connects the players and summarizes their
relationships to each other, or it can be a script given to actors in a movie, or in a live
experimentation, that describes their role within the artificial and synthetic world.
1.2.2 Executing Exercises
RT does not operate in vacuum, or by a simple answer to a what-if question. RT

evaluates a dynamic context, whereby events can only unfold based on a series of
interactions among different entities.
Consequently, RT cannot stop at the level of understanding and modeling a
situation. A RT exercise needs to be executed. The hypotheses generated while
building a model needs to be transformed into actionable steps. Each team needs
to test their own hypotheses, plan how these will be executed, transform them from
being ideas in one’s mind to being actions, monitoring the process of executing
them, monitoring and challenging the actions of the other teams, adapting as
necessary, and documenting the lessons learned.
1.2.3 Deliberately Challenging
A main differentiator between RT and any other exercise or experimental form

is that RT is about deliberately challenging the reciprocal competitive interaction
among two or more players. A challenge can be viewed as follows.
Definition 1.2. A challenge is an exposure of a system to a situation that requires
the system to perform close to, but outside, the boundary of its abilities.
1.2 Red Teaming 7
The concept of a deliberate challenge is key in RT. RT is an exercise that has

deliberately been established to achieve a purpose. A deliberate process requires
intention to achieve its objectives. In the classical military use of RT, the objective
is sometimes to push the situation into an area outside the opponent’s ability to gain
an advantage. This challenging process requires planning, thinking, and studies to
be performed to understand the opponent, their abilities, and how to influence them
if possible.
When the objective of RT is to challenge a plan, red attempts to understand
the assumptions of the plan and identify situations in which these assumptions are
violated such that the plan fails. When RT challenges a technology, such as during
the evaluation of a new communication device, red studies the specifications of the
device, its coverage range, and the frequency at which it is operating to identify
when this device will become obsolete, and in which situations one can deny the
device data to be transmitted, the opportunity to transmit data, or the ability to
transmit data efficiently through natural and manufactured environmental and non-
environmental factors.
In the definition of the word “challenge,” it was emphasized that a challenge
requires the system to operate “close by” but “outside” the boundary of the system’s
abilities. One may wonder why a close-by condition is required. Is it not sufficient
to know how to push the system to operate outside its operating envelope (i.e. space
of normal operations) to understand when it would fail to operate?
The primary objective of RT is not normally to be in an easy-win situation. The
primary objective is mostly to understand how a win can be achieved unexpectedly.
For example, it is very easy to present someone with a situation that is impossible,
such as asking a person to carry with their hands (and without any external
assistance) two tons of materials. It is easy to design the impossible. RT is not
about that: RT aims to understand the boundaries of limits. Knowing that a human
weighing 105 kg or more can carry at the absolute most 263 kg with their bare hands
provides the mean for designing a strategy to add 500 g more so that these hands fail
to carry the load. Repeating this exercise may strengthen the hands; thus, improving
the lifter’s ability over time. However, if the weight increases to 300 kg, the human
will detect that it far exceeds their abilities, and will give up the task; thus, defeating
the purpose. RT is about discovering the boundary of the human’s ability (i.e. the
263 kg in this example) and designing the strategy to push this boundary further
(i.e. adding 500 g will challenge the human to continue trying carrying the weight).
Despite that the human cannot achieve the task with the extra 500 g of weight, the
human is not discouraged by a weight that slightly exceeds their abilities.
Understanding this boundary becomes the means to designing deceptive, as well
as learning, strategies in competitive situations. The further a system is pushed
beyond its operating envelope, the easier it is for a deceptive strategy to be detected.
Operating at the boundaries means the signal-to-noise ratio is minimal; thus, the
extra 500 g will be lost in 263 kg boundary.
The concept of challenge is paramount in RT. Different tools and strategies may
be used to design a challenge, but they all have in common the ultimate objective of
pushing the system just beyond its usual operating envelope.
1.2.4 Risk Lens
The concept of risk is central to any RT exercise. The red team is formed to
discover blind spots that can impact blue’s objectives. This is primarily the concept
of risk, which is defined by the International Standard Organization (ISO) as the
effect of uncertainty on objectives [27]. Therefore, regardless of the objective
of a RT exercise, the fundamental driver of the exercise is to understand how
unknowns impact objectives; this is risk. We will adopt the word “impact” rather
than “effect/affect” because the latter is reserved for a different purpose later on in
the book.
Therefore, getting red to win per se is not and should not be the primary aim of
RT. Winning is merely an indication that the objective of discovering vulnerabilities
and negatively impacting blue’s objectives has been achieved. The primary aim for
RT is to understand the risk: the interplay between what blue did not anticipate and
blue’s objectives. This risk lens differentiates RT from other classical games. A RT
exercise needs to be equipped with a variety of tools for studying and identifying
the weaknesses of blue, tracing the causes, and identifying strategies to influence
and reshape the system. However, equally, the exercise needs to be equipped with
the suite of tools to analyze uncertainties and objectives, and therefore, risks.
1.2.5 Understanding the Space of Possibilities
Again, the primary aim of a RT exercise is not for red to beat blue. A possible aim
is simply to gain a greater understanding of what is possible. Any group will have
its own biases, the culture that operates within, and frame of mind that constrains
groupthink. Red attempts to help blue overcome these biases by exposing blue
to possibilities that blue may not have thought of. Getting blue to appreciate this
space of possibilities can assist blue in designing more robust strategies and open
opportunities for blue to make use of this sphere of new possibilities that blue was
not aware off before. This point will be revisited in Sect. 1.4.
1.2.6 Exploring Non-conventional Behaviors
A company that thrives on technology will have many employees that are techno-
logically savvy. The beliefs and behaviors of members in such an environment are
centered on technology. People living and working in such an environment may not
be able to imagine how to survive and live without technology. Technology becomes
the thought-baseline (bias) for the people working in this company. Such a thought-
baseline will steer their perception, ability to understand what they sense, and the
reasoning process they will follow to make a decision.
1.2 Red Teaming 9
Let us imagine a competing company to this technology savvy company; this

competing company is operating at the other end of the spectrum, that is, it is not
technology-centric. Instead, this company relies on classical relationship building
by meeting clients and talking to people face-to-face, and by establishing individual
trust with customers and engendering concepts of loyalty in every conversation.
Behavior such as insisting on meeting with a client in person might be considered
by the technologically savvy company as nonconventional. Some members of the
technologically savvy company would consider this behavior to be completely
irrational, and some would believe that such behavior no longer exists because “no
one would bother to do such thing anymore.” These employees would believe in
communicating with clients through emails and online chats.
RT can change such thinking. RT can help the technologically savvy company
to discover those behaviors that are perceived as nonconventional to their culture
and environment. Equally, RT can help the people-centric company to discover
the perceived nonconventional behaviors of the technologically savvy company
such as emailing clients instead of meeting with them in person, and relying on
objective competitive metrics in pricing and quality of services to sustain customer
relationships instead of relying on a concept such as personal loyalty. RT can
help both companies to discover the advantages of the perceived nonconventional
behavior of the competitor.
1.2.7 Testing Strategies
A strategy [19, 37] denotes the set of methods that are able to transform the
resources to which an organization has access to, or has the ability to access,
into the organization’s goals. Strategies must be tested all the time. During their
inception, it is not yet known whether the strategies are the correct strategies. It
is not known whether the competitor can design counter-strategies and what these
counter-strategies might be. Even as strategies are implemented, indicators need
to be monitored continuously to determine how successful these strategies will
be in achieving the designed goals. In a highly dynamic environment, indicators
for assessing the performance of strategies are vital because the environment may
change to states that were not considered when the strategies were designed.
RT transforms a strategy-design process from a one-off exercise to a lifelong
learning exercise. It sees the testing process as a strategy in its own right that works
hand-in-hand with any strategy. Through RT, a strategy is consistently scrutinized,
challenged, and tested against competing strategies and plausible counter-strategies.
The concept of strategy will be defined in Chap. 4 and will be discussed in more
detail throughout the book.
1.2.8 Mitigating Risk
RT sees problems and situations through a risk lens. It is this risk lens that makes
the RT exercise continuously conscious and self-aware of risk. As a result of this
consciousness and self-awareness, risk-mitigation strategies emerge as the process
of RT discovers and challenges different risks within a situation.
Analysts in risk management understand that risk cannot be minimized per se.
Instead, it is mitigated: its negative impact is steered away from the system, or the
system is reshaped to transform a negative impact into a positive one. For example,
let us assume a person is working as a police officer, and is expecting that in a
couple of years, they will receive a promotion, which will mean they leave the space
of action and take a back seat in the office. For this person, such change might
constitute a negative risk. There are many different ways to mitigate this risk, each
with its own pros and cons. The person may decide to begin making mistakes or
behave angrily toward the boss so that they are not promoted. However, such a
reaction does not constitute RT. RT is about being smart and strategic in every move.
A possibility for this person that would reflect the aims of RT would be to determine
who is being considered for a promotion. Subsequently designing a risk-mitigation
strategy to ensure this next-in-line person is promoted first is RT in situ!
1.3 Success Factors of Red Teams
RT is different from other activities that evaluate plans because of its reliance on
a deliberate, proactive and challenging approach.1 It has been used widely by the
military, security organizations, and large organizations. The success of the RT
exercise relies on a number of factors, some of which are discussed below.
1.3.1 Understanding and Analyzing the Concept of a Conflict
A conflict in RT is a situation in which two entities have conflicting objectives (see

Chap. 2 for more details). It is important to understand the definition of conflict in
this book as distinct from the classical military conflict.
A military conflict is a situation in which a military force is put into use.
Usually, the military conflict arises from a state-state, or state-non-state conflict.
For example, religion-driven terrorism is a conflict between a state and a non-state.
The former has the objectives of protecting its economy, value system, and people.
The latter has the objective of promoting its value system. Here, the two objectives
1
These concepts will be discussed in more details in Chap. 3
1.3 Success Factors of Red Teams 11
in conflict are the protection of a value system by the state and promotion of the
value system of the non-state entity. The objectives focused on economic and people
issues of the state are impacted if the value system of the state is impacted, and
therefore, these two objectives are dependent on the objective of protecting the value
system of the state.
The two objectives in conflict can be represented formally as follows:
1. State objective: minimize the damage to the value system of the state
2. Non-state objective: maximize the change to the value system of the state.
The military is a government policy tool designed to resolve the conflict. The
conflict can be resolved by means such as the state educating the non-state parties,
or exercising economic pressures on the non-state. Under certain circumstances, the
state may decide that the best policy tool to adopt is military action. Consequently,
a military conflict is initiated.
As such, the root of a conflict is the existence of at least two objectives that are
in conflict. An internal conflict exists when the two conflicting objectives are owned
by the same system. An external conflict exists when the two conflicting objectives
originate from two different systems.
An external conflict does not necessarily imply that one entity is the enemy of
the other. With an enemy, there is a declaration that an entity, labeled as an enemy
(red), will cause damage to a second entity (blue) to deny the latter the opportunity
to achieve its objectives which are in conflict with the former entity. This is not
necessarily because red has objectives designed for its own benefit that happen to
conflict with the objectives of blue that were designed for its own benefit. In some
cases, an enemy is a historical concept in people’s minds that arises from issues such
as a past conflict or religious views. RT used for enemies in a military context is a
very narrow view of RT.
The general concept of RT in this book is that it denotes reciprocal interaction,
causing reciprocal dynamics between two or more entities with conflicting objec-
tives. The conflict in objectives causes these two entities to be a threat to one another.
John’s need for a job can be considered in conflict with the company to which he
applied; the company in which the managers would like to find the best person for
the job or would prefer an internal applicant if possible. In this situation, John may
even create two objectives for himself that are in conflict: to balance his own life;
he could create an objective to relax and an objective to be very competitive. Over
time, he can decide on the level of trade-off he needs to achieve the balance and his
goals, but the two conflicting objectives reside and survive inside him, continuously.
This example demonstrates the difference between internal and external conflicting
objectives.
The strategies or ways used to resolve conflicting objectives are the primary aim
of a RT exercise. A military use of RT to design these strategies is a very narrow
purpose of RT. In fact, between two state players, many conflicting objectives
can be in play at any point of time, and it becomes essential to understand
the interdependencies among these objectives, and how they should be resolved.
A consequence of this analysis might be that we discover that a military use is
unnecessary.
In many cases, conflicting objectives are resolved through cooperation. As such,

RT is not always about competition. RT exercises can be designed to understand
how to cooperate, or how to design an effective coalition. The resolution of a
conflict through cooperation requires detailed modelling of the interdependency
among different objectives and a proper analysis of how trade-offs can be made
(see Chap. 2).
1.3.2 Team Membership
Similar to John’s experience, the composition of the red team is critical for the
success of the RT exercise. As the proverb says: “To imitate an ant, one has to
become an ant.” It is not sufficient to split a group into a blue and red team at
random. The group needs to be socially engineered to create the right red team.
Membership of each team is critical. Members must have the ability to think in the
same way as the role they are playing.
Martin could not help John alone. While he could have thought of many questions
that John would not know how to answer—preparing John for thinking on his
feet and managing surprises—context matters, and that was where Amy became
a necessary addition to the team. Amy’s role was not just an additional member
who knows the context. Her role was to prepare John for the unknown within the
context of the job. She was able to make John think about branch management.
Equally importantly, her questions triggered Martin to ask more questions simply
by listening to her. In fact, she also trained Martin subconsciously to ask questions
within the context of bank-branch management.
Without Amy’s intimate knowledge of the activities of a branch manager in a
bank, she would not know how to ask the right questions, or how to ask the questions
in the right manner. Martin listening to Amy meant that Martin was able to see his
blind spots, and this triggered him to ask questions he would not otherwise have
asked. Similarly, Amy listening to Martin enabled her to realize the bias in her line
of questioning, and opened her eyes to ask questions that she would otherwise not
have asked. Different categories of members in the RT exercise are discussed in
Sect. 1.6.2.
1.3.3 Time for Learning, Embodiment and Situatedness
Amy was ready for the exercise because she worked as a branch manager. Imagine if
Amy had not had this experience. For the red team to function properly, Amy would
have needed to learn what a branch-manager job involves. However, such a level of
basic knowledge is not sufficient. In this situation, the culture of branch managers
is not something only to know; it needs to be lived.
The concept of embodiment is critical for RT. Back to the proverb of the ant. One
can notice how the ant behaves. One can build many theories on the behavior of the
1.3 Success Factors of Red Teams 13
ant. These theories can even be validated to demonstrate that they truly reflect the
manner in which the ant behaves. However, if human nature can be revealed with
exactness, psychologists would have been replaced with pure engineers. Knowing
how ants behave, and even successfully predicting ants’ behaviors should not be
mistaken with the conclusion that one can duplicate the behavior and thinking
process of ants.
To place this in context: Amy could have read and studied many books on the
role of a branch manager in a bank. However, theoretical knowledge would not have
been sufficient. There are lessons that are learned on the job. Amy’s mind has been
reshaped every time she has a new experience in her job. These evolutionary steps
of Amy’s thinking are what make Amy think as a “branch manager.”
Amy’s body posture, manner of looking at John, the pitch of her voice, and every
aspect of her physical appearance have somehow been influenced by the job of a
branch manager. Being embodied in the job has transformed Amy into a branch
manager. Being situated in the job, Amy knows how to think “branch manager” in
the same way that the ant knows how to move as an “ant,” and think “ant.”
1.3.4 Seriousness and Commitment
The time for learning, embodiment and situatedness requires a great deal of
seriousness and commitment from the organization sponsoring the RT exercise, as
well as from all members of the exercise. The internal transformation of Amy as she
was before working as a branch manager to the new Amy who is a branch manager
requires a level of commitment from Amy, without which Amy would have only
become a bad imitation of a branch manager.
For example, consider the situation in which Amy did not have experience as a
branch manager. She wanted to know about this job so that she could help John.
She made the assumption that by gaining theoretical knowledge, through education,
about a branch manager job would be sufficient to give her the knowledge required
to ask John relevant questions. Let us assume that the time required for learning is
not an issue here. The result is clear, Amy would have been a bad imitation of a
branch manager. In such a scenario, she may even build a level of false confidence,
which results in her biasing the questions in a manner that negatively biases John.
Consequently, the entire exercise with John could have negative consequences. It
could have had the reverse effect on John than it did in the original example. Imagine
you know you will meet someone in the morning, but you do not know whether it
will be a male or a female. Now, imagine someone positioned your frame of mind
to believe you will be meeting a female. The surprise you receive from meeting a
male is much less in the former case of not knowing than in the latter case in which
you have been made to believe the opposite to what is true.
When Amy questioned John, the form and nature of her questioning subcon-
sciously placed John’s mind in a mental environment that was consistent with those
of branch managers. If Amy had done a bad job in her learning and commitment
to being a branch manager, John would have been positioned in the wrong frame
of mind. This could have made him more prone to being surprised by the interview
questions, or worse, it could have caused him to misinterpret the questions.
The seriousness and commitment of the red-team members to becoming red is a
double-edged sword. Members of the red team can be transformed psychologically
to be red. This transformation needs to sit below a line that should not be
crossed; otherwise, they will turn into real red (i.e. competitors to the organization’s
objectives or enemies to the state)! Members of the red team need to be as close as
possible to becoming red but should not cross the line of being red. If they cross the
line and become truly red, they will counteract and deceive the RT exercise.
In effect, members of the red team need to be socially engineered to have two
or more concurrent minds: the mind of a red and the mind of a blue. They need to
be trained and know when, where, and how to inhibit or excite each mind. Thus,
training members of the red team needs to be socially engineered, and continuously
monitored to create a safety net around these members. The mind of a blue needs to
be built first and must be stronger than the mind of a red.
Let us consider two hypothetical information-technology (IT) companies: we
will call them Minisoft and Manysoft. If in a RT exercise, Minisoft trains its red team
to such an extent that they believe truly in the Manysoft products over the Minisoft
products, the Minisoft RT exercise will fail. Members of the red team would see
Minisoft as a competitor to their ideology and desire the success of the Manysoft
products over that of the Minisoft products. We will continue with this example
during discussing different concepts throughout the book.
1.3.5 Role Continuity
The amount of time and level of seriousness and commitment required by the red
team to understand red implies that there is a high level of investment required to
train members of the red team. It is expensive and inefficient for members of the red
team to be used once and then dispatched.
The continuity of the members of the red team in playing red provides them with
a unique experience to innovate and use creative thinking to counteract blue’s plans.
It is through this continuity that members of the red team have the time to reflect,
experience, and reflect again to improve their skills in acting red, and situate and
embody themselves in the red team’s environment and manner of thinking.
1.3.6 Reciprocal Interaction
RT is about deliberate challenges. Red cannot think in isolation. It needs to interact

with blue. If members of the red team are truly transformed to think red, this
training can de-skill them from thinking blue. The consequence of that is, they
will lose their understanding of blue and will lack the ability to imagine what blue
1.4 Functions of Red Teaming 15
can do, and therefore, what counter-strategies they need to develop themselves.
Regardless of whether de-skilling occurred or not, red and blue need to interact.
Through interaction, red and blue accumulate a unique experience for acting and
counteracting.
1.4 Functions of Red Teaming
As an exercise, RT focuses on forming deliberate challenges to perform a number of

functions within an organization. The following is a list of some of the key functions
RT can play.
1.4.1 Discovering Vulnerabilities
A vulnerability is an exposure of a critical component in a system to a hazard or

a threat. RT has been used extensively in the area of Cyber security to test the
reliability of systems such as a computer network. In such a scenario, the red team is
commissioned to find effective strategies to penetrate the computer network. Every
time the red team succeeds, areas of vulnerabilities within the network are revealed
and the blue team has the opportunity to amend these.
In its role of discovering vulnerabilities, RT is a “must-have” tool for risk
assessment. It is difficult to imagine a reliable risk-assessment exercise that does
not employ some type of RT. If RT is not used within a risk-assessment exercise, the
exercise becomes vulnerable to the mere use of imagination, check-list and biased
historical experience.
1.4.2 Discovering Opportunities
Discovering opportunities is complementary to discovering vulnerabilities, but is

not identical. A vulnerability exposed in the red team creates a potential opportunity
for the blue team. It is a “potential” opportunity because there is no guarantee that
if the blue team attempts to exploit a red-team vulnerability, it will not create a
vulnerability of its own.
Creating an opportunity does not mean creating damage in red through exploiting
red’s vulnerabilities alone. On the contrary, an opportunity may exist through
helping red to close its own vulnerability. This is where we emphasize again the
importance of modelling the interdependency in the objective space. By closing
red’s vulnerability, blue can be closing a much larger vulnerability of its own.
This concept of “helping” is important. John leveraged RT so that he could
receive help. There was no damage in the exercise. It was a win-win situation for
John, Martin and Amy. From John’s perspective, it was an opportunity for preparing
him for the job interview. For Martin and Amy, it was an opportunity to help a friend
and gain experience themselves as interviewers.
1.4.3 Training
The example of John’s job interview demonstrates how RT was used to train John.
The nature of training that RT provided in this situation is very different from
classical non-RT training. In Non-RT training, John would have watched a number
of videos of similar job interviews, and possibly would have been assigned a coach
to give him “to-do” and “not-to-do” tips. RT-training has three conditions: (1)
reciprocal interaction, (2) deliberate challenge through active participation, and (3)
continuous assessment of risk. Through RT, John was trained in situ to be adaptive,
to think on the fly, and to manage surprise questions.
The mock-up interview enabled reciprocal interaction to take place. Amy and
Martin were actively listening to John to discover from his answers if more questions
can be generated to challenge him more. They needed to be goal focus on the job
of Branch Manager, and they needed to continuously assess John’s answers within
the scope of this goal. They needed to see if any of John’s answer threaten the goal
that John gets the job, and they needed to actively contribute questions to train John
more in these areas of vulnerabilities.
RT is a very effective training technique. In a non-RT training exercise, training
scenarios are standardized for all participants, but in RT training, the training
evolves differently for each different participant. Every time something about blue
changes, such as team membership, additional capabilities, and new knowledge,
it becomes a necessity that red evaluates the need to change its strategy. Through
interaction with the trainee, the trainer discovers areas that require more attention
and the training exercise is tuned and tailored toward these areas. Equally, the trainee
continues to learn from its own mistakes, and from designed and unintentional
mistakes of the trainer.
RT does not only train people to be efficient on the task, but it also trains people
to be efficient in their ability to adapt when the task changes. In summary, RT trains
people “to think”, not just “to do”. This difference truly differentiates RT from non-
RT training.
1.4.4 Thinking Tools
A RT exercise assists players in gaining an understanding and an appreciation of the

complexity of a situation. It mimics in concept a person performing self-talking.2
2
By self-talking or self-rehearsal, we mean internal conversations that occur in a person’s mind.
Imagine you are going to fire someone in the organization that you know very well. Assume you are
1.4 Functions of Red Teaming 17
The dynamics of back-and-forth interaction teaches participants and observers about

the richness and depth of the situation and opens their minds to thinking more
strategically.
The reciprocal interaction between red and blue in a RT exercise creates an
environment of self-consciousness of the existence of an opponent. RT transforms
the thinking process of each member of both teams. It teaches the members to be
always aware off the existence of the other team, its intention, and its readiness to
engage.
John was unintentionally taught a lesson during the RT exercise, that is, there was
a panel member trying to surprise him. A panel member that actively searched for
questions to discover whether John has indeed the qualities needed for the position.
This creates a level of alertness in John, as his mind is being trained that a surprise
is coming, what to make of this surprise, how to think about it, and how to control
his personal anxiety in these situations.
1.4.5 Bias Discovery
Vulnerabilities in a plan and biases are not the same thing. It is known in manage-
ment that without a bias, one cannot make a decision. In fact, every decision being
made carries a bias of some sort. Bias is not necessarily bad. Bias becomes bad bias
when it has a negative impact on the decision.
For example, a selection panel may choose the male applicant out of two equally
qualified male and female applicants in a job interview for a kindergarten teacher.
Each of the applicants would have been successful in that job but the panel needed
to make a choice. Possibly, for a kindergarten position in which many of the teachers
are females, the selection panel was biased to the male to balance the genders in the
working environment. This bias breaks symmetry, and without it, a decision cannot
be made.
In the same example, imagine that the female applicant was less qualified and
the panel consists of females who select the female applicant because they believe
that females do a better job in kindergartens than males. Here, there exists a
different form of bias, which is labeled “discrimination.” This type of bias relies
on stereotyping and unfairness. It is not the type of bias to which we are referring in
this section.
Let us revisit the first scenario in the kindergarten example in which the male
was selected. RT can help to understand that sort of bias. That is, the organization
a people person; that is, you care about people so it is important for you to ensure the person gets
hurt as less as possible. You start to rehearse in your own mind what you will say to this person.
You may even imagine what the person will reply to you and what you will reply back. This is
a form of rehearsal and internal RT within one’s self. Through self-talking, the person reinforces
certain concepts and words, a process which helps the person to remember and counteract their
internal fears and negative thoughts.
may not have been aware of this type of bias in its decision-making process; the
decision-making process may have been subconscious. If RT reveals this bias, the
organization becomes more conscious of its existence in advance. The organization
may establish a policy to increase people’s awareness of the need for males in
kindergarten education. The organization may even go further and study the impact
of a female-dominated environment in kindergarten education, and its psychological
impact on the children. Revealing biases can open doors for opportunities. Either the
organization will discover that the sort of bias being used is healthy (but its impact
needs to be better understood), or the organization will discover that it is unhealthy
and it needs to be eliminated from the organization’s decision-making culture.
1.4.6 Creating Future Memories and Contingency Plans
A RT exercise is realistic, but not “real” per se. If the RT exercise is genuine, it
is no different from the daily experiences we accumulate. It adds to our memory
a different type of experiences that we may not be able to afford to live truly in
reality. If lessons from the RT exercise were learnt properly by the organization,
these experiences can be engraved in our memory and be retrieved when they are
needed.
Although RT in large organizations is often an expensive exercise, it can be
considered a cheaper option to certain experiences that might arise if RT training is
not employed. A RT exercise is cheaper than provoking a real war. Participants in the
RT exercise learn from the experiences to improve their knowledge, performance,
and decision-making abilities. The organization learns its weaknesses and strengths.
Equally importantly, the memories that come from the exercise can be retrieved
when similar situations are encountered.
The plans and responses developed during the RT exercise can be saved for future
use. For example, RT is used by emergency-response management, in areas such as
fire fighting and mass evacuations, to create scenarios for plausible futures. Lessons
learnt from these scenarios are stored within the city council. These scenarios and
their associated lessons can be retrieved when similar situations occur.
1.4.7 Memory Washing
As described, RT can be used to learn about situations that may not yet have
been encountered. Similarly, RT can be used to “unlearn” situations that have been
encountered so that the individual is prepared for a new manner of thinking and
behavior in situations to be encountered.
For example, RT could be used to train emergency-response management
on using a new wireless-communication device. As the exercise unfolds, the
participants involved in the exercise accumulate experience using the new
1.5 Steps for Setting Up RT Exercises 19
wireless-communication device. As the participants become more embodied and

situated in the exercise, this type of wireless communication becomes an integral
part of their working environment.
What is often missed in the scenario above is that becoming accustomed to one
form of communication may de-skill people in another form of communication. For
example, to train air traffic controllers to trust and use a new tool within the air
traffic control interface which automatically calculates distances, they may need to
unlearn the behavior of calculating distances themselves. If they combine their own
calculations and the tool, they will take longer to calculate distances and delay their
ability to make decisions on possible conflicts.
1.5 Steps for Setting Up RT Exercises
The design and implementation of a RT exercise undergo a structured process. This

involves the following steps, which should be viewed as a generic guideline to
ensure the effectiveness of a RT exercise. Different RT exercises may require slight
variations to the steps discussed below.
1.5.1 Setting the Purpose, Scope and Criteria of Success
The purpose of the RT exercise defines the objective of the RT exercise, and acts as a
reminder of why the exercise is being conducted. The scope of a RT exercise is a list
of the soft constraints3 defining the context of the exercise. The criteria of success
are measures of utility of the exercise and their values can be used to demonstrate
the value-add of the exercise.
While RT is exploratory in nature, it is vital to know the purpose, scope and
criteria of success for the exercise before moving forward, that is, the answers must
be found to the questions of “why,” “what,” and “so-what.”
The purpose of the RT exercise influences all the steps in designing the exercise.
For example, if the purpose is to improve blue’s ability to anticipate how red acts
and reacts, it becomes essential to design scenarios to create that effect. The RT
scenarios need to produce a large number of situations that sample red’s behavior.
The exact task used to conduct the scenario may not matter in this example, as the
focus is on red’s actions in a wide range of contexts.
3
Constraints can be hard or soft. Hard constraints can’t be broken, that is, the constraint must be
respected or the solution is not accepted. Soft constraints can be broken with a cost. The scope of
a RT exercise may need to be updated, or the interaction between red and blue may beg a change
in the original scope.
A scope is a set of soft constraints that bind the context of the RT exercise. As
these constraints are soft, they can be broken. They can be ambiguous in nature and
their only purpose is to ensure that the exercise is not unbound. A scope defines
what is essential, what is useful to have and what is irrelevant in the context of a RT
exercise. However, by no means should this scope be fixed. The interactive nature
of the RT exercise may necessitate a change in scope. As new events unfold, one
may discover that the exercise was scoped incorrectly and a refined scope is needed.
The scope of the example above is training blue to anticipate red behavior. The
scope is defined with the two keywords “anticipate” and “behavior.” As such, the
exercise should not focus on the details of the situations, but on which behaviors are
likely to be generated in which situations. These situations need to be defined on a
sufficient level for the details of these behaviors to emerge, and in no more detail
than necessary. If more details are defined than necessary, the RT exercise can lose
flexibility.
The next important element to know in advance before designing the exercise is
how success of the RT exercise will be judged. The criteria of success define whether
the exercise was successful in fulfilling its purpose or not. If it is not known how
the success of the exercise will be judged, it will be difficult to define which data to
collect, which factors should be measured, how to measure them, and on what basis
the effectiveness of the exercise can be justified.
The purpose, scope and criteria of success establish a set of guidelines to measure
the appropriateness and cost benefit of each decision to be made in the remaining
steps of a RT exercise.
1.5.2 Designing the Exercise
A RT exercise is not different on any fundamental level from any other type of
experimentation exercises we conduct. Experimentation is a general concept with its
own guidelines and principles and a RT exercise is one type of experimentation. As
will be explained in Sect. 1.7.4, not every red-blue experiment is a RT experiment.
A RT experiment needs to focus on the design of, and interaction between, the red
and blue teams. More importantly, in RT, the experiment needs to focus on designing
the process of a deliberate challenge, that is, how each side will challenge the other
side? The objective here is not simply to win or play the game. The objective is to
learn how to stretch each side’s boundaries to the limit.
Imagine a simple RT military exercise whereby the blue and red teams were
deployed in a field. Soon after deployment, red began to fire and eliminated blue
very quickly. Blue discovered a weakness, and the exercise demonstrated some
benefits, but not true benefits because the true value of the RT exercise is to learn
about the thinking process red and blue experienced that created this result. The
exercise needs to be designed around discovering this thinking process, not around
which team wins or loses alone.
Designing the RT exercise is similar to using an experimental design. The focus

needs to shift to the following four key issues that are pertinent to the success of a
RT exercise.
• Designing the Scenario: As we discussed so far, a RT exercise needs a clear
purpose. This purpose dictates the scope, that is, the constraints on the context
within which the exercise will be conducted. Scenarios will be discussed in detail
in Sect. 4.1. At this early stage in the discussion, a scenario for a RT exercise can
be considered to be like the storyline in a movie. It should not dictate exactly what
every actor will do, but the role and character each actor will play. A scenario
takes the purpose and context of the exercise, and transforms them into a life-like
story to begin the exercise. It must be remembered that a RT scenario should be
considered a storyline and not as a script. A story will define the context, players,
and their roles. However, the dynamics of the interaction will generate the script.
The RT exercise as a whole represents the script: from how the story begins to
how it ends. A scenario defines the contextual information for the starting point
of the story, and sometimes for how the story unfolds, but in RT, it should never
define how the story will be played out and what the conclusion will be. These
are the responsibilities of the RT exercise.
• Designing Data Collection Plan: The value of a RT exercise is not in who won
or who lost but in the thinking process that each side experienced to make their
decisions, and the deliberate challenges that each side imposed on the other side.
Therefore, an effective data-collection plan is greatly important to a RT exercise.
A data-collection plan is a document that details the data to be collected
by asking a number of questions: which data will be collected? what are the
objectives of collecting these data? what is the intended analysis? who is the
owner of each piece of data? where will the data be collected? how will they be
collected? by whom? what access control each data user will have? and how will
the raw and processed data be stored and where?
A RT exercise can get expensive. It is very disappointing if a complete RT
exercise is conducted and the analysts discover that some data are missing, or
that the data being collected are not suitable for the analysis being conducted.
Some analysts would see a data-collection plan as an expensive exercise in its
own right. However, it is most likely that a RT exercise without a data-collection
plan will fail. It is counterproductive to a RT exercise if the analysis team does
not know the answer to the questions listed above on the data-collection plan
before conducting the exercise.
• Selecting the Players: Selecting the appropriate players for each team is a
critical element for the success of a RT exercise. Decisions on membership of
the red team are more difficult than those for the blue team. Let us imagine
a Minisoft-versus-Manysoft RT exercise. Who would Minisoft have on the red
team? A good option could be a careful choice of Manysoft ex-employees.
However, Minisoft may think the best players on the red team are its own strategy
people who are specialized in analyzing Manysoft strategies. Scrutinizing this
decision would reveal problems for several reasons outlined below.
First, there is no guarantee that what Minisoft strategists believe about

the manner in which Manysoft employees think is correct and representative
of Manysoft. Minisoft strategists may have been very successful in predicting
Manysoft strategy in the past, but this does not mean that they know how
Manysoft thinks.
Take another example, one may predict successfully what their child will
choose between ice cream and a healthy meal, but it does not mean that the same
person can conclude from the successful prediction that they understand how the
child thinks. The child may have chosen the ice cream because their body urges
them for fluid in a hot summer, the taste is good, or simply because they see
the other children eating ice cream. If the objective is to influence the child to
change their choice from ice cream to the healthy meal, the successful prediction
alone is not sufficient. Understanding the root causes must go beyond successful
prediction to understanding factors such as how the child makes choices, their
body’s needs, and their cognitive abilities. This may sound complex, but this
understanding might be achieved simply by using a group of children within an
experimental environment that is shadowing the first: to interpret their actions,
be in their shoes, and if situated and embodied correctly, think and act like the
first child.
Second, the RT exercise is a source of validation for the analysis performed
by the Minisoft strategists. If the members of the red team are ex-employees
of Manysoft, and if the thinking process they experience is captured, Minisoft
strategists can compare notes to validate their prediction capability.
Third, red team members in a RT exercise need to be skilled in red technolo-
gies, not in blue technologies. As such, if they are skilled in blue’s technologies,
it may be necessary to de-skill some red players.
One cannot be an ant by simply watching an ant. One needs to be embodied
in the ant colony as an ant to know how an ant thinks. Therefore, if the red team
is about ants, when possible, it is preferable to obtain real ants than to imitate
the ants. For example, in law enforcement, it is possible to use thieves from
jail to play the role of thieves in a RT exercise, and this is preferable to having
police officers imitate thieves. The thieves in the jail may attempt to deceive the
RT exercise. However, proper social engineering can ensure that we can rely on
other personality drivers for their behavior to conduct the exercise properly. For
example, the ego-centric personality of some thieves may equally drive them to
see the RT exercise as an opportunity to demonstrate they are superior to the
cops. Similarly, in war movies, ex-military personnel are often employed as the
actors. They have been embodied and situated in wars; their act and acting are
then natural.
• Briefs and Training: Briefs given to both red and blue teams need to be
designed, scrutinized and analyzed carefully. These briefs contain information of
the “known-known,” or what each team knows about the other team. They impact
the level of creativity that each team will exhibit in attempting to deliberately
challenge the other team.
A decision needs to be made on whether joint briefs providing information

for both teams to know about each other will be allowed. Sometimes, members
of the blue and red teams need to be concealed from the other team. Does the
police officer need to know what the thief looks like? In some exercises, some
information may be useful, but in other exercises, the same information may
invalidate the entire exercise.
Concealment of information from red and blue teams is an important issue
that should be considered when creating the briefs, as is the issue of whether
team membership should be disclosed.
An issue related to providing briefs is training. Sometimes, it is not sufficient
to include members who can act as red without training before conducting the
exercise. Amy may have needed some training in job interviews before she was
able to ask John the right questions despite her experience as a branch manager.
This is because she may never have been on the interviewer side of the job-
interview table herself.
A thief may act in a policing exercise as a thief. If the context of the exercise
is to steal information from computers, a computer-ignorant thief may have the
mind of a thief but not the skills to steal information from computers. Training
can provide the extra skills necessary to create the red team.
Training is emphasized over education here. This is not because education
is less important, but because education is more expensive and more time
consuming. In preparing a RT exercise, it is not always possible to wait for all
members to complete formal education. The example of the thief demonstrates
this because it may not be possible to wait until the thief obtains a PhD in
information security to conduct the RT exercise.
1.5.3 Conducting the Exercise
A RT exercise begins the moment the need for a RT exercise is announced, that is,
in the moments before the purpose, scope and criteria of success are designed. This
is important because this moment dictates constraints on which information should
be communicated to whom. However, conducting the exercise is about the moment
the experiment and game begin. This is the moment in which both red and blue
prepare for engagement and interaction. It is also the moment in which the scenario
is executed.
The RT exercise would usually involve a number of teams. In addition to the red
and blue teams, there is the team of designers who design the exercise; the team
of observers who watch the exercise unfolding, and possibly share their perception
of the events taking place; the team of analysts who specialize in analyzing the
RT exercise qualitatively and quantitatively; the technical team that is responsible
for all technical and engineering elements of the exercise, including monitoring
the automated data-collection tools; and there may also be other groups such as
politicians who simply watch the exercise to get more familiar with the situation.
Sometimes, other colors are used to designate other teams. For example, the
designers, analysts and observers are grouped into a white team, while a green color
denotes a special team who supports and acts as a coalition of the blue team. More
colors can be introduced to define other groups with interest in the exercise.
The scenarios discussed above demonstrate that the scale of a RT exercise can
extend from three people (as in the case of John’s job interview) to hundreds, as in
the case of a national-security exercise. Each member should not interfere with the
tasks and purpose of other members. The technical team should be invisible to both
the red and blue teams so that they do not distract them when performing their tasks.
The observers should be separated from the analysts so that they are not influenced
by the discussions among the analysts. The politicians should be separated from the
entire exercise so that they do not push their own agenda, influencing the exercise
to change its original intent.
1.5.4 Monitoring and Real-Time Analysis of the Exercise
Real-time monitoring and analysis of a RT exercise is a proactive approach to ensure

that the exercise as it is unfolding is meeting its intended objectives. A RT exercise
is usually expensive. As such, it is not wise to conduct the exercise, collect the data
for analysis, and subsequently discover there was a problem in the data. Usually,
a large-scale exercise will require from several days to months to complete. This
is why it is very important that the tools are in place to monitor continuously the
exercise in real time and analyze the data as they get generated.
Most of this real-time analysis will focus on indicators related to whether the RT
exercise is meeting its objectives. The analysis will be diagnostic to ascertain and
analyze problems in the experiment.
Sometimes, the analysis can be part of the exercise itself. For example, the
analysis team may monitor certain events and reveal information about red to blue
or vice versa. In this situation, the analysis team acts in an intelligence function
that can be outsourced to both teams if they do not have the analysis capability
themselves. If such an outsourcing decision is made, it should be made carefully as
an intentional decision for a beneficial purpose within the experimental design. The
flow and access to information should be clearly designed and articulated.
1.5.5 Post Analysis of the Exercise
The majority of the analysis required to data mine the RT exercise to extract trends
and patterns (i.e. lessons) will be offline and occur after the RT exercise. This is
sometimes due to the need to have the complete data set of the exercise before a
pattern can be extracted. The analysis may need to propagate information forward
and backward in the data to establish reasons and rational for the extracted pattern.
Sometimes, it is important to bring both red and blue teams back to the analysis
room after the exercise is completed. In this situation, the events can be played back,
while asking team members to reflect on why certain sequences of events occurred
in the manner in which they did during the exercise. This process of reflection may
be designed as part of the training process for both red and blue in preparation for
a subsequent exercise. It may also be necessary for understanding the results and
outcomes of the exercise.
1.5.6 Documenting the Exercise
As discussed, a RT exercise is an experience similar to any other we live. This

experience needs to be documented thoroughly. For example, the documentation of
a RT exercise for emergency-response management to combat an epidemic breakout
in a city can become a case study in the drawers of the city council for future
reference if a similar situation occurs. Documenting the exercise would allow the
city council to consult the documents if a real epidemic breaks out in the city.
The documentation of the RT exercise needs to go beyond the basic level of
documentation of the experiment to include the decisions made, their rationale,
their expected consequences, and the findings of the post-exercise analysis. The
documentation in the example above should include how the decisions could have
been perceived by the public, what went right, what went wrong, and how they could
have been amended.
1.5.7 Documenting Lessons Learnt on Red Teaming
RT exercises are a capability that any organization or nation should see as a lifelong
continuous capability. Every exercise teaches an organization how to perform the
next in a better manner. Therefore, lessons learned on RT should be captured,
documented, and stored as a source of knowledge for future exercises.
The first RT exercise to be conducted by an organization will be perfect only
in rare cases. Even if one exercise is perfect, there is no guarantee the following
exercise will be. RT exercises are complex and the likelihood that something will
go wrong is very high. Similarly, the likelihood that something that went right in a
previous exercise will go wrong in a future exercise is equally high. Overconfidence,
human bias, and the complex nature of the situations and decisions encountered
during a RT exercise are critical issues that will threaten the success of any RT
exercise. Lessons learnt from a RT exercise form part of the organization’s corporate
memory.
1.6 Ethics and Legal Dimensions of RT
1.6.1 The RT Business Case
An organization should understand the consequences of “not” getting involved in

RT. By avoiding RT because of the costs of the exercises and some of the issues to
be discussed in this section, the vulnerabilities of the organization will be hidden.
They can be exploited by entities both intentionally and unintentionally, and when
such vulnerabilities are exposed or exploited, the damage and element of surprise
can disrupt the operations and image of the organization, the trust of stakeholders
and clients, and increase other costs such as that of insurance.
An individual that enters into a competition without understanding and assessing
their individual skills and the skills of competitors is simply relying on luck. While
many would see luck as providing a 50–50 chance of winning, blind actions have
a much lower chance of success, particularly in today’s competitive environment in
which many other people are well prepared to manage uncertainty, and threats are
getting more and more complex.
The decision on whether to conduct RT requires a logical, objective, and
evidence-based business case; this is in itself is a RT exercise. RT provides a layer
of protection for the organization; a layer of training for individuals to the highest
level; a layer of preparedness for undesirable events; and a layer of innovation
through which scrutinizing processes, techniques, technologies and ideas within an
organization, new creative ones emerge.
1.6.2 Responsible Accountability
Players involved in a significant RT exercise, such as a national security one, are elite
people. From the designer to the technicians, they are all highly qualified individuals
to perform the role they have been assigned within the RT exercise. Members of such
an exercise should be chosen very carefully, and they should fully understand the
consequences of being involved in RT. The different roles within a RT exercise will
be discussed in this section.
A RT exercise can take many different forms, and be of different scales.
Therefore, it is not expected that all the roles being discussed in this section must
be fulfilled by separate individuals in each exercise. All roles can be fulfilled by a
team as small as five people, and for larger RT exercises, some roles may have large
teams managing them.
The organization should categorize key players in a RT exercise, and discuss the
risk associated with each category. It is important to emphasize that a RT exercise
trains people to “think” in the first place, in addition to training them “to do”. The
risk level described below represents the risk a player in a specific category poses to
1.6 Ethics and Legal Dimensions of RT 27
an organization, that is, if this team member becomes a bad citizen,4 to how much
negative risk will the organization be exposed. Equally, if the team member remains
a good citizen, how much positive risk (i.e. opportunities) will the organization gain
from having them as part of the staff.
This point of risk needs to be considered a natural step, not as a something to
hinder the exercise. In normal circumstances, every employee in an organization,
from the bottom level to the highest level, is trained. The risk that the employee
switches to a bad citizen of the organization always exists. However, this does not
happen with great frequency thanks to careful choices made in appointing people to
their positions.
Recruiting people to a RT exercise is similar to recruiting people to any other
position in the organization. Therefore, the risk cannot be ignored.
The subjective assessment of the risk level associated with each role in a RT
exercise should be viewed with caution. Some risk level may increase or decrease
for each role based on the nature of the RT exercise.
1.6.2.1 Red-Teaming Stakeholder (Risk Level—Low)
RT stakeholders (RT-S) are the primary beneficiaries and problem owners of the RT
exercise; therefore, RT-S should be setting at the highest level in any organization.
In the private sector, the board should be the primary stakeholder in a RT exercise.
A RT activity is very likely to touch on multiple activities within the organization.
Members of the RT team need to be protected at the highest level given the
benefits of the RT exercise are usually organization-level benefits. The board needs
to establish a subcommittee that oversees RT within the organization, similar to
other board-level committees such as the auditing and risk committees. If it is not
desirable to make RT activities visible to the outside world, the board risk committee
can take responsibility for RT.
The organization carries the risk that comes with every role in the organization.
Every role in a RT exercise can pose a risk on the organization. As we discussed
above, a red teamer can become red, and get transformed into a bad citizen.
The risk of the RT-S is low because the board will only have two roles: to ask
questions to the RT teams, and to protect the teams. Members of the board should
not be involved as players in the RT teams because this will create confusion about
the responsibilities of each participant, and may create an undesirable position of
power in the red and blue teams. An exception of this point is when the scope of the
RT exercise is about the board itself. In this case, some board contributions to the
exercise will fall under technical roles, not stakeholders role.
4
For example, a person in a red teaming exercise learns the skills to penetrate a computer system
then decides to do so in the real world to commit fraud.
1.6.2.2 Red-Teaming Communicator (Risk Level—Low)
RT is a very technical exercise regardless of whether the exercise is designed to

red team against a piece of technology or against a plan or an idea. As a technical
exercise, it is not always possible for the team leader of the exercise to be a good
public communicator. In many situations, the entire team involved in a RT exercise
should be hidden from the rest of the organization. In general, it is advisable to have
a RT communicator (RT-C) who acts as the interface between the organization and
the RT team.
To maintain the integrity of a RT exercise, it is wise to have a dedicated RT-C who
speaks on behalf of the RT teams. This communicator acts as the interface between
the RT teams and the rest of the organization. The RT-C needs to have a great deal of
understanding of the culture of the organization and its management. The risk of RT-
C is low, because they normally communicate high-level non-technical information.
1.6.2.3 Red-Teaming Legal Councilor (Risk Level—Low)
RT is an activity that is initiated and conducted by an organization to evaluate

something the organization believes is important. The nature of the exercise can
discover problems and issues that may involve the image of the organization. The
RT team may need to access certain data, challenge legacy systems, or hack into
a computer system to test its security level. These activities can raise occupational
health and safety, and insurance-related consequences.
It is necessary to have a legal councilor on a RT team because while individuals
on the RT team may be highly qualified in their own area, they may not understand
the legal consequences of some of their actions.
For example, assume a red team is formed to evaluate the security system in an
airport. The red team attempts to cross the security gate with prohibited materials.
They succeed and their success is written in a newspaper article. There can be many
legal consequences here. Were these materials safe for the people in the airport or
did it impose a risk hazard? Have the newspaper article created a threat for the
airport since a number of bad guys would get to know that there are security holes
in that airport that can be exploited? These consequences need to be evaluated as
part of the risk assessment for the RT exercise, and the legal councilor can provide
advises to protect the RT exercise as well as the organization as a whole.
Having a RT legal councilor (RT-LC) on the RT exercise serves two purposes.
First, the team members can use the RT-LC to ensure that all their actions and
behaviors are executed without any legal consequences to the organization or the
individuals. Second, the RT-LC can ensure that the organization is protected as the
RT team uncovers issues.
The negative risk of the RT-LC is low. In effect, a legal councilor provides a
protective layer for both the organization and the individuals of the RT team. The
positive risk of having a RT-LC can be high in some situations because the RT-LC
protects the organization and the RT team.
1.6.2.4 Red-Teaming Designer (Risk Level—Very High)
The RT designer (RT-D) is the maestro who designs how the exercise is played and
how players and actions need to synchronize. a RT-D is the person who needs to
understand and be immersed in experimental design for in-situ experiments. The
word “designer” instead of a “team leader” is used to avoid the implication that
there is only one team or only one leader. There is also an attempt to emphasize the
fact that besides RT-D being a leadership role, it is also a role in which design skills
and knowledge of the principles of RT are required.
The RT-D draws, and therefore is exposed to, the entire picture of the RT exercise.
The RT-D is the interface between the RT team with the RT-C, the RT-LC, and the
RT-S. The RT-D acts as the access control for information to all subteams of the
RT team. The RT-D should be a key person(s) in selecting the RT team members
because part of this role is identifying the skill set required to conduct the RT
exercise, as well as the personality types associated with the skill set.
The role of the RT-D comes with two aspects that make the risk level associated
with this role very high. First, the RT-D’s access to information and systems
make the role of the RT-D a high-risk role. Second, being the mastermind behind
designing the exercise gives the RT-D a level of knowledge greater than that of all
the members of the RT team, even though RT-D may not be very skilled in some
very technical tasks in the exercise.
1.6.2.5 Red-Teaming Thinker (Risk Level—Very High)
A second role with a very high level of risk is the RT thinker (RT-T). The RT-T will
be an individual who thinks about risk, knows how to design strategies to penetrate
systems or challenge plans. a RT-T is a system thinker of the highest caliber, who
combines a reasonable level of technical skills and understanding with strategies
and systems thinking.
This role is very high risk. Systems thinking alone is not sufficient to fulfil this
role. Members in a RT-T role need also to have a variety of technical skills. People
who are too technically qualified are not suitable for the RT-T role. The reason is
that a technical person can be too narrowly focused on the technical issues, may not
have the thinking risk skills, may have many biases arising from their technical
knowledge which would hinder innovative thinking, and may not have much
understanding of the role of strategy in a RT exercise. Similarly, system thinkers
who have no experience with the technical side of RT can be counterproductive to
the goal of the RT exercise, as they can imagine what needs to be done without
necessarily having the ability to judge whether it is doable.
This combination of technical know-how and systems thinking is where the high
risk resides in this role.
1.6.2.6 Red-Teaming Technician (Risk Level—Medium)
A RT technician (RT-Tech) is a role that will depend on the RT exercise. There

may be many RT-Tech personnel. If the RT exercise were about computer security,
one would expect to have involved on the RT team technologically savvy specialists
involved in networking, crypto, penetration testing, and low-level programming. If
the RT exercise were about evaluating a plan in a war-gaming exercise, one would
expect commanders with experience in a battle field, military strategists, people
specializing in fields such as culture, anthropology and psychology.
Each RT-Tech person would have a role limited to their specialty area and the
specific role assigned to them within the RT exercise. The risk associated with the
RT-Tech is no different to the risk associated with their area of specialty. Therefore,
their role on the RT exercise does not change their natural risk level. However, we
assign a medium risk level from their involvement in the exercise because a RT
exercise teaches people “to think”. This skill, in addition to their technical skills,
raise their overall risk to medium.
1.6.2.7 Red-Teaming Documenter (Risk Level—Low)
A RT documenter (RT-Doc) is a role that normally requires multiple people. These

are people with general knowledge of RT and RT language. Their role is to
document the RT exercise as it unfolds, writing technical reports at the conclusion
of the exercise, summarizing the outcomes, and developing knowledge based on the
lessons learned from the RT exercise.
A RT-Doc role is associated with low risk. While RT-Docs have access to
information to be able to document the exercise, they are not necessarily trained
in RT to the extent that their role carries a high risk.
1.6.2.8 Red-Teaming Auditor (Risk Level—Medium)
A RT auditor (RT-A) is a role responsible for auditing the RT process for

accountability. It is recommended that the RT auditor be an ex-RT-D, with the skills
to assess and audit the process.
The risk associated with this role is medium despite the fact that an auditor
is more experienced than a RT-D. The assumption here is that the auditor has
conducted a number of RT exercises before acting as an auditor. An auditor risk
before joining the RT exercise as an auditor is in any case very high already since
they have accumulated huge experience from their previous RT exercises. Therefore,
the extra risk arising from their additional involvement in a new RT exercise in an
auditing role comes with a medium level risk. This is because they get access to new
information and possibly new contexts.
1.6.2.9 Red-Teaming Observer (Risk Level—Medium)
RT Observer (RT-O) is a common role within a RT exercise. One would expect

several people to be observing the RT exercise. The RT-Os may be members of the
board who gain peace of mind that the exercise is working properly by observing
the exercise. RT-Os may also be trainees who are observing to learn.
The risk associated with observation is medium because of the learning effect
that can come with observations. Moreover, observers may have access to data or
may be in a position to infer data from the actions of red and blue that expose them
to data they are not supposed to access. This is sometimes hard to be avoided, simply
because it is not easy to always understand what a person can learn from watching
another performing some tasks. However, one way to manage this risk is not to
allow observers to observe a long chain of events in the exercise. The way observers
need to move around a RT exercise needs to be designed properly to minimize the
possibility of exposing them to a sequence of events that can expose data to them.
1.6.3 The Ethics of Budget Estimation
The expense of RT may not represent a large amount of money to the organization
when compared to other expenditures of that organization. However, most large
expenditures in an organization are involved with production and core business.
RT can be mistakenly considered a “nice” activity to have, rather than as a core
activity for the organization. This can lead to a perception that the expense of RT is
unjustifiably high.
If the RT-D is pressured and agrees to begin the exercise with an insufficient
budget to demonstrate that the benefits are greater than the cost, the following three
undesirable possibilities may arise:
1. The quality of the exercise will be compromised to ensure the assigned budget
is not exceeded. The consequence of such a situation can be expressed simply:
“what is built on ashes will end in ashes.” a RT-D should understand that the
RT-S have one primary objective: obtaining the right answer to the questions
that motivated them to approve the exercise with the minimum cost. Obtaining
the right answer is not controllable by the stakeholders who are not necessarily
experts in RT. They will entrust the RT-D to provide them with the right design
and answers. However, they can control the cost. Therefore, they will always
attempt to push down the cost. The designer should not accept a budget that will
not lead to the right answer. Therefore, the ethical hurdle to ensure that the design
is right lies with the RT-D once they accept a budget.
2. The designer takes the attitude that if the RT exercise begins with an inap-
propriate budget, the stakeholder will be forced to assign more money to the
exercise when it is needed. For example, an organization begins with a promise
that cannot be delivered with the limited budget it assigns; however, the benefits
to cost ratio sounds so good in a newspaper headline as a proposal with great

value for money. The project is funded. Once the project is in place, it becomes
much more difficult to abandon it. It is easier to request an increase in the budget
once the project has begun because no one would want to admit that the initial
decision was incorrect. While some may claim that this is a business strategy, it is
asserted here that this represents an unethical attitude. a RT-D should not trap the
organization into such a situation. The designer needs to be clear about the costs
from the beginning. If an increase in budget is necessary because of unanticipated
and unintentional circumstances, this is natural because not all elements of the
exercise will be known with absolute certainty in advance. However, a RT-D
who intentionally hides costs to trap the organization into the situation described
above is committing an unethical act.
3. The RT-D changes the scope of the exercise to ensure that the available budget is
sufficient to provide the right answer for the new scope. There is nothing wrong
with this behavior per se, providing the RT-D understands that changing the scope
of the exercise would mean changing the question that was originally raised to
initiate the RT exercise. The new question should be formulated, and clearly
discussed and agreed with the RT-S.
1.7 From Red Teaming to Computational Red Teaming
1.7.1 Military Decision Sciences and Red Teaming
The military has been leading the efforts on RT for many decades, but only over
the last decade, the need for establishing RT as a science has been stressed. It is
important to explain why and how this book departs from RT in the military. To do
this, we will offer a personal reflection on the different views on RT within military
decision sciences.
Some computer security consulting companies use the words of RT as a muscle-
based approach, that is, the company demonstrates that they are able to penetrate
into any security system to satisfy a client concern. Since no system is bulletproof,
there are always ways to penetrate a system. RT became the brand to sell this
approach.
This view to RT is detrimental. First, it has legal consequences that can generate
many negative risks to organizations and the government as a whole. In computer
security, the objective of RT is not to penetrate a system, but to map out the space of
vulnerabilities from a threat lens. Second, the military, similar to scientists, is used
to disciplinary approaches to conduct any study. This is important because the value
of any military study is in the lessons gained. The muscle-based approach used by
some consultancy companies focuses on selling the final results and the success
in penetrating a system. Proper RT studies should focus instead on the systemic
and disciplinary design and approach followed in the study to clearly articulate the
lessons learnt.
1.7 From Red Teaming to Computational Red Teaming 33
Military decision sciences have been attempting to take a disciplinary approach

towards RT to avoid the misconception of the approach being a muscle-based
approach. In some studies, RT is seen as a tool “to identify and reduce risks” [21].
In other studies, RT has been placed into a natural root within military decision
sciences in the wider context of alternative analysis techniques, currently a NATO
standard for concept development. “Alterative analysis is the deliberate application
of independent critical thought and alternative perspectives to improve decision
making” ([36] from [12]).
Given the previous definition, it is natural to see RT as one of the alternative
analysis techniques. Matherly [12] developed a methodology to categorize 29
different alternative analysis techniques into four groups: Diagnostic Analysis,
Contrarian Analysis, Imaginative Analysis, and Adversarial Analysis. The name of
each of these techniques is mostly sufficient in many cases to understand how they
operate. Readers interested in a quick overview of these techniques can refer to [33]
(from [12]). Table 1.1 summarizes this categorization.
It is important to pause and reflect on this part of the literature. Military
commands need to be clear and unambiguous, because the consequences are
negatives if the commander’s intent is not understood properly. Military decision
scientists follow similar steps and attempt to explain concepts unambiguously
to avoid confusion. Because the military has been using many system thinking
tools, including alternative analysis tools, for a very long time, it is important to
differentiate between existing tools and concepts and any new ones.
This sometimes lead to a black-and-white approach towards defining any new
terminology that will be added to a military handbook or doctrine. Consequently,
definitions of concepts and techniques can turn out to be too narrow and over-
precise. Unfortunately, this is the cost RT has paid in some places.
As the discussion in this book has demonstrated up to now, RT is an exercise
that an organization conducts to evaluate something. Seeing a RT as an exercise
is important, because within this exercise, the red and blue teams can use any of
the techniques listed in Table 1.1 and beyond. The red team may need to conduct
a SWOT analysis, it may need to use scenarios, it may start the exercise with a
brainstorming session, or rely on sophisticated computer models and architectures
as we will discuss in Chap. 2. RT needs to be seen as an encompassing exercise
instead of narrowing the concept too much into a corner, where the concept becomes
obsolete.
In this book, RT is discussed as a special type of exercises with two cornerstones:
risk analytics and challenge analytics. It is the combination of, and emphasis on,
these two cornerstones that uniquely distinguishes RT from other forms of exercises.
1.7.2 Smoothing the Way Toward Computational

Red Teaming
In Sect. 1.1, it was described that John went through the mockup interview exercise
to prepare himself for the interview. He began with Martin and discovered that the
Table 1.1 Categorization of alternative analysis methods as suggested by

Matherly [12]
Category of analysis Method
Diagnostic analysis Experimentation
Key assumptions check
Analysis of competing hypothesis
Quality of information check
Indicators of change
Deception detection
Logic mapping
Gap analysis
Contrarian analysis High impact/low probability
Team-A vs. Team-B
“What if?” analysis
Wargaming
Intentional failure
Imaginative analysis Alternative futures
Brainstorming
Outside-in thinking
Argument deconstruction
Problem restatement
Strength, weaknesses, opportunities, and threats
OPFOR
“What if?” analysis
Scenario development
Alternative analysis
Adversarial analysis Devil’s advocate
Stakeholder mapping
Red team analysis
Surrogate adversary
Red cell exercises
Cultural capability
team composition was not right. They invited Amy to join the team. John asked them
to play devil’s advocate with him, to ask him difficult questions, and indeed, they
did. This was a simple example of RT that many people would have encountered
it in their life. Unfortunately, Martin and Amy did not have a book to read on
how to execute a RT exercise properly, or what is expected. That is, they relied
on their understanding of the exercise and their experience. They did a good job
but the lack of science principles from which to derive this process would mean
that they cannot teach what they learned to others, they cannot generalize it beyond
the limited experience they had, and they cannot properly justify their choices or
thinking process. This is the value of transforming the art of RT, to the science of
Computational Red Teaming (CRT).
So far, this chapter has distilled lessons learned from the military application
of RT, and the author’s own style and experience. CRT will contextualize this
into a wider context to generalize RT so that it leaves the realm of the military
to applications in industry, technology, and government departments to support
effective decision making.
Before CRT is discussed, more light will be shed on John’s experience. Many
people have experienced a job interview. A possible way to describe the dynamics
of a job interview is to view it in three stages: first, the welcome and ice-breaking
stage. Initial questions are asked such as: why did you apply for this position? what
do you bring to this position? These are the sort of questions that the candidate may
have anticipated or should be answered by any reasonable candidate without being
too stressed.
The second stage focuses on the job with more targeted questions. For example,
you only managed a budget of $1 million, but in this job you will need to manage a
budget of $100 million, can you convince us you are capable of managing this large
budget?
The third and final stage of an interview cools down the interview environment.
For example, if you are successful, when can you take up this position? Can we
contact your referees?
What is the objective of a job interview? In normal circumstances, the objective
is to select the best candidate for the position. The organization may have never
encountered this candidate before. Therefore, this candidate is a “black box;” the
selected candidate might be the right person for the job, or it may be that the
selection of this candidate was a big mistake that the organization has to deal with
for some time. From this perspective, a job interview is nothing more or less than a
risk-assessment exercise, whereby the organization assesses the risk of appointing
each candidate. In this context, this risk is simply how the uncertainty about the
candidate can be assessed to judge properly its impact on organizational objectives.
Challenging the candidate is the means to execute this assessment. The sec-
ond stage of the interview discussed above is normally where the candidate is
challenged-this is one of the cornerstones of CRT: what is a challenge, and how
to challenge? The selection committee evaluates the application, referees reports,
and may test the candidate before the interview using a number of psychological
and technical tests. During the ice-breaking stage of the interview, the selection
committee continues to evaluate the candidates. Sometimes this stage triggers more
questions later in the interview.
During the second stage, the candidate is challenged. The selection committee
attempts to estimate the boundaries of each candidate’s abilities, skills, and behav-
ioral space. This can take many forms, including direct questions, or by presenting
the candidate with a real-life situation and asking for an opinion. For these questions
to challenge the candidate, the selection committee observes the responses, updates
its beliefs about the candidate’s skills and abilities, and steers the challenge. This
process reduces, or at least adds more confidence in, the selection committee’s
space of uncertainty about the candidate. Every time a possible doubt exists, a new
challenge is formulated. The process of challenge is guided by how the uncertainty
about the candidate may impact the organization and job’s objective. However, this
is a weak form of a challenge. It is subjective and mostly ad-hoc.
The science of RT, that is, CRT, will have these two cornerstones: risk and
challenge as the basis for designing and understanding the process of CRT. In
today’s world, where data and models are abundant, CRT attempts to design an
architecture to bring together the elements of risk and challenge to achieve the
objective of the exercise.
As was explained in the preface, the word “computational” emphasizes the aim
to design systemic steps for RT. It does not necessarily mean “computer based.”
However, in complex RT situations, and assuming that an organization understands
the CRT science to be discussed in this book, computer support for the RT exercise
is vital.
Before the discussion on CRT progresses, two issues at the interface of computer
science and CRT need to be explained. One is related to CRT, where computer
scientists have been tempted to automate RT exercises completely. The other is
related to computer science, where CRT offers an interesting perspective on the
concept of “intelligence” such as in artificial intelligence (AI).
1.7.3 Automating the Red-Teaming Exercise
In 2011, the DSTO in Australia published a technical report entitled “Moving

Forward with Computational Red Teaming” [21]. In 2012, DSTO published a
second report with the same title but by a different author. Reference [21] selected
two definitions of CRT from the literature: these two definitions presented below
derived from [2] and [4], respectively.
Definition 1.3. CRT is a set of methodologies and computational models that
augment a human-based RT exercise or perform a computer-based, more abstract
RT exercise.
Definition 1.4. CRT is a framework built on a set of computational models that can
assist a human-based RT exercise smartly and responsibly.
The word “computational” means systemic calculations. Computational or sys-
temic RT is a suite of methodologies for implementing or supporting a RT exercise.
CRT [1, 3, 4, 48, 57] opens the doors for designing computer models for and of
the RT exercise. Some may take this statement to its extreme, believing that we can
automate the entire RT exercise. Some may believe it means we should automate the
red side of the RT exercise. Why not? If we can structure every step of the process,
what would stop us from automating it?
Possibly, one day the AI science will exist to automate the entire RT exercise.
However, even when this day comes, it must be remembered that at its core RT is a
human-centric exercise. It is not simply a search technique for finding mistakes. It
is not simply a computer tool for training. It is an exercise that primarily supports
a human to think about a problem by being embodied and situated in the problem
and its context. As such, perhaps it is more sound to consider implementing an
augmented-reality version of RT than commanding a computer to perform RT on
one’s behalf.
Nevertheless, many of the components of the RT exercise can be automated if
we can structure them in a systemic manner and into computational elements that
are computable. Therefore, it is important first to focus on establishing the science
so that we are able to automate RT, if we can.
The automation of RT, which is a large complex exercise, should not be confused
with blue-red simulations, which is a special type of behavioral-based simulation
systems.
1.7.4 Blue-Red Simulation
Any form of interaction between two parties can be modeled as a blue-red

simulation. If two people attempt to persuade each other with their points of view,
one person can be considered blue, and the other red. The blue-red simulation
attempts to imitate this situation of exchange of opinions, and possibly to counteract
the other’s logic. In essence, blue-red simulations are generic simulations to model
and explore the dynamics of a competitive interaction within a given context.
Blue-red simulation technologies have been used in the military for decades.
This has led to many previous attempts that can be considered a form of CRT such
as those discussed in [14–16]. Both forms of blue-red simulations: human based and
computer based are common in military planning. The computer-based simulations
include the following: ModSAF [10], Enhanced ISAAC Neural Simulation Toolkit
(EINSTein) [25], Irreducible Semi-Autonomous Adaptive Combat (ISAAC) [26],
OneSAF [53], Comprehensive Research-Oriented Combat Agent Distillation Imple-
mented in the Littoral Environment (CROCADILE) [7], Map Aware Nonuniform
Automata (MANA) [30], JANUS [11], PAX [41], Warfare Intelligent System
for Dynamic Optimization of Missions (WISDOM) [55–57], BactoWars [51, 52],
CASTFOREM [32], Pythagoras [9], jSWAT [34, 35], and PAX3D [28].
Blue-red simulation has traditionally been used in the military to simulate combat
situations. This area of research flourished, taking a new direction in the 1990s
when researchers found interest in concepts from the field of complex adaptive
systems (CAS) within blue-red simulations. CAS emphasize the need for simple
representations of the agents, simple behaviors of these agents, and complexity as
an emerging phenomenon that arises from agents’ interactions with each other as a
group, rather than from the complexity of each individual alone.
This concept of simple agents generating complex group behaviors became
very appealing to researchers. Systems were developed including EINSTein [25]
ISAAC [26], and CROCADILE [7]. Unfortunately, it did not take long to realize
that while it was true that the simple agents produced behaviors on the screen that
appeared complex, there were two major problems in these systems.
The first problem was that agents could behave in manners that military personnel
considered meaningless. In essence, they believed that no troop would behave
in such a manner in a real-world situation. This generated suspicion about the
validity of these systems. The second problem was the lack of any means to explain
why certain complex behaviors arose in the simulation. Without knowing why, the
military was unable to transform the results of these simulations into meaningful
doctrines to adopt.
These two problems were acknowledged by researchers in the field. The first
problem was considered an advantage in the simulations-these strange behaviors
can generate risk. While a military culture may not allow these behaviors, an
individual might behave in such a manner if they had lost their sanity. Therefore,
these behaviors were not considered a disadvantage per se. However, if the designer
wanted to enforce a military hierarchy, there was no way in these simulations to
maintain the coherence of such a hierarchy over the course of the simulation.
The second problem was dealt with by researchers using two means: visualiza-
tion and post-analysis using data-mining techniques. Visualization provided easy,
but extremely effective, tools for gaining insight relevant to humans. The research
into data mining has resulted in post-analysis tools that can collate the massive
amount of information produced by these simulations in a coherent form. Both
means were combined and referred to as “data farming.”
In an attempt to address these problems, the author of this book with his
PhD student at the time designed WISDOM [55–57] as a system to solve both
problems. WISDOM designed the internal architecture that enabled a solution for
both problems. First, every relationship between any two agents was represented as
an explicit network. For example, vision (who sees whom in an environment) was
represented as a vision network that is reformed in every step of the simulation.
Similarly, the command structures involving factors such as communication were
represented as networks. An influence diagram was then constructed to represent
how these networks influence each other and in which context. As shown in Fig. 1.1,
each relationship/network within the agent society is associated with a node in the
influence diagram, which acts as a prior knowledge to guide reasoning.
Given that agents interact nonlinearly in a CAS, it is almost impossible to
understand how group-level behavior relates to the behavior of individual agents.
For example, who is responsible for producing an idea that resulted from a group
discussion: the person who uttered it, the people who were discussing it before that,
or someone who said something very early in the discussion and kept silent for the
rest of the time?
Reasoning in complex systems is difficult. WISDOM relies on the fact that
each relationship is a network and that these networks are interdependent in a
manner described by the influence diagram to do reasoning. Figure 1.2 offers an
approach that enables reasoning in these highly nonlinear dynamical systems. At
each simulation time step, each network is analyzed and many network measures
are extracted. Over time, these network measures form different time series. The
influence diagram represents the domain knowledge required to interpret these time
series. All that remained was to design data-mining tools to correlate these time
Fig. 1.1 Connecting relationships among agents to high level reasoning
Fig. 1.2 Reasoning in nonlinear dynamics using networks and time series analysis
series to provide the confidence that one change in a network influenced a change
in another network, sometimes with a long time lag in between.
Some researchers have attempted to equate blue-red simulation with RT. This
can be considered a weak comparison because RT requires explicit understanding
and modeling of risk and challenge.
Researchers, including the author and his students, have attempted to automate
some aspects of the RT exercise. These attempts have created a rich literature
that can be leveraged for use in CRT, but clearly, as the remainder of this book
demonstrates, the gap is extremely large between the current state of automation of
RT and CRT.
The first attempt to claim an Automated RT (ART) is attributed to preliminary
discussions by Upton and McDonald [47, 48]. ART relies on the evolutionary-
computation search technique (discussed in Sect. 3.3) to uncover good red strategies
in predefined scenarios. The two cornerstones of CRT were not modeled or
discussed. Thus far, all ideas for automating RT can be described as wrapping an
optimization layer (mostly evolutionary or co-evolutionary computation) around
a blue-red simulation. The search technique uses a blue-red simulation environ-
ment to evaluate its proposed strategies and solutions. A more serious series of
computational studies was conducted simultaneously by the author and his PhD
student [54, 55].
This line of research was followed by rich literature on the topic. Within the realm
of military-style blue-red simulations, ART [13] and Automated Co-Evolution
(ACE) [30] continued the traditions of EINSTein [25] and WISDOM [54, 55]
in adopting evolutionary algorithms to search the strategy space of blue and
red. More studies emerged on blue-red simulations under the banner of CRT,
including [17, 18, 23, 24, 38, 39], and one of the early studies using CRT for risk
assessment is reported in [8].
Outside the realm of the military, CRT began to be included in a wide range of
applications, including in cyber security [40, 42, 44] and air-traffic control [5, 6, 58].
Some early review papers include [1, 3, 4].
The above literature has provided a rich foundation for CRT. However, there
are many opportunities and research areas that remain unexplored. The remainder
of this book will discuss the science of CRT to draw a map of these unexplored
areas. The objective is to explain the foundations of CRT in an attempt to drive the
literature toward more meaningful studies on the RT domain.
1.8 Philosophical Reflection on Assessing Intelligence
1.8.1 The Imitation Game (Turing Test) for Assessing

Intelligence
Since the inception of computer science, and the dream to execute in silico what
humans can do in their minds, one of the fundamental questions that has generated
1.8 Philosophical Reflection on Assessing Intelligence 41
many inquiries into the philosophy of AI is what is “intelligence” in the first place.
Researchers ask what this word means and how it can be judged whether an entity
is intelligent.
The history of the philosophy of AI is replete with famous stories: from the
Chinese Room Argument that negatively impacted the work on Natural Language
Understanding, changing the name of the field to Natural Language Processing, to
the inability of the perceptron node in artificial neural networks to solve the XOR
problem.
One topic that created a great deal of discussion over the years is how to judge
whether a machine is intelligent.
Alan Turing [45] proposed an answer: the Imitation Game (IG). In this game, an
AI or a machine contestant is placed in one room, a human contestant is placed in
another room, and a human judge sits in a third room. The human judge does not
know who is in which room.
The human judge begins by asking questions to the agents in both rooms. At the
end of the task, the human judge needs to judge which room has the human and
which room has the machine. If the human judge believes the room that has the
machine is the one that has the human, the machine has passed the intelligence test.
Recently, a version of the IG was introduced for computer games in a competition
termed Human-Bots [22]. Interestingly, no machine has ever passed the IG test to
this day. If this had been a test provided to humans in a school or a university
environment, it would have been scrambled by university management long ago.
IG has been widely criticized by many, but no valid alternative has been proposed.
IG has many fundamental drawbacks. Some of these are discussed below from a
technological, rather than a philosophical perspective:
• IG assumes a binary definition of intelligence that does not help to establish a
gradient to advance the science of AI. The test does not allow intelligence to
be assessed on a score. While this is not a major drawback (as it can be easily
amended by asking the human judge to score each room or provide a weight that
a room has a human), designing such a score function would be sensitive to the
subjective opinion of a human judge.
• IG advances research in AI in a backward direction! The fundamental concept
underneath IG is for the machine to match human abilities and inabilities,
equally! For example, given that the judge would expect a human to make a
mistake when asked to complete complex calculations in a short period, or at
least to take longer time to complete such calculations, AI designers attempted
to mimic this human inability by slowing down the calculations or introducing
deliberate errors in the calculations. Such a behavior is not useful from an
engineering perspective: what is the point of producing human mistakes in a
machine from a technological perspective? If a society has a gifted child, should
the child not be embraced, or should society ask the child to make more mistakes
to seem like other children? Obviously, this is a matter of perspective.
• IG is logically inconsistent. Imagine both rooms have humans inside, what would
be the meaning of the decision made by the judge? Equally, if both rooms have
a machine, does it mean that whichever room the judge believes has the human
has an intelligent machine?
• IG makes a wrong assumption on a fundamental level: that intelligence is
context independent. Is a smart computer scientist or a mathematician necessarily
smart in a social context? Intelligence is a context-dependent phenomenon.
A mathematician specializing in the field of mathematical topology may appear
smarter than many other people when asked questions in this field. However,
our ability or inability to answer questions depends on many factors, including
our workload level, fatigue level, stress level, knowledge in the domain, level
of maturity, and our attitude toward self-reflection. Today, human intelligence
is assessed using multiple scales such as the intelligence quotient (IQ), and the
emotional quotient (EQ). Intelligence should not be assessed independently of
context because there is no single type of intelligence, even within human society.
• IG assumes that the ultimate aim of AI is to imitate humans rather than
complement humans. A social system is constructed with each human playing
a role that serves the society as a whole (i.e. the division of labor principle) and
allows the people in the society to live in harmony and learn to act intelligently.
If all humans in a society attempt to imitate each other, creating the same copy of
one another, the system collapses and the concept of intelligence will be erased
from the system. Therefore, it can be said that intelligence breeds differences
and a differentiation process among the agents in the environment. Similarly, we
should aim to cherish the differences that AI offers, not the similarities, when
assessing intelligence.
1.8.2 Computational Red Teaming for Assessing Intelligence
CRT offers perspective on how intelligence should be assessed. The basic tenet
of CRT is its reliance on deliberate challenge. In its simplest form, a deliberate
challenge may take the form of a debate, or as in John’s story, a mockup interview.
A reciprocal interaction between two entities, where each entity attempts to
deliberately challenge what the other is attempting to achieve, is a more objective
manner for each entity to evaluate the other.
The concept of a challenge does not need an external judge to make a decision.
Instead, the parties themselves can assess their own interaction. Every time one party
throws a challenge at the other, the recipient party can assess how far this challenge
truly expands its horizon. This is what occurs in a social system. People tend to
evaluate each other constantly based on feedback they receive through conversation
and interaction.
The concept of a deliberate challenge is different from a classical competition.
While in both cases, a context exists that bounds the scope of the interaction, the
primary aim in a competition is to win, regardless of whether a new lesson has been
learned.
References 43
In RT, the primary aim of the exercise is to learn new things. Blue attempts to
learn about the opponent, holes in their own thinking, and holes in their planning;
they also attempt to estimate their boundaries: where red’s abilities end and blue’s
inabilities begin. These are the boundaries between what blue can and cannot do,
what blue knows and does not know, and where the true challenges lie.
The process of estimating these boundaries, probing the other team with events
that require them to act outside their skills boundaries, and designing mechanisms
to counteract this team action distinguishes RT from a classical competition.
RT offers this unique mechanisms that provide objective ways to not only assess
intelligence, but to analyze the system to be assessed. For example, the time taken by
one side to sustain the interaction before it breaks down can be such objective metric.
One can rely on syntactical and semantic complexity to analyze the interaction or
more advanced complexity metrics [43]. This analysis can assist in pushing this
system up in the intelligence scale by identifying root causes for the limited behavior
expressed by the system during the RT Exercise.
References
1. Abbass, H.: Computational red teaming and cyber challenges. In: Platform Technologies
Research Institute Annual Symposium, PTRI (2009)
2. Abbass, H.A., Barlow, M.: Computational red teaming for counter improvised explosive
devices with a focus on computer games. In: Gowlett, P. (ed.) Moving Forward with
Computational Red Teaming. DSTO, Australia (2011)
3. Abbass, H.A., Bender, A., Gaidow, S.: Evolutionary computation for risk assessment using
computational red teaming. In: Sobrevilla, P., Aranda, J., Xambo, S. (eds.) 2010 IEEE
World Congress on Computational Intelligence Plenary and Invited Lectures Proceedings,
pp. 207–230. IEEE, Barcelona (2010)
4. Abbass, H., Bender, A., Gaidow, S., Whitbread, P.: Computational red teaming: past, present
and future. IEEE Comput. Intell. Mag. 6(1), 30–42 (2011)
5. Alam, S., Zhao, W., Tang, J., Lokan, C., Ellejmi, M., Kirby, S., Abbass, H.: Discovering delay
patterns in arrival traffic with dynamic continuous descent approaches using co-evolutionary
red teaming. Air Traffic Control Q. 20(1), 47 (2012)
6. Amin, R., Tang, J., Ellejmi, M., Kirby, S., Abbass, H.A.: Computational red teaming for
correction of traffic events in real time human performance studies. In: USA/Europe ATM
R&D Seminar, Chicago (2013)
7. Barlow, M., Easton, A.: Crocadile-an open, extensible agent-based distillation engine. Inf.
Secur. 8(1), 17–51 (2002)
8. Barlow, M., Yang, A., Abbass, H.: A temporal risk assessment framework for planning a future
force structure. In: IEEE Symposium on Computational Intelligence in Security and Defense
Applications, (CISDA), pp. 100–107. IEEE, Honolulu (2007)
9. Bitinas, E.J., Henscheid, Z.A., Truong, L.V.: Pythagoras: a new agent-based simulation system.
Technol. Rev. J. 11(1), 45–58 (2003)
10. Calder, R., Smith, J., Courtemanche, A., Mar, J., Ceranowicz, A.Z.: Modsaf behavior sim-
ulation and control. In: Proceedings of the Conference on Computer Generated Forces and
Behavioral Representation (1993)
11. Caldwell, W.J., Wood, R., Pate, M.C.: JLINK—Janus fast movers. In: Proceedings of the
27th Conference on Winter Simulation, pp. 1237–1243. IEEE Computer Society, Washington
(1995)
12. Carter Matherly: The Red Teaming Essential. Selectedworks (2013)

13. Choo, C.S., Chua, C.L., Tay, S.H.V.: Automated red teaming: a proposed framework for
military application. In: Proceedings of the 9th Annual Conference on Genetic and Evolu-
tionary Computation, pp. 1936–1942. ACM, New York (2007)
14. Davis, P.K.: Applying artificial intelligence techniques to strategic-level gaming and simula-
tion. Tech. rep., Rand Corporation (1988)
15. Davis, P.K.: Dealing with complexity: exploratory analysis enabled by multiresolultion,
multiperspective modeling. In: Proceedings of the 32nd Conference on Winter simulation,
pp. 293–302. Society for Computer Simulation International, San Diego (2000)
16. Davis, P.K., Bankes, S.C., Egner, M.: Enhancing Strategic Planning with Massive Scenario
Generation: Theory and Experiments, vol. 392. Rand Corporation Press, Santa Monica, CA
90407-2138 (2007)
17. Decraene, J., Chandramohan, M., Low, M.Y.H., Choo, C.S.: Evolvable simulations applied
to automated red teaming: a preliminary study. In: Proceedings of the Winter Simulation
Conference, pp. 1444–1455. Winter Simulation Conference (2010)
18. Decraene, J., Zeng, F., Low, M.Y.H., Zhou, S., Cai, W.: Research advances in automated red
teaming. In: Proceedings of the 2010 Spring Simulation Multiconference, p. 47. Society for
Computer Simulation International (2010)
19. Director, C.O.: Plans. defence capability development manual. Tech. rep., Technical report,
Australian Department of Defence (2006)
20. Force, T.: The role and status of dod red teaming activities. Tech. rep., Office of the Under
Secretary of Defense for Acquisition, Technology, and Logistics (2003)
21. Gowlett, P.: Moving forward with computational red teaming. Tech. rep., Defence Science and
Technology Organisation - DSTO, Australia (2011)
22. Hingston, P.: A turing test for computer game bots. IEEE Trans. Comput. Intell. AI Games
1(3), 169–186 (2009)
23. Hingston, P., Preuss, M.: Red teaming with coevolution. In: IEEE Congress on Evolutionary
Computation (CEC), pp. 1155–1163. IEEE, New Orleans (2011)
24. Hingston, P., Preuss, M., Spierling, D.: Redtnet: a network model for strategy games. In: IEEE
Congress on Evolutionary Computation (CEC), pp. 1–9. IEEE, Barcelona (2010)
25. Ilachinski, A.: Enhanced ISAAC neural simulation toolkit (EINSTein): an artificial-life
laboratory for exploring self-organized emergence in land combat (U). Center for Naval
Analyses, Beta-Test Users Guide 1101, no. 610.10 (1999)
26. Ilachinski, A.: Irreducible semi-autonomous adaptive combat (ISAAC): an artificial-life
approach to land combat. Mil. Oper. Res. 5(3), 29–46 (2000)
27. ISO: ISO 31000:2009, Risk Management - Principles and Guidelines (2009)
28. Lampe, T., Seichter, G.: Pax3d refugee camp scenario–calibration of the adapted pax model.
In: Scythe: Proceedings and Bulletin of the International Data Farming Community (Scythe 6),
IDFW18, Monterey, CA, USA (2009)
29. Lauder, M.: Red dawn: the emergence of a red teaming capability in the Canadian forces. Can.
Army J. 12(2), 25–36 (2009)
30. Lauren, M., Silwood, N., Chong, N., Low, S., McDonald, M., Rayburg, C., Yildiz, B., Pickl, S.,
Sanchez, R.: Maritime force protection study using mana and automatic co-evolution (ACE).
In: Scythe: Proceedings and Bulletin of the International Data Farming Community, vol. 6,
pp. 2–6 (2009)
31. Longbine, D.F.: Red teaming: past and present. Tech. rep., DTIC Document (2008)
32. Mackey, D., Dixon, D., Loncarish, T.: Combined arms and support task force evaluation model
(castforem) update: methodologies. Tech. rep., Technical Document TRAC-WSMR-TD-01-
012, US Army TRADOC Analysis Center, White Sands Missile Range, NM (2001)
33. MD (Ministry of Defense): Red teaming guide. Tech. rep., Wiltshure: The Development,
Concepts and Doctrine Centre (2013)
34. Menadue, I., Lohmeyer, D., James, S., Holden, L.: jSWAT2-the application of simulation to
support seminar wargaming. In: SimTecT (2009)
References 45
35. Millikan, J., Brennan, M., Gaertner, P.: Joint seminar wargame adjudication tool (jSWAT). In:
Proceedings of the Land Warfare Conference (2005)
36. NATO: Bi-strategic command alternative analysis concept. Tech. rep., Supreme Allied
Commander, Norfolk (2012)
37. Porter, M.E.: What is strategy? Harv. Bus. Rev. (November–December), 61–78 (1996)
38. Ranjeet, T.: Coevolutionary algorithms for the optimization of strategies for red teaming
applications. Ph.D. thesis, Edith Cowan University (2012)
39. Ranjeet, T.R., Hingston, P., Lam, C.P., Masek, M.: Analysis of key installation protection
using computerized red teaming. In: Proceedings of the Thirty-Fourth Australasian Computer
Science Conference, vol. 113, pp. 137–144. Australian Computer Society, Darlinghurst (2011)
40. Rastegari, S., Hingston, P., Lam, C.P., Brand, M.: Testing a distributed denial of service
defence mechanism using red teaming. In: IEEE Symposium on Computational Intelligence
for Security and Defense Applications (CISDA), pp. 23–29. IEEE, Ottawa (2013)
41. Schwarz, G.: Command and control in peace support operations model pax-approaching new
challenges in the modeling of c2. Tech. rep., DTIC Document (2004)
42. Shafi, K., Abbass, H.A.: Biologically-inspired complex adaptive systems approaches to
network intrusion detection. Inf. Secur. Tech. Rep. 12(4), 209–217 (2007)
43. Teo, J., Abbass, H.A.: Multiobjectivity and complexity in embodied cognition. IEEE Trans.
Evol. Comput. 9(4), 337–360 (2005)
44. Thornton, C., Cohen, O., Denzinger, J., Boyd, J.E.: Automated testing of physical security: red
teaming through machine learning. Comput. Intell. (2014)
45. Turing, A.M.: Computing machinery and intelligence. Mind, pp. 433–460 (1950)
46. Tzu, S.: The Art of War, p. 65. Translated by Samuel B. Griffith. Oxford University Press,
New York (1963)
47. Upton, S.C., McDonald, M.J.: Automated red teaming using evolutionary algorithms. WG31–
Computing Advances in Military OR (2003)
48. Upton, S.C., Johnson, S.K., McDonald, M.J.: Breaking blue: automated red teaming using
evolvable simulations. In: GECCO 2004 (2004)
49. Von Clausewitz, C.: On War. Digireads. com Publishing (2004)
50. Wheeler, S.: Moving forward with computational red teaming. Tech. rep., Defence Science and
Technology Organisation - DSTO, Australia (2012)
51. White, G.: The mathematical agent-a complex adaptive system representation in bactowars. In:
First Workshop on Complex Adaptive Systems for Defence (2004)
52. White, G., Perston, R., Bowden, F.: Force flexibility modelling in bactowars. In: Proceedings
of the International Congress on Modeling and Simulation (MODSIM), pp. 663–669 (2007)
53. Wittman Jr, R.L., Harrison, C.T.: Onesaf: A product line approach to simulation development.
Tech. rep., DTIC Document (2001)
54. Yang, A., Abbass, H.A., Sarker, R.: Evolving agents for network centric warfare. In: Proceed-
ings of the 2005 Workshops on Genetic and Evolutionary Computation, pp. 193–195. ACM,
Washington (2005)
55. Yang, A., Abbass, H.A., Sarker, R.: Landscape dynamics in multi–agent simulation combat
systems. In: AI 2004: Advances in Artificial Intelligence, pp. 39–50. Springer, Berlin (2005)
56. Yang, A., Abbass, H.A., Sarker, R.: Characterizing warfare in red teaming. IEEE Trans. Syst.
Man Cybern. B 36(2), 268–285 (2006)
57. Yang, A., Abbass, H.A., Sarker, R.: How hard is it to red team? In: Abbass, H.A., Essam,
D. (eds.) Applications of Information Systems to Homeland Security and Defense, p. 46. IGI
Global, Hershey (2006)
58. Zhao, W., Alam, S., Abbass, H.A.: Evaluating ground–air network vulnerabilities in an inte-
grated terminal maneuvering area using co-evolutionary computational red teaming. Transp.
Res. C Emerg. Technol. 29, 32–54 (2013)
Chapter 2
Analytics of Risk and Challenge
The formulation of a problem is often more essential than its solution,

which may be merely a matter of mathematical or experimental skill. To raise
new questions, new possibilities, to regard old problems from a new angle,
requires creative imagination and marks real advance in science
Albert Einstein and Leopold Infeld (1938) [2]
Abstract As emphasized several times in the previous chapter, CRT is about

analyzing risk and designing deliberate challenges. Whether we are deliberately
challenging the effectiveness of a strategic plan or the scalability of an optimization
or big-data mining algorithm, the concept of a challenge has the same fundamental
characteristics. The purpose of this chapter is to develop a disciplinary approach
to structure and model the analysis of risk and the concept of a challenge. This
structure can assist an automated system to risk assess and challenge a human or
a computer autonomously, and to teach the concept of challenge in a disciplinary
manner to humans. What is risk? How to analyze risk and how to “think” risk?
What is a challenge? What do we mean by deliberate? How do we design and
model the concept of a challenge deliberately? How do we systematically design
a challenge on which both humans and computers to operate? This chapter will
address these questions by establishing a unifying theory that defines and models
systems, uncertainty, ability, skill, capacity, competency, performance, capability,
and our ultimate aim, risk and challenge.
2.1 Precautions
This chapter will revisit many basic concepts that may seem already known to many
readers. Nevertheless, a formal definition of each of these concepts will be provided.
Some of the definitions will be obvious, some may deviate from daily uses of the
concept, and some may even contradict our present understanding of the concept.
This is why defining these basic concepts is essential.

48 2 Analytics of Risk and Challenge
The discussion of many concepts in this chapter intersects with other disciplines,
including those of the behavioral and educational sciences and organizational
psychology. In fact, psychology literature is rich in dealing with these concepts,
with many articles published on each of the many concepts that will be discussed
here.
A CRT exercise may include a behavioral psychologist to perform a behavioral
assessment of the blue team. It may use an organizational psychologist to understand
the culture of the blue organization or it may include a cognitive psychologist
to advise on task designs with specific cognitive-load characteristics to overload
blue’s thinking. Our discussion in this chapter does not aim to discuss these roles
and the science needed to perform each of them. A psychologist in any of these
roles is another team member of the CRT exercise, bringing their own expertise to
the CRT exercise. Psychology literature examines each of these roles and concepts
underpinning them with more depth than the discussion here.
The discussion in this chapter does not aim to reproduce the psychology
literature, nor does it aim to introduce a new psychological theory. The main aim is
to design a model of a challenge that we can use in a computational environment.
This model will be used to analyze an algorithm, a machine, a human or an
organization. The discussion will offer simple and structured behavioral models that
can be used by non-psychologists. These models are simple when compared to the
great amount of literature available on the concepts, and the complexity involved
in understanding human psychology. However, the models are reliable because
whether we use pencil and paper or computers to red team, and whether we use
them for small or large-scale problems, they will produce results that can be traced
to causes and evidence.
To bring the different pieces of a model of challenge together successfully, the
discussion will intersect with a number of fields, including psychology, education,
risk management, system theory, and computational sciences. Structuring these
concepts is a daunting task. First, science by nature offers a thesis and antithesis.
The reader may find scientific articles with different definitions that contradict
each other. In places, the treatment of the topic will certainly contradict some of
this science. Second, most of these concepts are also used in our daily language;
therefore, a first encounter with a definition for any of these concepts that does not
comply with one of our daily uses may create unease for the reader.
Nevertheless, given that one of the aims is to structure these concepts so that we
are able to compute them, we must define them clearly in an unambiguous manner.
Such unambiguous definitions will eliminate confusion in the reader’s mind while
reading this book, even if the definitions themselves are not universally accepted.
2.2 Risk Analytics 49
2.2 Risk Analytics
We define risk analytics as follows:

Definition 2.1. Risk analytics is the process of transforming data and requirements
into actions using risk thinking and a disciplined risk methodology to understand
historical situations, anticipate and predict futures, select appropriate courses of
actions for an organization to implement, and/or determining novel ways for an
organization to operate.
The above encompassing definition covers the roles and benefits of risk analytics
within an organization. To illustrate risk analytics as a process, Fig. 2.1 presents
six standard steps. These steps are very similar to those followed in any type
of decision making situation. However, risk analytics emphasizes that the overall
decision making process is guided with, and centered on, the concept of risk.
The first step is related to intelligence gathering and reconnaissance operations,
that is, the process of collecting data and targeted evidences to support the
decision making process. In the military and security domains, intelligence and
reconnaissance are two classic functions that drive any operation. Similarly, in
businesses, the field of business intelligence has witnessed large interest to provide
the data required to steer the decision making process. In government, evidence-
based policy is normally the terminology used to stress the need for having the right
data to shape policy development.
Intelligence does not only react to the needs of the organization, but also
provides a proactive capability to shape and drive organizational needs. As data gets
Fig. 2.1 Risk analytics steps

collected, the organization continuously assesses the situation, the associated risks,
and the threats that may exist in the environment. Most of these terminologies, such
as risk and threats, will be explained in more details in the rest of this chapter.
For the time being, we can rely on common knowledge in understanding these
terminologies to follow the current discussion on the risk analytics process.
When the organization identifies a specific type of threat or a possible negative or
positive impact on organizational objectives, a need arises to analyze this situation
and formulate alternatives. Response analysis is the process of formulating and
assessing responses. Consequence analysis then projects some of the selected
responses onto future states to assess the longer term impact of these responses
on organizational objectives.
A suitable response is then selected. The response design step transforms the
selected response into suitable actions that can be executed. For example, one
important aspect of response design is how the selected response will be framed
to others. The organization may decide to fire people. Will the organization present
this response as a direct consequence of drop in sales, as a restructure of operations
to improve productivity, or as a step towards renewing the organization. Framing the
response is a very critical skill that can dramatically impact the effectiveness of the
response in achieving the intended impact.
When risk analytics relies on designing challenges as the tool to react to threats,
the process gets more targeted, where the threat actor becomes the focal point of the
analysis. In other words, intentional actions become more paramount in the analysis,
as well as the response.
2.2.1 Intentional Actions
CRT is designed to challenge an entity. The success of our ability to challenge this
entity must be reflected in its performance. In CRT, this entity can be anything from
a human to a machine, from a company to a country, and from a technology to ideas
and beliefs. Regardless of what the entity is, it needs to have an owner. We will call
the owner a person or an agent.
We will follow a legal definition of a “legal person,” or a person for short.
A person can be natural, as a human, or a juridical, as a corporation. We will
reserve the word “agent” to mean both a person and software that performs some
tasks by producing actions. We will use the word “entity” to refer to agents that
think/compute and act or objects that do not think or act. If our discussion is limited
to a software agent, we will explicitly refer to it as a “software agent.”
While a person and an agent are systems by definition, as will be discussed in this
section, the word “system” will be used to emphasize the structure over the identity,
and the words “person” or “agent” will be used to emphasize identity over structure.
Whatever the type of an agent, we will consider an agent as a living organism:
it continuously produces actions in the environment. Even if the agent stays still,
staying still is an action. When a human or a computer program goes to a sleep, this
Fig. 2.2 Understanding

agents’ actions and their
relationships to the
environment
is an action in its own right. Therefore, an agent produces actions continuously in

the environment; each action will produce outcomes.
An agent (see Fig. 2.2) lives within four different generic environments that
we will call them the four domains. These are social, cognitive, cyber, and
physical domains (SC2PD). These environments are surrounded with many different
contexts, including the PESTE contexts. An agent lives within these contexts and
environments, and impact them by generating actions which create events that
influence the context, environment and the agent.
CRT is not a context that concerns reflex, unintentional or ad-hoc actions. A red
team is established for a purpose, and with an understanding of who the blue team
is. Therefore, in the context of CRT, we will focus on intentional actions.
Definition 2.2. An intentional act is the production of an action by an agent to fulfil
the agent’s goals.
Therefore, these intentional actions are not produced in vacuum; they are
produced to achieve an agent’s goal. This does not necessarily mean that an action
successfully achieves the goal of the agent. At the time the action was produced,
the agent’s intent was to achieve an agent’s goal, irrespective of whether this action
was actually successful in achieving this goal. This produces deviations between
the actual outcome of actions and the intended outcomes by the agent. When these
deviations are sensed by the agent, they act as a feedback signal for the agent to
adjust its set of actions accordingly.
2.2.2 Objectives and Goals
A properly designed intentional action needs to consider the outcomes the agent
intended to achieve the fulfilment of the agent’s objectives, goals and the uncertainty
surrounding the achievement of these outcomes. This begs the question of what
these concepts mean.
Definition 2.3. An objective is an approximately measurable phenomenon with a
direction of increase or decrease.
The phenomenon can be the agent’s state. For example, when an affective state
such as happiness or a physical state such as monetary richness become the subject
of an objective, we would usually have a metric to measure this state. In the case
of the affective state of happiness, we may not have a direct manner by which to
measure the state itself, but we can use a set of indicators. These indicators are
blended (fused) to provide a measurement of the degree of happiness. We would
then either attempt to increase (maximize) or decrease (minimize) the degree of
happiness.
In CRT, the objectives of both teams are somehow interdependent because
the agent’s states are interdependent on each other. For example, the red team’s
affective state of happiness may be negatively influenced by the blue team’s state
of richness (as in the simple case of human jealousy); thus, a decrease in blue’s
richness generates an increase in red’s happiness. In this case, the red team may
have an objective of minimizing the richness of the blue team to maximize its own
happiness. If the teams’ objectives are independent of each other, they should act
independently; therefore, there is no need for the CRT exercise in the first place.
If red and blue objectives are positively correlated,1 they can optimize their
objectives either by continuing to act independently, or by taking an opportunity
that might arise to optimize their objectives by acting cooperatively. In this case, the
objective of the CRT exercise is to explore novel opportunities for collaboration.
However, in most cases, CRT exists for competitive situations.2 In this case, a
blue-red competition can only exist if blue and red have conflicting objectives. Con-
flicting objectives can take two forms. In the first form, the objectives themselves
are in direct conflict with each other. For example, in a situation of war, blue wishes
to win at the cost of red losing, and vice versa.
In the second form, the objectives may not be in obvious conflict, but limited
resources place them in conflict. For example, there are two departments in a
company, one is responsible for research and development (R&D) and the other
is responsible for the core-business production line (the production department).
1
Two objectives are said to be positively correlated if an improvement in one is accompanied with
an improvement in the other and vice versa.
2
Even when we discuss CRT for cooperative situations, we use competition as the way to achieve
cooperation. For example, by challenging the student’s mind with stimulating ideas, the student
becomes more engaged, and pays more attention to and cooperates with the teacher.
The R&D department’s objective is to maximize the innovation of its design of

the next generation of products. However, the production department has objectives
such as maximizing production efficiency and product quality. While the objectives
of both departments are almost independent because the output of each department
aims at different time-scales, the fact that their budget comes from a common pool
can put them in direct conflict. Here, the conflict is that as one department attempts
to draw resources to achieve its objectives, it is depleting and competing with the
resources available to the other department for achieving the other department’s
objectives.
The same forms of conflicting objectives occurs in CRT. For example, a CRT
exercise to evaluate the security system of a company would place both teams
in direct conflict. The blue team’s objective is to maximize the protection of the
security system, while the red team’s objective is to maximize the probability of
breaking into the security system. This arm race is a core characteristic of the CRT
exercises.
This discussion demonstrates the importance of mapping out and understanding
the objective space for both red and blue in a CRT exercise. Figure 2.3 presents a
conceptual objective space for both red and blue. A solid arrow/line between two
objectives indicates positive influence. For example, if or2 in the figure represents
richness and or4 represents happiness, the solid arrow from or2 to or4 indicates
that as richness increases, happiness also increases. A line, instead of an arrow,
indicates influence in both directions.
It is critical to analyze this objective space in a CRT exercise because of the
interdependency between objectives. For example, we can see that or7 for red has a
Fig. 2.3 Blue and red objective spaces and their correlations. A solid arrow/line indicates positive
correlation; a dotted arrow/line indicates negative correlation
positive relationship with ob6 for blue. That is, it is beneficial for both blue and red
to cooperate to maximize these objectives.
However, this conclusion is superficial. We need to understand the complex
interteam and intrateam interactions in the objective space.
For blue, ob6 positively influences ob3, while an improvement in ob3 will
improve ob2, which will negatively influence ob6. This generates a negative cycle
with blue’s objective space. For example, improving education intake and quality
would improve health, but improving health would increase the age of retirement,
degrading job market, which then negatively influences education. Similarly, in
a network-security scenario, creating a stronger security system through multiple
biometric authentication protocols would increase system protection, but increasing
system protection would reduce the usability of the system (customers need to
spend more time to authenticate), which may increase customer dissatisfaction.
These examples demonstrate the internal conflict that can exist within the interteam
objective space.
This creates an internal conflict within blue objectives. Blue would then need
to establish its own trade-offs. In the meantime, red does not have the same
internal conflict. or7 negatively influences or6, which positively influences or2,
which positively influences or4, which negatively influences or1, which positively
influences or3. That is, or7 positively influences or3 (if we multiply all signs on the
path, we obtain a positive sign). We notice that there is a conflict between or4 and
or1, but this conflict does not impact the interdependency between red’s external
objectives.
If we examine the intrateam interaction, we see that ob6 for blue positively
influences ob3 for blue, which negatively influences or3 for red. Therefore, blue
has the following two problems:
1. Blue has a negative feedback cycle internally: ob3 ob2 ob6 o3b. Red can
influence this negative feedback cycle as red’s or7 objective interacts positively
with blue’s ob6 objective. Thus, red can influence blue’s decision made on any
internal level of trade-off.
2. Red’s or3 and or7 objectives reinforce each other. In the meantime, red’s or3
objective is in conflict with blue’s ob3 objective. As red improves its own or3
objective, blue’s ob3 objective deteriorates.
Once these objectives become known, each team attends to design plans to
achieve their objectives. To monitor progress toward the objectives, goals are
defined.
Definition 2.4. A goal is a planned objective.
Based on the agent’s assessment of what is possible and what is not, the agent
can establish an “aspiration level” for each objective. This process of planning and
designing aspiration levels transforms each objective, where the agent wishes to
optimize the objective, to goals, where the agent wishes to reach the way-point
indicated by the aspiration level.
In classical optimization, the problem the agent wishes to optimize can be

formulated as follows:
# f .x/
S.T. x 2 ˚.x/
where, f .x/ is the objective the agent wishes to optimize (minimize in this case),
x is the decision variable(s), the alternatives or courses of action from which the
agent needs to choose, and ˚.x/ is the feasible space of alternatives. Every solution
belonging to the feasible space ˚.x/ satisfies all constraints in the problem. We use
# to denote minimization, " to denote maximization, and “S.T.” as a shorthand for
“subject to the following constraints or conditions.”
For an agent to optimize one of its objectives, it needs to form a plan, or a
series of actions to make this optimization work. The agent’s plan is designed
after careful assessment of what is possible and what is not, or what we will term
“constraints.” Once planning is complete, the agent becomes more aware of the
environment, as well as what it can achieve and what it cannot. In this case, the
objective is transformed into a goal and the formulation above can be re-expressed
as is presented in the following equation.
# d C dC
S.T.
f .x/ C d C d C D T I
x 2 ˚.x/
where T is the target or aspiration level of the goal, d is the underachievement of

a goal, and d C is the overachievement of a goal. In this formulation, f .x/ C d C
d C D T is termed a “soft constraint”, while x 2 ˚.x/ is termed a “hard constraint”.
A feasible solution can violate a soft constraint with a cost, but it can’t violate a hard
constraint. The objective function can take many forms including the minimization
of underachievement alone, overachievement alone, or a weighted sum of both.
Figure 2.4 presents a pictorial diagram to emphasize the difference between
interteam and intrateam conflicting objectives. As we discussed above, each team
has its own internal conflicting objectives. Each team needs to decide on the level
of trade-off to compromise the optimization of these internal conflicting objectives.
In the meantime, blue-red interaction has its own conflicting objectives. A level of
trade-off is still necessary, as both red and blue need to compromise. Therefore,
Both interteam and intrateam conflicting objectives generate two different decision-
science problems that need to be solved. However, the tools used to solve interteam
conflicting objectives significantly differ from those used to solve intrateam con-
flicting objectives because of the following three reasons:
Fig. 2.4 Differentiating interteam and intrateam conflicting objectives
1. The first difference lies in who owns the trade-off. For the interteam conflicting
objectives, each team owns their problems and therefore can decide on the level
of trade-off they wish to achieve. In the intrateam conflicting objectives, the
trade-off is owned by both teams together. The issue of ownership is core when
selecting an appropriate technique to solve these problems because it defines
the level of control of a team on implementing a proposed solution. One would
expect that red and blue could exercise more control internally than externally.3
The implication here is an internal decision made by one team will be easier to
implement than an external decision.
2. The second difference lies in the nature of the trade-off. In the intrateam
conflicting objective space, the trade-off is not usually a one-off decision; it needs
to be negotiated and be determined by both teams together. As blue makes a
decision, red responds, and as red makes a decision, blue responds. Therefore,
the trade-off in the intrateam conflicting objective space is more dynamic than in
the interteam conflicting objective space.
3. The third difference lies in the nature of uncertainty and information availability
in the intrateam and interteam conflicting objective space. In an interteam
situation, external uncertainty is almost uncontrollable. The system attempts
to decide on its actions to manage the risk of these external uncertainties. In
the intrateam situation, uncertainty is dynamic. As the two teams interact, their
actions can shape the uncertainty space. This discussion point will be revisited in
Sect. 4.1.
By now, we should ask whether the division between internal conflicting
objectives and external conflicting objectives is meaningful. In fact, this division
largely depends on where we draw “system boundaries.” In the following section,
3
How to deal with the situation when one of the teams has more control externally than internally
is outside the scope of this book.
Fig. 2.5 The nested nature of red teaming
a “system” is defined. However, to illustrate that the division drawn between

intrateam and interteam is artificial, and that CRT is not simply an exercise between
“us and them,” Fig. 2.5 conceptually depicts the nested nature of CRT.
Figure 2.5 demonstrates that for whatever system we red team, within this
system, we can have another CRT exercise. For example, organization A may red
team its strategy against organization B. According to the previous discussion,
A owns its objectives and decisions. However, within A, there are departments
with conflicting objectives. Each department can conduct its own CRT exercise
and perceive the rest of the departments as external teams. Within the same
department, different sections may have conflicting objectives and they may apply
CRT to evaluate their own strategies. Within a section, people may have conflicting
objectives and a person may use CRT to evaluate their own plans. In short, CRT is
not an exercise between one country and another alone; as discussed in this book,
it is an exercise that can be used for individuals, organizations, countries, and even
algorithms and machines.
This demonstrates that the CRT exercise is also nested by definition. When red
and blue are two organizations, each team can be divided into smaller teams. There
may be a red team for the internal budget, another for the internal market strategy,
then the two teams may form a red team for the organization given that they are in a
better position to understand the internal and external objectives and uncertainties.
Some people argue that this nested view of CRT is not desirable because CRT
is perceived as an exercise with the enemy; so how can we red team inside
the organization? Reasoning demonstrates that within the organization, senior
management can resolve conflict, but if two countries fight, there is no equivalent
concept to senior management in organizations. Therefore, there is a fundamental
difference between CRT exercises conducted between organizations, and those
conducted within an organization.
This argument is flawed in two aspects. First, it reflects the limited view that
CRT is a military or national-security exercise. Limiting the concept of CRT to these
domains will harm these domains because the constrained context, while important,
limits the possibilities for CRT to grow as a science.
The second reason the argument is flawed is that the concept of senior man-
agement exists in every problem. Senior management is not an external counseling
service or a legal authority. Members of senior management come from different
portfolios in an organization. Even for matters related to military or national
security, different countries are members of a larger international organization such
as the United Nations. This does not eliminate the need for CRT on a country
level, a state level, a department level, an organization level, a technological level,
or even an algorithmic level. CRT is a nested exercise simply because conflict in
objectives is a nested concept. The fact that larger objectives are comprised of
smaller objectives can create conflict itself, and as each person is responsible for
a different portfolio within an organization, CRT on one level is comprised of CRT
exercises on sublevels.
2.2.3 Systems
As discussed, the primary reason that red and blue are in conflict is that the
objectives of the blue system are in conflict with the objectives of the red system.
In a CRT exercise, it is critical to consider both the red and blue teams as a system.
For red, blue is a system for which red attempts to dysfunction by counteracting its
objectives. The same is true for blue, red is a system that is attempting to dysfunction
blue because red’s objectives are in conflict with blue’s objectives. We use the
word “dysfunction” since interference with a system’s objectives with the aim of
acting against the benefits of the system is a possible cause for dysfunction. This
dysfunction can take the form of simply influencing the objectives of one team to
change, or in more dramatic situations, of damaging the components of the system.
Classically, a system is perceived as a group of components or entities interacting
for a purpose. This definition is too basic here, and does not adequately service our
analysis. Therefore, a system is defined here as follows.
Definition 2.5. A system is a set of entities: each has a capacity to receive inputs,
perform tasks, generate effects, and complement the other toward achieving goals
defined by a common purpose.
Definition 2.6. An effect is a measurable outcome generated by an action or caused
by a change in a system state.
The above definition of “system” can be considered an elaborate definition of the
classical definition of a system. However, this further level of detail is necessary.
It makes it clearer to an analyst that when they define a system (such as the red or
blue system), they must map out the entities; the inputs to each entity; the task each
entity is performing (reflecting the purpose of this entity or subsystem); the effects
that each entity generates; and how these entities and their objectives depend on
each other and come together to achieve the overall purpose of the system.
The definition for “effect” clarifies that given actions are produced continuously,
effects are also generated continuously. Every action produces many outcomes.
An effect is a measurable outcome within the context of CRT. If the outcome is
not measurable, it cannot be considered within a CRT exercise before it becomes
measurable (either directly or indirectly through a set of indicators); otherwise the
exercise will become an ad-hoc activity.
If we want to discuss change in happiness as an effect, we need to know how
to measure happiness. Alternatively, we need to find indicators that collectively
indicate happiness so we can measure these indicators. If we cannot measure
directly or indirectly, we cannot manage, we cannot engineer, we cannot define a
reward or penalty, and we simply cannot influence or control.
The definition of “effect” also emphasizes that effects can be produced without
actions. For example, aging is an effect of time. Even if we put the human on a bed
in a coma, the body will continue to age and decay.4 These changes in the state of
the system are naturally occurring without actions per se.
The definitions of system and effects used above are particularly useful for a
red teamer because they create knobs for engaging with the system to steer it and
influence it in a more clear manner. Knowing how the entities interact and the
resultant effects provides us with an idea of which entities are more important than
others, and which are more controllable than others. Once we define the key entities
we wish to control, we can examine how to control them and the desired changes in
the effects. However, given that each of these entities is a system, we can continue
to deconstruct the problem and locate more control points.
The second group of knobs is the inputs, the tasks an entity is performing, and
the effects an entity generates. Chapter 4 will present a more elaborate discussion
on this issue. Understanding these knobs facilitates the task of the red teamers.
Components comprising a system are in their own right, a system. An aircraft is
a system, as it consists of the mechanical, software, fuel and human components,
without which it cannot fulfil its purpose. The purpose of an aircraft is to fly. This is
actually an assumption for which we should pause and consider in depth.
Definition 2.7. The purpose of a system is the reason for being from the perspective
of an external observer.
While the components are internal to the system, the purpose is always in the eyes
of the beholder. The purpose of a system is an external judgment that is made by an
external stakeholder or observer. The purpose is defined by an external entity, which
can also be the owner of the system. Therefore, the same system can have multiple
4
One can consider this concept on a philosophical level as actions produced by the environment
that cause decay to occur, but we will avoid this level of interpretation in this book because it can
create unmanageable analysis.
purposes. For an airline, an aircraft’s purpose is to make money through flying. For
the post office, an aircraft’s purpose is to deliver the mail. For a business passenger,
an aircraft’s purpose is to provide transportation to attend business meetings. For a
world traveler, an aircraft’s purpose is to provide transportation to travel from place
to place for enjoyment.
The different views on the purpose of an aircraft by different external stakehold-
ers in the community may generate conflicting objectives. Making more profit from
an airline perspective can create conflict with a passenger who wishes to minimize
the cost of travel as much as possible. A longer route at an optimal altitude may
minimize fuel costs for the airline as compared to a shorter route at an inefficient
altitude, which burns more fuel. However, for the business passenger, a longer route
may entail late arrival at the destination.
For an airline company, the board will define the purpose of the company.
One can perceive the board as an external entity, which in reality it is because it
represents the interface between the stakeholders of the company and the company
itself. The chief executive officer (CEO) sits on the board as an ex-officio and reports
to the board. Through the CEO, the purpose is translated into internal objectives,
which are then transformed into goals, key performance indicators, and plans.
While the aircraft’s purpose for one person is for them to be able to fly, for
another, it might be a symbol of power and wealth-imagine having an aircraft in
your backyard that you do not intend to use. You only have it on display to show
your neighbors how wealthy you are.
In the latter case, it does not matter whether we run out of fuel since the purpose
of this aircraft is to symbolize power and wealth, not to fly. It does not even matter
if the crew does not arrive or the control software system is not working. These
elements are not critical for the purpose.
Therefore, there is a tight coupling between the purpose of a system, and which
elements of an aircraft are deemed important for that purpose. Elements contributing
to different purposes can overlap. However, all elements of an aircraft may exist, but
not all of them are critical elements for the aircraft (the system) to fulfil its purpose.
Therefore, what defines the “critical elements” in a system can be different from one
observer to another, and from one stakeholder to another.
Definition 2.8. An element or component in a system is termed “critical” if the
removal of, or cause of damage to, this element or component would significantly
degrade the ability of the system to achieve its objective, goal, or purpose.5
For example, the heart is a critical element in the human body because if it is
attacked, the human body defining the system in this context will find it difficult
to achieve its objectives and its purpose of functioning efficiently and living,
respectively.
5
Most of the definitions used for critical elements, hazards, threats, and risks in this book are
compatible with ISO3100 [8], but sometimes get slightly changed to fit the context of this book.
In the example of the aircraft in the backyard as a symbol of power, the critical
element of the aircraft is that it has all its exterior body parts, including the wheels.
Scratches in the paintwork may not affect its ability to fly, but would certainly affect
its appearance as a symbol of power. The engine is no longer a critical component;
if it is not working, the appearance is not impacted.
It is clear that what makes a component in the system a critical element is its
contribution to the capacity of the system in achieving its purpose. However, neither
this capacity nor the objectives are deterministic; they are impacted by both internal
and external uncertainties.
2.2.4 Uncertainty and Risk
A properly designed action must consider the outcomes the agent intended to
achieve at the time the action was formed to fulfil the agent’s objectives or goals,
as well as the uncertainty surrounding the achievement of these outcomes. So far,
we have discussed objectives and goals. However, the perceived outcomes are the
agent’s expectation of an action’s impact on objectives given the uncertainty of
that impact. Many factors come into play in determining this uncertainty, from the
personality traits of the agent to the agent’s sensorial abilities, availability and access
to information for the agent, and the complexity of the situation the agent faces.
Every action must be evaluated through its effects and the impact of these effects
on both red’s and blue’s objectives. These effects need to be designed systematically
and consider the uncertainty in the environment. Therefore, in CRT, the concept of
risk is paramount.
From an agent’s perspective, Fig. 2.6 depicts a basic form of the decision-making
cycle an agent undergoes. The agent relies on its sensors to perceive uncertainty in
the environment. The agent has a set of feasible actions it wishes to evaluate for the
particular context in which it is attempting to make a decision. Together with the
agent’s objectives, the agent needs to make a judgment on how these uncertainties
impact the agent’s objectives for each possible action the agent needs to evaluate.
The agent selects a possible action to execute based on the agent’s assessment of
the impact of uncertainty on objectives if this action is executed. This assessment
is also influenced by the agent’s risk personality traits and experience. The agent’s
personality towards risk gets influenced by the agent’s perception of uncertainty and
the feedback received from the environment; together, they can reshape the agent’s
attitude to risk.
For example, the manner a message gets framed and presented to an agent
influences the agent’s perception of the level of uncertainty in the environment.
Consider for example the difference between “this person is trustworthy” and “to
my knowledge, this person is trustworthy”. The second statement can be perceived
to carry more uncertainty than the first, despite that we understand that whatever
statement someone is making, it is based on the person’s level of knowledge.
Fig. 2.6 The role of uncertainty in an agent’s decision making cycle
When the action is executed, an effect is generated in the environment, which the
agent senses through its sensorial capabilities and feedback; this effect is then used
for further learning. We note that this effect carries uncertainty information as well.
The cycle continues, and the agent continues to perceive the uncertainty in the
environment, evaluating its impact on objectives, producing an action accordingly,
monitoring the effect, and generating appropriate feedback to update its experience
and learn.
The diagram shows that the agent’s risk was a function of its objectives and
uncertainty.
Definition 2.9. Risk is the impact of uncertainty on objectives.6
The definition of risk above includes both positive and negative impact; therefore,
it assumes that risk can be negative or positive. For example, the risk of investing in
the stock market can be positive (profit) or negative (loss). In both cases, we would
use the term risk because at the time the decision was made to invest, the decision
maker should have evaluated both possibilities: the possibility of making profit and
the possibility of making loss. An educated decision maker when making a decision
to invest accepts the negative risk as a possible outcome, and equally, the positive
risk as another possible outcome.
6
We have changed the definition of risk from the one introduced in ISO3100 [8] by using the word
“impact” instead of “effect”. The reason is that the word “effect” has a more subtle meaning in this
chapter.
The common goal of a CRT exercise is to manage risk. This claim is safe
because underlying every use of CRT discussed in Chap. 1 lies in objectives and
uncertainties that derive the overall CRT exercise. The CRT exercise is established
to fulfil a purpose that takes the form of a function. One of the main functions of
CRT discussed in Chap. 1 is to discover vulnerabilities as a step towards designing a
risk-management strategy. By discovering vulnerabilities, we become aware of them
and we can take precautions to protect the system. However, what is a vulnerability?
ISO3100 defines vulnerabilities as “a weakness of an asset or group of assets that
can be exploited by one or more threats”[8]. In this book, we will adopt a definition
from a system perspective [4] because words such as “assets” can be confusing
if they are not understood from an accounting perspective. As such, the following
definition of “vulnerability” is provided.
Definition 2.10. A vulnerability is the possibility evaluated through the level of
access or exposure a hazard or a threat has to a critical component of a system.
A hazard is an unintentional act that may harm the system such as a fire. A threat
is an intentional act such as a hired hacker who has the intention to hack into the
computer network and cause damage. For the network administrator, this hacker is
a threat.
Vulnerability exists through exposure to an authorized or unauthorized (even
accidental) access of a critical element to a hazard or a threat; we will refer to this
exposure as “events.” What creates risk is the level of uncertainty of this exposure,
and the magnitude of damage that can accompany the exposure if it occurs; thus, the
uncertainty surrounding the circumstances in which the event will occur will impact
the critical element, which will in turn impact the objectives.
O
Risk D Vulnerability Effect
The building blocks for hazards and threats are shown in Fig. 2.7. These building
blocks provide knobs to control hazards and threats. An entity needs to be capable
of performing the act. Therefore, capability is one building block. We will revisit
the concept of capability and deconstruct it into components in Chap. 4. For the
timebeing, an entity has the capability if it has the ingredients to provide it with
the capacity to perform the act. For example, a computer hacker needs to have the
knowledge to hack into a computer. In Sect. 2.14, we will call this know-how the
skills to hack into a computer. The collective skills necessary to perform the act of
computer hacking represent one dimension of the capability of the entity. Similarly,
for a bushfire to ignite by nature, the ingredients of the capability need to be in
place. These can be the ability of the environment to have high temperature, dry
weather, etc. A thief who is denied the knowledge to hack a computer can’t become
a computer hacker because the thief was denied the capability.
While we will expand more on the concept of a capability in Chap. 4, we will
approximate the ingredients of a capability in this chapter to physical ingredients
and know-how ingredients. Most of the analysis conducted in this book will focus on
the know-how. This is on purpose for two reasons. First, without the know-how, the
physical ingredients are insufficient. While it is true also that without the physical
Fig. 2.7 Building blocks of hazards and threats
ingredients, the know-how is insufficient, but the know-how is more important

because it can identifies different ways of designing the physical ingredients.
Second, since CRT is mostly about threats and threat-actors, the know-how shapes
up the behavior of the threat actor, and the characteristics of the threat.
The opportunity is about the exposure component in the definition of a vul-
nerability. A computer hacker who is denied access to the computer network has
been denied the opportunity to hack into the network, despite that the hacker has
the capability to hack. The black box in an airplane is robust against high impact
collision and fire so that in aircraft accident investigations, the recording can be
replayed to shed light on the accident. By placing the recoding device inside the
black box, fire and collision as hazards or threats have been denied access, therefore,
have been denied the opportunity, to cause damage to the recording.
Therefore, regardless of whether we are talking about hazards or threats, both
the capability of the entity and the opportunity need to exist. Moreover, in the case
of a threat, intent is needed. A computer hacker who has the capability to hack into
a network, and has the opportunity by being left alone in the network room without
any surveillance can hack into the network if the hacker wishes to. At this point, the
intent of the hacker is the only thing between hacking the network and not hacking it.
The three building blocks: capabilities, opportunities and intents, are key in
any risk assessment analysis because they offer tools to analyze complex systems,
while also offering structured ways to think of the remedies. As the example
above illustrated, to eliminate a threat, one can deny knowledge as a mean to deny
capability, one can deny access as a mean to prevent exposure and, therefore, the
opportunity to create an impact on critical elements, and one can shape and reshape
intent so that entities with the capabilities and opportunities do not become threats
in the system. This type of analysis can be used to assess the risk accompanying the
different roles of a red team that were discussed in Sect. 1.6.2.
Let us now take a more complex example that mixes hazards with threats.
Assume a system user who leaves their password on their mobile telephone to
remember it, the mobile telephone is stolen and a criminal uses the password to
break into the system. In this case, the user did not have the intention to cause
damage, despite this possibly being considered an act of negligence. While the
password was the means to obtain unauthorized access to the system through
the intentional act of the criminal (a threat), the availability of the password to the
criminal was not intended by the user (a hazard).
A critical component such as the heart in a human becomes a vulnerability
when it is exposed to a hazard such as a car accident or a threat such as
someone intentionally attempting to dysfunction the heart through a stab wound.
The vulnerability here arises from the level of access that was granted to the hazard
or threat by the holder of the critical element. If a fence was built that was capable
of stopping the car from crashing with the human, access has been denied, and
therefore, this particular vulnerability has been eliminated.
Before this discussion ends, one final definition is necessary. This definition is
often ignored in risk-management literature-the definition of a “trigger.” It must be
understood that the event would normally require a trigger. A trigger is a different
type of event. Becoming angry with someone may trigger violence. The event of
violence would expose some critical elements of the system to a hazard or a threat;
thus, creating a situation of risk.
Here, the word “trigger” is preferred over the word “cause.” A strict definition
of a cause is that the effect would not materialize without the cause. If someone
is angry, many things (i.e. triggers) can happen to make this person produce an
undesirable action. More importantly, these things can happen still and the effect
may not occur. None of these things is a cause per se; the real cause is the cause
for the person’s anger, which could have been that the person failed an exam.
Therefore, a trigger can be considered an auxiliary cause or an enabler for the effect
to materialize [1].
For example, if throwing a stone at a window causes the glass to shatter, the effect
of the action is shattering. Before the action is produced, the effect of the action
must be evaluated while considering the possibility that the force of the stone is not
sufficient to cause the window to shatter. Thus, uncertainties should be considered
when evaluating expected effects.
We will avoid discussing causality in its philosophical form. Despite the fact that
some of these philosophical views are the basis for some of the tools used in this
book, they are not essential for understanding the materials in this book. Interested
readers can refer to [1].
Fig. 2.8 A diagram connecting different concepts related to risk
It is important to understand the difference between a trigger event and a hazard

or threat event because trigger events are those events that we need to control to
prevent a hazard or threat event from occurring. Figure 2.8 captures these concepts
in a schematic diagram [11].
In Fig. 2.8, the rectangle on the right-hand side represents the elements of risk
that lie in the environment, while the rectangle on the left-hand side represents the
elements of risk that lie within the system. This distinction is critical when analyzing
a system.
As discussed at the beginning of this chapter, blue sees red as a system and vice
versa. The blue system sees red as one of the systems in the blue’s environment.
It represents sources of uncertainty to blue. There can be internal sources of
uncertainty within the blue system. However, blue would believe that internal
sources of uncertainty such as questions about whether their own internal team
is ready to face much larger jobs than what they currently perform are more
controllable through something such as training than external uncertainties.
The fact that blue sees red as one of the systems in the environment is only
half of the story. In CRT, blue needs to see itself as an external system to red in
red’s environment. Consequently, as red shapes blue’s space of unknowns, blue also
shapes red’s space of unknowns. These mutual interdependencies are the essence
of the interaction between red and blue. They define a space of uncertainties, where
Fig. 2.9 Role of

uncertainties and objectives
in different environments
uncertainties are interdependent. Uncertainties in CRT are no longer external factors

to the system, but tools and knobs that a system can control and use for its own
advantages.
Blue should not act in a passive manner, accepting what red does as sources
of uncertainty. It should take active steps in shaping its own actions so that it
increases red’s uncertainty. As uncertainties increase, red is likely to ignore its
original objectives and make uncertainty management one of its objectives. Red will
be overwhelmed with uncertainty, making the likelihood of an outcome a great deal
smaller so that differences because of magnitude of outcomes become irrelevant.
To illustrate this, consider the case in which you wish to invest your money. You
have a number of options with a reasonable level of uncertainty and a possibility
of a high return. The decision you would make is simple: select the combination of
options that maximizes your expected return. The expected return is a combination
of uncertainties that impact each option, and the ability of each option to maximize
your objective of achieving maximum return.
Now, consider the case in which all options with which you are faced have very
high uncertainty. As uncertainty increases, differences between the options based
on return decrease in the eyes of a human. Uncertainty takes over. In this case, you
must not focus on return.
Instead, you should identify for which option you can control or influence
uncertainty for your own benefit. You will focus on managing uncertainty, making
controlling uncertainty your objective.
Figure 2.9 illustrate this counterintuitive point. Imagine an agent is attempting
to find an apple in the two environments shown in the figure. In Environment 1,
the agent has four roads to follow, and each road has a probability of finding an
apple associated with it. The three directions of North, East and South have low
uncertainty (high or low probability), while the west direction has high uncertainty
where it is a 50–50 chance to encounter an apple. In this environment, the agent
needs to focus on the goal of finding the apple. A rationale agent will start with the
north direction as it offers the highest probability for encountering the apple.
In Environment 2, the situation is different. All four directions have high
uncertainty. Classic decision making would suggest to start with any, since the
expected value is equal for all four directions. In RT, however, we understand
that uncertainty should not be a fact that we must obey. Instead, we can challenge
the uncertainty by seeking more information. In this situation, the agent changed the
objective from maximizing the expect value for finding the apple to minimizing the
uncertainty in the environment. When the agent manages to minimize uncertainty,
the agent becomes ready to shift its focus back to maximizing return.
Controlling uncertainty is a non-intuitive concept. In almost all types of classical
modeling presented in the literature, the emphasis is placed on how to represent
uncertainty and incorporate it in the model so that the solution produced by the
model is robust and resilient against the uncertainty. That is, classical modeling
approaches uncertainty from a passive perspective, seeing uncertainty as external
to the system, and the responsibility of a system’s designer is to find designs and
solutions that can survive the uncertainty.
CRT has a different perspective on the concept of uncertainty. Through CRT, we
can see uncertainty as a tool. Red must realize that through its own actions, it can
maximize blue’s uncertainty. Blue needs to realize the same. Red can confuse blue
and blue can confuse red. This form of a deliberately designed deceptive strategy
is not about deceiving the opponent team so that it believes one thing will be done
while the intention is to do another. Rather, deception here denotes deceiving the
opponent to the point at which they do not believe anything. The opponent becomes
overwhelmed with the uncertainty in the environment to the extent that it becomes
paralyzed. It does not move because every possible direction in which it can move
is full of unknowns. In such situations, the opponent will either not move at all or
will simply make a random move.
A CRT exercise takes an active approach toward the discovery of vulnerabilities.
In the majority of the CRT exercises, even if the individual exercise is concerned
with the discovery of vulnerabilities caused by hazards, the issue of “intention”,
therefore “threats”, demands a different type of analysis from that which involved
with hazards. A criminal breaking into the system, after obtaining access to the
password through the mobile telephone is an intentional act. This act becomes
deliberate when it is planned. Studying the interaction between objectives and
uncertainties is the key difference between what we will term an “intentional action”
and a “deliberate action.” This difference may appear controversial from a language
perspective given the two concepts of intentional and deliberate are synonymous
in English, and are used synonymously in many textbooks. However, here, we
highlight differences between the two words.
2.2.5 Deliberate Actions
Within the class of intentional actions, we will pay particular attention to the
subset of deliberate actions. We will distinguish “intentional” from “deliberate” to
differentiate between classical decision making in an environment in which risks are
not consciously evaluated by a red teamer (but in which the actions are consistent
with the intention of the person) and decision making that is always accomplished
after careful risk assessments.
Definition 2.11. A deliberate act is the production of an intentional act after careful
assessment of risk.
In classical AI, the term “deliberate action” implies an action that has been
decided on based on the construction of a plan. The definition we use above is
more accurate because the emphasis is placed on risk assessment; therefore, a plan
is being produced with risk as the focal point for evaluating different options and
decision paths.
Therefore, every deliberate act an agent generates should contribute to the
objectives. A series of effects is usually required for an agent to achieve one or
more objectives. These objectives in their totality should reflect and be aligned with
the purpose of the system.
In CRT, the impact of the uncertainty surrounding deliberate actions is evaluated
on both red and blue objectives (i.e. self and others). Because the actions are
deliberate, part of the CRT exercise is for each team to assess and analyze the
actions of the other team. By analyzing actions, one team can reveal intent, drivers,
objectives, and even the perception of the other team of the uncertainty surrounding
them.
The previous statement should be read with a great deal of caution because of two
problems. The first problem is that we can become so overwhelmed with analyzing
actions that we utilize almost all resources without reaching any end. The second
problem is that actions can be random and/or deceptive on purpose; therefore, a
naive analysis of actions can mislead and counteract the CRT exercise.
Let us revisit the first problem. Some extreme views may perceive that there is an
intent behind each action. This might even be misunderstood from our discussions
above. We need to remember here that we are not discussing human actions in
general; we are discussing actions within the context of the CRT environment.
Therefore, there is a level of truth that we should expect that actions are produced to
achieve intent. However, the true complexity here lies in the fact that to achieve one
intent, there might be a need to design a number of actions. Some of these actions
need to be generated in sequence, while others do not depend on any order. This
defines a critical problem where the intent of the agent from a series of actions need
to be inferred. This is a difficult problem requiring advanced techniques from the
field of data mining. An introduction to data mining will be given in Chap. 3.
The second problem mentioned above is that actions can be deceptive and/or
random. An agent may produce random actions to confuse the other agent. Here, the
concept of deception is paramount and greatly impacts the behavioral data-mining
methods. We may think this is becoming too complex. We may feel the need to ask
how we can discover intent when deception is used. It can be surprising to learn
that deception can actually help us to discover intent. If we consider the fact that
deception in its own right is a set of deliberate actions designed to lead to an intent
that is different from the original intent, we can see that the intent inferred from
deception can give us an idea of where the real intent of the agent is. Of course we
need to ask ourselves how we would know in the first place that these actions were
designed for deception and how we could categorize deceptive and non-deceptive
actions. This is when complex tools, algorithms, and human’s educated judgements
blend together to answer this question.
2.3 Performance
Ultimately, in CRT the aim is to challenge the performance of a system. This

task itself can take many forms, from a grand strategic vision on improving the
economic performance of a country, to an efficient optimization algorithm of the
bus system in a city, to a controller navigating a robot in an urban environment
or a big-data mining algorithm to detect zero attacks in a computer network.
Performance is a fundamental issue that we need to understand before discussing
a theory of challenge. In effect, we challenge performance; therefore, we need to
understand what performance means, what are the ingredients of performance, how
to measure performance, how to analyze performance, and how to shape and reshape
performance when we challenge performance.
To understand performance, we need to delineate the building blocks and
concepts underpinning performance. A good starting point for this discussion is
the meaning of the word “behavior”.
2.3.1 Behavior
For an agent to produce effects, it needs to act. The set of actions generated by an
agent define what we will term the agent’s “behavior”.
Definition 2.12. Behavior is the set of cognitive and physical, observable, and non-
observable actions produced by an agent in a given environment.
We could define behavior simply as the set of actions produced by an agent.
However, this definition lacks precision and essential details. It lacks precision
because an agent does not act in vacuum; an agent acts within an environment.
First, let us define the environment.
Definition 2.13. An environment for an agent A consists of all entities that reside
outside A, their properties and actions.
Therefore, the environment represents the wider context within which an agent is
embedded. An agent is situated within its environment. The agent receives stimuli
from the environment, generates effects in response, and continues to monitor the
impact of these effects on those environmental states to which the agent has access.
Behavior is not limited to the physical actions produced by an agent’s set of
actuators. Most of the physical actions are expected to be observable from an
external entity. However, there is a group of actions that is generally unobservable;
2.3 Performance 71
these are the cognitive actions: the thinking process an agent experiences to reach a
decision. Cognitive actions represent a critical component in an agent’s behavior.
We cannot simply ignore them because they are hidden in the agent’s mind. In
fact, if we can learn how an agent thinks, or at least the drivers behind an agent’s
decisions, we can predict most intentional physical actions. However, achieving this
is extremely complex.
Meanwhile, one can see physical actions as the realization of cognitive actions.
Walking to the restaurant to propose to my partner is a set of physical actions.
These physical actions indicate that I have thought about the decision, and made a
commitment to execute the action of proposing, with the expectation that the effect
of marriage will become a reality.
The interplay between cognitive and physical actions is important in CRT. Once
more, it is important to remind the reader that we are not discussing actions in life
in general; this is all within the context of CRT, that is, an exercise with a purpose.
Let us consider two examples at the two ends of the spectrum of CRT: one in which
we are red teaming a strategic scenario on a country level and the other in which we
are red teaming a computer algorithm for encryption.
In the first example, analyzing the cognitive actions of blue is about understand-
ing factors such as how the blue team plans, evaluates options, and makes choices.
These cognitive actions can be inferred, with different degrees of difficulty, from the
physical actions of the blue team. For example, the division of the budget between
buying capabilities to conduct cyber operations and buying tanks would provide us
with an indication of how the blue team is thinking, where they see their future
operations, and what possible strategies they have to meet their future uncertainties.
These actions are not created for deception. It is less likely that blue will invest
billions of dollars in tanks simply to deceive red; the scarcity of resources as a
constraint reduces the space for this type of deceptive actions.
In the second example, the cognitive actions represent how the encryption
algorithm thinks internally, that is, how it performs encryption. If the algorithm
is an agent, we can notice its input and output. Breaking up the algorithm here
is to uncover the computations it uses to transform this input to that output. We
are attempting to use the external physical actions to infer the internal cognitive
(problem solving) actions of the agent; by doing this, we can evaluate the robustness
of our system, which is using this algorithm for storing data against attacks.
2.3.2 Skills
A red teamer attempts to interfere with, influence and shape the blue team
behavior (action space). Therefore, for the blue team, the red team is part of
blue’s environment. Similarly, for the red team, the blue team is part of the red’s
environment. The red and blue environments share common elements: the shared
environmental components between blue and red, and the components forming the
interface between blue and red.
As red attempts to impact blue, it needs to rely on this interface, that is, the
shared subset of the environment to generate effects. The ability of either team to
act to generate effects on the other depends on their skill level.
Definition 2.14. A skill is the physical and/or cognitive know-how to produce
actions to achieve an effect.
A skill is about the know-how related to achieving an effect. Some may define
skills as the know-how to perform a task. However, here, the concept of a task is very
limiting. By focusing on know-how for achieving an effect, we have a more flexible
definition for “skill.” This definition links the outcomes (effects) to the processes
and cognitive means (know-how). More importantly, by defining skills from the
effects perspective, we emphasize that the agent’s choice of which know-how to use
is based on the effects the agent wishes to generate, not on what the task that is being
assigned to the agent intends to achieve. This is a crucial distinction for designing
deliberate actions.
A skill cannot be defined in isolation; it always needs to be linked to a specific
effect. However, effects have different levels of complexity and are generally
nested. For example, the effect of producing on a computer a good essay based
on recounting real events, while adding some details from the authors’ imagination,
may be completed using different skills. Each of these skills link some level of
know-how to an effect. One effect might be turning the computer into an “on” state
(i.e. turning on the computer or ensuring that the computer is already turned on).
This effect requires the know-how for sensing whether the computer is on. If the
computer is not on, the know-how must be for sensing whether the computer is
plugged in and that there is an electrical current reaching the machine as indicated
with the power light, then using motor skills to press the “on” button. Another set of
effects might be the production of a letter on a screen (this requires the know-how
for generating motor actions to turn on the computer and press buttons); the effect
of recounting the event (this requires the know-how for writing an account); and the
effect of deviating from the actual story to an imaginary set of events (this requires
the know-how to produce imaginary events in a coherent, interesting and engaging
manner).
Each example of know-how listed above is composed of hierarchical knowledge
divided into subsets of know-how. For example, the know-how to produce a letter
on a screen requires the know-how of the layout of the keyboard; the know-how
to translate the intent to write a letter (a cognitive event) to a series of muscle
movements to press the buttons; the know-how to synchronize the fingers such that
the correct finger is associated with the correct key on the keyboard.
The above level of deconstruction may seem as though it has too much detail.
However, in CRT, the right level of detail will always depend on the objective of the
exercise. If it is desirable to establish a writer profile to authenticate a person on a
computer network, this level of detail will be appropriate.
In this situation, we need to know which fingers the person usually uses, and
which keys are associated with which fingers. These two pieces of information
(fingers used and finger-key association), together with the layout of the keyboard,
2.3 Performance 73
will provide us with an estimate of the time spent between pressing different
buttons. For example, if a person uses only two fingers, one would expect a larger
delay between pressing letters “h” and “o” when typing “hooray” as opposed to
the delay between pressing letters “k” and “o” when typing “Hong Kong.” This
information can establish a different profile for different users, which is then used
as a background process for authentication and user identification.
Therefore, sometimes a level of detail for one exercise is not required for another.
This is a decision that the CRT analysts must make.
A set of “know-how” forms a skill to achieve an effect. However, effects are
hierarchical. Synthesizing effects on one level of the hierarchy requires specific
skills (i.e. know-how to achieve a larger effect on an upper level of the hierarchy).
It is important to recognize that it is not sufficient to take the union of the skills
required to achieve the low-level effects to achieve the higher level effect. We
need to ensure that we also have the know-how to synthesize the low-level effects.
Therefore, the whole is not the sum of the parts.
This discussion indicates that skills are organized in a hierarchy, which is a
commonly accepted notion in information processing and behavioral sciences. The
challenge of a discussion such as this for CRT activities is that we can continue
deconstructing a high-level planning task (as in the case of planning the cultural
change required to accommodate next generation technologies in a society) into
smaller and smaller tasks down to an arbitrarily microscopic level. The main
question is whether this helps?.
In a CRT exercise, we need to deconstruct down to a level after which further
deconstruction of skills is not needed. Therefore, the concept of a skill as defined
above offers the red teamer a critical dimension for analysis. By analyzing the
blue team’s skills, red can evaluate blue’s limitations, discover its vulnerabilities,
and can reshape its own environment to generate innovative effects far away from
the know-how of blue. Red can even help blue by designing training programs to
improve blue’s skills in specific areas so that blue generates effects that are useful
for them but are far away from those in which red is interested. As long as we
avoid deconstructing effects and skills beyond what is appropriate and useful for the
exercise, this type of deconstruction is vital for the success of the analysis.
2.3.3 Competency
An agent’s behavior is defined by the actions the agent produces; these actions are
the product of the agent’s skills. There is a direct relationship between skills and
behaviors. An agent uses its know-how to generate actions to achieve effects. The
totality of these actions represents the agent’s behavior. Thus, an agent’s behavior
is the product of the agent’s cognitive and physical skills. However, how can we
evaluate behavior or skills?
Definition 2.15. Competency is the degree, relative to some standards, of the level
of comfort and efficiency of an agent in adopting one or more skills to achieve an
effect.
Competency is the measure of performance we will use to assess an agent’s
behavior. It acts as an indicator for the nature of the know-how (skills) an agent
possesses.
The definition above requires further discussion related to two factors, the need
for a standard to measure competency, and the distinction that has been made
between comfort, which is a characteristic of the agent, and efficiency, which is
a characteristic of the task.
2.3.3.1 Need for a Standard
Competency is measured relative to a standard, which is a reference system against

which one can compare agents. Gilbert [5] uses the elite or the fittest in a population
as the reference point. (Section 2.3.4 will discuss Gilbert’s views further.) Here, the
main point to emphasize is how to set a standard in a CRT exercise.
It must be remembered that the red team comes from the blue team’s culture. As
a result, the setting of a standard can be misleading if the standard is set relative to
blue without knowing the standard for red. Let us consider two previous examples:
one related to a strategic decision, while the other related to computer security.
Assume two countries: X and Y . Country X is technologically savvy, developed,
has a democratic government, and the population is highly educated. Country Y
is undeveloped, relies on very old and outdated technologies, suffers from internal
political instability, and the education level is poor.
It is clear in this example that if Y wishes to red team X ’s plans, Y is incapable of
evaluating X ’s competencies. A red team drawn from Y for the purpose of thinking
like X does not have the knowledge to do so, neither does it have the intuition
and thinking abilities to imagine what are the right standards to use to evaluate
X ’s competency. Most likely, in this case, the exercise will fail because it is too
imaginary or the exercise will simply be counterproductive.
However, does this mean that X can red team Y given that they possess the
knowledge and technology? Let us assume X is attempting to conduct a CRT
exercise to evaluate how Y will respond to an economic pressure that X will enforce
on Y to steer Y toward becoming a democratic country.
In this example, we would need to ask what standards the red team should be
using within X to assess the competency of Y in responding to this situation. It
would not be surprising to expect that the standards used by X to evaluate Y are
likely to overestimate what Y can do. The red team in X is highly educated. They
can run scenarios on large computer clusters to evaluate every possible response that
Y can produce. They have sufficient knowledge about Y that they can assume that
they are able to think like Y . When assessing the competency of Y in applying
2.3 Performance 75
a specific policy tool to the situation, they can lower their own standards but
realistically, having a very high standard is not a problem in this situation. So, what
is the problem?
The main problem in this situation is that people in Y are extremely competent
in a group of skills that X does not have. It is the know-how to use simplicity
to respond to technology savvy know-how. Therefore, for X ’s CRT exercise to be
effective, X needs to accept that they may have a blind spot in their understanding
of the behavioral space of Y . As such, how can red in X define the standard for
this blind spot? There is no single answer to this question. The obvious answer is
to study Y to the greatest extent possible. Providing the complex answers to this
question is beyond the scope of this book.
Let us take a more objective example. Assume a group of thieves would like
to rob a bank. The bank establishes a red team from their high-tech departments
to identify the vulnerabilities that the thieves may exploit. The red team properly
evaluates the competency of red in terms of every skill required to break into their
computer network. The red team uses their standards to break into the computer
network as the standard to evaluate the thieves’ competency level. Let us assume
that the thieves are not as skilled in cyber espionage, cyber strategies, computer
security, and network intrusions. In such a case, the standards used by the red team
remain appropriate, despite the fact that they are well above the level of capability
of the thieves.
However, the thieves’ objective is to rob the bank, not to break into the bank’s IT
network. Given that we are assuming that breaking into the IT network is a necessary
condition for the thieves to rob the bank, it is fair for the red team to evaluate
this as a vulnerability. However, the thieves do not have the know-how to break
into the network. Instead, the thieves know how to blackmail, exert inappropriate
pressures, and use violence and force. The thieves are not scared of breaking the
law. Their behavior is embedded in an entirely different behavioral space from the
highly educated IT team in the bank.
As such, the primary problem is that the skill space for the thieves cannot be
fully discovered by the red team in the bank. Given that skills are nonlinear, there
is no guarantee that the standards used by the bank are high enough to assess the
competency of the thieves. The thieves may simply cause damage to the electrical
power supply in the city, cause damage in the bank’s computer system, force the
bank to switch to manual operations, and steal the car with the money like in the old
movies.
Setting a standard to define competency in CRT assumes that in a normal setting
behavior is symmetric. CRT addresses symmetric and asymmetric situations; it is in
asymmetric situations that setting standards relies on correctly mapping the behavior
and skill spaces, and having the know-how (required skills) to set the standards
properly.
How can we then establish standards in CRT? First, we need to change the
standard from a ceil that defines a goal to a baseline that defines an objective.
By using the concept of the elite, we establish an upper boundary on what can
be achieved, we then attempt to measure how far the agents are from this upper
boundary (goal) based on the agents’ outputs. However, the red team may not have
the skills or knowledge to estimate this upper boundary properly. Overestimating the
upper bound is not necessarily a bad thing, but arbitrary overestimating this upper
boundary in an ad-hoc, blind manner or underestimating it are real vulnerabilities
for the CRT exercise because the other team might be greatly more competent than
what the red team think.
Moreover, a ceil is established under the assumption that we know what the
effect space is. In the absence of complete knowledge of the effect space, we cannot
define this ceil. Therefore, in CRT, we need instead to move away from this idea
of establishing the standard as a ceil. Instead, competency of one team will be
defined relative to an assessment of the performance of the other team. We term
this “comparative competency.”
Definition 2.16. Comparative competency is the degree of the level of comfort and
efficiency of an agent in adopting one or more skills to achieve an effect in one team
relative to the ability of the other team in achieving the same effect.
In comparative competency, a team expresses its competency relative to the
performance of the other team. Therefore, competencies are expressed as two
percentages, one related to the comfort of red relative to blue, and the other related
to the efficiency of red relative to blue when attempting to achieve a particular effect.
Comparative competency does not address the problem that one team may have
a blind spot in mapping the other team’s skill space. This problem requires multiple
treatments, especially with regards to team membership discussed in Sect. 1.3.
Remember that different skills can come together in different ways to achieve the
same effect. Therefore, when measuring competency, we are measuring to the best
possible performance that the other can display in achieving the effect. Since this
best possible performance is dynamic within a CRT context, because of the learning
occurring within the CRT exercise, comparative competency is a dynamic concept.
2.3.3.2 Comfort vs Efficiency
Given that we will explicitly distinguish between the cognitive and physical
attributes of, and functions performed by, agents, it is also important to distinguish
between comfort, the level of ease in achieving an effect, and efficiency, the accuracy
and speed in achieving that effect.
Imagine you are at the checkout counter of a supermarket. The cashier behind the
counter is scanning your items, and placing them in a bag. One of the cashiers might
be the elite in that supermarket because every item they place in a bag is scanned
(100 % accuracy) and they can scan and package 20 items per minute. This cashier
is defining the standard for the checkout counters in this supermarket.
Judging on throughput alone is not sufficient for us to understand the long-term
effect. The level of comfort, the cashier’s feelings and perceptions about the ease
with which they perform their job can provide us with a more informative picture
of performance, and the ability to predict long-term effects. If the cashier perceives
2.3 Performance 77
that the job is very easy and simple, we may assume that their performance would
degrade if they worked without rest for 1 h. If they perceive that the job requires a
great deal of effort and they need to concentrate to ensure the accuracy of scanning
and packing the items, we know that the cognitive load becomes an important factor
in this situation and the cashier’s performance may degrade in 30 min without a
break instead.
This discussion emphasizes that competency cannot rely on agents’ physical
and observable actions alone, it should also consider the agents’ cognitive actions.
Whether or not to assess these cognitive actions requires cost-benefit analysis. A
study needs to decide on the importance of this type of data to the particular
exercise. Cognitive data can be drawn from questions posed to the subjects or from
sophisticated data-collection mechanisms such as brain imaging. This is an exercise-
specific decision.
2.3.3.3 Revisiting Behavior
We can now redefine behavior, or offer a second definition of behavior.

Definition 2.17. A behavior is the expression of an agent’s competency level and
acquired skills in actions.
In this definition, we emphasize competency (comfort and efficiency) and skills
(know-how) when observing an agent’s set of actions to discuss the agent’s behavior.
This definition, illustrated in Fig. 2.10, moves us a great deal closer to a useful
definition of behavior that provides us with the tools for analysis. Competency
provides the measures and indicators, while skills guide the data-collection exercise
to focus on the how. By understanding the how, we can diagnose the behavior, and
through competency, we can observe the impact of different treatments on behavior.
Figure 2.10 connects the concepts discussed so far in a coherent form, and
introduce additional points for discussion. It differentiates between two types of
knowledge. Axiomatic knowledge need to be acquired by an agent through transfer
from another agent. We will reserve the function of education to the transfer of
axiomatic knowledge.
Learned knowledge is acquired through mechanisms such as training, practising
and challenges. Training assists the agent to improve efficiency on a task. Practising
provides the agent with the confidence and comfort in performing a task. A
challenge through any form of interaction, including training and practice, provide
another mean to extend the agent’s learned knowledge. These learned knowledge
become an input to the agent’s know-how knowledge base.
The agents’ skills and competency come together to form the agent’s behavior,
which is expressed in the form of actions. Through self-reflection on these actions,
as well as training, practising and challenges, the agent learns new knowledge.
In CRT, we will assume that everything is measurable, whether directly or
indirectly, through a set of indicators that approximate the phenomenon we wish to
measure. We also assume that everything is for a reason; therefore, there is a cause
Fig. 2.10 Deconstruction of behavior
for everything (be it physical, cognitive, social, or environmental) underlying every

piece of information in a CRT exercise, even if the purpose is to deceive the other
team, or if the information was mistaken demonstrating a level of inefficiency in the
other team. Without these assumptions, we cannot perform computations; thus, we
cannot systematically analyze a situation. CRT evolves through measurements.
In CRT, we need to measure to compute, and we need to compute to

measure. We need to understand causes to control and influence, we need to
influence to create effects, and we need to create effects to achieve the purpose
of the exercise.
The question we need to ask is how to measure behavior. If we wish to challenge

behavior, we need first to be able to measure it. Otherwise, we will have no means
2.3 Performance 79
by which to establish with confidence whether we were able to challenge anything.

Understanding behavior through skills and competency gives us a powerful tool
to measure behavior; in fact, it is so powerful that through these factors, we can
automate the processes of production of behavior and the analysis of behavior.
Any action producible by an agent in a CRT exercise does not occur in a vacuum.
Both the agent’s skills and competency level of these skills shape, influence, and
even determine the production of an action. If an agent lies, the agent needs to have
the skills of lying. If the agent lies well, the agent needs to lie well relative to the
other team, ensuring that the other team believe the lie. If the agent falls because
their feet are weak, there is a physical constraint limiting the agent from reaching
maximum competency of the skill of walking. If the agent mumbles or produces
a grammatical error, it might be caused by the fact that the agent’s cognitive
resources have been depleted; thus, the agent is making mistakes, resulting in lower
competency with certain skills.
Thus far, we have discussed skills, competency and behavior. We have discussed
competency as a means by which to measure performance. It is now time to discuss
performance.
2.3.4 From Gilbert’s Model of Performance to a General

Theory of Performance
The model we will use in this chapter is inspired by the work of Gilbert [5], the
father of performance engineering or what he termed “teleonomics.” However, we
will deviate from Gilbert’s views in part to design views appropriate for the CRT
context of this book, and to ground his system-thinking views in computational
models.
Gilbert sees the world split into two components: the person (P ) and the
environment (E). We should recall that a person in this book can be a group or an
organization. When the person receives a stimulus, they need to be able to recognize
it. This recognition is fundamentally conditional on their ability to recognize the
stimulus. Gilbert termed this “discriminative stimuli:” S D .
When a person receives a discriminative stimulus, they need to have the capacity
to respond. Gilbert termed this “response capacity:” R. A person may have the
recognition system to receive and comprehend the stimulus, and the capacity to
respond, but they choose not to respond simply because they do not have the
motivation to do so. Therefore, the response needs to be accompanied with “stimuli
reinforcement:” Sr , which for the person represents the feedback to their motives.
The above can be summarized in Gilbert’s notations as
S D ! R:Sr
Table 2.1 An example for mapping Gilbert’s model to a scientist job

SD R Sr
Information Instrumentation Motivation
Data Instrument Incentives
Environment
Literature Functional Laboratories Funding
Knowledge Response capacity Motives
Behavioral repertory Education and training Thinking and skills Ambition
(know to recognize) (know how)
The ! represents implication in his notational system. Gilbert then divided

the environment into three components that correspond to the three components
associated with a person: data represent the information delivered to the person
through the stimuli; instruments represent the instrumentation component to deliver
the response; and incentives represent the reward system to reinforce and/or trigger
motivation.
We will present our own example below to explain Gilbert’s model, and to use
it as the basis to explain other concepts in the remainder of this chapter. Let us
take a scientist as the person we wish to model. Following Gilbert’s model, we can
construct the matrix presented in Table 2.1.
The simple example presented in Table 2.1 demonstrates the strength of Gilbert’s
model for CRT. First, the environment provides to the scientist the three enablers
that allow the scientist to perform their job. The literature represents access to
knowledge. For a scientist to innovate, access to the most recent knowledge that
has been developed in the field is essential. If the environment denies the scientist
such knowledge-for red teamers, this is a knob to achieve an effect of stopping the
scientist from achieving their goal-the scientist might end up reinventing the wheel.
The instrumentation that the environment needs to make available to the scientist
is represented here as the scientific laboratory and encompasses all the tools and
instruments required for the scientist to do their job. Once more, if these facilities
are not available, the scientist cannot produce the desired outcome and cannot
materialize their ideas.
Incentive is a tricky concept in science. We would expect that a scientist requires
some sort of incentive to perform their work. Here, we assume that incentives take
the form of scientific funding and grants. These grants do not necessarily provide
monetary incentive to the scientist, but a moral incentive, reflecting a recognition of
the importance of the work. The monetary value can be used to improve the facilities
and instrumentation; thus, speeding up scientific discovery.
The behavioral repertoire for the person captures the model of the stimulus-
response discussed above. Here, we assume that the level of education and training
provide the scientist with the ability to recognize stimuli in the environment. The
author often says to his students,
2.3 Performance 81
You see what you know.
A scientist who does not understand mathematics will not be able to interpret an
equation written on a whiteboard; thus, they cannot interpret the stimuli that may
trigger an idea in their mind. Thus, education and training represent the knowledge
repertoire required for S D to function.
The capacity to respond for a scientist represents their thinking abilities and
skills. To create new contributions, the scientist needs to have the skills and
creativity to produce scientific outcomes from the stimuli. Their motivations are
assumed to be internal and to take the shape of scientific ambition.
The model above gives us the basis to analyze the person from a CRT perspective,
providing us with the knobs to influence performance and reshape it if needed.
The details of Gilbert’s work can be found in his seminal book [5]; a very
worthwhile read. His work is inspiring and well engineered. However, we need to
search deeper and be more concise to transform his system into a system suitable
for CRT. This is for several reasons.
First, Gilbert focused on a holistic view of performance, resulting in an efficient,
but high-level, model that can guide human managers to improve performance. The
objective in CRT is to challenge performance; therefore, we need to transform this
holistic view into a grounded control model that enables us to steer performance to
either positive or negative sides. Moreover, we need this model to be sufficiently
grounded so that we can use it to compute, but not too grounded to avoid
unnecessary computational cost.
Second, Gilbert did not seem to differentiate between the physical, cognitive
and cyber spaces. By focusing on performance alone, it did not matter in his
work whether the capacity of the agent was cognitive or physical, or whether
the instruments used by the environment were psychological or physical. These
elements are not included for the performance engineer to analyze based on the
context in which they are working with. However, here, we prefer to make these
distinctions clear given the tools and models to be used for CRT will be different.
In the example of a scientist, Gilbert’s model is possibly useful for us as humans
to see how we can manipulate performance from the outset. However, if red teamers
wish to challenge this scientist with ideas, or challenge their environment to steer
their scientific discovery one way or another, it is necessary to dig deeper. We need
to separate the physical (e.g. laboratory) from the cognitive (e.g. creative thinking)
and the cyber (e.g. access to information). Gilbert does this to some extent as we see
in the example in which data and knowledge represent the stimuli, instrumentation
represents to some extent the physical side, and motivation represents the cognitive.
However, we can see also in the scientist example that this is not sufficient. A
laboratory would have people such as post-doctorates and Ph.D. students who
provide ideas to the scientist. These ideas can act as stimuli, responses or even
motivations.
Third, Gilbert spent considerable time in his book as an anti-behaviorist. In fact,

when one attempts to understand Gilbert’s views properly, it is clear that he was not
an anti-behaviorist because his own model, when analyzed properly, is a behaviorist
model. However, it seems that the behaviorist laboratory in which he was raised, and
the environment in which he was living were taking extreme views of the behaviorist
approach. This caused Gilbert to take a strong stand in his book against behaviorism,
while clearly his model demonstrated that behaviorism is embedded in the roots of
his mind.
In our model, we will combine the cognitive, social and behavioral schools to
provide multiple interception points a red teamer can use to understand, and if
required, influence and reshape, a system. This will create a larger model, but one
should zoom in and zoom out as needed based on the context, and the available
data and resources. For example, Chap. 4 will present holistic models suitable for
a strategist to use when designing strategies for a whole of government or for
designing a strategy for policing and security. This will be in contrast to the type
of data and the model used in Sect. 5.3 where we reach a level of detail on the level
of human-brain signals.
Zooming in and zooming out in the model presented in this chapter provide
the level of flexibility that a red teamer should have when analyzing tactical,
operational, strategic, or grand strategic levels. Data, details, and even the models
to be used are different, but the guiding principles and the thinking model are the
same.
Figure 2.11 presents a complete rework of Gilbert’s model, grounding it in the
cognitive-science literature, or more specifically, information-processing research,
while maintaining the underlying features to measure performance. We will first
explain the nomenclatures below:
• U c and U p are the agent’s cognitive and physical resources, respectively;
• Lc and Lp are the agent’s cognitive and physical skills, respectively;
• E c and E p are the environment’s cognitive and physical resources, respectively;
• Ar and E r are the internal (agent’s self-) and external (environment) rewards to
an agent, respectively;
• M and I are the motives and intent of an agent, respectively;
• B represents the ability of the agent to perceive and recognize the stimuli; it is a
fusion function of the stimuli, the agents’ cognitive and physical resources, and
the agent’s cognitive skills;
• S and R are the stimuli in the environment, and the action/response produced by
the agent, respectively;
• f is a fusion function that integrates the agent’s response overtime and transform
it to changes in the physical and cognitive resources in the environment, and into
an environmental reward to the agent;
• O is defined as the positive or negative opportunity offered by the environment
to the agent;
• a ! in this diagram represents a flow of data, where the word “data” denotes
elements such as information, knowledge, experience, and wisdom.
2.3 Performance 83
Fig. 2.11 An architecture of agent-environment interaction model
• C alone on an arrow represents a positive influence of change in flow and should

be understood as the element at the tail of the arrow positively influences the flow
to the element at the head. That is, if x1 positively influences the flow to x2 , when
the flow to x1 increases, the flow to x2 is likely to increase and vice versa;
• C= on an arrow represents a possible positive or negative influence of change
in flow, that is, when the element at the tail of the arrow detects a change in flow,
it may decide to increase or decrease the flow to the element at the end of the
arrow;
• a circle shape represents a memory/storage + processing unit;
• a rectangular shape represents a storage unit alone;
• a triangularN shape represents a decision unit;
• a shape of represents a fusion process, where one or more inputs need to be
mixed and integrated over time to generate an output.
Let us consider the model to understand what it does and how it does this. The
starting point is the circle labeled “control.” This circle represents the decision
making that occurs within the environment, outside the agent. The environment
may decide to generate a stimulus to the agent, either unintentionally (e.g. weather
events) or intentionally (e.g. through the actions of other agents). A stimulus takes
the form of a cognitive (e.g. data, ideas, experiences) or physical (e.g. information
about a money transfer, or the information associated with giving a child a candy)
flow. In this model, a flow is always a flow of data/information, although it may have
a cognitive or physical source.
Once the information leaves the control point, it becomes a stimulus, S , to the
agent. The agent receives this stimulus in a different form, B, from what it is in the
environment. This form represents the fusion of different factors: the stimulus that
was generated, the agent’s physical resources, the agent’s cognitive resources, and
the agent’s cognitive skills.
For example, if the agent lost the ability to taste (i.e. had a malfunctioning
tongue), this limitation in an agent’s physical resources would impact the agent’s
perception of tasting information in a stimulus. Similarly, if the agent is autistic, the
lack of certain cognitive resources would impact the agent’s perception of a hug.
Finally, the agent’s cognitive skills (e.g. the agent’s knowledge of how to hug to
reflect compassion or affection) would impact the agent’s perception of a hug.
The perceived stimulus is then transformed into motives or goals. Sometimes,
the stimulus may generate a new goal, as in the case of a new task being assigned to
the agent and the agent needing to add to their repertoire of motives a new goal on
the need to complete this task. At other times, the stimulus provides the agent with
an update to one of its existing goals, as in the case of briefs from a subordinate that
update the decision maker’s knowledge of the rate at which existing performance
indicators are being met.
The states, and the corresponding changes, of an agent’s goals produce intentions
to act. Intentions in this model are a product, not a system-state. The intention
unit fuses the motives, the agent’s cognitive resources, and the agent’s physical
resources to produce a plan. Information during this fusion process moves back
and forth, where the cognitive and physical resources call and modify the cognitive
and physical skills, respectively. During this process, cognitive and physical skills
are updated and checked to produce the plan.
For example, assume an agent who used to be a professional swimmer had an
accident in which they lost their right arm. Assume that the goal of the agent remains
to be able to swim fast. Both the agent’s cognitive and physical skills need to be
updated. The agent needs to form a plan to move from the previous skills to a new
set of skills that consider and match the new physical constraint.
The agent’s internal plan can take the form of a series of actions that the agent
needs to produce. However, only a limited number of responses can be produced
at any point of time. Therefore, the intention unit also produces a schedule for the
generation of responses. The first group of mutually compatible responses (e.g. a
smile on one’s face, together with a handshake) form a “response:” R.
The agent’s internal response may be produced differently in the environment.
For example, as the agent is moving their arm to shake a person hands tightly,
the intended pressure on the other person hand is not properly produced. Thus, the
handshake does not produce the intended effect.
Two rewards systems operate as action-production works. The first is the internal
feedback, self-reward or self-punishment system in which the agent internally
rewards itself. A person may attempt to reinforce their own goals to the extent
that the person perceives that their goals are satisfied when they are not. This
internal reward mechanism is very important because it is generally hidden and
inaccessible from the outside world. It can act as a negative feedback cycle that
2.3 Performance 85
balances an individual’s motives or a positive feedback cycle that cascades an

individual’s motives. When the internal reward mechanism gets damaged, it can
create a personality disorder.
The second reward function originates from the environment, where other agents
in the environment need to decide on the form and impact of such an environmental
reward. We refer to this as the “opportunity:” O. The environment may offer or deny
opportunities to the agent.
In our example of the scientist, the environment may decide to open the flow
of physical resources-in this situation, the agent receives more funding; close the
flow of physical resources-here, the funding stops; open the flow of cognitive
resources-here, the ideas and knowledge produced elsewhere are communicated
to the scientist (agent) to increase their knowledge repertoire; or close the flow of
cognitive resources-here the scientist (agent) is denied such knowledge or cannot
find people with the appropriate cognitive skills to extend their cognitive capacity
to process information.
A red teamer’s task is to understand how to design this control function, or at
least influence it, so that the agent’s actions and goals are consistent with the red
teamer’s goals.
A red teamer would aim to achieve one of the following three generic categories
of goals:
• alter an agent’s performance in a negative or positive direction;
• alter the effect of an agent’s actions; or
• both of the above.
Figure 2.12 shows an abstract categorization of the role of CRT. In this
categorization, CRT has two roles. One is to shape the actions of competitors,
thus, the effectiveness of CRT is evaluated by measuring the distance between the
intended effect of the competitor and the actual effect.
The second role of CRT is after the competitor’s action, where CRT attempts to
influence the effect generated by the opponent after it has been produced. Here, the
effectiveness of CRT is evaluated by measuring the distance between the aspired
effect of CRT and the net effect of the competitor. The competitor’s net effect is
the effect originally generated by the competitor minus the effect of the interference
generated by CRT.
An example of the former role is when a red teamer wishes to alter the
performance of a student to improve their performance in solving arithmetic
problems. An example of the latter role is when CRT attempts to reshape the
environment such that the advertisements a tobacco company is using have no effect
on people.
When a red teamer aims at discovering vulnerabilities in a person, the primary
reason that these are considered vulnerabilities is that the exposure of some critical
elements to a threat will either impact the performance of the person or will simply
impact the outcomes, and therefore, the objectives of the person. Each function of
CRT discussed in Chap. 1 can be considered to achieve one of the three goals noted
above.
Fig. 2.12 Abstract categorization of the role of computational red teaming
The primary design mechanism a red teamer uses to alter performance or change
the effects of blue is by designing a challenge. This is discussed in details in the
following section.
2.4 Challenge Analytics
2.4.1 A Challenge is Not a Challenge
It is very difficult to isolate the very few scientific articles discussing the concept of
a “challenge,” from the countless scientific articles using the concept to indicate a
“difficult” or “impossible” situation. Therefore, it is logical to devote time here to
explaining what a challenge is within the context of CRT. We need to go beyond
a dictionary-level, common explanation of a challenge, to a more formal definition
to ensure the possibility of designing models of challenge. We should see that a
“challenge” here is a state that constructively achieves an objective. It does not
denote the impossible, or a difficult situation.
Initially, we may see a challenge in simple terms: a challenge exposes an entity
to a difficult situation. However, the main question is at which level of difficulty we
are to employ the term “challenge.” It is very simple to ask other people difficult
questions and criticize them for not knowing the answer, making them to feel
inferior or incapable.
2.4 Challenge Analytics 87
Take for example a situation in where parents would tell a 6-year-old child that
they cannot earn money, that they are the ones who can buy the child what they want,
and therefore, the child should listen to them. The child is exposed to what we would
term in common English a “challenge.” They feel inferior in their ability to attract
and own money. The child would be wondering what their alternative is to listening
to their parents. The answer is obvious in this context; the child needs a manner
in which to obtain their own money. The parents, without intention, generated an
undesirable causal relationship in the child’s mind, that is, if the child was able to
obtain money, the child could buy whatever they wanted, and therefore, the child
could have an excuse for not listening to their parents. As presented below:
Obtain Money ! Freedom to Buy Anything Desired

Obtain Money ! No Need to Listen to Parents
These types of challenges are like a lose canon, they can fire randomly and even
hit their owners. Within the scope of this book, we will not consider this example to
constitute a challenge; we will simply consider it as an unthoughtful exposition to a
state of hardship.
It is unthoughtful because the parents above would like to gain a position of
power over the child as rapidly as possible. As a result, they state that if the child
is unable to achieve something that they know or believe is far beyond the existing
capacity of the child, the child must comply with certain conditions imposed by the
parents. The parents fail to understand that this behavior may trigger a reaction of
hostility and impose a feeling of hardship for the child. The child may rapidly adopt
a hostile attitude toward their parents, or use their level of knowledge to find the
quickest way to find money, which is obviously from the parents’ own pockets!
This is not the type of challenge we will model and discuss in this book. Instead,
we will examine engineered, thoughtful and constructive forms of challenges
whereby, the challenge is designed to achieve a desired outcome or effect.
2.4.2 Motivation and Stimulation
In computational sciences, the idea of how to create a stimulating environment has

been studied a great deal in the areas of computational creativity, computer games,
and simulation. Most studies in computational sciences on these issues are still in
their infancy, sometimes offering conceptual models and ideas that are difficult to
ground in a working system. However, and more importantly, “stimulating” should
not be equated to “challenging” in a strict scientific sense.
Stimulating ¹ Challenge
Challenge ¹ Stimulating
The above notations emphasize that a stimulating situation does not necessarily
mean that the situation was stimulating because there was a challenge associated
with it. Similarly, a challenging situation does not necessarily stimulate the agent.
An agent may be exposed to a properly designed challenge, but the agent may lack
motivation or interest, which makes the situation less stimulating to them.
Criteria such as “stimulating” and “motivating” are more suitable for a human
agent, as they require human traits and judgment. To generalize the concept of a
challenge to a machine agent, we need to reduce these criteria to a set of indicators
that can be used to assess and/or judge the process objectively without the need to
rely on subjective judgment.
We use a simple to understand, but more complex to implement, criterion.
Definition 2.18. A task is challenging when the distance between the aggregate
skills required to do the task and the aggregate skills that agent possesses is positive
and small.
That is: Aggregate required skills—Aggregate possessed skills > " ! a
challenge iff " is small and > 0.
We need to offer two words of caution here:
1. The concept of “distance” in the discussion above is not a simple quantitative
metric.
2. The aggregate of skills is not the sum of skills.
Several sets of skills can be united in different ways to create different high-
order skills. For example, let us assume that Jack is a creative person with excellent
writing skills, and cryptographic skills. The skills of creativity and writing when put
together may make Jack a creative writer. The skills of creativity and cryptography
when put together may make Jack a good computer hacker. Practice plays the role of
increasing the competency level of the agent. As the agent becomes competent, new
skills emerge. As Jack practices his creative writing and computer hacking, he may
develop skills in script writing for science fiction movies on quantum computations.
A good computer hacker is not created through simply by adding creativity and
cryptographic skills. If it does, then we simply obtain two different people, one who
is creative but has no understanding of computers, and the other who is a well-
educated cryptographer but is not creative. When we put these two people together,
it is unlikely that a good computer hacker will emerge for a long time, that is, the
time required for each person to transfer some of their core skills to the other.
The above raises the important question of how to create a good computer-
hacking team. The creative thinker needs to have some understanding of cryptog-
raphy and the cryptographer should have a degree of creative-thinking ability or
should be “open-minded.” There must be overlap of skills to establish a common
ground for the members of the team to speak to each other in a language they can
both understand, while not necessarily being an expert in the other’s field.
To recap the above discussion from a mathematical perspective, a distance metric
on a skill space is not a trivial task, and the aggregation of skills is usually a
nonlinear coupled dynamic system.
2.4.3 Towards Simple Understanding of a Challenge
There is not a great deal of literature on the concept of a challenge but there is small
amount in the fields of education and psychology. Here, we will build on the work
that does exist, but we must first deviate. As we will see, most of the literature treats
the concept of a challenge in a holistic manner. A challenge is defined, then the
concept is left to a designer such as an educator to interpret what it means within
their context.
The online free dictionary [7] defines a challenge in many different manners.
One definition that is particularly relevant to this book is the following: “A test
of one’s abilities or resources in a demanding but stimulating undertaking.” This
definition highlights the delicate balance that needs to exist between the two words
“demanding” and “stimulating.” The need to strike this balance is supported by
theories in educational psychology. Sanford’s theory of challenge is key in this area
[10]. In his work, he explains the subtle difference between a challenge and a stress.
He emphasizes the need to strike the right balance so that a challenge to a student
does not turn into a stressful situation. This work was followed in the education
domain by some scientific research on the topic [3, 9]. The pattern of a challenge
was recently mentioned in a study on immersion, although there was no analysis of
the specific pattern [6].
The above definition linked the concept of a challenge with the concept of
“stimulating”. However, we separated these two concepts in the previous section.
The word “demanding” is interpreted in our definition as exceeding the boundary.
The word “stimulating” is interpreted that it is not too demanding to the extent
that the agent may give up. However, the concept of “stimulating” has a second
dimension related to the agent’s motives. A challenge would become stimulating if
it has elements that triggers the agent’s motives. This dimension is agent-specific.
As we discussed before, we separate the two concepts of a “challenge” and this
dimension of the concept of “stimulating” in this book.
The concept of a challenge is traditionally found in the literature on “dialectics.”
Naturally, considering a challenge can take the form of questions. Can we design a
counter-plan for the opponents plan? Can we design an example to teach the students
an extra skill they currently do not possess? Can we design an anomalous dataset
that is sufficiently similar to normal behavior to be able to penetrate the anomaly-
detection system for our testing purposes? Therefore, questioning is a natural mean
to communicate a challenge. However, we should be cautious since not every type of
questioning is a challenge. Questioning can be a mean for examination, interrogation
and extraction of truth, sarcasm, or even a dry joke.
Mendoza [9] thinks of a challenge as “forcing myself to learn always to think
at the limits.” Admiring the work of Ellul on dialectics, Mendoza cites Ellul’s four
notions of a theory of dialectics:
1. Contradiction and flux are two characteristics in life that must be reflected in
the way we theorize. Through the holistic approach of dialectic, meaning can be
grasped.
2. The coexistence of a thesis and antithesis should not lead to confusion or one
suppressing the other. The synthesis should not also be a simple addition of the
two; instead, it emerges through “transformative moments” with “explosions and
acts of destruction.”
3. The negative prong of the dialectic challenges the spectrum between the positive
and negative prongs, creating change; or what Ellul called “the positivity
of negativity.” Ellul sees change as a driver for exploration. Mendoza offers
examples of the positives, including: “an uncontested society, a force without
counterforce, a [person] without dialogue, an unchallenged teacher, a church with
no heretics, a single party with no rivals will be shut up in the indefinite repetition
of its own image.” [9]. These positives will create a society that resists change.
4. Automaticity of the operation of dialectic is not possible because many of the
contradictory elements in the society are necessarily going to create those unique
dialectic moments. Ellul cautions that “Dialectic is not a machine producing
automatic results. It implies the certitude of human responsibility and therefore
a freedom of choice and decision.” [9].
Generalizing from the above four notions of dialectics in a manner relevant to
this book, we can identify four factors for a challenge:
1. Coexistence of thesis and antithesis
2. Change and negatives derive challenges
3. Synthesis is an emerging phenomenon
4. Noise.
While noise was not an explicit topic in the above discussions, it needs to be
induced. Given the many contradictions that exist in the world with no potential to
influence a challenge, they can inhibit the emergence of challenges when attempting
to automate the process. Therefore, they should be filtered out. Given the nature of
this noise, it is best suited for humans to filter them out than automation.
The above does not necessarily offer a solution to how we can model, design,
and create a challenge, but it certainly offers cautionary features for which we need
to account for when discussing automation. As a principle, this book does not claim
that we can automate the concept of a challenge; in fact, this is precisely why we
will dismiss the concept of automating the CRT process.
CRT as a creative process, needs the “transformative moments” through

which synthesis is formed in the mind of a human, where it is not planned,
nor can it be explained deductively in terms of the data or the logic used within
a piece of software. These transformative moments, where ideas emerge and
only a vague argument or a correlate of logic can be given as a justification.
Fig. 2.13 Synthesizing Ellul and Mendoza opinions on a challenge
In this book, automation is discussed as a means to deriving a process of

challenging. The advanced computational models we will discuss in this book are
merely tools that can be used in a computer-in-the-loop CRT exercise.
Figure 2.13 synthesizes the concepts discussed by Ellul to form a basic structure
for a challenge. This structure can be seen as an information processing lend on
the concept of a challenge. Nevertheless, the views of Ellul and followers do not
structure the concept of a challenge sufficiently for our computational purposes. We
will attempt to do this. This will not to be an attempt to claim (erroneously) that there
is one and only one model through which to structure the concept of a challenge, but
to lead to the creation of more models on how to structure the concept of a challenge
in future research.
This model must surpass the classical conceptual models offered in fields such as
dialectics and the indirect and unstructured manipulation of the topic in areas such
as systems thinking; mainly because our model should also allow for automation
to occur. Therefore, while we need to seed it with these areas of the academic
literature, we need to ground it in concepts that are computable; processes that can
be automated; and methodologies that can be objectively assessed.
We will not present the model directly, rather, we will evolve it so that the reader
can subscribe to the logic behind the model. We will begin with the simplest ones,
then proceed steadily to a more meaningful, realistic and functional model.
We will begin with two simplistic representations that capture the concept of
a challenge from different perspectives. Figure 2.14 depicts the first conceptual
representation of a challenge. It is an intuitive model that subscribes to a common
Fig. 2.14 A conceptual diagram of the concept of challenge
intuition of a challenge. The totality of the skills an agent possesses represents the
set of tasks the agent is capable of performing. To challenge this agent is to find
such task that the agent cannot perform because the agent lacks certain skills, while
at the same time, this task is very close to what the agent can currently do.
For example, we ask a child to multiply three by three knowing that the child
has learned how to add and knows the basics of the concept of multiplication, for
example, knowing how to multiply two by two. However, the child is unable to
multiply three by three because the child has not been exposed to sufficient examples
to generalize the concept of multiplication to arbitrary multiplication of any two
numbers. Nevertheless, the child was able to generalize the concept of addition
to arbitrary numbers, and understands the basics of multiplication in the simple
example of multiplying two by two. The child has all the skills to multiply three by
three, except one skill: the know-how to generalize that multiplication is a recursive
addition. Whether or not this extra step is simple enough or too hard for the child
will depend on the child’s cognitive resources and skills.
Likewise, we can teach a person how linear regression works then challenge them
by giving them a simple nonlinear example that requires a simple transformation to
make it linear. The person needs to synthesize their knowledge to solve the example
in a manner in which they have no experience. Even if they fail, once the solution
is explained to them, they see no problem in understanding it. This is the point at
which we hear exclamations such as “Ah, I see, this now sounds so obvious, it just
did not cross my mind the first time I attempted to solve this problem.”
The above example demonstrates an important point that many people may find
counterintuitive, that is, a challenge can only synthesize existing knowledge, it
cannot introduce new axiomatic knowledge. Now is a good time to differentiate
between these two types of knowledge.
We will argue that there are two broad categories of knowledge an agent can
have: axiomatic and derivable (learned) knowledge. Axiomatic knowledge can only
be gained through direct exposition to certain facts, processes, and tasks. Similar
to mathematics, once we believe in the axioms, theorems can be derived from the
axioms, both deductively or inductively. To develop a new type of calculus, it is not
sufficient to study and practice calculus, we need different types of knowledge to
understand what it means to develop a new calculus in the first place.
Similarly, people who studied humanities may be very creative when writing a
story or analyzing a conversation. However, if they have never studied mathematics,
no challenge can synthesize their existing knowledge into a new type of knowledge
that enables them to understand mathematics. The distance between the two spaces
of knowledge is large. The same result will ensue by asking a mathematician to
understand Shakespeare if they have not been exposed to literature before or by
asking a person who has recently begun to study a language to understand complex
jokes in that language; we know this is difficult because a joke does not just play
with words in a language, it also relies on cultural elements that the person may not
have gained this type of axiomatic knowledge of this particular culture.
This is not to say that a challenge does not produce new knowledge; on the
contrary, if it does not, then it is not a challenge. Instead, a challenge can only move
us from one place to a place close by; thus, the knowledge the challenge produces
may impress us but it must come from within a space that is sufficiently close
to the space of the old knowledge. This knowledge can be “transformative”—as
Ellul indicated with “transformative moments”—in the sense that it is a non-linear
synthesis of existing knowledge. Because of non-linearity, it is hard to explain
it deductively from existing knowledge. The agent may perceive that it is new
axiomatic knowledge, but the agent would feel also that it is not too difficult and
that it can vaguely be associated with what they already know.
Recasting the previous conceptual diagram of a challenge in a different form,
the skills of an agent would influence the agent’s behavior. Figure 2.15 depicts this
process by conceptualizing the space of possible behaviors an agent can express.
The model assumes that we wish to challenge a thinking entity, let us refer to
this entity as an agent. Similar to the theory of challenge in the field of education,
our aim is to push further this agent to acquire skills and knowledge beyond those it
currently possesses.
Figure 2.15 offers a complementary perspective on a challenge when the aim
is to challenge the behavior of an agent or a system; the aim is to encourage the
system to express a behavior that is outside the boundary of its normal behavior. For
example, challenging a passive person to take a more proactive attitude should not
be considered a process that will magically transform this person into a proactive
person overnight. It is extremely unlikely that such a transformation will occur
so rapidly simply because being proactive requires many skills to be acquired,
including thinking and communication skills.
Within this space resides a subspace of the behaviors the agent currently
expresses, which we assume in this example to represent the space of passive
behaviors. To expand the behavior subspace of the agent to include proactive
behaviors, the small dark circle represents the closest subspace that features
proactive behaviors but is not too far away from the agent’s current subspace of
behaviors.
To achieve the intended effect from a challenge, the agent must be engaged
during this process, that is, the agent should not find the challenge process too
boring or too difficult, but instead, stimulating and motivating. The challenge needs
to stimulate the agent so that a new behavior is expressed by the agent. Therefore,
to ensure that the challenge is effective in achieving the desired effect, its design
needs to be agent-centric to connect agent’s skills with agent’s motives. To this end,
and before progressing any further, we must pause to explain what we mean with
behavior in this context.
2.4.4 Challenging Technologies, Concepts and Plans
Figure 2.16 expands this discussion beyond the use of the concept of a challenge to
expand the skill set or behavioral subspace of an agent, to testing and evaluating
algorithms and systems. This new example will allow us to dig deeper in an
easy to understand context. In Fig. 2.16, we assume a computer network. In this
environment, A represents the space of all behaviors or all possible traffic that goes
through this network. Some of this traffic will constitute anomalies and is depicted
by the subspaces B and D. The difference is that we know of the existence of B but
we do not know of the existence of D because of our bounded rationality, limited
knowledge or any other reason that would prohibit our thinking from knowing about
the types of anomalies hidden in D.
We can assume an algorithm that is able to detect anomalies. This algorithm may
be able to detect anomalies in subspace C , which is a subset of B. A classical test
and evaluation method such as stress testing to evaluate this algorithm will very
likely end up with the subspace B C . This is because the bias that exists in our
design of these stress-testing methods is (subconsciously) based on our knowledge
of B.
Methods designed based on the concept of a deliberate challenge should discover

that in addition to B C exists the subspace D. Notice that D is sufficiently close
enough to B, but resides outside our sphere of bounded rationality. It is reachable
through CRT.
Thus far, we have ignored the fact that a challenge requires at least two agents:
one challenges the other. Figure 2.17 demonstrates this by depicting the two design
spaces of a challenge for both teams. In the top space, blue searches the space of red
attacks. While there is a space of red attacks that is known to blue, there is a subspace
within this space where blue knows that if red attacks come from this subspace, blue
can’t detect them. This is the subspace where blue is aware and conscious of its own
vulnerability.
There is also a subspace in the space of all possible attacks by red, where blue
is unaware of it. Thus, this subspace represents the blind spot for blue. The same
analysis can be done on the red side in the bottom diagram.
2.5 From the Analytics of Risk and Challenge

to Computational Red Teaming
Sofar, the discussion introduced many concepts that underpin the risk and challenge
analytics areas. It is time to synthesis these introductory materials into computa-
tional forms. The discussion will start with the first formal perspective on CRT and
how it relates to the analytics of risk and challenge. This will be followed by a more
focused discussion that synthesizes the introductory materials into a coherent form
for each of the cornerstones of CRT.
2.5.1 From Sensors to Effectors
The grand challenge in computational red teaming is to seamlessly transform

sensorial information to effectors that create the right set of effects. This grand
challenge is depicted in its simplest form in Fig. 2.18.
This figure is too generic and goes beyond the realm of CRT. One can say that
it is a picture that captures the generic objective of autonomous systems: how to
transform sensed information into an agent’s desired effects through the design of
effectors that influence and shape the environment.
Fig. 2.18 Transforming sensorial information, from sensors, to effects, through effectors, cycle
2.5 From the Analytics of Risk and Challenge to Computational Red Teaming 97
Fig. 2.19 Cognitive–cyber–symbiosis of the CRT-based sensors-to-effectors architecture
Zooming in on this pictorial representation of an autonomous system from a

CRT perspective, Fig. 2.19 expands the picture with the cornerstones of a CRT
system. A CRT system should not be understood as a simple computer program, but
a system of systems to conduct a CRT exercise; some components can be software-
based, while others are human-based. These components interact together to form
a CRT system. We will call this process Cognitive-Cyber-Symbiosis (CoCyS)—
pronounced as “Cookies”—to emphasize that this system is not based on the mere
integration of different components, but on blending and seamlessly fusing these
components to form a single computational CRT machine in a fluid manner.
Definition 2.19. Cognitive-Cyber-Symbiosis (CoCyS) is an environment whereby
human thinking, mental processes and indicators, and the Cyber space are blended
to improve the effectiveness of decision making.
The first cornerstone of a CRT system, as shown in Fig. 2.19, is risk analytics.
This agent aims at analyzing how uncertainty impacts the system’s objectives.
More details on risk analytics are offered in Sect. 2.5.3. The second cornerstone of
CRT, Challenge Analytics agent, is linked to the risk analytics agent. Challenges
are discovered and formed using the Observe-Project-Counteract (OPC) agent
architecture discussed in Sect. 2.5.4. In essence, risk analytics in CRT analyzes risk
by challenging its own decision making and thinking process as well as the external
environment including competitors and other agents.
The four agents: risk analytics, challenge analytics, effects estimation and effects
design, represent the four primary thinking processes that negotiate with each other
continuously to red team. These agents should be seen as an attempt to structure

CRT, not an attempt to define four independent systems that needs to come together
to form CRT.
The four agents described in Fig. 2.19 will share computational needs, thus,
it is not advisable to duplicate the software infrastructure. Duplicating software
components in a system can create many problems. First, it increases the acquisition
cost of the system, simply because the same functionalities get bought from different
vendors. Second, it increases the maintenance cost of the system. Third, and
most importantly, over time, the CRT system becomes a can of worm: unplanned
redundant functionalities and data that no one can understand the assumptions
spread across a system that is supposedly designed to challenge assumptions, etc, in
other systems.
To overcome the problems mentioned above, Fig. 2.20 shows the service-
oriented architecture (SOA) to generically define—in a structured and not too
decentralized manner—the high-level services required for CRT. An SOA is a
computer architecture, whereby functionalities are defined as services without any
central control mechanism. These services communicate to each other through a
platform, known as the service bus, that enables services to define, discover, and
use other services in the system.
Fig. 2.20 Service-oriented architecture for computational red teaming

The technical details for implementing an SOA is beyond the scope of this
book, but the concept of SOA is simple enough to be understood on this level
of abstraction. SOA can be implemented using web-services; which relies on the
internet as the backbone for the service bus.
Figure 2.20 shows one view of the SOA for CRT, which connects sensors to
effectors. It also emphasizes that the system internally measures indicators for
the success of achieving the objectives; thus, providing evidence-based decision
making approach. The risk analytics component has a number of services, including
optimization and simulation services. The role of these technologies will be
discussed in the following chapter.
Both challenge analytics and risk analytics rely on three technologies: simula-
tion, optimization and data mining, similar to risk analytics. These technologies
will be discussed in more details in Chap. 2, and an example to illustrate how they
need to work together for CRT purposes is given in Sect. 3.1.2.
2.5.2 The Cornerstones of Computational-Red-Teaming
Risk analytics and Challenge analytics are the two cornerstones of a CRT system.
Figure 2.21 depicts this relationship by factoring risk into its two components:
uncertainty and objectives. Together, the objectives of the organization and the
uncertainty surrounding the decision making process constitute risk. The challenge
analytics component aims at designing challenges for uncertainty, constraints and
objectives. This point will be elaborated on in Sect. 2.5.4.
Fig. 2.21 The cornerstones of computational red teaming

2.5.3 Risk Analytics
Risk analytics is the process of deconstructing system-level risk in an organization to

its constituent parts, assess the system-level risk, and evaluate the impact of possible
courses of actions on organizational risk spatially and temporally.
The difference between risk analytics and risk management is that, the former
emphasizes the thinking process and tools, while the latter emphasizes the process.
Risk analytics (see Fig. 2.20) begins with transforming input data into effects’
indicators. The CRT system is designed for a purpose, which is defined with a set
of effects that need to be generated, and therefore monitored. Before acting on any
data that has been generated, either from the environment or internally within the
system, the system needs to be able to measure some indicators of the effects it
aims to generate. This measuring process acts as a reinforcement mechanism for
the system, enabling the system to measure deviations from intended effects and
correcting its actions accordingly. The Effects Estimation agent is responsible for
continuous monitoring of the environment to estimate effects’ indicators.
The output of risk analytics is a set of effects. The risk analytics agent needs
to take into account the environment, including the properties of the agents in that
environment and how effects should be shaped and framed to influence these agents.
This is the role of the Effects Design agent.
From the definition of risk as the impact of uncertainty on objectives, the two
cornerstones of risk analytics is to analyze uncertainty and objectives to synthesize
a risk picture. This risk picture needs to be challenged to manage risk. The challenge
analytics component is responsible for finding actions that can have a positive and/or
negative impact on organizational objectives.
2.5.4 Challenge Analytics Using the Observe-

Project-Counteract Architecture
The process for challenge analytics consists of two steps:

Estimate Boundaries: In this step, the boundary constraints are discovered. Sec-
tion 3.1.2 will provide an example on how a challenge can be computationally
discovered. In its basic form, the process for blue (red) works by estimating the
boundary constraints of red (blue) capabilities.
Probes: Once boundaries become known, the probing step attempts to design
search mechanisms to generate points just across the boundaries. When a probe
crosses the boundaries with a small distance, it becomes a challenge.
As we discussed before, risk is made up of uncertainty and objectives. Challenge
analytics, as shown in Fig. 2.21, attempts to design ways to challenge uncertainty,
constraints, and objectives so that CRT can challenge risk.
The challenge analytics of uncertainty aims at estimating the boundary con-

straints of uncertainty; that is, instead of enumerating what may happen, it discovers
the boundary on what may happen. Uncertainty may be misunderstood as an
unbounded space of possibilities. However, life is evolutionary, not revolutionary.
What bounds uncertainty is plausibility. Anything can happens, but nothing happens
in vacuum.
Plausibility bounds uncertainty and plausibility depends on know-how

(skills).
The prices of shares in the stock market can rise very quickly, but we can estimate
a boundary on how far they can rise. It is possible to estimate multiple boundaries
with different levels of confidence. If blue can estimate the bounds on red’s
uncertainty, blue can design strategies to challenge red by creating uncertainties
outside these bounds.
Similarly, challenge analytics need to challenge objectives. We discussed that
classical decision making assumes that objectives are mostly defined and fixed.
However, objectives in CRT are controllable elements that can be reshaped. One
way for blue to influence red is to reshape red’s objectives. Challenge analytics can
help blue to estimate the boundary conditions on red’s objectives so that blue can
challenge red by aiming to reshape red’s objectives. This reshaping process can be
done by changing these boundaries, moving them in different directions.
To illustrate a simple example using classical linear programming, assume that
red aims to maximize profit, where the profit objective function is formulated as
follows:
" 2xC3y
with x and y representing two different types of effects that red wishes to
generate. For blue to challenge these objectives, blue needs to analyze two different
boundaries: the boundaries on the coefficients, and the boundaries on the structure.
The boundaries on the coefficients is to estimate how far the two coefficients of 2
and 3 for x and y can change, respectively. However, some gains achieved by red are
influenced by blue. These coefficients represent red’s gain from each type of effects.
In essence, they represent how red values these effects. As such, to challenge these
coefficients is to understand the boundary constraints on them; that is, for example,
the coefficient of x may change between 1 and 5 based on a number of factors. Blue
can then design a strategy to influence these factors so that this coefficient changes
in the direction desired by Blue.
The boundaries on the structure aims at estimating the constraints on the effect
space for red. In other words, can we introduce a third variable z to this equation
that is more beneficial for us? These structural boundaries are very effective tools.
Fields such as Mechanism Design and Game Theory can assist in discovering this
third dimension, although we will avoid discussing Mechanism Designs in this book
because most work in this domain falls in the same classical trap of game theory,
which assumes (1) rational agents and (2) agents are self-aware of the value of any
alternative (i.e. when an agent is faced with an alternative, the agent has an internal
value representing the maximum value the agent would be willing to pay for that
alternative). The advantages of CRT is that, it does not have such restrictive and
unrealistic assumptions. For example, what would be the maximum price you would
be willing to pay to save your life? In essence, the question also means, how far can
you go to save your own life? Mechanism design assumes that each agent knows
the answer to this question precisely!
The third element that CRT can challenge is the constraints on the other team.
Constraints normally exist for two reasons; either the structure and properties of
the system are inhibiting the system from expressing certain behaviors, or the
environment is doing so. Constraints from the environment are forces impacting
the system in a similar way to uncertainties. The primary difference between an
environmental constraint and uncertainties is that the former are certain forces,
while the latter are uncertain. For example, weather conditions are environmental
conditions impacting a flight. When weather conditions are known, we can take
them as constraints when designing an optimal flight path. When weather conditions
are not known, they become uncertainties that a flight path needs to be evaluated
against a range of possible weather conditions. In classical optimization, the two
concepts can be combined in the form of a stochastic constraint.
Most of the time challenge analytics is concerned with designing counterac-
tions to challenge the other team. This design process for blue(red) will require
mechanisms to estimate boundary conditions for red(blue) constraints, uncertainties
and objectives, designing actions outside these boundaries, projecting the impact
of these actions in the future, and selecting the most appropriate counteraction for
blue(red) in response to red(blue) actions.
As will be illustrated in Sect. 3.1.2, challenge analytics rely on three technolo-
gies: simulation, optimization and data mining, similar to risk analytics.
Computationally, challenge analytics requires a proactive architecture that can
support proactive generation of counteractions. One possible realizations of this
architecture is the following Observe-Project-Counteract agent architecture. This
architecture has three components as follows:
Observe: In the first stage, each team needs to observe the other team by contin-
uously sensing information, extracting behavioral patterns, and assessing their
skills (assessing boundary constraints).
Project: In the second stage, the creation of a model of how the other team acts is
required, so that each team can use this model to estimate their actions in the
future, and evaluate the impact of one team’s actions on the other team. In the
debate, if we can estimate through observations what the other team knows, we
can equally estimate their response to our future questions.
References 103
Counteract: In the third stage, counter-strategies are designed to counteract what

the other team intends to do. The ability to observe and project the other team’s
behavior into the future provides a team with the means to evaluate its counter-
strategies.
Many variations can be created from this architecture by replacing the word
“observe” with “sense,” and replacing the word “project” with “anticipate,” “pre-
dict,” or “estimate.” Clearly, each of these words has a slightly different meaning,
and appropriate use will depend on the context.
For example, the difference between observe and sense is that “observe”
reaches beyond a basic level of sensing. Observing requires intentional sensing of
information and making sense of this information.
Similarly, microscopic differences can be defined between the words project,
estimate, predict and anticipate. A random guess of where blue is going in the
absence of any information is a basic type of prediction. Therefore, blue can predict
based on its own beliefs, without the need for information on red. Establishing
proper confidence intervals around this prediction will move us from the realm of
prediction to the realm of estimation. Anticipation increases the complexity even
further by using future state information to define the current state of the agent.
Projection is the wider concept, whereby any form of prediction, estimation or
anticipation is considered a form of mapping between existing states to future ones.
The word “counteract” is emphasized instead of the word “act” because the
emphasis of red is not to produce an action independent of blue. One can act out
of one’s own interest or even subconsciously. However, counteraction is a function
of an opponent action; it is a deliberate response that requires proper logic to be in
place to undo deliberately the effects of the opponent’s action.
The critical distinction between a counteraction and a response resides within the
clause “to undo deliberately the effects of the opponent’s action.” A counteraction
is not a simple or reactive response, but a response that is designed with the effect
of the competitor’s action in mind. It is a response designed to ensure that the effect
of the competitor’s action is not materialized.
Deliberate actions are centered on the objective of an agent. When the level of
this objective relies also on the actions of the opponents, the agent’s action becomes
a counteraction.
References
1. Abbass, H.A., Petraki, E.: The causes for no causation: a computational perspective. Inf.
Knowl. Syst. Manag. 10(1), 51–74 (2011)
2. Einstein, A., Infeld, L.: The Evolution of Physics. Simon and Shuster, New York (1938)
3. Ellestad, M.H.: Stress testing: principles and practice. J. Occup. Environ. Med. 28(11),
1142–1144 (1986)
4. Gaidow, S., Boey, S., Egudo, R.: A review of the capability options development and analysis
system and the role of risk management. Technical Report DSTO-GD-0473, DSTO (2006)
5. Gilbert, T.F.: Human Competence: Engineering Worthy Performance. Wiley, Chichester (2007)
6. Grimshaw, M., Lindley, C.A., Nacke, L.: Sound and immersion in the first-person shooter:
mixed measurement of the player’s sonic experience. In: Proceedings of Audio Mostly
Conference (2008)
7. http://www.thefreedictionary.com/. Accessed 1 Feb 2014
8. ISO: ISO 31000:2009, Risk Management - Principles and Guidelines (2009)
9. Mendoza, S.: From a theory of certainty to a theory of challenge: ethnography of an
intercultural communication class. Intercult. Commun. Stud. 14, 82–99 (2005)
10. Sanford, N.: Self and society: social change and individual development. Transaction
Publishers, Brunswick (2006)
11. Sawah, S.E., Abbass, H.A., Sarker, R.: Risk in interdependent systems: a framework for anal-
ysis and mitigation through orchestrated adaptation. Technical Report TR-ALAR-200611013,
University of New South Wales (2006)
Chapter 3
Big-Data-to-Decisions Red Teaming Systems
The general who loses a battle makes but few calculations beforehand.
Thus do many calculations lead to victory, and few calculations to defeat:
how much more no calculation at all! It is by attention to this point that I can
foresee who is likely to win or lose.
Sun Tzu (544 BC - 496 BC) [33]
Abstract This chapter is about computations in CRT. Plain language is used to

explain the concepts of experimentation, optimization, simulation, data mining, and
big data before presenting the intelligent architectures that can transform data to
decisions in CRT systems. Most of these architectures can be used outside CRT in
any situation. However, augmenting these architectures with CRT capabilities offers
unprecedented computational capabilities for offline and real-time decision-making
situations equally.
3.1 Basic Ingredients of Computations in Red Teaming
3.1.1 From Classical Problem Solving

to Computational-Red-Teaming
Before more technical discussions on CRT, one may need to understand the
differences between a classical problem solving approach and CRT. Figure 3.1
depicts a categorization that attempts to separate the two classical schools of
thinking in problem solving.
The “think-to-model” (T2M) school represents classical AI and quantitative
Operations Research (OR). Within the military, it represents what is known as the
military appreciation process (MAP), which is the process officers get trained on
to solve problems. It starts with defining what the problem is, after all, without
knowing what the problem is, the activity can be counter-productive. Once the
problem is defined, it gets formulated either mathematically or qualitatively but

106 3 Big-Data-to-Decisions Red Teaming Systems
Fig. 3.1 Problem solving schools of thinking
in a structured manner, alternative courses of actions are designed by solving

this formulation, the most appropriate course of action is selected, and gets
executed. It is important to emphasize that uncertainties are defined during the
problem formulation stage and when alternative courses of actions get evaluated,
uncertainties can be accommodated for.
The “model-to-think” (M2T) school represents a more adaptive form of the
classical scientific method or the experimental approach. Here, models are like
experiments, they are not used to solve the problem alone, but they are also used
to define what the problem is in the first place. In the M2T school, we will use the
word “strategy” instead of “solution” because the objective here is to find the ways
to transform the means to goals [13, 27]. Models are used to define the appropriate
means and goals; thus, defining a problem.
CRT is an M2T approach. In CRT, problem definition is not imposed on the
exercise. While the exercise starts with a scoping, the reciprocal interactions and
events during the exercise can change the scope and define new problems.
It is important to emphasize this aspect of CRT. Usually, CRT is discussed as a
way of defining problems. As one team challenges the other, the challenge defines
a new goal that needs to be reached. Every time a new goal gets defined, a new
problem is defined. The teams can rely on the T2M to solve this new problem, but
to create a new challenge, they have to rely on the M2T approach.
The next section will present a scenario to illustrate the computational ingredients
and tools required for a CRT exercise.
3.1 Basic Ingredients of Computations in Red Teaming 107
3.1.2 Run Through a CRT Example
Let us recall that the two key concepts in CRT are risk and challenge. Decisions
are evaluated using a risk lens to challenge the system under investigation. In this
section, we will present a synthetic scenario for a CRT exercise to demonstrate how
the different bits and pieces of modeling come together to present a coherent CRT
environment.
Ramada (red) and Bagaga (blue) are two nations: Ramada is developing and
Bagaga is developed. Ramada relies on foreign aid from Bagaga to provide financial
support to its senior citizens. Bagaga provides this financial aid to increase the
loyalty of Ramada’s citizens to Bagaga.
Bagaga established a CRT exercise to understand the implication of the different
levels of financial aid it can provide to Ramada. Given that Bagaga established
this CRT exercise, the blue team represents Bagaga, and the red team represents
Ramada.1
Over the years, Bagaga has developed technologies to conduct CRT exercises
of this type. Given the complexity of the situation, Bagaga decided to use its CRT
capabilities to implement the CRT exercise.
Bagaga formed a highly qualified red team consisting of five experts: an anthro-
pologist; a social scientist; and a psychologist (all of whom specialize in research
on, and have a working knowledge of, Ramada); a strategist (who is familiar with
the machinations of the political policies of Ramada); and a computer scientist (who
specializes in running CRT models). In addition, a number of technicians have been
enlisted to support the red team.
Bagaga has constructed a blue team consisting of experts in economics, interna-
tional relations, and a computer scientist specialized in running CRT models.
The purpose of the exercise was explained to both teams as follows: “the purpose
of this exercise is to design a strategy to maximize the value gained by Bagaga
from the financial aid given to Ramada (benefit), while minimizing the amount of
financial aid (cost).”
Each team was assigned their roles as follows: “the blue team needs to decide on
a level of financial aid that Bagaga can afford, while the red team needs to discover
vulnerabilities in the blue team’s decision that can cause the value for money to be
less than expected.”
In this exercise, value for money is defined as
Benefit
Value for money for the blue team D
Cost
Positive Effects
Value for money for the blue team D
Negative Effects
1
Notice that the first letter of the country name corresponds to the first letter of the color to help
remember which team is which.
The exercise will continue as a cycle. The objective of the blue team is to make
a decision on a level of financial aid that Bagaga can afford. The outcome of this
decision will be communicated to the red team, whose objective is to analyze the
vulnerabilities of Bagaga’s decision. The red team then sends the blue team its
findings in the form of the level of loyalty the financial aid achieved in Ramada.
The financial aid’s vulnerability cycle will continue until Bagaga is comfortable
that the analysis has covered the space well.
What computer models would the exercise use for this activity?
The blue team decided to use economic models to understand what level of
financial support Bagaga could commit given its tight budget constraints. They will
rely on the international-relations experts to forecast the expected impact of the
assigned financial support on Ramada’s loyalty.
The red team decided to augment their expertise with the advanced computa-
tional capabilities available to them.
In the first cycle, the blue team ran their economic models and decided that an
appropriate level of financial aid would be B1. The decision was communicated to
the red team.
While the blue team was working on finding an appropriate level of financial aid,
the red team was attempting to understand the relationship between financial aid
from Bagaga and the loyalty of citizens in Ramada. To achieve this, the red team
needed to have a model of Ramada.
The model needed to capture the behavior of Ramada’s government in response
to different levels of financial aid. The red team decided to use a model they
named ScioShadow-Ramada. This was an advanced two-layer simulation model of
Ramada: one layer was a behavioral model of the government, while the second
layer was a behavioral model of the citizens.
The model used different variables about the lifestyle of a typical citizen in
Ramada, and mapped them to the feelings and emotions that are translated to loyalty
toward Bagaga. The model worked on different levels of detail and could be detailed
to the extent of mimicking the dynamics of how feelings and emotions are created
for different types of citizens in Ramada.
By varying the parameters of ScioShadow-Ramada, the red team can study the
impact of different levels of financial aid on the level of loyalty Ramada has to
Bagaga. We will term the mimicking behavior of the model “simulator.”
A simulator within CRT is the oracle that represents the phenomenon under
investigation. We can question this oracle with any factor we want, providing the
models inside the oracle cover this factor. Let us ask the oracle one of the main
questions arising from the situation at hand: “If the level of financial aid from
Bagaga is B1, what is the level of loyalty expected in Ramada?”
The red team can run a great deal of simulation (i.e. many calls to the simulator
with different parameters initializations) using ScioShadow-Ramada. However, this
is very time consuming and computationally very expensive. Instead, the red team
decides to use optimization technologies to find the points representing the best
mappings (optimal solutions) between the level of financial aid from Bagaga and the
corresponding level of loyalty in Ramada. They execute a number of optimization
runs to find all optimal solutions.
Fig. 3.2 Fitness landscape of Ramada’s loyalty in response to Bagaga’s financial aid
In this exercise, the red team finds three optimal solutions. In Fig. 3.2, these
solutions are labeled M1, M 3, and M 5.
Figure 3.2 presents the relationship between the conditioning of the parameter
of interest, level of financial aid, and the possible response in the effect of interest,
which in this example, is measured by the loyalty level of citizens in Ramada toward
Bagaga. This diagram is sometimes termed the “response surface” or the “fitness
landscape.”
A response surface presents the effect (response) as a function of the cause
(parameter under investigation). A fitness landscape is a concept from biology that
presents the fitness of different species in the population; we can simply assume that
each solution in this diagram is a configuration for a policy or a “species”. In both
cases, understanding this surface is important because this is the diagram that blue
must consider when making a judgment.
Recalling that the level of financial support that the blue team chose is B1, we
can immediately deduce from Fig. 3.2 that B1 does not lead to the highest level of
loyalty. In fact, decreasing the financial aid can lead to a higher level of loyalty.
However, why is this the case?
To answer this question, the red team needs to dig underneath the fitness land-
scape using their knowledge and expertise on how Ramada works. The optimization
process that discovered the three optimal solutions has undergone a search process
to find these optimal solutions. The search process usually moves from one solution
to another (or from a sample of solutions to another) in the search space, evaluates
encountered solutions, and then decides either to generate new ones or to cease the
search.
If the red team saves all the solutions encountered during the optimization
process, it can visualize them as demonstrated in Fig. 3.3.
Let us remember that each solution in Fig. 3.3 arose from running the simulation
system in ScioShadow-Ramada. Therefore, the environment we used to initialize
Fig. 3.3 Solutions encountered during the optimization process to construct the fitness landscape
each simulation that generated each of these solutions can be saved. Based on the
analysis executed by the red team, two key variables in this environment (apart
from the level of financial aid) can be considered the cause for the variations in
these solutions. These two variables are the corruption level in the government of
Ramada, and the level of government control the Ramada government exercises
within the country.
Since it is established that the government of Ramada receives the financial aid
from Bagaga, the level of corruption in the government of Ramada means that the
financial aid does not all go to the citizens. However, this also depends on the level
of control of the government. If the level of control is very high, and the corruption is
very high, one would expect this to correspond to a very low portion of the financial
aid being passed to Ramada’s citizens.
In Fig. 3.3, the red team has used Z1, Z2, and Z3 to denote solutions encoun-
tered during the optimization process and fall within the local area of each of the
three optimal solutions: M1, M 2, and M 3, respectively. It is important to note that
it is not necessarily true that all Z1 were encountered during the search for M1.
Therefore, the red team can visualize the relationship between these two variables
and the corresponding loyalty level as presented in Fig. 3.4. Fortunately, the red
team can see that each optimal solution and its surrounding neighborhood occupies
a distinct subspace of the government of Ramada’s corruption and control space.
Given the large amount of data that has been collected, the red team can apply
classification, which is a type of data-mining technique that can automatically
discover the boundaries between different areas in the diagram. The output of the
classification method is presented in Fig. 3.5.
This output is interesting because it divides the corruption-control space into
three distinct regions that impact the relationship between the amount of financial
aid from Bagaga and the level of loyalty of Ramada’s citizens to Bagaga.
Fig. 3.4 The relationship between fitness landscape and causal space
Fig. 3.5 Classification of causal space
The red team sent their findings to the blue team. They have demonstrated to the
blue team that the original decision of the blue team was vulnerable.
The blue team decided to challenge the red team. They needed to find a way
to push the boundaries that separate the three areas of optimal solutions to their
advantage. To design this challenge, the blue team began to analyze the skills (know-
how) the two concepts of government control and government corruption require.
The international-relations expert identified three skills for government corruption
and two skills for government control. These skills are presented in Fig. 3.6.
The skills presented in Fig. 3.6 are interesting to note. A corrupt government
requires the skill of having a strong understanding of the social and political context
of the country. Moreover, they may also have excellent social-intelligence skills.
These two skills are interesting because they are also the skills needed for a healthy
Fig. 3.6 Challenging the causal space
government. The third skill for corruption is that the person (government) needs to
know how to suppress their conscience, be selfish, and avoid altruism.
This situation is similar to an excellent thief and an excellent detective: the two
have the same skills but their attitude toward society is driven by two opposing value
systems.
For the government to have control, they need to know how to design laws to give
them this control, and how to establish strong law-enforcement agencies to protect
and uphold the laws.
The analysis conducted by the blue team revealed the skills that can be influenced
to challenge and reshape the relationships presented in Fig. 3.5. The blue team used
simulation, optimization and data-mining techniques again, in conjunction with
ScioShadow-Ramada, to find ways to challenge the red team. Figure 3.7 presents
the outcome of this exercise, where the blue team identified that the line separating
Z3 from Z1 and Z2 can be pushed by a distance: c1, and the line separating Z1
from Z2 can be pushed further a distance: c2.
To explain the findings of the blue team, let us begin with c1. If line Z3 .Z1
Z2/ is pushed toward c1, the area of Z3 will be larger. The fitness landscape did not
change, but what changed was that the blue team found ways to influence Ramada
Fig. 3.7 Second layer causal space
such that lower government control could generate higher benefits. For example, if
the blue team directed some of the financial aid into media programs to promote a
sense of community in Ramada, this would counteract the selfish behavior displayed
by some government officials in Ramada and convert them to exhibiting altruistic
behaviors. Similarly, by enhancing the laws in Ramada, Z1 Z2 can be pushed in
the direction of c2; thus, higher levels of corruption can be counteracted with better
laws.
In discussing the concept of a challenge, the use of data-mining techniques to
estimate the boundaries between the different sets of solutions was crucial. These
are the boundaries that needed to be discovered before establishing the impact on
Ramada. The blue team can now use a portion of the B1 financial aid as a budget to
influence and reshape the government of Ramada to achieve a higher level of loyalty
in the citizens of Ramada.
This example presents a number of CRT technologies that were essential in the
exercise. First, we can see from the beginning of the exercise that the purpose and
roles of the teams were assigned unambiguously. This is part of the concept of
“experimentation,” which we will discuss briefly in the next section.
The second set of skills relates to optimization, data mining, and simulation.
Optimization provides the tools to search for and discover optimal solutions or
promising areas of the search space. Data mining provides the tools to group
information, find ways to discriminate between information, and find possible causal
relationships in the data. Simulation is the oracle that represents the system under
investigation in a computer program to which we can simply ask questions instead
of asking the real system itself.
3.2 Experimentation
CRT exercises are experiments designed for a purpose. More details on

experimentation in defence can be found in [6]. Be it in the form of a human-
based exercise in which a group of experts assumes the role of red to red team a
plan, or a group of software systems put together to test the robustness of a cyber
environment, the scientific method must be followed. The reason for this will be
discussed in the next section. In CRT, we will always begin with a purpose, then
formulate one or more hypotheses, then design experiments to test these hypotheses.
3.2.1 Purpose, Questions, and Hypotheses
A CRT exercise is not performed in a vacuum. It is established for a purpose with

a specific aim or specific aims. The purpose defines the reason of being: why the
exercise exists. For example, the exercise may exist to test the robustness of a
security system. The aim of the exercise indicates what it attempts to achieve. In
the previous example, the exercise aimed at identifying situations that may exploit
or harm the security system.
The scientific method ensures that the exercise is conducted in a systematic
manner to achieve the stated aim. The questions of the exercise stem from the
purpose and aim, the hypothesis is derived from the question, and the experiments
are designed to prove or dismiss the hypothesis. The aggregate findings of these
experiments should answer the question that was initially posed.
Examples of some questions that would be asked in a CRT exercise include the
following: Will our strategy survive the competitive nature of the environment? Is
our plan of attack sufficiently robust to be successful even if our opponent relies on
their adaptive capacities to change their strategies? Can someone who has access to
the right technology penetrate our security system? Is the physical protection around
our major airports sufficient to stand an organized attack? Are the technologies
adopted for our future air-traffic system the right technologies for how we expect
the demand in the environment to evolve? Will the human operator makes mistakes
during stressful scenarios?
The importance to begin the CRT exercise by posing questions stems from two
reasons:
1. The primary question defines the scope of the exercise. A CRT exercise can be as
large as we want it to be. However, the larger it is, the more complex it will be.
Without an equivalent level of complexity in the design and resources allocated to
the exercise, the exercise will turn from being a systematic well-planned exercise
to being a bunch of ad-hoc activities that are neither logically connected, nor
justifiable.
2. Every question carries a cost. This is the cost of finding a proper and correct
answer. The benefits that the answer to a question can offer to an organization
3.2 Experimentation 115
and the RT-S must be clear. If the cost exceeds the benefits, the CRT exercise
should not take place. There is no point in asking a big fancy question when the
budget is limited. Whatever the answer we get, it creates vulnerabilities in the
decision making cycles by exposing the credibility of the answer to threats of
doubts.
The CRT question(s) will usually lead to hypotheses. There are two definitions
of a hypothesis.
Definition 3.1. A hypothesis is a statement of a prior belief formed about the
outcome of an experiment.
Definition 3.2. A hypothesis is a prior belief of the existence of a cause–effect
relationship.
If the first definition is used, a hypothesis may sound strange to some CRT
designers; primarily because in complex CRT exercises, the formulation of a belief
about the outcome of a CRT exercise is either a trivial deductive exercise from
the question, or conveys the image of an academic experiment, rather than being
a classical in situ CRT experiment.
For example, if the question that triggered a CRT exercise was whether a security
layer be penetrated. The hypothesis would appear trivial if we simply stated that we
believe that this security layer can be penetrated. The word “hypothesis” itself, with
the first definition in mind, may also appear overly academic in a CRT context.
If the second definition of a hypothesis is used, we can see the importance of
defining a hypothesis more clearly; in fact, it becomes a necessity. In this definition,
a hypothesis is a belief about the cause of the effect. If the effect is penetration
of the security layer, a hypothesis can be formulated to state that lack of physical
protection of key computer access points makes it possible to penetrate the security
layer. Here, the cause is our key to open the door toward reaching and achieving the
effect. By stating the hypothesis of the CRT exercise, we are stating our initial belief
about the first key we will use to generate the effect.
Formulating the right hypothesis substantiated with systematic and logical
thinking is a valuable step toward obtaining rigorous results. If it eventuates that the
hypothesis is invalid, this becomes a finding in its own right. In the example above,
if a lack of physical protection of key computer access points does not lead to a
penetration of the security layer, this finding would convey that the security layer is
robust against this cause. More importantly, it will prompt us to ask why the lack of a
physical protection layer of computer access points is not a door toward the security
layer: Is it because there is an internal firewall between internal subnetworks and the
core network within the organization? Is it because there are strong cryptographic
protocols or is there another reason?
These follow-up questions will be the basis for the evolution of the study, the
formulation of updated and new hypotheses, and the exploration of more means to
achieve the effect.
Sometimes the CRT exercise is executed using a human-in-the-loop simulation.
The red team needs to make decisions on what analysis tools they will adopt to
evaluate the plans of the blue team, then the simulation environment provides
feedback to both teams on their plan and counter-plan. A hypothesis is still required;
otherwise, the problem is open ended for the red team. The red team needs to use
the hypothesis as a mechanism to begin the exercise and initiate the activity to avoid
confusion about where to begin.
While we emphasize the need for a hypothesis, the hypothesis should not be
considered a bias for the red team that would stop them from innovating. The
questions and hypotheses frame the CRT exercise in its reason of being, but should
not constrain the thinking process, ideas or innovations that should emerge from the
exercise. A hypothesis is merely an initial belief. Once the exercise begins, the team
members may dismiss it completely. However, even when this occurs, the hypothesis
acts like a seed for the discussion and investigation. Even if the red team is not
persuaded that the lack of physical protection of key computer access points makes
it possible to penetrate the security layer, they will begin to debate and dismiss this
claim. This debate will encourage the analysis.
3.2.2 Experiments
Once the question(s) and hypothesis/hypotheses are formulated, the experiment

must be designed. To this end, we need first to define an experiment.
Definition 3.3. An experiment is a situation that is conditioned by the designer to
exclude unwanted factors, while carefully including wanted factors, to examine a
cause–effect relationship.
The definition above does not restrict the concept of an experiment to a laboratory
environment. Experiments can be conducted in situ, in a real operating environment.
Three keywords that were used in the definition require further discussion: unwanted
factors, wanted factors, and cause and effect.
3.2.2.1 Unwanted and Wanted Factors
The designer of an experiment has the daunting task to ensure that the experimental
environment is conditioned to exclude many unwanted factors. These unwanted fac-
tors can be elements that do not impact the experiment and therefore, including them
may introduce noise and confusion in the mind of analysts, constituting unnecessary
factors that complicate the experimental environment. Unwanted factors may also
be elements that impact the experiment, but the designer wishes to exclude them to
be able to isolate cause and effect.
For example, imagine we want to examine the effect the exposure of a manager to
depressive situations will have on the quality of the manager’s critical decisions. Let
us assume the hypothesis of this experiment is that as the degree of depressive events
to which the manager is exposed increases, the quality of the manager’s decisions
decreases.
3.2 Experimentation 117
To evaluate the degree of depressive events, RT-D may rely on changes in the
physiological responses of the manager such as changes in skin temperature and
heart rate once the manager is exposed to depressive events.
Therefore, we need to be able to detect when a depressive event occurs and
measure the physiological responses before and after the occurrence of the event.
However, the manager may get depressed at home before going to work. The
designer of the experiment may opt only to run the experiments if the manager
is in a pleasant psychological state; this may require the manager to sleep on site
or in a near-by hotel, or may require preventing routine problems arriving on the
manager’s desk during the experiments so that the experiment can focus on critical
decisions.
This type of conditioning ensures that the experiments are conducted in the right
circumstances to establish the cause–effect relationship, if it exists. Unnecessary
factors are eliminated; those that are key to the experiment are included, and the
environment is conditioned such that no external factors influence or bias the results.
3.2.2.2 Cause–Effect Relationship
An experiment is designed to test the hypothesis, that is, to test whether we can
establish a relationship between the cause and the effect. Assume we want to test the
robustness of an algorithm that routes the vehicles in a furniture-delivery company.
The objective of the CRT exercise is to discover when this algorithm fails; and
robustness is not achieved. A possible question can be the following: When does the
algorithm fail? A possible hypothesis might be that disturbance in the environment
would cause the algorithm to deviate significantly from the best possible solution in
these situations.
The CRT team begins the exercise by brain storming the type of disturbances that
can happen, designing novel methods to synchronize these disturbances, and create
chains of different disturbances so that the overall uncertainty in the environment
cascades to a very high level. The CRT team will write code to automatically
generate scenarios with the designed disturbance characteristics, and stress test the
routing algorithm with these scenarios. They will scrutinize the performance of the
algorithm on different scenarios, learn patterns of failures, and redesign scenarios
that are likely to cause these failure patterns to cascade the level of failures.
The process described above attempts to search for scenarios in which factors
come together to cause maximum disturbances. To conduct this search and opti-
mization process, we can either rely on human thinking to search and optimize
explicitly, or rely on human thinking to design automatic search and optimization
methods. The next section introduces the basic ideas behind search and optimiza-
tion. As the scenario discussed in Sect. 3.1.2 and this example demonstrate, the
purpose of these computational tools is to discover the cause–effect relationships.
3.3 Search and Optimization
An experiment is designed to solve a problem. Meanwhile, each experiment can

contain one or more problems. A problem exists when a goal is set and we do not
know how to achieve it. We solve the problem by identifying the steps to reach the
goal. Therefore, problem solving is to design the steps needed to move a system
from its current state to the desired goal state. Without the creation of a problem
in the first instance, the system would not advance and move forward. Without
problems in organizations, we do not need management, and problem solving,
decision sciences, and analytics become obsolete fields. As such, creating problems
is different from creating troubles. Creating problems is required to advance the
organization.
One way to create problems is to design new targets; hence a gap is created
between the existing state of the system and these new target states. This is
an effective strategy organizations adopt to get employees to improve their own
performance. Another way is to discover the hidden wholes that already exist in
the systems that can either stop the organization from reaching its goals or trap
the organization in areas away from its desired targets. The gaps hide risks, while
pushing the targets further away just outside the boundaries of a system’s ability
creates challenges.
To take a positivist attitude towards problem creation, it needs to follow a
disciplined and deliberate approach to ensure that those newly created problems
indeed move the organization towards a positive direction outside the organization’s
existing comfort zone.
Modelling transforms a problem into a suitable representation, such as a set of
equations, where many formal methods can be used to solve the problem in this
new representation. Optimization is a discipline that attempts to model problems
mathematically and solve these problems using mathematical techniques.
Any CRT exercise will involve an optimization model of some sort and a search
strategy. We may need to search for a vulnerability in a system, a chain of low
signal-to-noise threats that can cascade to a large threat, or an action to counteract a
goal, intent or another action. Some classical AI and OR literature, while not always
explicitly intended, may differentiate between search and optimization.
Search is usually considered the strategy used to traverse a confirguation/state
space. A search algorithm is a series of steps that is followed systematically; the
algorithm will find what it is searching for. For example, to find a human face in a
group of pictures, an image-processing algorithm can be used to search and detect a
human face automatically.
Let us take a second example. Searching a computer-science tree (remember a
computer-science tree is upside down with roots at the top, branching down to the
bottom, and leaves at the lowest level of the tree) is about designing a strategy to
traverse the tree. Imagine each node in that tree has a binary value: 1 if it has been
visited and 0 if it has not. The strategy can be a breadth-first visit within the level
before changing a level, a depth-first visit to the next level before visiting a second
node within the same level, or some other strategy guided with a cost function.
We can visualize this as having a tree with all 0s at the beginning of the search.
3.3 Search and Optimization 119
In breadth first, the 1s will begin appearing from the top level, moving from left to
the most right, then going back all the way to the left, but one level down, with the
last one appearing corresponding to the bottom right node. In the depth first, the 1s
will begin appearing from the top all the way to the bottom of the tree, then back up
to the highest level with a remaining 0 node, then back down again from that node.
Search in this case is guided by this binary variable, and a simple rule for visiting
a node that has not been visited before. In fact, this simple rule is all we need to
perform a complete search of the tree. The reason we follow depth or breadth first
is that they correspond to the minimum cost of tracking which nodes have been
visited. If we follow a random order, we need to have a list of nodes that have not
been visited, and every time we visit a node, we remove the node from the list. This
involves extra storage that we do not need if we follow breadth first or depth first.
Assume that the objective is not to visit all nodes in the tree. Instead, assume
each node is a nest with a number of ants. Our objective is to find the smallest nest
in the tree. This is an optimization problem. We can still use breadth-first or depth-
first strategies and search the entire tree. This is perfectly fine if the size of the tree
is relatively small and there is no domain knowledge on how nests are distributed in
the tree.
However, if we know that nests tend to get smaller as we go down the tree, we
can search only the leaves at the bottom. This characteristic defines the advantage of
optimization techniques. They exploit domain knowledge about how solutions are
distributed in the search space and utilize this domain knowledge to find the optimal
solution fast.
Optimization is a core technology in any type of problem-solving activity,
manual or computer based. When a need arises to solve a problem, we need to
be able to evaluate what is an allowed (feasible) solution and what is not an allowed
(infeasible) solution. This is achieved by designing the set of constraints. If a solu-
tion satisfies all constraints, it is evident that this solution is acceptable and feasible.
If many solutions or alternatives exist that satisfy the constraints, we need a criterion
or a set of criteria to decide which of these solutions is more appropriate. A criterion
can take the form of an objective function, where one solution is better than another
if it has a better value on this criterion. Another form a criterion can take is of a
goal, where one solution is better than another if it betters satisfies the goal.
Optimization is usually considered the process of finding one or more solutions
that have the minimum cost according to some cost function. Any search problem
can be modeled as an optimization problem. In the example above where we wanted
simply to visit all nodes in the tree, the optimization problem is to maximize the
number of 1s in the tree. The optimal solution exists only after we have visited all
nodes.
Let us take another example that is more messy than the structured exam-
ple above. In Manysoft, Mario is the technical person who knows the secret for
the company’s next revolutionary product. Therefore, Mario is a critical element
in the organization. Marcus works for Minisoft and knows how to entice someone
like Mario to speak and reveal the secret. Manysoft knows this fact. The objective
of Manysoft is to find a strategy to minimize the probability that Mario encounters
Marcus. This is an optimization problem. Once the problem is formulated mathe-

matically and/or logically, we can design search methods to generate one strategy
after another until we encounter a strategy that creates the minimal probability that
Mario will encounter Marcus. The solution might be to relocate Mario to a different
branch or city.
Optimization exists in every situation in which a decision needs to be made. In
classical mathematics, to find the solution of a system of equations is to minimize
the infeasibility of the system. In data mining, to find the model that generalizes
well on the data to predict futures we have not yet seen is to find the minimum
risk-cost function of some sort. In an organization, to design a strategy to improve
the profit of the organization is to find a strategy that maximizes the distance that
increases the current profit level to the maximum possible level given the resources
and capabilities of the organization.
Therefore, optimization is a field that is used in every problem-solving exercise.
However, it has been erroneously associated to a narrow subfield of mathematics
where the aim is to find a solution with the minimum cost over all possibilities:
the “global minimum solution”. In many real-life situations, the ambitious aim of
guaranteeing that the solution is the best of an entire set of possible solutions is
not possible. The search space grows very rapidly, and even with the most powerful
computational resources we have, we cannot guarantee this.
Formally, an optimization problem is defined with a tuple < D.X /; C.X /;
F .X / >, where X is a set of variables, with each variable that belongs to a domain
X 2 D, C being a set of constraints over X , and F being a set of objectives over
X to be maximized and/or minimized. A maximization problem can be transformed
into a minimization one by multiplying the objective function by .1/.
In its general form, an optimization problem is defined as follows:
Min. F .X /
Subject to
D.X / and C.X /
Solving an optimization problem has two parts. First, we need to find the set
of feasible solutions V .X / 2 D.X /, where .v1j .x1 /; : : : ; vij .xi /; : : : ; vnj .xn // 2
V .X /; i D 1; : : : ; n; j D 1; : : : ; k, with vij .xi / is the value assigned to variable
xi 2 X , n and m represent the cardinality of the set of variables and feasible
set respectively, and .:/ represents the ordered solution vector that satisfies all
constraints; in other words, we need to find an assignment of a value, vij , to each
variable, xi 2 X , such that, the value is within the domain, vij 2 Di , and it satisfies
all constraints, C.X /.
Second, we need to find the optimal solution, V V .X /, within the set of
feasible solutions V .X /.
Define I.X / as a function representing the amount of constraint violation. The
amount of constraint violation can be measured in many ways including by the
number of constraints violated, a distance metric between the value of the current
solution and the closest point in the feasible region, or the amount of effort needed to
retrieve feasibility. The problem can be reduced to an unconstrained version taking
the following form:
Min. F .X / C I.X /
with being a penalty value for constraint violation.

The classical use of the generic optimization model defined above is to solve
problems such as finding vehicle routes at a minimum cost, creating computer
packet routes with a minimum delay, the smallest fleet to conduct an operation,
and the best match between jobs and machines to complete all jobs on time.
Another classic, but closer to CRT, use of optimization is design optimization. We
may have the shape of an aircraft wings with the values of some design parameters
missing. Optimization theory is used to find the best assignment of values to these
parameters. This is usually a form of optimization that is very expensive from a
computational perspective because to evaluate any assignment of values to these
parameters, computationally intensive simulations need to be executed to evaluate
the performance (i.e. objective function) of these parameters.
CRT aims to design deliberate challenges for the other team. Optimization is the
tool by which to achieve this aim, actively searching for challenges, vulnerabilities
and events associated with high risk. In its basic form, a CRT problem can be
considered a form of design optimization. More complex CRT problems have a
much higher level of complexity than design optimization.
The primary characteristic of the use of optimization in CRT is that most of the
time, we are optimizing on the problem space, attempting to find a problem that
challenges the other team. This imposes significant difficulties for the optimization
community.
In effect, we attempt to optimize everything in our life, even though the method
we use may not generate the best overall possible action. Instead, most of the time
we follow an incrementalist approach by which we attempt to find a solution that
is better than what we know or what we have done in the past. When we wish
to maximize our income, we usually think of getting a promotion in our work or
doing a second job. We think less of changing our career completely, and we do not
necessarily know how to define the problem mathematically and completely to be
able to find the best mathematically proven action to maximize our income.
When we use the word “optimize,” we implicitly mean that a “search strategy”
is required to find the optimal solution of the problem at hand. We will use “opti-
mization process,” “optimization strategy,” and “search strategy” interchangeably
in this book. We will categorize optimization strategies into two main groups:
blind (general-purpose) versus knowledge-based (special-purpose) optimization,
and system versus negotiation-based optimization.
3.3.1 Blind vs Knowledge-Based Optimization
A blind (general-purpose) search process is used in many classical optimization

techniques. The search process is general, such as the Simplex method [12], which
is an algorithm to find the optimal solution for a linear-programming problem. The
algorithm does not attempt to use any knowledge extracted from the specific context
of the problem to decide on the search strategy to find the optimal solution. Instead,
the context is transformed into a mathematical model. The algorithm is designed
to solve any mathematical model that satisfies a specific well-defined mathematical
assumption. For example, in the case of the Simplex method for linear-programming
problems, the assumptions are that there is a single objective function, the objective
function is linear, all constraints are linear, and the domains of all variables are R.
A knowledge-based (special-purpose) optimization algorithm will be customized
to the problem at hand. For example, we may know from a domain knowledge that
the optimal solution is located in a specific region of the search space. Consequently,
we can bias the initialization process toward this region. If we change the Simplex
method to begin at a point in the space that we know is close to the optimal,
the method is no longer general purpose. Because knowledge-based optimization
algorithms are not general-purpose algorithms, if the problem changes or the
knowledge used to bias the algorithm changes, the old algorithm is no longer
suitable, or it may become equivalent to a blind-search algorithm.
Generally, we will find a properly designed knowledge-based optimization
algorithm more efficient (in memory and/or speed) in solving the specific problem it
attempts to solve. However, it cannot be applied outside that specific narrow context.
A general-purpose blind algorithm can solve a larger set of problems, but it may not
be efficient for some instances in this class.
To avoid confusion, we may switch between the word algorithm and the word
heuristic. In classical optimization, an algorithm provides some guarantees in its
ability to converge. Imagine I have a single node in the tree discussed above with a
nest of wasps. Depth first becomes an algorithm, because it will guarantee to find
the node with the wasp. The only problem here is that it may need to visit all nodes
if the node with the wasp is the one in the most bottom right node.
A heuristic is a rule of thumb; it is a piece of knowledge that we have about the
problem that if it works; it will lead us to the optimal solution very rapidly. This
is different from knowledge-based optimization. We can have blind optimization
using a heuristic, such as a genetic algorithm2 (GA). We can use a GA to solve any
optimization problem. In its basic form, it is a blind-optimization heuristic.
In the example above, we biased the Simplex method to begin from a specific
solution; this is an example of using domain knowledge with an algorithm. Even if
2
The word “algorithm” in GA is not used in the same manner in which an algorithm was defined.
A GA is a heuristic. The use of the word algorithm here refers to the corresponding computer code
so the heuristic is written as a computer algorithm.
our initial bias is incorrect, the Simplex method is guaranteed to find the optimal
solution for the linear-programming problem. Therefore, our bias may cost us more
time if we get it wrong, but it will not cost us quality.
We may ask why we should rely on heuristics. The reason is that in any realistic
large-scale real-world problem, heuristics are more efficient than algorithms. This
may shock some mathematicians. However, this discussion is important for CRT
because of its impact on the choices we will make in optimization. As such, we
need to explain this point further.
The Simplex method, and some variations of the method, constitutes an efficient
algorithm that can realistically solve large-scale real-world problems and guarantees
to find the optimal solution. In fact, it will be faster than most heuristics, and we can
solve problems with millions of variables. As such, how is it that we are making a
claim that heuristics are more efficient?
To use the Simplex method, the objective function and constraints need to be
linear and the variables need to be continuous. Real-world problems do not satisfy
these requirements easily. Even if the constraints are linear, most likely the cost
function will be nonlinear. More often, we will find some variables need to be
integer, and in most CRT problems, we will have many conflicting objectives rather
than a single linear objective.
Therefore, if we wish to use the Simplex method, we have to make the problem
fit the model. This is a trick that is not accepted in CRT. In the CRT exercise, we
attempt to discover vulnerabilities, or evaluate the system we wish to study. By
approximating the problem too much, we might be hiding vulnerabilities, and more
dramatically, we might be biasing the findings away from an area of high risk toward
areas of lower risk.
3.3.2 System vs Negotiation-Based Optimization
Most of the mathematical-based optimization literature focuses on what we will

term “system optimization.” A mathematical model is formulated to capture all
aspects of a problem in a system. Examples in this category include formulating a
nonlinear programming model to optimize a heat-exchanger network, or an integer
programming model to optimize the timetable of a university. This type of system
optimization is suitable for interteam problems as the case of one of the circles
presented in Fig. 2.5.
Negotiation-based optimization occurs when a problem is naturally decon-
structed among different stakeholders or players. A single optimization model
would be very complex for such a problem, and would sometimes be impossible
to formulate. Moreover, in some cases, information is not centralized for a single
model to be formulated adequately. Negotiation-based optimization is suitable for
intrateam optimization. For example, the problem of minimizing the fleet size of a
furniture-delivery company: this problem consists of a number of interdependent
problems. One interdependent subproblem is routing. If a vehicle chooses the
shortest path, it might also be the busiest path; therefore, delays will occur. Taking
a longer path would mean fewer trips were made by a vehicle and an increase in the
fleet size.
A second interdependent subproblem is bin-packing. Optimal packing of a
vehicle to maximize the load it carries may result in an increase in loading and
unloading time and longer routes for the vehicle. A bad packing of a vehicle would
mean unutilized space, shorter than necessary trips, and a need for a larger fleet.
A third interdependent subproblem is timetabling or scheduling. Increasing on-
road time would reduce the fleet size, but may also increase breakdowns and
maintenance rate. Minimizing the idle time of a vehicle would mean that disturbance
in the route or delay in executing a delivery would delay all subsequent tasks. In the
latter case, some deliveries may even need to be postponed to the following day.
Moreover, it may cause an increase in the rate of fatigue for drivers.
A fourth interdependent subproblem is fleet mix. Having a single type and size
of a vehicle in the company would provide the best maintenance services because it
would mean lower maintenance costs, as the company would require a single type
of expertise in its workshop; having a single type and size of vehicle could also lead
to more efficient material-handling processes. However, having all vehicles of the
same size is likely to increase the underutilization of vehicles, and may increase the
number of trips necessary to execute all deliveries.
To decide on the optimal fleet size of this delivery company, at least the above
four interdependent subproblems need to be solved. Routing would depend on the
type of vehicles, timetabling, and bin-packing. Similarly, the optimal decision on a
fleet mix would be influenced by decisions made in the other four subproblems.
To formulate a single optimization problem to solve all subproblems is not
desirable. The interdependency of the four subproblems is translated in the math-
ematical model to a high level of coupling and nonlinearity. The model will be
too complicated, and it will be very difficult to design efficient optimization search
strategies to solve it. Moreover, a disturbance in one subproblem would impact the
entire model, making it difficult for the model to adapt in a changing environment.
In these problems, it is easier to imagine that each subproblem is handled by a
computer/software agent. Each software agent attempts to optimize its own model.
When one agent proposes its optimal solution to the other agents, the other agents
may reject the solution because it imposes a constraint on them that deteriorates
their own optimum outcome. Therefore, each agent must negotiate the amount it is
willing to lose to achieve its own optimum outcome. A schematic diagram of this
multi-agent system is presented in Fig. 3.8, while methods to solve this problem are
discussed in [1, 3].
3.4 Simulation 125
Fig. 3.8 Multi-agent system for negotiating the optimization of interdependent problems
3.4 Simulation
Simulation is the brain of CRT. Before we progress in this philosophical function

of simulation within CRT, we need to define and differentiate between a number of
concepts, beginning by providing a definition of a model.
Definition 3.4. A model is a representation of a system in some form.
The form of this representation can vary widely. It can be mathematical, or be in
the form of a linguistic description, or a diagram. Every model is a representation
of something in a form that is not the form of the something itself. For example, a
model of human behavior is a representation of human behavior that is not in the
form of human behavior itself.
Words describe our thoughts, and as such, one can think of words as a
representation of our thoughts, as a model of our thoughts. Therefore, a model
is not limited in its form to a representation of a function, as in the case of a
cognitive model for thinking that represents “thinking” as a function. Instead, a
model can represent the expression of thinking into words, gestures, or actions. This
representation in its own right is a model. It can be considered a model of the output
of the “thinking” model, and it can equally be considered a model of communication
as a function in the system. We will begin with an example of a model from the
military, which is a classic model for combat. The model takes a mathematical form
and is known as Lanchester equations.
The classical Lanchester equations constitute a model that assumes direct
weapon. Attrition is described using the Lanchester Square Law (LSL) as coupled
differential equations.
dBs .dt/ D ˛b Rs .t/ (3.1)

dRs .dt/ D ˛r Bs .t/ (3.2)
where Bs .t/ and Rs .t/ represent blue and red force size, respectively, and ˛b and ˛r
represent the single-shot kill probability for blue and red, respectively. The solution
to the Lanchester equation, Bs .t/ and Rs .t/ should satisfy the following equation:
˛r ..Rs .0//2 .Rs .t//2 / D ˛b ..Bs .0//2 .Bs .t//2 / (3.3)
Bs2 .t /
when ˛r
˛b D Rs2 .t /
, the force ratio during combat becomes a constant, with the rate of
dBs .t / Bs .t /
attrition ratio dRs .t / becoming proportional to the force ratio R s .t /
.
Lanchester equations constitute a simple model. They have been widely criti-
cized because of their assumptions. As such, it may serve us to discuss this issue.
Since every model is a representation of something in a form that is not the form
of the something itself, equivalence between the original system and the model can
only be a matter for theoreticians to debate. In fact, it is inconceivable to think of
a model of any real-world phenomenon that is equivalent in a strict mathematical
sense to the phenomenon itself. As such, we need to understand the following four
related concepts: model assumptions, resolution, level of abstraction and fidelity.
Definition 3.5. Model assumptions represent the conditions under which the model
is valid.
That is, model assumptions represent the conditions under which the model
represents what it is supposed to represent correctly.
In the Lanchester equations presented above, one of the assumptions is that the
two forces rely on direct fire only. If we wish to model a situation with indirect fire,
we should not use the model above as it is. The model is a tool in our hands; it is
our choice how and when to deploy it. If we deploy it incorrectly, we are to blame,
not the model.
The starting point of any computational modeling is to build a model. We can
then transform this model into a piece of code.
Definition 3.6. A simulator is an encoding of the model in a suitable software
system (i.e. program).
For example, an aircraft simulator may look like a physical aircraft with all the
gears except that this aircraft is flying in the virtual world using a model.
Definition 3.7. Simulation is the ability to reproduce the behavior of a system
through a model.
Simulation is the process whereby we sample different inputs, use the simulator
to generate the corresponding outputs, group the outputs and attempt to understand
how the aircraft responds to different inputs. In essence, a model represents the
system, a simulator encodes the model, and simulation is the overall process with
the objective of mimicking the original system, sampling the input space, and
reproducing the behavior of that system. The relationships between the original
system and the produced mimicked behavior are captured in three concepts:
resolution, abstraction and fidelity.
3.4 Simulation 127
3.4.1 Resolution, Abstraction and Fidelity
The three concepts of resolution, abstraction and fidelity are sometimes used
interchangeably. We will provide a different view to these three concepts to ensure
that they are understood as distinct, albeit interdependent concepts. Figure 3.9
illustrates the interdependency among these concepts, where if we imagine we
see a system through a telescope, resolution is what we see through the telescope
lens. We then need to make a decision on how to represent what we see, where
abstraction comes into play. In what we see lies many pieces of information. Fidelity
is how much of these information we will bring inside the model or the simulation
environment. What we will bring in, will reflect on what the simulation environment
is able to generate. Thus, fidelity is both an input and an output.
Formally, resolution is a function of the system while abstraction is a function of
the model.
Definition 3.8. Resolution is what the modeler intends to model about the problem.
Definition 3.9. Abstraction is what the modeler decides to include or exclude in
the model.
Let us take the Lanchester equations discussed above as an example. Here, one
objective could be to understand force-level attrition. This is the level of resolution
the modeler decides to consider. The modeler may zoom in the system and instead
of examining a force-level question, they may decide to examine a company or even
a single soldier.
Fig. 3.9 Relationship between resolution, abstraction and fidelity

When Manysoft perceives Minisoft as a competitor, Manysoft may decide to

model Minisoft as a “single whole” as a competitor or may decide to model
every department within Minisoft to describe Minisoft as a competitor. How much
zooming in Manysoft will perform when it models Minisoft denotes the level of
resolution?
Similarly, when modeling human decision making, we may decide to examine a
human as a simple agent with inputs, processes and outputs; what is usually termed
as a behavioral model. Alternatively, we may include the cognitive processes and
examine the details of modeling neural activities.
Deciding on the level of resolution appropriate to model a system is a key
decision every modeler makes when building a model for a system.
The second key decision is level of abstraction. Abstraction here is considered
mapping from the system at the level of resolution decided to a suitable repre-
sentation (i.e. model). Abstraction in modeling can be considered the process of
identifying the causes and principles that truly govern the behavior of the system
while ignoring system-level details that do not contribute to this behavior.
While Lanchester equations model force-level attrition, it was the modeler who
decided that a system of coupled differential equations was an appropriate level of
abstraction. A question such as whether we should have considered rate of change
only in that model, or whether we should have also considered acceleration/de-
acceleration (second derivative) is an example of considering how much we wish
to abstract the system.
In the example of behavioral modeling of human decision making, the repre-
sentation or modeling language we use to model a behavior represents the level of
abstraction a modeler chose for that problem. Possibly a set of propositional rules
is sufficient, or possibly a complicated recurrent artificial neural network would be
more appropriate. A modeler decides on this level of abstraction.
In simple terms, let us imagine we wish to translate the Arabic language to
machine form for further processing. The analyst may examine only formal Arabic
given there are many dialects and informal variations of the Arabic language.
This is the level of resolution used for this problem. The analyst then needs to
decide on the appropriate level of abstraction: Should formal Arabic be represented
using propositional logic, predicate logic, some sort of grammar such as tree-
adjunct grammar; what representation is more suitable for the specific application
in context?
Definition 3.10. Fidelity is a measure of model coverage of the space defined by
the level of resolution.
That is, fidelity would beg the following question: Does the system cover all
sentences in formal Arabic? As such, fidelity is about faithfulness and completeness.
However, abstraction is about correctness. Using propositional logic to model
formal Arabic would create incorrect parsing of the language. As such, propositional
logic does not provide an appropriate level of abstraction to the level of resolution
we wish to model. Therefore, it becomes incorrect in reproducing the phenomenon
of interest. It also does not cover the whole space of formal Arabic; therefore, the
level of fidelity is low.
3.5 Data Analysis and Mining 129
The above examples of how to distinguish among resolution, abstraction and

fidelity define our use of these concepts in this book.
3.5 Data Analysis and Mining
Thus far the discussions have revealed a need for data-analysis techniques to support
the CRT exercise. For experimentation, classical hypothesis testing is required to
establish confidence about whether the trends discovered in the data are reliable
trends or happened by chance.
In this section, we will introduce a bit more advanced data-analysis techniques.
One of the main tasks in CRT is to discover the boundaries of behavior to create
a challenge. In the scenario presented in Sect. 3.1.2 on Ramada and Bagaga, the
simulations generated large amount of data that were labeled with one of three
labels: z1, z2, and z3. As presented in Fig. 3.5, we needed to discover the boundaries
that separated each of these three labels in the space. This task is traditionally known
as “classification:” one of the main problems in the wider field of data mining.
In the remainder of this section, we will first introduce data mining and machine
learning. We will then discuss different approaches to classification. The discussion
will then continue on how we can adopt these approaches to approximate the
boundary of behavior that will enable us to decide on how to design challenges
within the CRT exercise.
Historically, the field of knowledge discovery in databases (KDD) [14] is
concerned with the overall process of transforming data into knowledge. This
process has many steps. Data can exist in any form, including folders in file cabinets,
files on computers, web pages on the internet, audio and video files in mobile
telephones, and data that resides in our head. To process these data on a computer,
we need to digitize them, that is, they need to be stored on a computer in 0s and 1s.
This may involve hiring people to type the data, or using automatic methods such as
scanners, optical-character-recognition software, and speech-to-text software.
The process of transforming the data into a digital format can involve mistakes.
To ensure data quality, we need to fix these mistakes by deciding on what to do
when we encounter a missing value (e.g. the data entry did not include the age of the
client); an inconsistent value (e.g. the customer is 6-years-old working as a CEO);
or many other data-cleaning issues. Once the integrity of the data is established, we
can then transform these data into a form suitable for the specific algorithm we are
using to discover “knowledge.” This step is traditionally termed “data mining.”
Data mining is a step within KDD in which the data are in a state ready to be
processed, and the data-mining technique takes these data and discovers knowledge
in the form of patterns and relationships. Within CRT, data mining offers extremely
powerful tools that one team can use to learn about the other team. But first, let us
focus on the word “knowledge.”
One way of thinking about knowledge is to see it as a set of rules. For example, if
John is not at his desk, the security system is vulnerable. Obviously, we can discuss
many issues about this rule, from its validity to its causality and generalization.
However, this is not the point at present. The two main points we need to discuss
about this rule are the following: representation (the form) and inference (how we
discovered it).
Representation can be perceived as “IF : : : THEN : : :” representation. It is a very
powerful representation despite its classical form having critical assumptions such
as linearity. It is powerful because it has an expressive power, that is, a human can
understand it easily. Symbolic representations such as this are consistent with the
manner in which we reason about entities in the world.
However, on what basis have we discovered this rule? Why is the security
system vulnerable when John is not at his desk? These are two different questions.
We have observed the system behavior over time. Assume we are discussing a
computer network. We have noticed that many times when John leaves his desk,
a denial-of-service attack on the network occurs. Such a rule can be discovered
through data-mining techniques that attempt to correlate events across different
databases. These types of correlations can be misleading because there might not
be any relation between John leaving the desk and the denial-of-service attack.
Nevertheless, whether John is a cause for the denial-of-service attack is not the
issue. We first need to discover the rule/pattern. Before we can dismiss the pattern,
we need to consider it a hypothesis that warrants further investigation. That is, this
rule is simply a hypothesis that is yet to be validated. We can then ask why.
Asking “why” may trigger a data-collection exercise for data that we have not
been collecting. For example, we may collect data on where John goes when he
leaves his desk if one of the hypotheses is that John is generating the attack. We
may collect data on John’s experience and attitude on the network if we believe that
John is an excellent network administrator who can quickly sense espionage activity
that occurs before the attack, and diverts the attack to a dummy network. Therefore,
leaving his desk is a window for an intruder to penetrate the network.
The above discussion illustrates a point that is critical for CRT. The data-mining
process can help us to generate hypotheses that are supported by evidence from the
data we have. This can be an entirely blind process without a bias of any specific
presupposition. One hypothesis raises questions that trigger more analysis and more
hypotheses can be discovered during the process. Therefore, one is able to see the
overall CRT exercise as a data-mining exercise; it begins with hypotheses, conducts
experiments and/or discussions to collect evidence, either confirms or refutes the
hypotheses, and the cycle continues.
The previous representation can be extended to “IF : : : THEN : : : ELSE : : :”
and can contain a series of nested “IF : : : THEN : : :” rules. For example, see the
following rule, which assumes that the first step to authentication in the system is
based on a fingerprint.
If subject’s finger is oily, authorization is not granted; authorization is granted
subject to identification.
This rule is a compound rule; we can split it into three basic rules that we can
easily map to each path from the root node to a leaf node in a tree-like form. The
three rules are the following:
Fig. 3.10 The IF : : : Then : : : rule in a tree form
• If subject’s finger is oily, authorization is not granted.

• If subject’s finger is not oily and the subject has been identified as an employee,
authorization granted.
• If subject’s finger is not oily, and subject has not been identified as an employee,
authorization is not granted.
We notice that the order of applying the three rules above does not impact the
behavior of the system. This representation is traditionally known as a rule set. We
can make the representation more compact by ordering the rules, in which case, we
will have a decision table as follows:
1. If the subject’s finger is oily, authorization is not granted.
2. If the subject has been identified as an employee, authorization is granted.
3. Authorization is not granted.
The rules in both previous representations can be visualized as presented in
Fig. 3.10. This visualization represents another type of a representation that is
commonly used to represent knowledge: a decision tree.
A tree in computer science is visualized upside down. At the top node (root), we
have the entry point to the tree. At each node, we have a question to answer. Based
on the answer, we need to follow the corresponding branch to the next node until
we reach a leaf.
The split at each node in the computer can be conditional on a binary question as
in the example in Fig. 3.10, or can be based on categorical or continuous variables.
An example with continuous variables is presented in Fig. 3.11. In the top figure,
we can see the decision tree and the corresponding partitioning of the space in the
bottom figure.
Fig. 3.11 A decision tree for continuous variables
Interestingly, we can see in Fig. 3.11 a similar diagram to that presented in

Fig. 3.5 in the sense that both diagrams have lines that partition the space. Therefore,
one way to represent the boundaries of a challenge can rely on using a decision-
tree representation. The main difference between the nature of the boundaries in
Fig. 3.11 and those in Fig. 3.5 is that the former are parallel to the horizontal and
vertical axes (which are referred to as “axis-parallel hyperplanes”), while those
represented in the latter diagram have a slope (which are referred to as “oblique
hyperplanes”).
We notice that in axis-parallel-hyperplane decision trees each time we have a
condition, there is a single variable that is used in this condition (e.g. Age < 35).
This type of decision tree is termed a “univariate decision tree.” In the oblique
hyperplane case, a split can involve a weighted sum of multiple variables (e.g.
3 x Ages C 5 x Loyalty Points <120). This type of decision tree is termed a
“multivariate decision tree.”
The leaves of the decision trees above represent a categorical variable referred to
as the “class.” This type of decision tree is referred to as a “classification decision
tree.” If the leaf is a predictive function (point value), the tree is referred to as a
“regression decision tree.”
In this book, we will limit the discussion to classification problems as the main
technique discussed so far to discover boundaries to design a challenge. However,
many other data mining technologies such as clustering analysis, association rules,
and point prediction are useful across the CRT exercise.
Similar to Fig. 3.5, let us assume that we collected data through simulation or
real-world sensors that indicated three categories of risk: high represented with “x,”
medium represented with “C,” and low represented with “” as in Fig. 3.12. In
this figure, we present two approaches to classification termed “inner” (diagram on
the top) and “outer” (diagram on the bottom) classification. An example of inner
classification is discussed in [21], while an example of an outer classification is
discussed in the famous C4.5 [29].
In inner classification, we attempt to provide an exemplar (also termed a
“prototype”) in each category. In the top diagram of Fig. 3.12, the exemplar is
presented with a large bold label in the middle of each group. When a new
observation arrives, we measure the similarity between the observation and the
three prototypes. We then assume that this observation belongs to the group with
maximum similarity. For example, if a customer has loyalty points of 20 and an age
of 25, this customer is closest to the prototype labeled “X;” thus, we will say that
this customer is a high-risk customer.
In outer classification, we attempt to find the boundaries between the classes as
demonstrated in the bottom diagram of Fig. 3.12. When a new observation arrives,
we will see within which area it falls and assign it the category of this area. This
matching process is performed using a decision tree, but other models exist, for
example, rule sets and artificial neural networks.
In CRT, we attempt to estimate the boundary between classes to define the
boundary of challenges. Therefore, it is more appropriate to use an outer classifier.
We will limit the discussion to classification trees as an efficient manner in which to
approximate the boundary between classes. In the remainder of this section, we will
present a simple introduction on how we can build a classification tree. Despite that
many algorithms exist on this topic, including CHAID [20], CART [8], ID3 [28],
C4.5 [29], LMDT [9], SPRINT [31], SLIQ [26], and QUEST [25], our discussion
will focus on C4.5 [29] as a commonly used algorithm in industry.
Fig. 3.12 Outer and inner classification
3.5.1 C4.5
C4.5 is a very efficient algorithm by which to learn univariate classification trees

from data. It is important to emphasize that C4.5 is not suitable for Big Data, because
it forming a single tree in Big Data is not possible. Nevertheless, C4.5 is based on
a number of concepts, such as information gain ratio, that can be very useful in Big
Data problems. Moreover, it has been used successfully by industry and academia,
and it has demonstrated balanced behavior when compared to other classification
methods [24].
C4.5 relies on the information theoretic concept of entropy. Therefore, it is
important to understand this concept before we progress on how it is used to build a
decision tree in C4.5.
In the scenario presented in Sect. 3.1.2, three areas in the fitness landscape were
identified as interesting. We labeled them Z1, Z2, and Z3. If we wish to store these
three labels (let us also refer to them as “classes” or “categories”) in the computer
in 0 and 1s, we need two bits. In this case, we can represent Z1 as 00, Z2 as 10,
and Z3 as 01. We are able to discover the number of bits required using a simple
mathematical concept known as a “logarithm” or “log” for short. We will write the
log of a number x to the base y as z D logy .x/. In essence, the relationship between
the value of the log, x and y is that y z D x. Therefore, log2 .1/ D 0 because 20 D 1.
Given that we wish to represent these three classes in binary formats using bits,
and each bit can take two values (0 or 1), the log needs to be of base 2. Given
that we have three classes, log2 .3/ D 1:584963. In our representation above, we
used two bits. However, the average number of bits we needed was 1.584963. This
can quickly indicate to us that we used more than we needed; thus, the space used
to store the three classes is underutilized. This is obvious because we do not have
a class corresponding to both bits having the value of 1, that is, there is no class
encoded as 11.
The log gives us the average number of bits we need to use to encode data for
storage or transmission. Equally, we can use this idea to measure the information
content of a message or dataset. Assume we have a random sequence of the three
labels: Z1, Z2, and Z3. Given the sequence is completely random, the probability
that we select a label from this sequence and we correctly predict this label
before identifying it is 31 . Let us calculate the information content (entropy) of this
sequence, we will then explain what it means.
X
c
Entropy.S / D pi log2 pi (3.4)
i D1
where, pi is the probability that one of the three classes will appear in the sequence,
and c is the number of classes.
Therefore, in our example, the entropy is
1 1 1 1 1 1
log2 . / log2 . / log2 . / D 1:584
3 3 3 3 3 3
We need to pause a while to consider the value we obtained above.

Imagine you are sitting in front of a computer screen, one of these three classes
will be displayed on the screen. You need to send a message to predict which class
you will see on the screen before it appears. What is the average number of bits you
need to use in each message you need to send?
It is logical to say that this is exactly what we calculated before. Given that the
classes will appear completely at random, each time your prediction will be any of
the three classes. Therefore, you will need 1.584963 bits on average to transmit this
prediction.
Consider a situation in which the sequence has many more Z1 than Z2, and Z3.
For example, assume that 90 % sequence is Z1. We can recalculate the entropy as
follows:
0:9 log2 .0:9/ 0:1 log2 .0:1/ 0:1 log2 .0:1/ D 0:568
Given that the data has more regularity, it is becoming more predictable. In fact,
if we make our prediction to be Z1 always, we will have 90 % accuracy. In this case,
we do not need to transmit our prediction every time.
An entropy is a measure of the impurity of the data. If the data are impure and
completely random, entropy is at the maximum. We will say that the information
content is very high and that we cannot find regularity in these data to reduce the
storage or the length of the message required to transmit them.
When the data are pure, containing the same information everywhere, we can
optimize the space required to store these data and we can easily predict the contents
of these data.
Using the same principles above, we can design criteria to automatically detect
whether the criteria we will be using to split the data into two groups to improve
predictability will be successful.
Let us revisit Fig. 3.5. Before we discover the three hyperplanes that separate
the three classes from each other, the data are mixed, regularity is low; therefore,
entropy is high.
If we correctly discover the two hyperplanes that separate the three classes from
each other, we will end up with three groups of data. In each group, there is only
a single class, regularity is at its peak value, predictability in each group is perfect;
therefore, entropy is very low.
That is, we are offered a discriminatory condition to split the data into two
groups. We can check the discrimination power of this condition by using the
concept of the entropy to test the change of impurity of the data. If entropy before
the split is higher than the entropy after the split, we know that this discriminatory
condition split the data into more regular subsets.
C4.5 follows the same principle. Comparing the entropy of the data before and
after a discriminatory condition is applied defines the difference as information gain
or gain ratio.
C4.5 defines I nf o.S / as the information embedded in a dataset S , with jS j
records, and k number of classes .C1 ; : : : ; Ci ; : : : ; Ck / as follows:
X
k
freq.Ci ; S / freq.Ci ; S //
Info.S / D log2 (3.5)
i D1
jS j jS j
where Freq.Ci ; S / represents the number of records in the data S belonging to class
Ci .
Given a criterion x that splits this data set into n subsets fS1 ; S2 ; ; Sn g, we can
calculate the information content of the data after this split as follows:
n ˇˇ ˇ
X Sj ˇ
Infox .T / D Info.Sj / (3.6)
j D1
jS j
3.6 Big Data 137
Information gain is the difference in the entropy before and after the split
Gain.x/ D Info.S / Infox .S / (3.7)
measures the information that is gained by partitioning S according to test x.

This measure of information–gain, Gain.x/, is biased when there are many
classes. To overcome this bias, C4.5 adopts the concept of information–gain ratio.
It first calculates SplitI nfo.x/ as a normalization factor to overcome this bias:
n ˇˇ ˇ ˇ ˇ!
X Sj ˇ ˇS j ˇ
SplitI nfo.x/ D log2 (3.8)
j D1
jS j jS j
The information gain measure is then normalized as follows:
Gain.x/
GainR atio.x/ D (3.9)
SplitI nfo.x/
Finding x that maximizes GainR atio.x/ is then formulated as an optimiza-

tion problem. C4.5 applies a simple and efficient iterative loop to perform this
optimization.
3.6 Big Data
“Big data” is a “buzz word” that will survive for many years. It is considered a buzz
word because it has no clear definition. However, it comes with a clear message:
there is an urgency to rethink how data are mined and analyzed. This urgency is
driven by technological, economic, and social factors [15, 30, 40].
On the technological level, different computer architectures exist that facilitate
the collection and storage of massive amounts of data. Examples of these architec-
tures include:
• Service-oriented architecture (SOA), a current industry standard that offers the
conceptual basis for the development of web services (see Sect. 2.5.1).
• Grid computing, which is very likely to be subsumed with the architecture
discussed below.
• Cloud computing, which provides the flexibility to store and process data
remotely in a distributed fashion across computers around the world.
On the economic side, the field of data mining offers the tools to process the
data. However, in the nineteenth century, organizations were striving to collect data,
then the data bloom occurred in the current century. The need to collect and store
massive amounts of data has long been recognized by many organizations, including
Walmart, Google, and government (for reasons of national security and others).
Many of these organizations entered the business of collecting data, analyzing

them, and extracting value for the organization from the knowledge discovered
through the application of the data-mining techniques. However, the business case
for other organizations to enter into this game was very weak. The cost of storing
data was high relative to the value to be extracted from these data.
The computer architectures mentioned above, particularly cloud computing, offer
very cheap storage and processing infrastructure for many organizations. As the
cost decreased, demand began to bloom and the buzz word of big data became
increasingly common.
On the social level, the internet has established itself as the backbone for
modern social systems. It acts as the glue that connects people and organizations.
The existence of a single backbone-like, information-exchange infrastructure upon
which almost everyone in the world can rely is a very attractive idea for many orga-
nizations. Email, online chat, voice-over-IP, video conferencing, social networking,
and many other social-media technologies have allowed for a single pot to exist for
massive data collection across the globe.
Data are the primary asset of this century. The great rise in activity on the internet
has opened opportunities for companies to collect these data to understand the
market, customers, competitors, and even to understand the company itself.
The 360-degree personal-evaluation test in which an individual evaluates them-
selves, and subsequently their co-workers, and their subordinates evaluate them
is now possible for companies, entities, and even whole of government by just
using the available big data and without the need to directly interact with anyone.
A government can mine social data to see how people perceive it; based on these
data, it can evaluate its weak areas and the challenges that exist that may need to be
managed early.
The social benefits of big data are huge, but the social challenges are also
increasing. Social benefits include applications in health monitoring, education,
training, and finance. Under the umbrella of big data, issues such as privacy and
data security are among today’s hot topics.
3.6.1 The 6 V’s Big Data Characteristics
Rather than defining big data, the literature takes the approach of defining the
characteristics of big data. The letter “V” was selected to represent these charac-
teristics [19, 30]. The journey began with the 3 Vs (volume, variety, and velocity),
extended to the 5 Vs by adding (veracity and value), and finally extended to the 6
Vs by adding (variability).
Volume is about size, number of features, and number of instances. From
terabytes, to petabytes, and beyond, this massive size of data needs to be stored,
processed, and managed appropriately.
Velocity in data-mining language is about the changes occurring in the envi-
ronment and concept drift. The implication of dealing with streams of data is that
3.6 Big Data 139
relationships and concepts underlying these data can, and will, change over time.
The analysis must be able to detect these changes, and adapt the knowledge learned
as the environment changes.
Variety reflects the heterogeneous nature of the data. As the transactions of
each customer are recorded, information about the customers needs to be linked to
understand the relationship between the type of customer and the type of transaction.
Simultaneously, information about the products and the supply chain need to be
extracted to understand whether the higher than usual expected demand can be
fulfilled, or the supply can be delayed in the presence of low demand. To compound
the complexity of the situation, unstructured text from newspapers must also be
analyzed to extract economic indicators and trends.
Veracity is about the truthfulness of the data. It can be encapsulated in two main
problems that need to be managed: trustworthiness of the source and noise. A great
deal of research is currently focusing on the area of estimating the trustworthiness
of a data source. Unfortunately, this is a self-defeating course of action. Once a
model for estimating trustworthiness is built, it can be used to deceive the system.
Conversely, noise is a very long-standing topic in data analysis and mining. It can
come in many different forms, including noise in the communication channel, noise
because of approximations and rounding decisions, noise because of ambiguity in
the representation language, or noise in perception.
Value (for money) is about the worthiness of a big-data decision-the cost and
benefit trade-off that an organization must research and evaluate before committing
its resources to decisions on big data. This is a long-standing problem in data mining
in general. For example, the board of a company wants to see a good business case
on why large investments need to go into the data-mining department, or what
is known in business terms as return-on-investment. This constitutes a catch-22
situation: it is not possible for analysts to evaluate the knowledge hidden in the data
to demonstrate value without having data-mining capabilities in place. However, to
have data-mining capabilities in place, the analysts need to demonstrate the value.
In such situations, organizations usually turn to the following principle: “start
small first; walk before you run.” Unfortunately, this often turns out to be a very
bad and unwise decision for data mining in many organizations. As the analysis
team starts small, with few capabilities, they discover little information. Over time,
the analysis team becomes occupied with the little amount of information, routine
reporting, and being pressured to continue to demonstrate value with no investment
in place. It does not take long for the data-mining department to grow in size with
several more people, and diminish in value as it becomes overloaded with classical
analysis tasks that can mostly be automated if investments become available.
Business cases for big data need to take a different form. It is not wise to invest
blindly, but it is also not the type of investment that can be executed incrementally,
from the bottom of the stairs up to the roof. The initial investment is indeed
significant; therefore, the decision must be very well researched. However, the initial
step should not be small under any circumstances; otherwise, the organization will
enter into a loophole.
Variability is about changes in the format and data structure of the incoming data.
Many big-data problems rely on third parties for streaming the data. For example,
when mining online news, the news company may change the format of its website
to make it more accessible, attractive, or even more difficult for non-subscribers to
access. The big-data infrastructure needs to be able to accommodate these changes
rapidly and seamlessly. Some of this complexity can be reduced in the presence of
proper contracts between different parties, and meta-data standards.
The above discussion has not defined what a big-data problem is but has dis-
cussed how to recognize a big-data problem. The meaning of the word “big” is not
completely clear: How big is really big? However, this haziness is necessary because
as organizations accumulate more data, complexity will continue to increase and
the big-data problem will remain. Perhaps today the focus is about problems in
managing terabytes, but tomorrow the problem will be about exabytes, and in several
years, it will be about yottabytes.
3.6.2 Architectures for Big Data Storage
Classical databases such as Oracle and DB2 are not directly suitable to handle big-
data problems. The primary reason is that these databases assume that the data
structure is known in advance. This assumption is very limited in big data, since we
may know the data structure of the data we currently have, but we do not necessarily
know the data structure for the data we are collecting now, or the data we will
collect in the future. Moreover, most data in the big-data domain are unstructured.
Therefore, this assumption is one of the main reasons for the high costs that used to
accompany big-data applications in some large companies.
To manage big data, the storage of the data needs to be structure-free. The
structure needs to be left to the processing time; thus, the same data can be structured
differently according to the need of the application. This was the basic idea behind
Apache Hadoop [32, 34, 35], an open-source project to facilitate cloud computing.
At the core of Hadoop, the Hadoop Distributed File System (HDFS) provides
the software level to distribute data across the cluster. HDFS allows for redundancy
such that a node failure does not impact calculations. The primary question is how
to distribute the data and how to process such distributed data. This is the task of
the MapReduce model developed by Google. In 2008, Google claimed that it could
process 20 petabytes of data each day using MapReduce.
The idea of MapReduce is to provide the facilities to split (Map) and aggregate
(Reduce) data. That is, one component is responsible of taking the incoming data,
splitting it, then HDFS save these data across the cluster. The other component
processes the data locally on different nodes in the cluster then aggregates these
data to provide an answer. HDFS takes this answer and stores it back to the cluster.
The Hadoop project has many other tools including Pig, a high-level parallel
computation programming language and execution framework, Hive, an environ-
ment that transforms Hadoop into a data warehouse with a predesigned structure;
3.6 Big Data 141
HBase, a column oriented database that can hold billions of rows; or Mahout, a
library of data-mining algorithms, among others.
Hadoop provided the architecture to store and process big data. However, in
any real-world application where the processing of the data goes beyond simple
queries and correlation analyses, another type of architecture is needed. This is
the architecture that needs to support big-data mining. Discussions on related
architectures are covered in Sect. 3.6.4.
3.6.3 Real-Time Operations: What It Is All About
The term “real time” has also been a buzz word commonly used in today’s jargon.
Some see real time as fast computations, but the word fast is relative: Does it mean
completing calculations in a few minutes, seconds, milliseconds? How fast is fast?
A real-time system is usually characterized with temporal and logical correct-
ness [16]. Temporal correctness sets a time constraint on the system to provide an
output, while logical correctness sets a constraint that the output must be correct
according to certain specifications.
In simple terms, a real-time system provides the user with the right answer
(logical correctness) at the right time (temporal correctness). Therefore, real-time
systems are not about being incredibly fast. They need to be sufficiently fast to meet
the time constraint without violating the correctness of the answer/response. In one
application, the response might be needed in milliseconds, while in another, the
response might be needed in several months.
The response time of a decision module or a system, is the time needed for every
operation from the time the needed/requirement of a response is established (request
time) to the time the response is generated (delivery time).
3.6.4 GDL Data Fusion Architecture
Research on the model architectures needed to support data-mining problems is

not a new topic. Applications in all significant domains are usually based on such
architectures in one way or another. For example, High-Level Architecture (HLA)
and Distributed Simulation Architecture (DSA) are among the famous architectures
in the field of simulation [11, 18, 22].
A type of architecture that is relevant to big-data mining is the classical Joint
Directors of Laboratories (JDL) data-fusion architecture, originally published in
1992, and refined in 1998 [7]. Subsequent refinement of this model have been
suggested by adding a user-refinement level [4,5]. Table 3.1 lists the classical levels
in JDL.
The JDL information-fusion architecture remains conceptually valid. However,
big-data problems do not necessarily follow a linear procession as the progression
Table 3.1 Data fusion Level Description

functions hierarchy
0 Pre-processing
1 Object assessment
2 Situation assessment
3 Impact assessment
4 Process refinement
suggested by JDL. For example, object assessment would generate data that are
added to the database, this can trigger another process of preprocessing. In addition,
some of these functions on one level of abstraction in the system would play a
different role on a different level of abstraction in the system.
The remainder of this chapter will be devoted to discussing the architectures
proposed in this book for the type of modeling required to provide the necessary
components and functions to produce CRT decisions.
3.7 Big-Data-to-Decisions Computational-Red-

Teaming-Systems
3.7.1 Preliminary Forms of Computational-Red-

Teaming-Systems
In Sect. 1.7.4, Blue–Red simulation is discussed. A number of agent-based blue–red

simulation systems for combat were mentioned including EINSTein [17], MANA
[23], and WISDOM [37–39]. Some of the work in the existing literature on combin-
ing evolutionary/coevolutionary algorithms and these simulation environments was
also mentioned, including the work on ART [10], ACE [23], and the earlier work of
the author [36, 37] was discussed.
The above literature provides only preliminary steps towards CRT. This section
will summarize the attempt in [2] to present a systematic evolution and categoriza-
tion of these preliminary CRT systems. This attempt defined five levels of CRT,
starting from level zero (CRT0), up to level five (CRT4).
Figure 3.13 depicts the very basic level, where the interaction between blue and
red is captured in a multi-agent simulation system. Similar to all simulation systems
reviewed in Sect. 1.7.4, agents act using a set of rules scripted either symbolically
using a knowledge base system, or functionally using a neural network, Bayesian
network or a hidden Markov model. The decision making of each agent in many
of these systems can be seen as a form of a Sense-Decide-Act (SDA) agent-based
model, where an agent senses the environment, then formulate a series of possible
courses of actions, and finally choose one of them to execute.
While in CRT0 agents have fixed behavior with no learning abilities, in CRT1
shown in Fig. 3.14, individual agents can learn through some basic learning model.
An agent in CRT1 follows the Sense-Learn-Act (SLA) cycle. The agents sense the
3.7 Big-Data-to-Decisions Computational-Red-Teaming-Systems 143
Fig. 3.13 CRT0: baseline preliminary computational red reaming level
Fig. 3.14 CRT1: level one preliminary computational red reaming system
environment to estimate information about environmental states as well as receive

reinforcement signals, update their decision model using a learning algorithm, then
act. This learning cycle continues as the simulation runs.
Evolutionary and co-evolutionary algorithms can be seen as a form of social
learning, where individuals can share and exchange their knowledge. This is the
basis for CRT2, whereby agents’s behaviors are governed by a decision making
model similar to CRT0. Evolutionary computation is used to evolve either the
properties of these agents, such as evolving their personalities or capabilities, or
evolve their decision making models, such as evolving the set of rules, neural
networks or finite state machine used by the agents to produce actions (Fig. 3.15).
Fig. 3.15 CRT2: level two preliminary computational red reaming system
Fig. 3.16 CRT3: level three preliminary computational red reaming system
In CRT2, learning normally happens on one team side, while the other team is
fixed. This is not the case in CRT3, as shown in Fig. 3.16, where both teams learn
together through reciprocal interaction with the other team. Both teams go through
the evolutionary learning cycle of evaluation, selection, recombination. As each
team learns, the landscape of the evaluation process changes. Therefore, a strategy
adopted successfully in one step of the coevolution may fail in a different step as it
gets to face a better opponent.
Fig. 3.17 CRT4: level four preliminary computational red reaming system
CRT4 (Fig. 3.17) is the most advanced version of these preliminary models,
whereby social and individual learning are combined. As the society evolves,
individual agents also exhibit individual lifelong learning abilities.
CRT0 to CRT4 are typical forms of blue–red simulations. These systems have
certain limitations from CRT perspective. They assume that the overall CRT
exercise can be captured in a simple simulation. Unfortunately, any practical CRT
exercise relies as we have discussed so far on the analytics of risk and challenge.
Even in these blue–red simulations, these two cornerstones exist but they are
external to the simulation. Evolution is used to find a strategy where one team will
win, thus, it can be seen as a weak form of a challenge. But the true meaning of a
challenge is not considered. As for risk analytics, this is left to the scientist who,
in the simple case, runs these simulations many time to estimate the risk of the
decision.
CRT is a very complex exercise. These simulations have been written in the
context of wargaming, but they are all by far too simplistic to capture the real and
complex dynamics of a war. While they are useful in modelling some aspects of a
war on a certain level of abstraction, their results are only confound to these aspects
and need to be incorporated into a wider context. As such, these preliminary systems
are only one tool that can be used within a real and wider red teaming exercise.
3.7.2 Progressive Development of Sophisticated

Computational-Red-Teaming-Systems
The scenario presented in Sect. 3.1.2 demonstrated the basic technical tools for a
CRT environment. The rest of this chapter will elaborate on these tools. However,
it is important to mention that CRT implementations can vary widely in the level of
sophistication and the type of science that needs to go into the development of such
CRT systems.
Table 3.2 shows the progressive stages of sophistication that the implementation
of a CRT system can go through and the corresponding response capabilities of
each level. As the system progressively moves away from mere dependence on
subjective opinion to autonomous generation of scenarios combined with integrated
simulation, data mining and optimization capabilities, the system improves its
abilities to prepare the organization for the unknown that may face the organization
causing shocks and surprises.
Table 3.2 Evolution of computational red teaming capabilities

Stage Description Benefits
1 Subjective and expert judgement Fast answers
No objective evidences
Does not work in complex systems
(bounded rationality)
High risk because of high bias
2 Measurements and data mining Objective evaluation
Can’t answer what to do to improve
Can’t answer why
Can’t manage surprises and shocks
3 Simulation Simple ad-hoc what if
Limited answers to what to do to improve
Can’t answer why
Can’t manage surprises and shocks
4 Optimization Asking the best what-if
Can answer what to do to improve
Can manage some surprises but not shocks
Can’t answer why
5 Simulation optimization-mining Asking the best what-if
Can manage some surprises & shocks
Can answer why
6 Autonomous scenario generation Asking the best what-if
Can manage surprises and shocks
Can answer why
3.7.3 Advanced Forms of Computational-Red-

Teaming-Systems
In the previous section, preliminary implementations of simple blue–red simulations

are discussed. In this section, more realistic models and architectures for CRT are
presented. The three core services: data mining, optimization and simulation, to sup-
port the CRT SOA architecture presented in Sect. 2.5.1 and Fig. 2.20 are discussed.
We provide a generic introduction to risk analytics, the computer technology upon
which CRT is based. Each of the three concepts: simulation, optimization, and data
mining are discussed in the previous sections. In this section, we will discuss risk
analytics as an architecture. We will use an incremental approach to describe the
risk analytics architecture (RAA) to allow for organizations to start with a core
architecture then grow it over time.
Our discussion will begin with Fig. 3.18. This figure represents the first basic
steps in building a simulation system. First, the designer wishes to model a reference
system in the real world. Let us assume that this system is an aircraft.
Let us further assume that the problem we are attempting to solve is short-term
conflict detection, where we need to identify whether two flight paths will intersect
in approximately the next 4 min from any point of flight.
The system (i.e. aircraft) is abstracted by identifying the key important elements,
and isolating them from the elements that add detail but are not essential to the
modeling exercise. For short-term conflict detection, it is essential to model the
aerodynamics of both flights. It is not sufficient to examine flight-plan information,
and it is inappropriate to use low fidelity models, because high fidelity calculations
are required for accurate estimation of time.
From here, it becomes clear that the model needs to capture many details about
the aircraft and the environment, including the type of aircraft; type of engine; the
appropriate aerodynamic parameters associated with such an aircraft and engine
type; as well as weather and wind information. This model will take as inputs the
current position of the two aircraft, their weight, expected wind information over
the next 10 min, and information of both aircraft about where they intend to fly. The
model will output whether there is a conflict between the two flight paths, and if
there is a conflict, the estimated location and time of the conflict.
The model described above is a mathematical model. We need to translate it into
a simulator.
Fig. 3.18 Stage one of the risk analytics architecture

This software program will output the information described above for any set of
inputs. The simulator is almost like a black box. The model resides inside this black
box. The pure job of the black box is to transform the inputs into the appropriate
outputs using the model residing within.
If every input above is known with certainty, all what we need is the simulator
to feed the inputs and we will be able to determine the exact output. However, this
deterministic behavior is not useful in the real world. Usually, we can only estimate
the inputs. Many of these inputs are external to the aircraft and we are not sure how
the aircraft will behave for different values of these external inputs.
For example, wind information is external to the aircraft. We may need to test
the behavior of the aircraft for different values of wind. This process is known
as “sampling,” whereby different values of all uncertain elements are generated
according to certain rules or distribution.
As was discussed before in Sect. 3.4, simulation is the ability to reproduce the
behavior of a system through a model. Simulation is the process whereby we sample
different inputs, use the simulator to generate the corresponding outputs, group the
outputs and attempt to understand how the aircraft responds to different inputs.
A number of questions arise when we examine Fig. 3.18. One question is about
validation, asking the question of how one knows that the chosen model and
the behavior it exhibits reflect the behavior of the real system. Another is about
verification, asking the question of whether the model is implemented properly (i.e.
Is the simulator a correct implementation of the model?).
This question is answered in Fig. 3.19. In this figure, the behavior of interest of
the real system is observed and recorded. Similarly, the behavior of the simulation
is recorded. Patterns can be extracted from each set of observations, by using
clustering analysis or some other data mining technologies, to deconstruct the output
from the simulation into similar groups.
The behavior of the aircraft on a simple level can be captured through recordings
of the state vector of the aircraft in the simulation and in the real world. State vector
information includes longitude, latitude, altitude, time and/or speed information.
Comparing these state vectors can tell us whether the simulated aircraft is following
identical path to the real aircraft.
The previous comparison denotes one level of validation. However, it is too
simplistic. It may result in the aircraft in the simulation just playing back the data it
received from the real world. We need to dig into the concept of behavior to validate
this simulation. One manner in which to achieve this is to calculate fuel burn and
change of flight weight after each transition in space (i.e. flight segment). We can
then compare the behavior of the fuel burn and changes in flight weight to the factors
in the real world. We can conduct further research by comparing other aerodynamic
details.
Figure 3.19 addressed issues of validation, but not verification. The verification
process is internal, one the modeling team must undertake. It usually relies on
comparing the specifications of the model to the implementation in the simulator,
comparing inputs and outputs of the model through certain hand calculations or
other means with the outputs of the simulator, and other software-verification
techniques. The issue of verification is outside the scope of this discussion.
Fig. 3.19 Stage two of the risk analytics architecture
Figure 3.19 stops at the level of simulating an aircraft. This is similar to blue–red
simulation, where we stop at the level of having a simulation of the blue and red
teams to sample the response of blue and/or red to the actions of the other team (see
Sect. 1.7.4).
A natural question from decision makers once we have the system implemented
in Fig. 3.19 is that if we are successful in imitating the real system, can we use this
imitation to test the system for conditions that we cannot test in the real world?
Further, can we use this imitation to reveal how can we optimize the real system?
Figure 3.20 adds an optimization loop, whereby the simulated system is used
to find the optimal solution for an objective function. For example, in the above
example of an aircraft, one can ask what is the optimal trajectory of an aircraft that
minimizes fuel burn between a specific origin and a specific destination, and given
specific wind conditions.
In the question posed above, the optimization needs to take as input an origin,
destination, and a wind profile, and provide an output on a second-by-second basis
the longitude, latitude, and altitude of the aircraft. Another possible output would
include factors such as the thrust level and flap settings, representing the settings
of the flight-management system to fly the aircraft. We will continue with the first
type of output for ease of understanding. We will call this type of optimization
as “behavior optimization” because the primary focus is placed on reproducing the
Fig. 3.20 Stage three of the risk analytics architecture
behavior of the system without necessarily constraining the system to ensure that the
model is plausible in its working mechanisms with the internal working mechanisms
of the system itself. In simple terms, we would like to optimize the behavior of the
ants without necessarily paying attention to whether or not the model used is a
biologically plausible model of how the ants make decisions.
There are many different methods by which this optimization method can work.
One is that the optimization method will first generate a trajectory at random,
or use a previously flown trajectory from a real situation. The method will then
systematically change the output (i.e. change the positions on the trajectories by
changing the speed of the aircraft and flying angle) and measure the impact of that
change on the objective function. These changes need to be implemented in small
steps. The changes are accepted if they improve the objective function and discarded
if they do not.
This systematic search procedure is conducted by an “optimization algorithm” or
a “search procedure” as we discussed in Sect. 3.3. When no changes can be found
that improve the objective function, the search ceases.
So far, we have been successful in optimizing the system indirectly through the
simulation. In fact, it is a great deal cheaper and safer to use the simulation for this
type of optimization. It does not make sense that we perform this optimization on the
real aircraft. Even if we do, we will spend many years validating the system, not to
Fig. 3.21 Stage four of the risk analytics architecture
mention the high risk associated with this. The simulation environment enables us to
run thousands of scenarios in a computer environment in a much smaller timeframe.
Figure 3.20 opened opportunity for optimization. Figure 3.21 opens even more
opportunities. If we were to validate the simulation environment as discussed
(Fig. 3.19), can we actually ask the simulation questions to reveal patterns that we
cannot ask the real environment?
For example, we may ask what type of behavior an aircraft would exhibit
if we were to apply certain flight-management system settings that are believed
dangerous? These dangerous settings cannot be tested in the real world on a real
aircraft.
Simulation mining is about applying data-mining techniques on data obtained
from simulations to extract indirectly patterns and generalizations on the real
system. We are now moving into the space of fantasy in which we can execute
experiments in the simulated environment that we cannot do in the real environment.
We can extract information about the system of interest without touching the actual
system. The simulation environment acts like an oracle that can tell us what will
happen if we change the system in certain ways: it becomes the crystal ball that we
can use to query the system from a distance.
Despite all the levels of sophistication we have introduced thus far, we made one
dangerous and undesirable assumption: the model is fixed. What would happen if
through the validation process we discussed for Fig. 3.19, the behavior we obtained
from the simulation was consistent with the real-world behavior and then after some
time, the simulation began to drift away from the real world? We must remember
that the only constant in the real world is flux.
It is natural in many systems that the above phenomenon would happen. Not all
aircraft behave in the same manner and their performance degrades over time. If
we are simulating a manufacturing environment, it is likely that over time, workers
become more efficient in the work they do and the simulation underestimates this
efficiency.
We cannot rebuild the model every time a change in the real system occurs.
Equally, we cannot afford to continue using the simulation if it no longer represents
the real system. The solution for this dilemma is to utilize the data-mining loop for
validation to reveal why drift is occurring. The pattern that explains drift can then
be sent back to the model to change the model adaptively and autonomously.
Figure 3.22 adds an error term that allows the model to change its parameters
to adapt to new situations. These adaptive search capabilities are very interesting.
Imagine now that we began with a rough model. In many complex real-world
applications, such as modeling the dynamic of a government, to conceive of a good
model is a non-trivial task. We may have a rough idea about the system that we use
Fig. 3.22 Stage five of the risk analytics architecture

Fig. 3.23 Stage six of the risk analytics architecture
to build a rough model. As the adaptive search capability enables us to recover when
the real environment changes, it can also be used to recover when the model is not
entirely accurate.
Our last level of sophistication is presented in Fig. 3.23. The adaptive search
capability is augmented with an optimization module to search for new models and
speed up the refitting of the model. Adaptation takes a long time and it may not
provide solutions that are close enough to the optimal behavior required. Moreover,
if we rely only on an error term to update the parameters of a model, and if the
change in the environment requires a structural change of the model (e.g. the original
model was linear, while the model needed after the change is quadratic), simple
adaptive mechanisms will fail. The adaptive search mechanism merely represents
incremental modifications of the model by augmenting it with the patterns extracted
from the data-mining module. In its simplest form, it can be considered a set of
exceptions that are added every time an exception occurs.
The situation of relying on a simple adaptive search mechanism forever (some-
times referred to as “lifelong learning for a learning problem” or “lifelong opti-
mization for an optimization problem”) is not optimal. The exception list will grow
very fast and get out of control, or simple changes of parameters will not be able
to approximate the changes in the nonlinearity in the relationship. Instead, it is
preferable to use an optimization module to auto-generate new models, as in the
case of using a genetic programming or rule-discovery to auto generate a controller.
By now, the loop is complete and the system is ready for one side of CRT.
We have an intelligent system that can correct the models it is using. It can use
simulation as its brain to think about the real system and play “what-if” scenarios. It
can discover information about the real system without touching the real system. It
can even discover how to influence and optimize the real system without revealing
itself to the real system.
It is important to emphasize that the system under investigation mentioned in the
example so far can be a physical, socio-technical, cognitive or human system.
In classical CRT, we are possibly more interested in socio-technical systems. Red
needs to challenge the thinking of blue. The RAA can be implemented regardless of
whether the system to be modeled is physical or socio-technical.
For example, if the simulation is about reproducing the behavior of a group of
people in a social context, the details of the calculations performed for the aircraft
are not a fit for this problem. In this case, we may rely on more sophisticated
forms of data mining. We may capture the networks of communication among
group members, identify the characteristics of this network, and compare these
characteristics in the simulated group and the real group. As we move into social and
cognitive simulations, the comparison between the simulated behavior and the real
behavior necessitates the use of more sophisticated forms of data-mining methods.
The risk analytics system is perfect for modeling human behavior. The adaptive
search and optimization module enable us to begin with rough models (i.e. initial
hypotheses). The system will then refine these models over time through collection
of more intelligence and observations about the human system. The simulation
mining loop enables us to question the simulation instead of questioning the
humans; thus, we can reveal information without contacting the human system
we are observing. The optimization algorithm loop enables us to discover how to
influence the human system to achieve certain goals and objectives. Consequently,
hypotheses about the system become better descriptors of the behavior of the system
in questioning.
The risk analytics system is the computational thinking machine to think red,
think about red, think for red, think blue, think about blue, and think for blue.
However, designing and developing the risk analytics components in practice
requires indepth technical know-how of these components (skills) and a high level
of competency in synthesizing them.
3.7.4 The Shadow CRT Machine
Thus far, we have discussed sophisticated approaches to design an intelligent

computational environment using risk analytics. A computational environment
that can continuously receive data, extract patterns, play what-if scenarios, find
responses, and even correct its own models.
For a CRT system, we can duplicate risk analytics for both red and blue. In
essence, risk analytics can be the architecture used by an agent, and as many
agents as desired can be created for both the red and blue sides. While the overall
architecture will be the same, the data and models used by each agent might differ.
CRT does not stop at the level of building a smart computational environment of
a system. Recall that CRT is about designing deliberate challenges. Risk analytics
can design deliberate challenges with the architecture discussed thus far. However,
the challenges will not be complete. CRT can offer the Shadow CRT Machine.
Definition 3.11. A shadow CRT machine is a computer environment that works in
parallel with an actual system, shadowing and monitoring its operations, projecting
ahead to create the space of possible future states of the system, and challenging any
negative risk that arises in that space by proposing appropriate responses.
Classical system models are categorized in two generic types: closed systems
and open systems. In closed systems, we draw a strict distinction between the
system and its environment. We assume that everything in the environment is
uncontrollable (exogenous variables). We can only control the variables within the
system (endogenous variables); clearly, our control of endogenous variables is not
unlimited. Endogenous variables would have constraints; we can only change these
variables while respecting these constraints.
Closed systems arise from the reductionist school in which a problem needs to
be decomposed into smaller subproblems. When modeling each subproblem, we
assume that the subsystem associated with the subproblem is a closed system.
In open systems, we bring some variables from the environment inside the
system. The modeler understands that a system is not situated in a vacuum.
In CRT, we will go beyond open systems. Perhaps we can refer to it as a
“wide-open” system approach. We consider the fact that every action from the
system influences the environment. Therefore, there is a degree of control (which
may be limited but is certainly not insignificant) that a system can exercise on its
environment.
Figure 3.24 presents the risk analytics for CRT. We can assume in this figure that
the environment is the red team, while the system is the blue team. However, in CRT
the overall architecture presented in Fig. 3.24 would be the architecture used within
a team. As every team attempts to represent itself and the opponent in its thinking
process, each team would need to have a representation of both red and blue and
mechanisms to evaluate decisions and reciprocal interactions.
The environment is modeled and represented explicitly, almost in the same way
the system is modeled. Both the system and the environment interact. However,
clearly, one objective of modeling the environment is to identify the environmental
forces that can be reshaped for the benefit of the system.
In Fig. 3.24, the human agent sits at the interface between the shadow
CRT machine and the external world for the blue and red teams. A real-world
CRT exercise will involve multiple pieces of analysis of nested nature. There can
be CRT activities within the CRT activities. It is also very likely that each team will
contain a number of humans.
Fig. 3.24 The shadow computational red teaming machine
Fig. 3.25 Cognitive-Cyber-Symbiosis for Computational Red Teaming
This mix of humans and shadow CRT machines is depicted in Fig. 3.25 using
the CoCyS system. For this environment to operate symbiotically, each human and
machine are thinking entities. They have different, but complementary skills. The
seamless blending of humans and machines as a single living organism in a CoCyS
system is a sophisticated goal. A demonstration of it will be covered in the third
case study in Sect. 5.3.
References 157
While this chapter covered the sophisticated roles different models can play
within the shadow CRT machine, the human plays other form of sophisticated roles.
The human has the daunting task of deconstructing messy and complex problems
into structured problems in a principled manner. The human needs to understand
how to deconstruct a complex organization like a socio-technical system or a cyber
system into building blocks that can be analyzed properly. The following chapter
describes some of these thinking tools that a human can use and rely on for CRT.
References
1. Abbass, H.A., Baker, S., Bender, A., Sarker, R.: Identifying the fleet mix in a military setting.
In: The Second International Intelligent Logistics Systems Conference, pp. 22–23 (2006)
3. Baker, S., Bender, A., Abbass, H., Sarker, R.: A scenario-based evolutionary scheduling
approach for assessing future supply chain fleet capabilities. In: Evolutionary Scheduling,
pp. 485–511. Springer, Berlin (2007)
4. Blasch, E.P., Plano, S.: JDL level 5 fusion model: user refinement issues and applications
in group tracking. In: AeroSense 2002, pp. 270–279. International Society for Optics and
Photonics (2002)
5. Blasch, E., Plano, S.: DFIG level 5 (user refinement) issues supporting situational assessment
reasoning. In: 2005 8th International Conference on Information Fusion, vol. 1, pp. xxxv–xliii.
IEEE (2005)
6. Bowley, D., Comeau, P., Edwards, R., Hiniker, P.J., Howes, G., Kass, R.A., Labbé, P., Morris,
C., Nunes-Vaz, R., Vaughan, J., et al.: Guide for understanding and implementing defense
experimentation (GUIDEx)-version 1.1. The Technical Cooperation Program (TTCP) (2006)
7. Bowman, C., Steinberg, A., White, F.: Revisions to the jdl model. In: Joint NATO/IRIS
Conference Proceedings, Quebec (1998)
8. Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees.
1 edn. Chapman and Hall/CRC, New York (1984)
9. Brodley, C.E., Utgoff, P.E.: Multivariate decision trees. Mach. Learn. 19(1), 45–77 (1995)
10. Choo, C.S., Chua, C.L., Tay, S.H.V.: Automated red teaming: a proposed framework for mil-
itary application. In: Proceedings of the 9th Annual Conference on Genetic and Evolutionary
Computation, pp. 1936–1942. ACM, New York (2007)
11. Dahmann, J.S., Fujimoto, R.M., Weatherly, R.M.: The department of defense high level
architecture. In: Proceedings of the 29th Conference on Winter Simulation, pp. 142–149. IEEE
Computer Society, Washington, DC, USA (1997)
12. Dantzig, G.B.: Linear Programming and Extensions. Princeton University Press, Princeton
(1998)
13. Director, C.O.: Plans defence capability development manual. Tech. rep., Technical report,
14. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: Knowledge discovery and data mining: towards
a unifying framework. In: Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.)
Advances in Knolwedge Discovery and Data Mining, pp. 1–36. AAI/MIT Press, Cambridge
(1996)
15. Frankel, F., Reid, R.: Big data: distilling meaning from data. Nature 455(7209), 30–30 (2008)
16. Greenwood, G.W., Tyrrell, A.M.: Introduction to Evolvable Hardware: A Practical Guide for
Designing Self-adaptive Systems, vol. 5. Wiley, New York (2006)
17. Ilachinski, A.: Enhanced ISAAC neural simulation toolkit (EINSTein): an artificial-life
laboratory for exploring self-organized emergence in land combat (U). Center for Naval
Analyses, Beta-Test Users Guide 1101, no. 610.10 (1999)
18. Jacobs, P.H., Lang, N.A., Verbraeck, A.: Web-based simulation 1: D-sol; a distributed java
based discrete event simulation architecture. In: Proceedings of the 34th Conference on Winter
Simulation: Exploring New Frontiers, pp. 793–800. Winter Simulation Conference (2002)
19. Jin, Y., Hammer, B.: Computational intelligence in big data [guest editorial]. IEEE Comput.
Intell. Mag. 9(3), 12–13 (2014)
20. Kass, G.V.: An exploratory technique for investigating large quantities of categorical data.
J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 29(2), 119–127 (1980)
21. Kirley, M., Abbass, H., McKay, R.: Diversity mechanisms in pitt-style evolutionary classifier
systems. In: Triantaphyllou, E., Felici, G. (eds.) Data Mining and Knowledge Discovery
Approaches Based on Rule Induction Techniques, Massive Computing, vol. 6, pp. 433–457.
Springer, New York (2006)
22. Kuhl, F., Dahmann, J., Weatherly, R.: Creating Computer Simulation Systems: An Introduction
to the High Level Architecture. Prentice Hall, PTR Upper Saddle River (2000)
23. Lauren, M., Silwood, N., Chong, N., Low, S., McDonald, M., Rayburg, C., Yildiz, B., Pickl, S.,
Sanchez, R.: Maritime force protection study using mana and automatic co-evolution (ACE).
In: Scythe: Proceedings and Bulletin of the International Data Farming Community, vol. 6,
pp. 2–6 (2009)
24. Lim, T.S., Loh, W.Y., Shih, Y.S.: A comparison of prediction accuracy, complexity, and training
time of thirty-three old and new classification algorithms. Mach. Learn. 40(3), 203–228 (2000)
25. Loh, W.Y., Shih, Y.S.: Split selection methods for classification trees. Stat. Sin. 7(4), 815–840
(1997)
26. Mehta, M., Agrawal, R., Rissanen, J.: Sliq: A fast scalable classifier for data mining. In: Apers,
P., Bouzeghoub, M., Gardarin, G. (eds.) Advances in Database Technology–EDBT ’96. Lecture
Notes in Computer Science, vol. 1057, pp. 18–32. Springer, Berlin/Heidelberg (1996)
28. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
29. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
30. Russom, P., et al.: Big data analytics. TDWI Best Practices Report, Fourth Quarter (2011)
31. Shafer, J.C., Agrawal, R., Mehta, M.: Sprint: A scalable parallel classifier for data mining.
In: Proceedings of the 22th International Conference on Very Large Data Bases, VLDB ’96,
pp. 544–555. Morgan Kaufmann Publishers, San Francisco (1996)
32. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: IEEE
26th Symposium on Mass Storage Systems and Technologies (MSST), 2010, pp. 1–10. IEEE,
Nevada, USA (2010)
33. Tzu, S.: The art of war (translated by Samuel b. Griffith), p. 65. Oxford University, New York
(1963)
34. Venner, J., Cyrus, S.: Pro Hadoop, vol. 1. Springer, Berlin (2009)
35. White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Sebastopol (2009)
36. Yang, A., Abbass, H.A., Sarker, R.: Evolving agents for network centric warfare. In: Proceed-
ings of the 2005 Workshops on Genetic and Evolutionary Computation, pp. 193–195. ACM,
New York (2005)
37. Yang, A., Abbass, H.A., Sarker, R.: Landscape dynamics in multi-agent simulation combat
systems. In: AI 2004: Advances in Artificial Intelligence, pp. 39–50. Springer, Berlin (2005)
38. Yang, A., Abbass, H.A., Sarker, R.: Characterizing warfare in red teaming. IEEE Trans. Syst.
Man Cybern. Part B: Cybern. 36(2), 268–285 (2006)
39. Yang, A., Abbass, H.A., Sarker, R.: How hard is it to red team? In: Abbass, H.A., Essam
D. (eds.) Applications of Information Systems to Homeland Security and Defense, p. 46. IGI
Global, Hershey, PA (2006)
40. Zhai, Y., Ong, Y., Tsang, I.: The emerging? Big dimensionality? IEEE Comput. Intell. Mag.
9(3), 14–26 (2014)
Chapter 4
Thinking Tools for Computational Red Teaming
Abstract To discuss conducting a realistic CRT exercise in large organizations, this

chapter presents models to analyze very large complex systems in a systematic man-
ner that is easy to manage. Scenarios as a mean to capture complex plausible futures
are discussed, followed by the presentation of guidelines on designing scenarios for
CRT exercises. Organizations are complex entities. As such, one section is dedicated
to discussing the complexity resulting from the interaction and entanglement of the
four domains: physical, cyber, cognitive and social. This discussion is followed by
the presentation of two models offered to manage this level of complexity. One
of the models presented is designed to disentangle this complexity into building
blocks where effects can be generated in large organizations. This chapter presents
the different operations that can be conducted on networks in which it is possible to
capture a complex system such as a social system in a network form.
4.1 Scenarios
4.1.1 Possibility vs Plausibility
The word scenario is very commonly used in all scientific fields. In experimental
design, a scenario is an engineered situation to which the experimenter exposes
the units of the experiments. For example, in psychology, if we want to examine
human behavior under stress, the designer may engineer situations in which a human
subject will feel stressed. These situations may be imaginary and deceptive, but they
are engineered to condition the subject to ensure that the phenomenon to be tested
is expressed.
In strategic planning, a scenario is usually an imaginary future. Each future
represents a situation that a country or an organization may face in the future. In
decision sciences, a scenario classically captures a possible set of changes that
might occur for the model. For example, a decrease in the assigned budget is a
scenario, and the analysts would like to understand the impact on performance if
the decrease were to occur. In engineering, a scenario is usually referred to as a
“test case.” These test cases represent the possible situations that a machine may
encounter in the future. They test the machine’s ability to perform under a wide
range of circumstances. In finance, a scenario is usually perceived as variations of
the financial position of a company or variations of the budget.

160 4 Thinking Tools for Computational Red Teaming
In all of the above, despite the fact that a scenario may take different forms
in different fields of science, the fundamental concept of a scenario remains
unchanged.
Definition 4.1. A scenario is a picture of how uncertainties may come together to
form a plausible set of forces (a context) that impact the performance of a system.
We avoided to use the word “future” in the above definition, primarily because it
confuses the concept of a scenario by making people assume that a scenario can only
be about the future. In some analysis, we design scenarios to understand the past. If
we can find the set of forces that shaped up the context in a situation that occurred 20
years ago, we can explain the phenomenon that happened then. One simple reason
to do this is that we may not have enough data to explain the phenomenon as in the
case to attempt understanding a market decision that occurred in the past.
The shape and form a scenario will take will differ in different fields. It can be
a story in strategic planning; an excel sheet with the company budget in finance,
a series of events to which to expose a subject in psychology; a range of values a
parameter may take in an optimization model; or an idea that flies through one’s
mind about a question that may be asked in a job interview.
Scenarios are the language used to represent uncertainties. A scenario is a
plausible set of forces that the system might face. It is important here to emphasize
that we use the word “plausible” rather than the commonly used, inaccurate, word
“possible.”
A scenario is not a “possible” set of forces. Emphasizing what is possible and
what is not can cause a scenario designer to focus on the likelihood that something
will occur. Plausibility deals with the manner in which the internal events of a
situation can come together to create the entire situation logically, consistently, and
coherently.
The distinction between plausibility and possibility is important, as it shifts the
focus of the designer from thinking whether an event is possible to focusing on
the internal dynamics of the event and asking questions concerning which elements
need to come together to make this context plausible. This shifts the focus from
possibilities and probabilities, to reasoning, holistic coherence and overall logical
consistency.
Nevertheless, plausibility implies a non-zero probability and a possibility. Plau-
sibility centers the analysis on inferencing root causes rather than merely on
occurrences. The difference in the two approaches of analysis lies in the angle
taken for the analysis and the corresponding possible bias that can be generated.
Plausibility is akin to a “bottom-up” approach. We begin with the building blocks,
the basic forces that define and shape the dynamics of a situation. We then examine
how these forces can interact to condition a context. Some of these context may
not be sufficiently plausible (weakly plausible) and are therefore, excluded, while
others are strongly plausible and are included. Through this approach, we conceive a
strongly plausible situation that we may have once considered less likely. However,
possibility alone may limit a designer’s focus to a local context, without considering
the overall logic of the scenario in the wider context. That is, possibility focuses
4.1 Scenarios 161
on syntax, while plausibility focuses on semantics. The literature contains many

approaches for designing scenarios: whether a scenario is a simple combination of
the uncontrollable variables in a mathematical model or a space defined with a set
of factors that may influence the manner in which a context is shaped.
In traditional mathematical modeling, scenario analysis can be achieved through
sensitivity or parametric analysis, or through incorporating probabilities into the
model. Here, the definition of a scenario is simple and the focus of the analysis is
placed on understanding the impact. In strategic analysis, scenarios are perceived as
stories, or sometimes a more structured definition is that a scenario is perceived as
the set of factors shaping up a plausible space of possibilities.
In this chapter, we will discuss scenarios from a CRT perspective. Scenarios are
no longer a mere representation of uncertainties. In CRT, scenarios are options:
when blue designs scenarios, blue attempts to capture the scenario space of red in
its own scenario space. An example would be useful before we progress.
Let us revisit the example of John and his job-interview scenario. John can simply
rely on the interaction with Martin and Amy as a CRT exercise to prepare for the job
interview. However, as we stated previously, CRT is not a simple training exercise.
What is important in CRT is the thinking process. In the CRT exercise (i.e. mockup
job interview), John needed to do more than simply going through the exercise. John
needed to be more proactive and think about the exercise. He needed to formulate
scenarios in his mind on what plausible questions he may receive. However, equally
importantly, John needed to think how he could inject uncertainties for Amy and
Martin. This sounds too complicated: Why would John inject uncertainties into Amy
and Martin’s thinking?
If John controls the space of uncertainty for Amy and Martin, John has already
taken control of the questions they will ask him. John can indirectly influence and
reshape the question space of Amy and Martin.
For instance, assume that John has moved between many jobs in a short period,
and that he is afraid that this will be perceived negatively in his application. He has
the option to wait, plays on his luck, and see if he will be questioned about this.
Whether he is posed the question or not, it is likely that the impact will be negative
in both cases. If the question is on the table but Amy and Martin are embarrassed
to ask it, it does not mean that this question is not going to influence their decision
about John’s suitability. If they ask the question, it is likely that John’s answer will
be perceived as defensive or they may not be convinced with the argument.
Now let us consider another scenario in which John takes control of the situation.
When John is asked at the beginning of the interview to tell the interview panel about
himself, John sees this question as an opportunity to manage the uncertainty of his
fear, and generate uncertainties for Amy and Martin to create his own advantage.
John may begin by saying, “I have a wide experience. When I began my career,
I talked to many people and realized that in many jobs, there exists some sort of a
preference toward applicants who have been in one place for a long time. However,
many of the cases I heard about from my colleagues demonstrated to me that people
stay in one place for a long time for many reasons. Sometimes, it is not because they
are good at what they are doing. Sometimes, people stay in the same job for a long
time because they are either not skilled enough to find a new job, they are scared of
making a change in their life, or they are not self-motivated. For me, I did not want
to fall into this trap. Instead, I wanted to demonstrate that I can move from one job
to another and that my skills are needed in different positions. I am now satisfied
that I have demonstrated my skills sufficiently in many types of positions, and am
now seeking a position where I can achieve more stability in my life and spend the
rest of my career giving to one company. I feel I have demonstrated sufficiently to
myself that I can move between jobs, and I now need to demonstrate to myself that
I can stay in one place for a long time.”
In this answer, John did not only answer the question that Amy and Martin saw
as a possible problem for John, but he also answered the question in such a manner
that will make Amy and Martin question their hypothesis-the hypothesis that people
who stay in one place for a long time are better than people who move from one
job to another. John played with Amy and Martin’s uncertainties and injected new
uncertainties in their minds about other applicants that they may have favored.
Overgeneralization can be a problem in CRT, and this example is no exception.
John may have opened a can of worms by answering in the manner described
above. The assumption here is that John did not only imagine what Amy and Martin
think, and respond to his imagination, but that he had carefully studied Amy and
Martin (maybe through his experience with them) and only then, could he make an
informed decision on how to play with their uncertainties.
John has learned how to execute CRT correctly. He coupled in the scenario he
designed in his mind two uncertainties: the uncertainty that he may be facing and he
does not control, and the uncertainty of the other team that he can design, influence,
shape, and even control. Effective representation of scenarios in CRT requires
mechanisms to capture the interaction of blue and red uncertainties simultaneously.
However, how can we design and capture these scenarios? The rest of this section
provides insight into this question.
4.1.2 Classical Scenario Design
Traditionally, scenarios are designed through many different methods that range
from being completely ad hoc to very systematic. We will use one systematic
method in particular to explain one of the classical scenario-design methods, which
is termed Field Anomaly Relaxation (FAR) [3].
Let us first explain the FAR technique for generating scenarios. In the 1970s,
Rhyne developed FAR as a method that does not restrict the analysis to quantifiable
factors; it draws on insight and judgment, provides an audit trail, produces a range
of explicable scenarios, and begins and ends with short essays.
FAR relies on morphological analysis [4, 5], which in simple terms means
the identification of the basic independent blocks defining a structure, the
non-overlapping values each building block can take, then the generation of all
plausible combinations of these values. For example, if we say that a face is
4.1 Scenarios 163
comprised of building blocks such as a nose, two eyes, two ears, and a mouth, we
can then create categories of all possible shapes a nose may take. The same can be
executed for the shape of an eye, mouth, and ear. In FAR, the building blocks are
termed “sectors,” and the values each building block can take are termed “factors.”
Hence, an eye is a sector and an almond-shaped eye is a factor.
If we generate all possible combinations of shapes for eyes, ears, mouths, and
noses, we can enumerate all possible shapes of a face that we would ever encounter.
Similarly, we can enumerate plausible futures that we have not experienced.
FAR begins with a story that describes a strategic future. From this story, and
through brain storming, sectors are extracted. For each sector, the possible factors
are defined. The superset of all possible combinations of these factors is generated.
Some combinations may not be plausible; these are eliminated from the set. The
superset now only contains the plausible combinations of factors. These can then
be grouped together, ordered in an evolutionary path, and a tree describing how
possible futures may unfold is constructed. Each path in this tree is a possible
sequence of situations that may unfold to uncover different futures. This unfolding
process is written in the form of a story. As such, FAR begins with one story and
ends with another.
A simple example is the following. Let us assume that Manysoft is faced with
a challenge against Minisoft. Manysoft does not know whether it should expand
its workforce. A brain-storming session of this situation identified two sectors:
market stability and resource pressures. The market can either be stable or unstable.
Resource pressures can either be low or high. Therefore, in FAR language, each
sector has two factors.
The superset of factors can then be defined as follows:
S1 ! stable market, low resource pressures
S 2 ! unstable market, low resource pressures
S 3 ! stable market, high resource pressures
S 4 ! unstable market, high resource pressures
It is clear in this example that all four combinations are plausible futures that
the company may face. If we assume that S1 is the current situation, we can order
these scenarios. For example, from S1 we can move to either S 2 or S 3, then we can
move to S 4. It may not make sense in this domain that suddenly resource pressures
become high, and the market becomes unstable. It may be more logical to assume
that stability of the market changes and consequently resource pressures change.
Alternatively, resource pressures change, then the stability of the market changes.
We can then write two ways in which the future may evolve:
S1 ! S 2 ! S 4
S1 ! S 3 ! S 4
FAR then suggests that a story is written for each future path.
4.1.3 Scenario Design in CRT
FAR is based on morphological analysis and as such, would fall into the trap of
any method that attempts to capture everything. For example, the above describes
the basis of FAR using an example of enumerating all possible shapes of faces that
one can encounter. What this example did not consider is that through interaction
with the environment, the morphology of the face can be mutated through mutations
in the gene: a face may contain only one eye, or even three eyes. This may not be
imaginable even through a well-designed brain-storming session. FAR assumes that
the structure of the face as we know it is not subject to change.
However, the real problem in these methods is that they stop at the morphological
level. It is important to understand that the real building blocks defining these
morphologies are not just phenotypic building blocks, but also genotypic building
blocks.
For example, when designing a strategic scenario, economic and political stabil-
ity are two examples of typical sectors considered with FAR. They are what we term
“phenotypic building blocks.” What derive these building blocks on the genotypic
level might be culture, education and natural resources. Culture may shape how
disagreement is resolved within a particular country, and whether violence is
pertinent in the manner in which conflict is resolved. Education may shape how
strategy is formed and whether the country advocates for lateral thinking or a
classical memory-based and obeyance-based education system. Natural resources
are fundamental enablers to any economic growth.
On one level, phenotypic building blocks are essential because they hide more
complex details and interactions; thus, they make discussion of the scenarios
manageable. On another level, precisely because they hide essential interactions,
we need genotypic building blocks to understand a scenario more clearly despite
the fact that they will come with a complex space of possibilities. Therefore, both
levels are needed, and in fact, more levels can be needed.
A phenotypic building block relies on the deconstruction of how a system
appears to an observer. A genotypic building block relies on a deeper analysis and
understanding of the true forces that make the system appears as it does. A gene
can be mutated and this can either mutate the structure of a face or a morphological
building block of a face.
Morphological methods ignore the fact that there can be many other layers on top
of a face. They ignore that a mask can be worn to hide facial components or make-
up can be used to reshape an eye into a shape that is not natural or biologically
possible, but is indeed a plausible shape.
FAR demonstrates a simple, efficient and very effective methodology to develop
scenarios. However, it is missing one critical element when in the context of a CRT
scenario: the evolution of blue’s future is dependent on the evolution of red’s future
and vice versa. Blue should not define sectors and factors that are all independent
of red’s objectives. This can waste a lot of resources in discussing forces that sound
right, but are not plausible for red.
4.1 Scenarios 165
Fig. 4.1 Building blocks of a scenario in CRT
In our previous example, the scenarios were designed for the wrong aim.
Manysoft does not know whether it should expand its workforce. However, what
went wrong in the design of the scenarios above is that Manysoft did not consider
that Minisoft is its main competitor and may be asking the same question as
Manysoft, or a question that has a conflicting answer to that of Manysoft. Designing
the scenarios in the manner in which they were above assumes that Manysoft acts
in a passive manner to the uncertainty in the environment. However, in CRT, a
scenario needs to capture the essence of CRT, that is, there is a continuous reciprocal
interaction between two entities.
A blue scenario in CRT needs to capture blue uncertainties as well as red
uncertainties. Blue’s uncertainties can be uncontrollable for blue. However, blue
can control some aspects of red uncertainty. Figure 4.1 presents the basic unit of a
building block that defines a scenario within CRT. The sign on an arrow reflects the
nature of correlation or the behavior of the interdependency relationship. A positive
sign means a positive correlation (an increase in one factor would increase or
positively enforce the other), while a negative sign means negative correlation (an
increase in one factor would decrease or negatively enforce the other).
While traditional definitions of scenarios focused on factors representing sources
of uncertainty, in CRT, a scenario is comprised of building blocks. For example, the
fundamental unit of each building block for blue is comprised of four components:
outcome/effect, objective, blue uncertainty, and the portion of red uncertainty that
blue can influence.
Before we progress, we should discuss what may be the most controversial
component to have of the four components: to include objectives in scenario design.
Traditionally, scenarios focus on uncertainties. Traditionally, uncertainties exist in
the environment, they are uncontrollable, and therefore, the objectives of the system
are not considered when designing the scenarios. Remember that in CRT the two
teams have conflicting objectives. Therefore, one team’s objectives are a threat to
the other team. Blue needs to consider its objectives in its scenario design simply
because blue’s objectives are the source of uncertainty for red, and will impact how
red’s uncertainties are developed. Blue wishes to generate an outcome to achieve its
objectives. To use another example, John’s objective is to earn money. Getting a job
is the outcome of the interview that John wishes to undergo to achieve his objective.
As such, blue’s outcomes should always positively influence blue’s objectives. If
a blue outcome is designed to negatively influence one of blue’s objectives because
it may have a more profoundly negative effect on red’s objectives, blue needs to
redefine its objective to make this intentional negative influence positive. That is,
if blue will sacrifice a little damage in itself to generate greater damage in red (as
in the case of a company losing some shares in the market to influence the market
and force the competitor to lose more shares), blue needs to redefine its objective
in terms of red’s loss (i.e. positive objective for blue). Therefore, we will always
assume that blue’s outcome will always be designed to influence blue’s objectives
positively.
The uncertainty facing blue will always negatively impact blue’s outcomes.
Meanwhile, blue’s outcomes need to negatively impact red’s uncertainty. Before the
above statements generate skepticism, we need to note that these sentences are only
valid within the context of designing the basic building block of a scenario. With
scenario design, it is natural to focus on the negative impact of blue’s outcomes on
red’s uncertainties. The execution of a scenario may produce the opposite effect.
Moreover, blue may design strategies with the opposite effect. Still, there is no logic
that would justify why blue would generate an outcome to help red, unless at the
end of the process of helping red, there is a trap for red or a large gain for blue.
The building block of a scenario in CRT consists of the elements presented in
Fig. 4.1.
Scenario_building_block D f .Blue_objective; Blue_uncertainty; Red_uncertainty/
This definition of a scenario emphasizes that each building block is centered

on an objective. With the large space of possibilities that red and blue face, and the
possible larger space of uncertainties, it is not beneficial to use a traditional scenario-
design approach. A technique such as FAR would ignore the manner in which the
objectives of red interact with blue’s own objectives, and how this interaction can
create other factors that cannot be considered when thinking passively.
While a scenario does not need to describe the details of the interaction among
players and only needs to describe the situation and context within which the
simulation runs, the objectives of both teams and how the two teams interdepend
on each other should be part of the scenario design.
Therefore, to explain a methodology to design scenarios for CRT, we will use
the above building block of a CRT scenario in two forms. In the first form, we
will remove the intermediate node of outcomes and connect the independent nodes
directly to each other as in Fig. 4.2.
4.1 Scenarios 167
Fig. 4.2 Building block form 1 of a scenario in CRT
Fig. 4.3 Building block form 2 of a scenario in CRT
Interestingly, this figure demonstrates that the central point for a scenario remains
the space of uncertainty for blue. However, it emphasizes that this space depends on
blue’s objectives and red’s uncertainties. In fact, blue’s uncertainty can be great.
Blue’s objectives scope blue’s uncertainties. Only the uncertainties that impact
blue’s objectives are relevant here. This bounds the uncertainty space for blue.
Conversely, red’s uncertainties may expand the uncertainty space of blue.
Red may inject more uncertainty into blue’s uncertainties than necessary. For
example, red may embark on deceptive operations that appear to be real solely to
increase blue’s uncertainties or shift blue’s attention to other desired uncertainties.
If blue does not consider the interdependencies between its objectives and uncer-
tainties with red’s uncertainties, blue may fall into the trap of underestimating its
own uncertainties or prioritizing its uncertainties and objectives incorrectly.
Figure 4.2 does not present red’s objectives. However, Fig. 4.3 does present
these to emphasize the symmetric nature of scenario building blocks in CRT.
Red’s uncertainties would be impacted by blue’s uncertainties, as would red’s own
objectives. In essence, red and blue do not necessarily have complete access to each
other’s objectives. Even if they do, while the objectives are in conflict, the main
“direct” interaction between red and blue scenarios is through the uncertainty space,
not the objective space.
In the second form, we can maintain two nodes only: blue’s outcomes and red’s
uncertainties. Risk for blue is defined as the impact of blue’s uncertainties on blue’s
objectives. As such, blue’s outcomes are blue’s risk, which can be both negative
and positive. Consequently, this second form emphasizes that a blue scenario can be
defined in terms of building blocks, where each building block takes the form of a
risk for blue and how red’s uncertainties impact that risk.
It is important to explain why red’s uncertainties are not considered part of blue’s
risk. The main reason is that red’s uncertainties may be certain for blue, blue may
inject its own designed uncertainties in red, blue may shape red’s uncertainties, and
it might even be the case that red’s uncertainties do not impact blue’s uncertainties
at all.
For example, John’s risk in getting the job consists of John’s uncertainty of
not knowing how he will perform in the interview, and what questions he will be
asked, and whether he will achieve his objective of getting the job. The company to
which John is applying has different uncertainties: whether the right candidate for
the job will apply, whether the selection committee will make the right choice and
detect the right applicant, and whether the right applicant, if selected, will accept
the job. John can play with the company’s uncertainties even if he is not the right
applicant. John may demonstrate that he is keen to get the job, or that he is loyal
and it would be cheaper to hire him and train him a bit more to become the right
applicant than hunting for the right applicant for whom there is high demand in the
market. John needs to consider the company’s own uncertainties when designing
his own scenario space.
4.2 A Model to Deconstruct Complex Systems
To design a CRT exercise, designers need to understand how to control or influence

a system. Here, we need a systematic manner in which to analyze blue and red
as systems. We need a framework in which we know what to change to achieve
our objectives without eliminating the possibility that our objectives can simply be
uncertainty management.
To establish the knobs and design the controls, we need to rely on strategic
thinking. We may have heard statements such as “we need to think outside the box.”
The “box” here refers to the set of constraints and assumptions that we take them
granted; these can bias our thinking in one direction, while leaving out a significant
space of possible actions that are untouched. The box represents our bias, limited
cognition, stereotyping, system traditions, and many other factors that influence our
thinking in one manner, while leaving out many innovative manners in which to
achieve the goals. Even the concept of thinking outside the box is a limited view of
strategic thinking.
If the objective is to be creative-truly creative-we need to learn to avoid being
limited in our thinking by thinking of any box in the first place. We need to forget
about the box and think freely about how to achieve our objectives (ends). In being
goal focused, we should not apply the notion that “the ends justify the means” in a
bad manner.1 Rather, this say should be rephrased as “the ends explain which means
are more useful than others.”
1
Sometimes this sentence is misinterpreted to mean that bad means can be forgiven if the ends are
good.
4.2 A Model to Deconstruct Complex Systems 169
Definition 4.2. A strategy is the “ways” in which we use the “means” (resources
and capabilities) to reach and achieve the “ends” (objectives and goals).
Definition 4.3. Strategic thinking is the creative process used to design and connect
the means, ways and ends.
The above definition of strategy stems from other works such as [1, 2]. In CRT,
this thinking needs to be about both red and blue. From a blue team’s perspective, it
is important to break the box and think creatively about how to achieve its objectives.
In doing so, the blue team needs to consider how to force the red team into a box
that is strategically important for blue. It is beneficial for blue that red’s thinking
is within a box; it is detrimental for blue to have a box of its own during strategic
thinking. From the red team’s perspective, it is beneficial for red that blue’s thinking
is within a box; it is detrimental for red to have a box of its own during strategic
thinking.
This race to shape a box for the opponent and break a team’s own box is what we
will term here a “thinking strategy.” When this thinking is guided with appropriate
risk analysis, we will term it “thinking risk.”
4.2.1 Connecting the Organization
Figure 4.4 depicts a manner in which to analyze an organization strategically so

that thinking strategies to influence and control that organization can be designed
and engineered. The entry points in this high-level thinking strategy model are
resources. The basic resources that all organizations rely on can be categorized as
“land,” “people,” “capital,” and “knowledge.” The first three represent the classic
view of resources in economics. Here, we add knowledge as an additional resource
that is becoming increasingly recognized in modern societies.
4.2.2 Resources
This first level presents opportunities for controlling an organization. Any organiza-
tion, even an entire country for the sake of the argument, has limited resources. In
most situations, in one organization one of these four categories is a factor that is
more limiting than the other factors when compared to another organization.
For example, let us assume that for Minisoft, “people” is the resource that is most
scarce of the four basic resources. Promoting certain activities in Minisoft would
mean the organization would be forced to shift people from one area to another.
For example, Manysoft may simply leave a portion of the market focusing on client
support untouched. Minisoft sees this portion of the market and attempts to profit
from it. However, the limited people resources available in Minisoft would force the
company to shift some software programmers to customer support. This reduces
Fig. 4.4 Schematic diagram displaying how the blue and red teams are connected strategically as
a system
Minisoft’s ability to offer software programming, and Manysoft would like to

focus all of its activities on the software-programming market. Here, Manysoft is
indirectly controlling Minisoft and shaping its activities. However, it is not harming
Minisoft’s profit. This example represents a strategy by Manysoft to focus on the
market that attracts a larger profit: software development, rather than on the market
that attracts less profit: client support. In this example, Manysoft is shaping the
environment for its own benefits.
Thinking risk through understanding uncertainty management is to understand
the environment, influence it, and shape it so that other players are controlled
without communication, explicit control, or even knowing in what direction they
are being controlled. This is thinking risk in action.
Manipulation of resources offers control or influence points for one organization

over another. Increasing salaries for people with software-development skills is a
strategy by Manysoft to attract Minisoft software programmers; this makes software
programmers a more scarce resource for Minisoft. Such legal and legitimate actions
can hide deceptive intentions that have damaging effects on the opponent.
One may argue that the Cold War is a good example in which the resource of
“capital” was the focus. Traditional wars focused on the resource of “land.” In the
age of the knowledge economy, “knowledge” as a resource is becoming the knob to
control or influence organizations. Denying certain organizations access to certain
types of knowledge is a strategy to achieve such control or influence. The pros
and cons of such a strategy will not be discussed here. In all eras, the resource
of “people” has been the enabler to manage all other resources.
4.2.3 Fundamental Inputs to Capabilities
The second level is fundamental inputs to capabilities [1]. These are the building
blocks that require the synthesizing of resources within an organization to establish
a capability.
A capability is the capacity to achieve an objective [1]. Within an airline, an
aircraft is a capability. It requires a number of fundamental inputs to capabilities
such as supplies, collective training, and personnel. Each fundamental input to
capability requires different mixes of all resources. For example, collective training
as a fundamental input to capabilities is concerned with enabling groups such
as air crew to train together so that they can understand how to communicate
effectively with each other; how to overcome misunderstandings that arise because
there are people with different specialties such as pilots and air stewards; and how
to synchronize actions and roles in emergency situations. Collective training is a
means to achieve interoperability in human interactions.
Collective training requires the four resources. It requires a place for the training
to be conducted: land; it requires capital and investments; it requires people to
conduct the training (trainers); and it requires knowledge in the subject matter to
make the training meaningful and effective.
Collective training alone does not create a capability for an airline. However,
it is an essential building block for flying capability. Nevertheless, one may find
a company specializing in collective training where collective training in that
company is a capability in its own right. As such, a fundamental input to capabilities
in one system may be a capability in a different system; in the same way, a
component in a system is a system in its own right.
The aircraft itself is a fundamental input to capability known as the platform. An
aircraft on its own will not fly. It needs supplies of fuel, crew (air or ground), and
many other elements before it delivers a flying capability.
Now, imagine someone donating five different types of aircraft to an airline. At
first, this seems great. However, scrutinizing its impact on fundamental inputs to
capability can prove that it is a damaging occurrence. The airline needs to have five
experts in different subjects to manage the five different types of aircraft. It needs to
establish different maintenance regimes and technical skills to cover the portfolio of
its fleet. In fact, in this scenario, the airline will stretch itself thinly until the point of
possible collapse.
Fundamental inputs to capabilities are the second knob to control or influence
a system. While the system may have resources, shifting the resources from one
fundamental input to capabilities to another would create a gap and the capability
will not materialize.
4.2.4 Capabilities
The third layer contains capabilities. Each capability is designed to deliver functions
to achieve effects or outcomes for the organization. If one cannot influence resources
and fundamental inputs to capabilities, the capabilities will exist. What one needs
to focus on is whether the functions that these capabilities will perform are
controllable.
For example, assume a flying capability. The organization establishes the
capacity to fly aircraft. The functions can be flying domestically or internationally,
carrying passengers or cargo. A competing airline can shape the market so that it
focuses on international flights, leaving the domestic market for a different airline.
For that second company, while the fleet has the capacity to fly internationally, the
market is reshaped such that performing this function is not a wise move.
As we approach the final two layers, effects/outcomes and strategies, an external
entity will need to exercise a different type of control. We may not be able to stop
an outcome if the resources, fundamental inputs to capabilities, capabilities, and
functions are uncontrollable. Instead, we need to manage the effects in one way or
another.
To manage effects, we need to introduce the concept of a network, then discuss
the operations that one can achieve on networks. Understanding these operations
will demonstrate how an effect can be managed. How to control or influence this
network of effects is the basis for the following section. Before we progress in this
topic, we need to continue explaining the remaining layers in our high-level model.
Operations on networks and their use for influencing effect spaces are discussed in
Sect. 4.3.
4.2.5 Vision, Mission and Values
The layer at the most right-hand side of the diagram captures the vision, mission
and values of an organization. Vision denotes the long-term goals an organization
needs to achieve. Mission is what the organization is about: spelling out in concrete
terms the intermediate goals, way-points, and performance indicators that the
organization needs to achieve to reach its vision, and the set of functions that the
organization needs to perform to be able to achieve these goals.
Values represent the boundaries of behavior for both the organization and its
employees that should be maintained while making decisions and searching for
solutions.
Vision, mission and values are designed by the board to provide the organization
with a coherent sense of direction, focus, and culture.
4.2.6 Strategy
Between the effects layer and the vision, mission and values layer, the strategies
layer design the “ways” to connect the “means” (i.e. all layers on the left-hand
side of the strategy layer) to the “ends” (i.e. the layer on the right-hand side of the
strategy layer) [1]. The role of a strategy is to understand the “hows” of, and risks
in, transforming and translating:
• the goals into outcomes and effects that need to be met to demonstrate that the
goals have been achieved
• the outcomes and goals into functions that need to exist to enable the successful
achievement of these effects
• the functions into capabilities or integrated functioning systems with the capacity
to perform these functions to deliver the required effects
• the capabilities into fundamental inputs to capabilities or the building blocks
required to have an integrated functioning system
• the fundamental inputs to capabilities into required resources that need to be
synthesized to produce each fundamental input to capability
• an overall framework to ensure that all layers on the left-hand side of the
strategies layer and all strategies are linked in an efficient, coherent, cost
effective, and meaningful manner to achieve the mission, vision and values on
the right-hand side.
While a strategy on a strategic level would stop at translating the vision, mission
and values into effects, in each level of the organization, a strategy needs to be in
place to understand the ways a layer on the left-hand side would deliver the goals
and objectives of the adjacent layer on the right-hand side.
While objectives and goals move from right to left (descending down in the
organization from a strategic level to a tactical level), constraints on achieving
these goals usually move from left to right (ascending up in the organization from
tactical levels to the strategic level). A strategy between two layers ensures that the
objectives will be achieved despite the constraints, or the objectives will be reshaped
to account for the constraints.
Each layer in this model for the deconstruction of a complex organization
provides opportunities for both blue and red to influence the organization. If this
is a red organization, red can use this model to identify problems and any force that
is interfering with the goals of the organization. In the meantime, blue will focus on
designing forces to influence and shape red’s organization. Red in this case has a
much more complex problem than blue. It is sufficient for blue to focus on one layer
to influence the overall red organization, while red has the daunting task to ensure
that all layers are well-functioning to achieve its objectives and goals.
4.3 Network-Based Strategies for Social

and Cyber-Security Operations
4.3.1 Socio-Cognitive-Cyber-Physical Effect Space
In any organization, multiple outcomes need to be balanced, factors such as

maximizing profit, maximizing employee satisfaction, maximizing stakeholder
satisfaction, and maximizing opportunities in the market constitute but a few
objectives that may simultaneously co-exist in a profit-based organization. The word
“maximizing” in this context implies attempting to obtain the highest possible value
of the corresponding objective while respecting the constraints in the system, which
normally prevent the value of the objective from increasing indefinitely.
The interaction between the outcomes and the humans requires the analysis of
four domains in which an effect or outcome is residing: social, cognitive, cyber and
physical Domains (SC2PD) as shown in Fig. 4.5.
An outcome may need to reside within a physical space. For example, creating
the capability to perform self-maintenance and self-repair within a logistical
transport company if done correctly would generate an outcome of maximizing
the on-road time of the fleet. This is an outcome in the physical domain because
it impacts the operation of the vehicles and platforms in the system.
The same previous capability can have an outcome in other domains such as the
cognitive domain. In this logistics transport company, customers would build an
image of trust in the operations of this company. This trust would be expected to
maximize the customers’ loyalty, as the image of the company in these customers’
mind improves.
The same capability may generate an outcome in the social domain, whereby
loyal customers can be connected through loyalty programs to build a social network
of loyal customers. Such a network would make customers aware of other customers
who are using the company; thus, synchronizing movements in a supply chain can
occur more efficiently. For example, imagine the raw materials M1 will move from
customer C1 to customer C 2. C 2 will process M1 and transform it to M 2, which
then needs to move to customer C 3. If C1, C 2, and C 3 are connected through the
social network of the logistics company, they would understand that they can rely
on the company to achieve on-time delivery of the raw materials and products when
needed; thus, cutting down the cost of storing materials.
4.3 Network-Based Strategies for Social and Cyber-Security Operations 175
Fig. 4.5 The interdependent nature of the effect space
The discussion of the cyber domain is postponed to Sect. 4.3.2.

The level of complexity emerging from the interconnectedness of the effects in
the effect space, the need to understand the operations on the networks mentioned
above, and the interaction of effects in the SC2PD, demonstrate the need for
effective and efficient methodologies to manage the effect space.
This level of complexity cannot be managed by humans alone. CRT provides
the computational environment in which hypotheses can be tested, but more
importantly, in which hypotheses can be formulated and generated.
What is interesting in a network of outcomes is that the outcomes share
constraints and causes. As such, they are not free or independent. Their interde-
pendencies are reflected in the links between them, but their true independencies are
rooted in the structure setting underneath them, that is, in having common functions,
capabilities, fundamental input to capabilities, and resources.
The complexity caused by the interdependency of different domains can be lever-
aged to provide multiple interception points whereby effects can be generated on
one level to isolate or deny effects on a different level. To manage effects, one needs
simple tools to deconstruct the organization and the system into subcomponents
where effects can be intercepted and manipulated accordingly.
Controlling a network of effects must consider the human side, as it is the human
who will judge whether the maximization is satisfactory. The human moves the
problem from pure maximization or minimization to a satisficing problem.
To understand the knobs, we need to frame these as knobs for controlling or
influencing networks of things or people. These knobs are universal for any type
of network, although some of them would be only applicable in socio-technical
contexts than in pure technical contexts, where the concept of social engineering
would be more relevant and applicable.
4.3.2 Cyber Security
Before we proceed with a discussion on network operations and how they can
contribute to the fields of social engineering and Cyber Security, we need first to
define what we mean with Cyber. This is important because it will make it clear
why “networks” are at the center of any Cyber Security operation.
Traditionally, one would discuss the information domain rather than the cyber
domain. Recently, the cyber domain is emerging as a more encompassing concept.
However, the definition of “Cyber” is somehow confusing. For computer scientists,
it has been seen as another form of computer security, despite that the Cyber space
extends well beyond classical computer security issues such as cryptography and
network security to issues where an understanding of complex systems, network
theory and psychology are paramount.
In the military, the word “Cyber” has also been confusing. The military has
been conducting operations in the electromagnetic spectrum such as electronic
warfare operations for decades. Therefore, the Cyber space is not a new concept
to the military as it is almost a new buzz word in the civilian world. This
begged the question of whether the word Cyber should be differentiated from the
electromagnetic spectrum.
Within science, the field of Cybernetics is maybe the oldest scientific field that
uses the word Cyber in its title and roots. One of the main journals in this field is
the IEEE Transactions on Cybernetics, which used to be called, IEEE Transactions
on Systems, Man and Cybernetics Part B: Cybernetics. The scope of this journal
emphasizes papers on “communication and control across machines or machine,
human, and organizations” (Source: IEEE Transactions on Cybernetics - Aims and
Scope Statement).
The above is a very simple demonstration on how confusing the word Cyber is.
Despite that many people would claim to know what it is, but it will be more difficult
to ask them to define it precisely in such a way that truly distinguishes its meaning
and use from other disciplines.
Figure 4.6 shows a conceptual diagram to explain the Cyber space. The starting
point is the physical infrastructure that supports the Cyber space, which contains
elements such as the physical backbone of the internet, servers, routers, signal
receivers, signal transmitters, and satellites. These elements are necessary for the
existence of the Cyber space. Together, they form different physical networks.
Within an organization, a local area network connects the organization’s internal
computers together, but does not necessarily connect the organization with the
external world. To protect the physical infrastructure, it is important to understand
physical security. For example, access control to the server’s room in an organization
is a type of physical security.
Fig. 4.6 An outline of the building blocks for the Cyber space
The physical layer provides the physical infrastructure that allows the generation
and propagation of electromagnetic signals. Signals can be seen as the water flow in
a river, they carry things such as fishes and boats. Information are the equivalent of
fishes and boats; that is, signals carry information.
The electromagnetic flows form another type of networks known as the logical
network. Nodes in the logical networks do not necessarily correspond, and mostly
they do not correspond, to nodes in the physical network. This network requires
protection of a different type from the physical protection. Securing communication
channels are one type of communication security required to protect the logical
network.
Signals carry, or encode, information. These information can be texts carried in
an SMS message, voices in telephone calls, email messages in a computer network,
or videos on the internet. These information forms other type of networks, that
we will call information networks. An information network connects information
together. For example, a database of customers in a supermarket, connected to a
database of financial transactions of credit cards can provide the network required
to understand loyal customers or behavior of customers across a sector. Information
security is the classical focus of computer security.
The Cyber space is made-up of the components shown in Fig. 4.6. Therefore,
Cyber is the space spanning any flow in an electromagnetic spectrum. This flow
can be a flow of information in the form of bits, regulated with electrical signals,
as in the case of pieces of data moving from one computer to another. This can
be observed in the logistics company that was transmitting orders from customers
to the decision-support unit installed in vehicles so that drivers would act on these
orders. The flow can be a flow of signals as is used in a Global Positioning System
(GPS) in transmitting vehicle positions in real time so that the logistics company can
monitor and adequately optimize the use of its fleet. The cyber domain is currently
evolving to represent the space in which complex flows occur in the electromagnetic
spectrum. Injecting a virus by a rival company into the computer system of
the logistics company, intercepting the GPS signals produced by the vehicles of
the logistics company by rival companies, or jamming the communication lines
between the logistics company and its fleet are examples of offensive operations
in the cyber domain to generate a cyber effect. Examples of operations to generate
positive effects for an organization using the cyber domain include marketing the
company in online social networks, establishing a space for the company in a virtual
game such as second life, and using emails to announce discounts and special offers.
Definition 4.4. Cyber space is formed from all flows regulated by the electromag-
netic spectrum
Definition 4.5. Cyber security is the business processes and tools needed to protect
any flow in the electromagnetic spectrum.
Definition 4.6. Cyber operations are any sequence of activities conducted in the
electromagnetic spectrum with the intent to achieve one or more effects.
As being demonstrated in Fig. 4.6, networks are the basis for the Cyber space.
In fact, security can’t be claimed in any Cyber subspace unless the three types of
networks shown in Fig. 4.6 are secured. Therefore, it is paramount to categorize
network operations since the Cyber space is likely to be part of most red teaming
exercises in any large organization.
4.3.3 Operations on Networks
There are many network operations that need to be discussed and understood by the
teams conducting a CRT exercise. The technical details on how these operations are
conducted are context and exercise specific, but an understanding of these categories
is essential for members of the CRT exercise (Fig. 4.7).
The main challenge in managing a network is that connections between nodes
create interdependencies that make it difficult to manage consequences. A change
in one node may propagate undesirable effect(s) in the overall network, or even
cascade and generate a massive blackout in the overall system, for example,
cascading failures in power networks or cascading anger in a social network.
This challenge creates many opportunities to manage, shape, or break down a
network. Each of these opportunities will be explained below as an operation that
can be conducted on a network.
Fig. 4.7 Different operations on networks to achieve an effect
4.3.3.1 Detect: Detection of Hidden Networks
Possibly the first operation on a network is being able first to detect that a network
exists. In some cases, it might be simple to expect that an organization would have a
network of effects. In other cases, this network may not be detectable by an observer.
The simplest manner in which to explain this is to think of a network of criminals
within Manysoft who are trying to commit fraud. Manysoft cannot do anything
about this network until it is able to detect it. Through the detection of several nodes
Manysoft may be able to establish with confidence that a larger network exists.
The detection problem can be extremely difficult. However, one characteristic of
a network that makes this detection problem more easy (as opposed to easy) is the
existence of many nodes. By definition, the network cannot survive as a network
without links. Therefore, while detecting a single node can be a very challenging
problem, the larger the network, the more likely that the existence of the network
becomes detectable. Nevertheless, a hidden inactive (i.e. sleeping) network is more
difficult to detect than an active network.
4.3.3.2 Identify: Identification of Detected Networks
Identification comes after detection. Once a network has been detected, the question
that must be posed is what the network is about. To identify a network is to associate
an identity (purpose or intent) with it, which is a form of contextual information.
For example, is this network for fraud or is it a gossip network? Identification can
also involve many more features about the network including an estimate of its size,
topological characteristics, and a characterization of the dynamics and types of flows
on the network. All these features can help to zoom in more to clearly identify the
network and distinguish it.
4.3.3.3 Track: Tracking Networks as They Maneuver in the Environment
Tracking a network can be defined formally as the problem of using minimum

resources to maximize situation awareness of the activities and moves of a network.
Once a network has been detected, it becomes natural to ask how to track it. Do
we need to track each node? Is it sufficient to track the hubs? Should we sample
nodes at random to track the network?
The answers to the above questions depend on context. The common indicator
that can assist in answering these questions is to understand the dynamics of the
network. By network dynamics, we intend to convey the concept of how the network
changes over time, how the nodes move in different spaces, how the links are
initiated, and the characteristics of the flows on the network.
Changes in the network can take forms of adding or removing new nodes and/or
links. If the network expands low-degree nodes (nodes with few connections) into
chain-link expansion, tracking the hubs may not yield satisfactory results in tracking
the network.
Nodes can live in different spaces, defining the different modes of links. For
example, if a node represents ideas, one can argue that ideas live in a cognitive
space. An individual may live in a social network, while simultaneously living in an
infrastructure network if they are a guard or a train operator. Understanding how a
node moves in different spaces would help the tracking process.
Take for example a network of individuals. Assume we are building a network
based on who is close to whom at a cocktail party. Proximity in this context can
be measured and defined as people located within half a meter of each other. If
one individual moves a great deal faster from one location to another than all other
nodes, this fast-moving node may require more tracking than the other.
The dynamics determining how one link is initiated in a network or is eliminated
has a profound impact on the ability to track a network. In the cocktail party example
above, assume a link would be initiated when two people shake hands. Therefore,
the fact that one node is moving faster than the others does not mean that this node
is able to establish links. Whereas, carefully moving small distances between large
groups and shaking people’s hands would have a more profound impact on link
formation. Here, shaking a hand can become the trigger for tracking.
4.3.3.4 Deny: Denying Access and Capabilities to a Network
This operation is concerned with denying a network access to capabilities. For

example, let us assume that a network of employees wanting to commit fraud in
a bank has been detected. One response is simply to fire all of these employees.
However, this response can be simplistic and ineffective. First, there is no guarantee
that the new set of employees will be better. Even worse, hiring a new set of
employees is almost like reinitializing detection to zero; thus, requiring more time
to detect whether the new set will commit fraud.
Second, fraud cannot be committed properly unless members of the networks are
sufficiently experienced to know how to bypass the procedures and rules to commit
fraud. We may all agree that these are bad people with bad intentions. However, we
cannot deny that these people have significant corporate knowledge that may require
many years to build in a new set of employees.
Third, if a company has real problems in finding people with the required skill set
and expertise, the company does not necessarily have the luxury of firing employees
with significant expertise.
As risky as it may sound, one might suggest that maintaining the network
within the company is the best solution. However, this network needs to be denied
capabilities to commit fraud. For example, the nodes (people) might be reallocated
to different branches; they could be moved to different sections of the bank in which
there is no access to transactions (e.g. performing a passive auditing function instead
of having access to the transaction systems though which they can create fraudulent
transactions); or the IT monitoring system could improve its scrutiny in validating
transactions entered by people in these networks, and deny access when transactions
are suspicious.
In a world with limited resources, denying capabilities is an effective strategy
for preventing a network from operating improperly and achieving its objectives.
This is especially the case when dealing with threats where intents are very difficult
issues.
As we are discussing CRT, we should note that this is a double-edged sword.
The red team may detect an effective network within blue such as a communication
network. Red can inject sufficient information to blue to raise blue’s suspicions
about the nodes of that network, causing blue to think that these nodes form a bad
network such as a network of fraud. Blue would then apply their risk-averse strategy
and decide to deny those nodes capabilities to perform their functions. In doing so,
red would achieve its objective, as blue would be forced to replace the nodes that
are too efficient with less efficient nodes; thus, opening a hole in the blue system to
be exploited by red.
4.3.3.5 Prevent: Preventing a Network from Achieving an Effect
In generation, the prevention operation operates to prevent a network from perform-

ing an action. We distinguish between two types of prevention strategies. In the first
type, a barrier is placed between the network and its objectives. Here, capabilities
are not necessarily being denied because the network may still have the capability
to perform its function. For example, a fraud network may have access to the system
where fraud can be committed. However, every time the network attempts to commit
fraud, a barrier is in operation. For example, the nodes are called for jobs in different
locations, or the system is shut down for maintenance.
This first type of prevention is fundamentally different to denying capabilities for
two reasons: the first is that the network still has the capacity to commit fraud; and
the second is that to prevent the network from committing fraud, there is a need for
an extremely efficient monitoring processes to establish perfect situation awareness
about the network intent and the expected time for an action to be taken. Only at
this time can prevention through these barriers can be successful.
The second type of prevention operates to prevent the network from growing
or increasing its connectivity. This is also a type of a barrier, but it is a barrier
surrounding a network topology, rather than a barrier surrounding a network
function.
Prevention may seem a difficult operation. However, it is an indication of
a healthily functioning organization. If prevention cannot be achieved, it is likely
that the organization does not have sufficient situation awareness of the networks
within the organization and intended actions. As such, it is likely that this organi-
zation has networks that have not been detected. Prevention is better than cure; it is
immunizing the organization against a potential attack from the network of viruses
that may intend to harm the organization.
4.3.3.6 Isolate: Isolating a Network from Other Networks
Isolation can be a form of denying access of one network to other networks.

For example, isolating the network of people intending to commit fraud from
all other social networks in the organization is a form of denying capabilities
through isolation. A spy within a pharmaceutical company who attempts to steal
the company secret to leak to its rivals can be isolated in many ways. For
example, simple ways to isolate the spy from the rest of the company could involve
intercepting emails addressed to the spy, delaying the arrivals of the emails so that
the spy misses a party, or deleting them so that the spy does not have access to a
conversation.
In cyber security, honey pots are virtual environments in which a hacker can
reside, erroneously believing that they are within the real IT environment of an
organization. As such, the hacker has been isolated from the main computer
environment within the organization, despite the hacker maintaining their capability
to hack the system and the fact that they are indeed exploiting the system: the only
difference is that they are exploiting a made-up system and not the real system.
4.3.3.7 Neutralize: Neutralizing a Network’s Effect
Assuming a situation in which a network has the capability to perform its intended
function, and it has not been possible to prevent the network from achieving its
goals or isolating the network: How can we manage the outcomes? Assume that
a network of hackers manages to penetrate the IT system of an organization, steals
information and is now holding the information to threaten the system. The question
now is whether we can neutralize the effect.
It might be simpler to think of someone who blackmails you with a photo of
you naked: What should you do? If you feel extremely embarrassed, they will
be successful in blackmailing you and getting what they want. You will open
opportunities for more people to blackmail you. Another strategy would to go public
naked. Yes indeed! Regardless of how much fear and embarrassment you may feel in
normal circumstances of appearing naked in public, if you are blackmailed, you may
need to overcome this fear, as the cost of being blackmailed exceeds the cost of your
fear of being seen naked. Despite that this example may sound like an exaggeration,
it sufficiently illustrates the point we wish to make. Blackmailing rests on fears. One
possible strategy to manage blackmailing is to face the fears.
Similarly, if information has leaked about an organization and the situation
is difficult to contain, it may be easier to make the information public yourself.
Preempting the effect of the attacker may generate opportunities for you. You can
go public and demonstrate that the organization is moving toward sharing more
information with the public, regardless of how much it may be damaging to the
organization. It is likely that the damage will be much less if you go public than if
the hacker goes public with the information because you will have the opportunity
to frame these information as you wish and pre-empt the hacker’s, possibly more
damaging, framing of the information.
If the hacker goes public, the damage is not only in revealing the information.
The damage extends to the security system of the entire organization, the image
of the organization, and its ability to protect its own information. In addition, one
successful hacker may also become a hero for other hackers to follow. Proper risk
assessment can involve the principle that controlled damage is a preferable strategy
to follow than aspiring for a damage-free situation.
4.3.3.8 Destroy: Destroying an Existing Network
Sometimes the only solution available is to destroy the network. A network

of Unmanned Aerial Vehicles (UAVs) with very advanced sensorial capabilities
performing surveillance may not be managed using the operations discussed above.
This may lead to a difficult but necessary decision: to destroy the network. For
example, a network of serial killers may constitute a network that needs to be
destroyed.
The destruction of a network is not necessarily a destruction of the nodes. It is
more effective to destroy the links. For an organization, moving the hub node to an
overseas branch and to a different area in the organization would destroy the fraud
network. A hub in this situation can mean many things, for example, the node that
is most connected socially to all other nodes; the node that is most influential on all
other nodes; the node that is doing the core thinking on behalf of all other nodes; or
the node with the technical competency to execute the act. By destroying the links,
the network collapses.
4.3.3.9 Reshape: Reshaping the Network’s Environment
In many socio-technical systems, it is more effective for an organization to reshape a

network than to follow any of the operations mentioned above. However, reshaping
can be expensive and usually requires more time than do the operations.
Take for example the network of people wanting to commit fraud in a bank.
Imagine that we were able to reshape the intent of these people to protect the bank
against fraud: Who is more skilled than the fraudsters themselves to do such a
job? They can become the red team! In fact, every set of skills a network creates
is valuable within the organization if the organization finds methods by which to
reshape the intent of these networks.
Reshaping a network intent or function means turning a thief into a detective;
thus, creating a breed of detectors who know how a thief thinks, functions, and
performs. Network reshaping is almost the best operation that can be performed on
a network. However, it takes time and resources to reshape intent and get a network
to redesign its objectives and goals.
4.3.3.10 Hide: Hiding a Network Within Other Networks
Given we are discussing CRT, it is natural that for each network operation we
discuss, we also discuss a counter-operation. In CRT, hiding a network is an
operation in which one of the two teams does not want certain networks to be
detectable by the other team.
Hiding a network can be a large area of research in its own right. Here, we will
discuss this operation at a surface level. Regardless of the level of depth required to
discuss this operation, there are fundamental properties that the operation of hiding
a network requires. These properties are node autonomy, and link invisibility. These
two properties may seem dependent on each other. As the degree of autonomy
increases in a node, this node will detach itself from other nodes; thus, there will
be no need to establish or reinforce links with the original network (i.e. no need to
communicate), which will eliminate the links (dependency) between the nodes.
However, links exist for many reasons. Two individuals can be autonomous
in their actions, but they live together, work together, talk to each other on the
telephone, and perhaps even have a personal relationship.
References 185
The opposite is also true. If no link exists between two individuals, it does not
mean that autonomy is high. Broken links can be an indicator of a dysfunctional or
a sleeping network.
Hiding a network is not about eliminating links or nodes, or even making the
network hidden. A network cannot function properly if no links exist and nodes
are fully autonomous. The need to synchronize action, share information, manage
resources, to name a few, always means that the network cannot be fully hidden.
Instead, effective hiding operations of a network are about balancing the signal-
to-noise ratio, that is, embedding one network within many different networks is
one way to hide the main network. Imagine an individual who is extremely sociable.
The larger the number of people this individual meets, interacts with, works with,
collaborates with, and even walks with, the more difficult it will be to detect the real
network of interest through this particular individual. Every interaction between
this individual and another individual will be considered an observation, and every
observation is either important or noise. As the number of noise observations
increases relative to the important ones, the probability to detect a real observation
decreases.
The story does not end at creating many noisy links to hide the real links.
What is important here is not the word “many,” but the nature of the few.
If an individual meets with the gang of fraudulent employees twice each month,
this individual needs to make “twice-a-month” noise encounters more frequent than
single encounters. The encounters need to overlap and synchronize such that the
signal cannot be isolated from the large noise by which it is surrounded.
Such an operation for hiding networks is normally very sophisticated and
requires very advanced understanding of network dynamics and intelligence
operations.
References
1. Director, C.O.: Plans. Defence capability development manual. Tech. rep., Technical report,
3. Rhyne, R.: Field anomaly relaxation: the arts of usage. Futures 27(6), 657–674 (1995)
4. Zwicky, F.: Morphological astronomy. Observatory 68, 121–143 (1948)
5. Zwicky, F.: Discovery, Invention, Research Through Morphological Analysis. McMillan,
New York (1969)
Chapter 5
Case Studies on Computational Red Teaming
Abstract This chapter focuses on the utility of CRT. Three case studies are
presented in varying scale and complexity. The first case study is technology
focused, where CRT is used to evaluate three conflict-detection algorithms in the
domain of air-traffic management (ATM). The second case study presents a lab-
based experiment using a game between humans and computers. The objective
of the experiment is to test the impact of noise in the information presented to
the humans and the impact of deceptive strategies. The final case study presents
a significant exercise with air-traffic controllers (ATCOs). The exercise uses elec-
troencephalographic (EEG) brain data to perform CRT in real time. This case study
connects different enablers for CRT in an integrated manner.
5.1 Breaking Up Air Traffic Conflict Detection Algorithms
5.1.1 Motivation and Problem Definition
The main case study in this section will focus on the domain of ATM. The
description will be self-contained, avoiding technical jargon and focusing on the
problem in as much level of abstraction as possible. A more technical description of
the case study can be found in [5].
ATCOs have the primary responsibility for ensuring that aircraft are separated
safely as they fly. Aircraft should do not come too close to each other, as this
increases the risk of collision and fatal accidents. The airspace is divided into flying
zones, named “sectors.” We can assume for simplicity that each sector is managed
by an ATCO.
The continuous increase in traffic demands is causing an increase in the amount
of traffic. These increases require more tools to support ATCOs in maintaining
smooth and safe operations. Moreover, the increase in traffic demands has required
changes to occur in the cockpit, with more advanced technologies being added in
cockpits to provide pilots with better situation awareness of the air traffic.
To allow new air-traffic-management technology to be implemented and adopted
by industry, the technology needs to undergo significant testing and evolutionary
cycles. It can take many years before a technology that has been discovered in
a university-laboratory environment to be used in the real air-traffic-management

188 5 Case Studies on Computational Red Teaming
environment. One reason is the need for rigorous testing. The second reason is the
expense involved in changing the infrastructure to support the new technology and
the legacy systems needed to approve these changes.
In this case study, we need to red team a technology known as conflict detection.
In its basic form, a mathematical algorithm is used to decide on whether two aircraft
will come into conflict (i.e. fly closer to each other than they should) within a given
timeframe. The ATM industry performs such evaluations. Classical methods by
which to evaluate new algorithms such as these include mathematical analysis of the
behavior of the proposed algorithms, and running human-in-the-loop simulations
and fast-mode simulations. In the latter case, data on real traffic are collected, and
then replayed in a simulation environment. These simulations are conducted with
and without the technology enabled to reveal possible complications from using
the technology. These real-traffic data can also be manipulated to increase traffic
volume or change specific traffic characteristics that the test-and-evaluate engineer
believes that they are important for testing the technology.
This approach of testing has always been perceived to be effective because it
relies on real data and while the testing is simulated, it is as close as any simulation
can be to the real-world situation.
However, there is a vital drawback in this approach. The technology that we
are testing today is not the only change that will occur in the following 10–20 years,
which is when it will be implemented. In fact, many other concepts and technologies
are being tested today, but are being tested independently. In the case of air-traffic-
conflict-detection algorithms, the air traffic of today will not continue to be the air
traffic of 10–20 years’ time. More importantly, the characteristics of the sectors and
routes can change because of other concepts that focus on redesigning the airspace,
for example, dynamic sectorization [19] and the free flight concept [16, 17].
The implication of the discussion above is that using today’s data to test tomor-
row’s technologies can be a misleading approach. If conflict-detection algorithms
are tested with today’s air routes, even if we double the traffic on these routes, our
testing is biased. Tomorrow’s routes can have completely different structures and in
fact, as in the case of free flight, may not exist at all.
Such information necessitates a thinking approach that is not bound by the box
of today. CRT offers this opportunity.
5.1.2 The Purpose
The purpose of the CRT exercise in this case study was to identify the vulnerabilities
in three conflict-detection algorithms that at the time of the case study, had passed
advanced prototyping stages. The three algorithms were a nominal algorithm [11],
a probabilistic algorithm [14], and a worst-case algorithm [15]. These were param-
eterized in the following manner:
5.1 Breaking Up Air Traffic Conflict Detection Algorithms 189
• A conflict was defined in the standard manner for en-route air traffic. Two aircraft
are said to be in conflict if their horizontal separation is less than 5 nm and their
vertical separation is 1,000 ft or less.
• Each algorithm needs to project the position of each aircraft ahead of time to
predict future conflict. The look-ahead time window was set to 8 min because the
quality of prediction beyond this time limit degrades rapidly.
• An algorithm will raise an alarm that there is a conflict if the conflict will occur
in 5 min. This is known as the time to closest point of approach (CPA), which is
the time needed for two aircraft to be as close as they can get on their routes .
• Each algorithm will scan 60 nm of the environment every 5 s to detect whether
there is a conflict. These are known as the probe range and frequency,
respectively.
Given the above parameterization, the three algorithms were ready to be tested
through CRT.
5.1.3 The Simulator
To evaluate these algorithms, we needed a simulation environment that could mimic

reality as much as feasible. Fortunately, we have been working in the air-traffic
domain for some time and we had our in-house high-fidelity simulation environment
known as ATOMS [4]. This simulation environment was designed to simulate traffic
in any airspace, with any sector and traffic characteristics that a user can select.
This provided us with the flexibility required to evaluate the three conflict-detection
algorithms.
ATOMS also has multiple models inside it that allowed us to run the simulation
on different levels of fidelity. For example, we could choose more abstract algo-
rithms to calculate fuel burn from aircraft very rapidly, or we could switch to high-
fidelity aerodynamic algorithms to have more accurate fuel calculations but more
slowly. This ability to switch between different fidelities was not investigated in this
case study, but we have investigated its advantages in other publications [6–8].
5.1.4 The Challenger
To challenge three advanced conflict-detection algorithms, we had two options. The

first option was to estimate the boundary of their performance and expose each of
them to scenarios that fall just outside their behavioral boundaries. However, we had
no means by which to estimate these boundaries directly. Moreover, the behavior of
these algorithms was defined with many parameters under many different possible
contexts (scenarios) that we needed to understand as part of the CRT exercise.
Therefore, we had to resort on the second option of blind probing with optimization
to discover the boundaries of these algorithms’ behaviors.
The idea of blind probing is to form an optimization problem in which the

parameters that we need to optimize are the parameters that define the scenario
space. The objective of the optimization is to find scenarios where the algorithms
fail most. In a classical optimization, this idea of blind probing can be misleading
because one may end up with a worst-case analysis, that is, the scenario that can
exist where the algorithm will fail extremely badly. However, this is not the objective
of this exercise. Our objective is to estimate the boundary of behavior, that is, the
performance envelope for each of these algorithms. Therefore, the optimization
objective is not to find the most disastrous scenarios alone, but those scenarios
encountered along the way, including scenarios in which the algorithms did not fail.
Evolutionary-computation techniques provide such an advantage. The first pop-
ulation of solutions is generated at random, where it is very likely that the scenarios
are easy for each algorithm. These easy scenarios fall nicely inside the performance
envelope. As evolution moves from one population to another, the scenarios become
more difficult, and slowly we move from inside the performance envelope to outside
the envelope until we reach scenarios that increasingly cause the algorithms to fail.
If the initialization of the evolutionary-computation method generates scenarios that
are too difficult, we can easily change the objective function so that the search
algorithm moves from outside the behavior space to inside.
By collecting all solutions encountered through the evolution, we can then apply
data-mining techniques such as decision trees or decision rules to discover the
boundaries where the algorithm switches from being effective in detecting conflicts
to being ineffective in detecting conflicts.
When the boundaries have been discovered, we have solved the challenge prob-
lem. We can easily probe these algorithms with scenarios that can challenge their
behavior. These challenging scenarios are those that the designers of the algorithms
would have thought were solvable by the algorithms, but were not because they were
shown to exist slightly outside the true performance envelope.
We will start with the optimization component then describe the data-mining
component.
5.1.5 Context Optimizer
The first step in this optimization is to decide how to represent a scenario. To

establish and run a single simulation in ATM, we usually require very large amount
of data, including data on atmospheric conditions; aircraft aerodynamic parameters
and aircraft data (those are the key parameters associated with a specific aircraft
to provide accurate estimations of the performance of this particular aircraft);
air routes; airspace constraints; special use (restricted) airspace; and operating
procedures and rules.
Most of these massive amounts of information become an integral part of the
simulator itself. Nevertheless, as designers and red teamers, we need to be conscious
of these data because they can hold detrimental assumptions that need to be
validated. It is also important not to be too ambitious and allow all of these data
to change in a scenario because the more variables we allow to change, the more
time and resources we need to run simulations and optimize.
To strike the right balance between what goes into the simulator for initialization
and what goes into the scenario as parameterization of context is a problem-
dependent task. The success of this balance will rely on the expertise of the analysis
team.
In this case study, our focus was on conflicts. The primary purpose of the CRT
exercise can be restated as the following: “how to condition conflicts such that these
conflict-detection algorithms fail.” It is the keyword “condition” that will make us
focus on what we need to optimize by asking what is the minimum amount of
information we need to know to condition a conflict.
To condition a conflict, a mathematical analysis needs to be performed to
understand the mathematical building blocks for a conflict. From this mathematical
analysis, three key parameters derive the formulation of a conflict: distance, angle,
and phase of flight. A conflict is defined when two aircraft violate the minimum
separation distance at the CPA. This distance can be violated in the vertical and/or
horizontal dimensions. Thus, we have two key parameters: horizontal separation at
CPA and vertical separation at CPA.
The conflict angle at CPA is a third parameter that defines the angle at which the
two aircraft are approaching each other. This angle is crucial because it contributes
to the level of fatality of a conflict when comparing, for example, head-to-head and
head-to-tail conflicts.
The final two parameters represent the phase of the aircraft at the time of conflict.
One aircraft may be climbing to a higher altitude, descending to a lower altitude, or
maintaining its cruise phase. Thus, we have two parameters, one for each aircraft.
Each of these parameters can take one of three categorical values (climb, cruise, and
descent).
These five parameters (horizontal separation at CPA, vertical separation at CPA,
conflict angle, phase of first aircraft, and phase of second aircraft) directly define the
characteristics of a conflict. Therefore, performing the optimization directly using
these five parameters would be the most compact representation we can create to
define a conflict. These five parameters exist for one pair of aircraft. In a scenario
with 100 aircraft separated into 50 pairs, the total number of parameters that we
would need to optimize would be 250. This is a large-scale optimization problem,
especially because of the black-box nature and the high level of nonlinear interaction
that occurs within the simulation environment. However, it is the most compact
representation of the problem. Any approximations beyond this level can create
hidden holes in the analysis that pop up later as vulnerabilities.
Thus far, the conflict parameters have been defined directly, but a traffic scenario
is not defined with conflict parameters. A classical air-traffic scenario is defined in
its simplest form by flight plans that describe for each aircraft its origin, destination,
take-off time, and route as a minimum set of information. Therefore, the conflict
parameters defined above need to be transformed to flight-plan information. This
process requires a mathematical transformation known as “backward integration.”
The idea is to take the position where the conflict will occur, along with the above
five parameters, then propagate back the position of each aircraft to construct a route
that generates this conflict from a particular origin.
Through backward integration, the 250 parameters are used to generate 100
flight plans that guarantee a minimum of 50 conflicts in the scenario. These 50
conflicts are generated with the characteristics defined by the parameters. Naturally,
it is expected that more unplanned conflicts will arise from the interaction of these
aircraft. However, if no other conflict arises, there are at least 50 conflicts in each
scenario. Evolutionary computation is used to optimize this vector. A chromosome
contains 250 variables. Each chromosome represents the parameterization of an air-
traffic scenario with 100 aircraft.
To discover the vulnerabilities in the three algorithms, the problem is to search
for those scenarios for which the algorithms will fail to detect conflicts. The failure
of a conflict-detection algorithm is measured with two metrics: missed detects
and false alarms. These two metrics are generally in conflict. As the algorithm
attempts to detect every conflict, it generally needs to lower its threshold for the
classification of a conflict. This reduces missed detects but increases false alarms
because the lower threshold will categorize the non-conflict situations that are
close to the boundary that defines a conflict as a conflict. This situation suggests a
multi-objective formulation with two objectives: maximization of false alarms and
maximization of missed detects.
The second version of the non-dominated sorting genetic algorithm
(NSGA- II) [10]. NSGA-II is a good evolutionary multi-objective optimization
algorithm. It may not be guaranteed to converge every time to the global optimal
non-dominated set, but in our case, this is an advantage. By running the algorithm
many times, we can generate paths in the search space, some of them focus on local
non-dominated sets, while others focus on the global non-dominated sets.
A population size of 50 scenarios, evolving over 100 generations and the process
repeated 10 times with different seeds would generate 50,000 scenarios. Each
scenario has at least 50 conflicts; thus, each algorithm was probed with more than
2,500,000 conflicts. This massive amount of data generated by the evolutionary-
optimization algorithm was then fed into the data-mining algorithm to discover
patterns of failure.
5.1.6 Context Miner
Given of the large amount of data available, and the need to understand the per-
formance boundaries of each of the three conflict-detection algorithms to estimate
their performance envelope, we needed to rely on an efficient data-mining technique
that was transparent. Transparency here means that the output of the data-mining
technique needs to be simple and in human-understandable language so that the
output can be verified by the designers of the three algorithms, as well as any air-
traffic organization interested in evaluating these algorithms.
Fortunately, in this case, the author was simultaneously heading a different

project on designing efficient rule-mining algorithms for network-intrusion detec-
tion (NID). While the domain of application was different, the requirements were
the same: the algorithm needed to be able to handle a large amount of data and
generate transparent rules that could be judged by a domain expert. Within NID,
there was an extra requirement that the algorithm needed to scale to be able to work
in a real-time environment. The data-mining algorithm relied on classifier systems,
a group of methods in evolutionary computation where a population is a group of
chromosomes, each chromosome is an “IF : : : THEN : : :” rule. The population
attempts to adapt the rules to match the data; thus, over time, the rules correctly
classify the data. While classifier systems are intuitive in their design principles,
they have been known to be very slow with very large number of parameters. These
were the two problems we needed to solve before a classifier system could work for
NID.
The resultant data-mining algorithm was inspired by how virus protection
software rely on a virus-definition file that contains the signatures (rules/patterns)
of different viruses. As new viruses become available, they are detected manually
by domain experts, and their signatures are added to the virus-definition file. The
resultant data-mining algorithm worked in the same way, except that new patterns
were automatically detected by the classifier system. The technical details of the
algorithm can be found in [18].
Figure 5.1 demonstrates the architecture of the data-mining algorithm. At first,
the signature store (rule set) is empty. The data stream is directed to the classifier. As
the classifier learns IF : : : THEN : : : rules from the data, and after these rules become
reliable in their ability to predict their class, they are added to the signature store.
When the data pass through the signature stores, if a rule matches the incoming data,
it is used to predict the class; otherwise, the data are sent to the classifier. Over time,
increasingly fewer data will go to the classifier because the signature set becomes
better at detecting the class of incoming data.
Fig. 5.1 Signature extraction classifier

When using this classifier system in a real-time environment, the relationships

in the incoming data can change; therefore, old learned rules become obsolete. The
signature store will detect this when the accuracy of a rule goes down. In this case,
the rule is deleted, allowing the data that used to match it to pass through to the
classifier system. This mechanism allows the classifier system to learn new rules
that adapt to the change in the environment.
5.1.7 The Response
The data-mining algorithm allowed us to learn when each conflict-detection algo-

rithm has failed, that is, the performance envelope for each conflict-detection
algorithm. The response of the CRT exercise will be a synthesis of these results
in the form of an ensemble algorithm.
In an ensemble algorithm, a group of algorithms is used instead of any of them
in isolation. This allows flexibility in the ability of the system to respond. Since the
data-mining algorithm identified the rules of when each algorithm will fail, these
rules are in essence the environmental states that when occurring will mean an
algorithm will fail. By collecting these environmental states in a repertoire, we can
predict which algorithm will fail; therefore, we can predict simultaneously which
algorithm we should use.
This idea allowed the use of the three algorithms because each of them fail
differently. The response of the red team, therefore, is the creation of a new
algorithm that is comprised of all three and is more robust than any of the three
algorithms in isolation.
This exercise was successful in two aspects. First, the CRT exercise identified
the vulnerabilities of the algorithms in a transparent representation and language that
could be validated by others who are not specialized in the field. Second, the exercise
did not stop at the level of discovering vulnerabilities, but was completed with a
proposed fix of these vulnerabilities by allowing the algorithms to work together as
an ensemble.
5.2 Human Behaviors and Strategies in Blue–Red

Simulations
Section 1.7.4 presented blue–red simulations. The next generation blue–red simu-
lation systems allow for representation of human behavior using behavioral models
within the simulation environment. Examples of these systems include ModSAF [9]
and OneSAF [20]. Most of these behavioral-based models rely on studies that
5.2 Human Behaviors and Strategies in Blue–Red Simulations 195
identified how humans react in certain situations. For example, a behavioral model
can represent human walking speed as a function of the load the human is carrying,
the terrain, fitness level of the human, and how tired the human is. More difficult
areas in human modeling are the considerations of the type of planning strategies
a human uses when faced with certain situations, even simple situations such as
chasing one another.
In this case study, we needed to red team against a human to uncover human
strategies in simple goal-following tasks. These types of strategies may sound
overly simple. However, as soon as we add dimensions of deception and noisy
information, we are faced with many options that can lead to different behavioral
models. A need arises to provide evidence to support these different models, and
to provide information about the context in which to apply them for blue–red
simulation environments. Moreover, once it is possible to design a methodology
to design human-based experiments to study a pool of strategies that the human
follows in certain contexts, a second need arises for automatic methods to conduct
this type of analysis autonomously given the large volume of data.
The task was very simple, and was inspired by Tom and Jerry (the cat and mouse).
Tom (blue agent) attempts to catch Jerry (red agent), while Jerry attempts to escape
Tom. In this exercise, Tom is fully autonomous, following simple fixed strategies,
while Jerry is controlled by a human.
The situation is an abstraction of a thief being followed, a software agent in
a cyber space attempting to follow automatically a human intruder, or a group
of autonomous aircraft following another group with ground-controlled or air-
controlled pilots. In many of these problems, the physical space where maneuvers
occur can be isolated (to some extent) away from the strategy followed by humans.
Modeling the environment with this level of simplicity has the advantage of
eliminating the unnecessary complexities that exist in the real world and do not
necessarily contribute to the real phenomenon that we wish to study. Moreover, it
allows the analyst to focus on the question, and develop the necessary tools without
worrying about irrelevant details.
Within the decision-support and information-fusion literature, the above
approach is valid. The Tom and Jerry game is a high-level representation of the
context or situation under investigation.
To explain with an example, assume that the two teams are two companies.
The behavior of these two companies is defined by two factors: investment in
production and investment in marketing. We can see these two factors forming a
two-dimensional space, with Tom and Jerry representing the current distribution of
the budget between production and marketing. Tom is attempting to copy Jerry by
following him in the environment, while Jerry is trying to maneuver to escape from
Tom. Both would attempt to deceive each other, while their information about each
other also contains a level of noise. This level of abstraction allows us to focus on
the key phenomenon we wish to investigate without becoming overwhelmed with
the situation.
5.2.2 The Purpose
The main purpose of this CRT exercise was to establish a CRT methodology to
identify vulnerabilities in human strategies within a simple reflex task.
The team size was reduced to one; thus, we only had a single agent on each side.
The blue agent was autonomously controlled through predefined heuristics written
in a scripting language. The red agent was controlled by humans in one set-up, by
a machine in a second set-up using scripts, and by a machine in a third set-up but
using a machine-learning model (artificial neural network) that was trained in the
human data produced in the first set-up.
The human-based experiments relied on 34 (19 females and 15 males) human
subjects. The scenario space involved 20 scenarios. Each scenario was played twice
by each human player; thus, each human played 40 games. The sequence was
shuffled to guarantee that each human player received a random sequence that was
not played by any other players.
Each scenario was defined by two factors: sensorial capabilities (input to blue)
and behavioral capabilities (output of blue). These two dimensions represented the
situation awareness of blue and the deceptive range of blue, respectively. More on
these parameters will be discussed in the simulation section.
5.2.3 The Simulator
The simulation environment was very simple two-dimension grid. This simulator
acted as a game engine for the human experiments, whereby the human interacts
with the simulator, and as a simulator for the machine experiments.
The grid was a bounded 640 640 cells, involving a wall of eight cells at
each side. This left an environment of 624 624 within which the agents could
move. While the space was modeled as a grid to match the resolution of the
space on the computer screen for the players to play, movements were described
in a continuous domain using a fixed speed of step with an angle between
Œ180ı ; 180ı . Therefore, agents had the same speed so that they did not take
advantage of speed differentials. They only controlled the travel angle, which was
b for the blue agent, and r for the red agent. The simulation environment used
discrete time simulation. Nevertheless, the time taken by the human between seeing
the blue agent moving and responding with a move was recorded and analyzed as
reaction time.
The true challenger in this scenario is the red agent because the blue agent acts
using a predefined strategy at the beginning of each simulation independently of
5.2 Human Behaviors and Strategies in Blue–Red Simulations 197
the strategy the red agent is following. Meanwhile, the red agent is expected to
approximate blue’s strategy to escape from blue.
The blue agent attempts to capture the red agent by traveling straight to where
the red agent is moving. The blue agent also applies deceptive strategies, whereby it
may deviate from this straight line, coaxing the red agent to think that it is traveling
away from them. This deceptive strategy allows the blue agent to deviate for some
time then move head on to the red agent.
The blue agent produces an action based on the most recent position it receives on
red. The sensorial information for blue is modeled using two parameters: frequency
of receiving information and noise in received information. As such, when there
is a large time lag in the information that blue receives, the blue agent acts during
this time on a historical position for the red agent. Similarly, if the information the
blue agent receives has a high level of noise, the blue agent will act on a perturbed
position for red.
The red agent is controlled by a human or a machine. Information is commu-
nicated to the red agent in the form of information display-in the case of human
control-or in a digitized manner-in case of machine control. In the former, there
are two classes of scenario to which the human is exposed. One class is termed the
“known–known” and the other is termed the “known–unknown.” In the former class,
the red agent sees where the blue agent is, where the blue agent thinks/perceives the
red agent is, and where the red agent truly is. In the latter case, the red agent only
sees where blue and itself are in the environment, without knowing what the blue
knows. It is hypothesized that in the former class, allowing the red agent to know
where the blue agent thinks it is, will aid the red agent in escaping from the blue
agent.
5.2.5 Behavioral Miner
At the end of all experiments, every trajectory generated by the blue and red agents
in every simulation was recorded. These trajectories are the expression of the red
agent’s strategy in actions. While a trajectory here is simply a sequence of movement
angles that the red agent decided to produce to escape from the blue agent, this
sequence contained a significant amount of information.
To mine this information, the sequence of angles needed to be converted into
information on how a player employs a specific strategy while playing. A strategy
here was the sequence of actions followed by a player with the intention to achieve
a goal. Since the angles on their own do not reveal a great deal of information about
the intention of the player, we needed to calculate the first and second derivative
of the angle. The change in an angle and the acceleration of the change revealed
information on player’s intentions.
We were also able to calculate all moments in the sequence of angles, but this will
be a computationally expensive exercise. Therefore, the behavioral miner analyzed
three pieces of information: change information calculated from the sequence of
angles, reaction-time information, and the scores of different games.
5.2.6 The Response
These sets of experiments produced some fascinating analysis. In particular,

the strategies employed by the human red agent were very limited compared to the
strategies employed by the machine red agent. The machine tended to produce more
creative strategies in response to the blue agent’s actions in different set-ups. These
results demonstrated a vulnerability in human behavior, whereby the humans tended
to behave in the same way, by reacting to the blue agent without differentiating in
their behavior between the known–known and known–unknown scenarios.
In essence, the extra information provided to the human on where the blue agent
thinks the red agent is located seemed to be ignored by the human. The human
was more focused on the task of escaping the blue agent, while ignoring the extra
available information.
The previous result was supported by the similarity in reaction time in all
scenarios. We would have expected if the human considered the extra information,
the reaction time would be slower in the known–known scenarios. However, this
was not the case, supporting the conclusions above that the human ignored the extra
information and focused on the escape behavior.
Despite the simplicity of the experiments, this case study provided a set of data-
mining tools to analyze behavior information in non-kinetic spaces.
5.3 Cognitive-Cyber Symbiosis (CoCyS): Dancing

with Air Traffic Complexity
The final case study presents a CRT exercise that brought together many of the
computational tools and methods discussed in this book. The initial concept and
results are discussed in [1–3].
Operators in safety-critical domains hold a great deal of responsibility for
maintaining the safety of the environment. In the air-traffic-control domain, an
ATCO would be managing one dozen or two dozen aircraft simultaneously, and
be responsible for the thousands of lives on board of these aircraft. As the traffic
demands increase, and changes are introduced in air routes and structures, the
complexity of the environment simultaneously increases.
5.3 Cognitive-Cyber Symbiosis (CoCyS): Dancingwith Air Traffic Complexity 199
The primary negative risk in a brain-traffic-interface task within a safety-critical

domain is the escalation of air traffic complexity to the point that the human ATCO
cannot accommodate.
Managing complexity using CRT is a non-trivial task. This project needed to
bring together different concepts and systems from different fields to design a CRT
system able to red team complexity in real time.
By monitoring the environment, estimating how the complexity of traffic is
changing, and monitoring the controllers, as well as estimating how their cognitive
complexities are changing, CRT can generate actions to challenge and steer any of
these complexities to keep them within their operating envelope. That is, when the
humans are less engaged or over-engaged (moving outside their normal operating
envelope), the CRT system needs first to approximate where this boundary is, and
identify which actions can steer the state of the controller to cross back the boundary
to the normal operating envelope.
Approximating the boundary of the cognitive complexity of a human is a task
that is far from trivial. Every human is different, and every day can come with its
own issues that makes the humans perform differently. The in-situ nature of the
environment makes it even more difficult. It is not possible to stop the traffic and
probe the human with questions or put the human through lab experiments to extract
data on their cognitive performance.
Fortunately, the above difficulties define an opportunity. Within this context,
the ATCO is assumed to be focusing on the air-traffic environment. Under normal
circumstances, this assumption is fair, and can be considered a default assumption
in this domain. Even if the ATCO arrives to a shift in a state of high stress because
of a fight they had at home on that day, the assumption is valid: that no other factor
outside the ATCO’s working conditions will be added once they begin their shift.
Therefore, there is no assumption on the state of the ATCO at the beginning of their
work shift; the assumption is that once the ATCO begins their shift, the only thing
to which they are exposed is the working task they are being assigned. If during a
shift the ATCO receives a telephone call with a major issue that is not work related,
it can impact their performance, but there are protocols in place to compensate that
in ATC.
This discussion leads to a question: Can we see the traffic with all its complexity
as the cause for a change in an ATCO’s cognitive complexity? If we can, instead
of approximating the performance envelope of the ATCO through direct cognitive
indicators alone, we can approximate this performance envelope through the task
itself.
To proceed with this idea, the following simplified scenario can be considered.
Brain data can be measured easily from an ATCO with today’s technology. In
this project, we focused on EEG data, which measure the electrical potential of
the neurons in the human brain. We used harmless non-intrusive sensors that were
placed on the scalp. Indicators were extracted from the brain data on brain activities
in real time and without any interference with the ATCO’s job.
Simultaneously, the traffic can be analyzed in real time, with indicators extracted
from the traffic. However, the question is how to link the air-traffic indicators and the
brain-data indicators? We need some tasks that are concrete and simple to provide
us with the confidence that the changes we notice in brain activities are due to the
task and nothing else.
Let us now assume we have two tasks; we will call them baselines. The second
task is a copy of the first task with the addition of one subtask such as counting.
Differences in brain activities between the two tasks, assuming everything else is
constant and controlled for, can be contributed to counting. That is, if we somehow
subtract the two tasks from each other, and we somehow also subtract the brain
signals from each other, in a strict sense, the first result is taken to cause the second
result because the first result occurred before the second result, and under the closed-
world assumption of this test, there are no other alternative explanations for the
second result except the first result.
In this hypothetical example, one non-trivial assumption is that everything else is
constant and controlled for. In a real-world environment, this assumption is very
unrealistic. However, let us now imagine we can repeat this process many times,
and on a continuous basis, we will obtain many differences between tasks and many
corresponding differences between the brain data that we can correlate. Given the
continuity of the data collection, differences in tasks are repeated, allowing for
multiple measurements for similar phenomena that we can average or for which
we can perform some other statistical tricks to eliminate the extra factors for
which we did not account. These differences become the probes that we use to
approximate the performance envelope of the complexity in the environment.
We are then left with the task of designing actions that can maintain the environ-
ment inside the envelope every time it attempts to cross the complexity boundary.
Given the dynamic nature of the environment, these actions need to stabilize the
environment over time. That is, this can be considered a classical control system that
attempts to stabilize this environment. Unfortunately, the task is too complex to use
classical control theory or perform system identification in advance to understand
fully the states of this system or the constraints on these states.
CRT performs this control function using tools from simulation, optimization
and data mining. By having a simulation running in the background shadowing the
traffic in the foreground, we can use this simulation to perform impact analysis.
The optimization search engine can propose solutions and request the simulation to
project the impact of these solutions in the future. Only solutions/actions that are
more likely to stabilize the environment are selected as potential actions to steer the
environment back inside the complexity envelope.
We are now left with one last issue to resolve: how to implement this action.
Within an air-traffic environment, we cannot accept the risk that the system can
produce an action that (because of the approximations in the simulated environment)
can expose the safety-critical system to negative risk. There are a number of
practical solutions for this, but these are too technical and outside the scope of
this summary of the case study. For simplicity, we can assume that the action can
be communicated to the ATCO through a human intermediary. This human expert
can transfer the action in a manner that manages this risk. The implications of this
protocol are also outside the scope of this summary.
5.3.2 The Purpose
The main purpose of this exercise is to anticipate by forward projection and

challenge the task and cognitive complexities as they arise in a safety-critical
domain. The concept of a challenge here is more important than ever. This exercise
needs an explicit model of a challenge that approximates the performance envelope
of the ATCO and the complexity in the environment.
Figure 5.2 depicts the primary concept underlying this CRT exercise. By
simplifying the demands on the mental resources into three categories of low,
engaging and high; a human is said to be facing cognitive-load problems when
these demands are either low (causing a state of boredom), or high (causing a state
of mental exhaustion). Both boredom and mental exhaustion will be referred to as
“risky cognitive states.”
For example, when the risk in the environment of a possible collision between
two aircraft is relatively small, the human cognitive state can escalate this risk to
a high level if the human is in a risky cognitive state. For example, a conflict may
exist 10 min into the future, but the controller is overloaded with communication
with pilots on a different conflict. While the conflict that is 10 min into the future is
usually solved with some maneuver strategies, the cognitively overloaded controller
is distracted, which results in a delayed response that can make the low-risk conflict
a high-risk conflict.
Fig. 5.2 A pictorial representation of the cognitive balance required for users in safety critical
jobs
As such, the purpose of this CRT exercise is to estimate the boundary constraints
of this interaction between the environment and the controller’s cognitive state,
with the aim of dynamically designing strategies to maintain the controller in the
“engaged” zone.
The players in this exercise were predefined because of resource constraints. It
was not possible to plan for having many ATCOs. Four ATCOs and two pilots were
available for the study at different times over the period of 1 week.
5.3.3 Experimental Logic
As discussed in the previous section, contrasting indicators from the traffic with
indicators from the controller’s brain provided a means to identify the boundary
constraints for this problem. Figure 5.3 demonstrates the characteristics of the brain-
traffic interface. On the controller side, the controller’s environment represents the
wider environment within which the controller is embedded. In this environment,
the controller is performing many tasks. Some tasks are non-work related such as
thinking about their partner; some tasks are work related but not traffic related such
as thinking about their boss; while other tasks are work and traffic related. These
latter tasks represent the core job of the controller.
As the controller attempts to manage safety, the controller performs three main
functions of monitoring, planning, and action [12, 13]. These functions require
Fig. 5.3 Brain traffic interface

cognitive processes such as use of the working memory or shifting attention.

The constrained cognitive capacity of the human is usually abstracted in the form of
limited cognitive resources. As the human employs cognitive processes to perform
the controller’s functions, cognitive resources are depleted.
EEG data can monitor the cognitive processes; this allows us to estimate the rate
of depletion in cognitive resources.
Concurrently, and on the task side, there are many factors impacting the air
traffic: from the mix of aircraft and the structure of the airspace to the time-
variant factors such as traffic flow and weather. The literature of air-traffic control
has metrics to evaluate the complexity of the traffic, as well as the complexity of
the controller’s tasks. These complexity metrics provide indicators of the changes
occurring within the air-traffic environment.
The user interface (visual and auditory cues) that the controller uses to receive
information about the airspace and send commands back to the air-traffic envi-
ronment captures all uncertainties in the situation. There are uncertainties that are
characteristics of the traffic itself such as the uncertainty in estimating the position
of an aircraft over time. However, there are uncertainties that stem from the design
of the user interface such as decisions made by the designer on how to represent
cues, and which information to display or hide at which point in time. Therefore,
the user interface can be considered the interface of uncertainty where the Cyber
and physical worlds interact with the controller’s cognitive world.
Figures 5.3 and 5.4 bring these complex dynamics together in a pictorially to
illustrate the level of complexity required to manage the brain-traffic-integration
problem. As stated, “the devil is in the details,” meaning that this level of conceptual
Fig. 5.4 Brain traffic interface loop

explanation is sufficient to understand holistically what the system is doing, but the
actual complexity in making this system a reality rests within the technical details
that are left out to maintain the presentation at an appropriate level for a wider
audience.
5.3.4 The Simulator
While real-life ATCOs and pilots were used in this exercise, the experimental nature
of CRT means that the traffic itself was simulated using a complex high-fidelity
simulation environment developed by Eurocontrol. For the sake of our discussion,
we will term this simulator the “real traffic:” the word “real” here needs to be
interpreted as “realistic.” This is to differentiate this simulator from the simulator
used within the CRT environment, which we will term the “shadow simulator.”
The shadow simulator was primarily used in the background to “shadow” or “run
in parallel with” the real traffic. The purpose of the shadow simulator is as before:
it becomes the CRT tool to mimic the real environment. It enables the CRT exercise
to project ahead, and undertake consequence and impact analyzes as required.
The extra tasks that the shadow simulator needs to conduct means it needs to be
very fast. To achieve this speed, the shadow simulator does not need to duplicate
everything in the real traffic. There are many functions and information that are
calculated in the real operational environment that are essential for the controller to
conduct their job. The shadow simulator needs to calculate only those calculations
necessary for the CRT exercise. This is achieved by selecting an appropriate level
of abstraction with which the shadow simulator can work; thus, allowing it to run
in fast mode and at a much faster speed than the real environment to conduct look-
ahead analysis.
Even with many tricks in speeding up the shadow simulator, it is not very
easy to bring the speed to the level required for the CRT environment. Within
this experiment, the exercise is running in a real-time mode. Within a minute or
two, thousands of simulations may need to be conducted by the shadow simulator.
To enable this number of simulations to occur, the shadow simulator can run on
multiple cores, or even on a computer cluster. In this exercise, we did not need to
use a computer cluster because of the team’s vast expertise in designing efficient
simulators and the analysis that was conducted to ensure that the simulator included
all information needed for the exercise and nothing more.
5.3.5 The Miner
Multiple pieces of data analysis are needed in an exercise such as this. First, brain
data are signals that are measured in real time. Second, the traffic itself needs to be
assessed in real time.
The brain data were measured from 21 sites on the scalp, including two
references to transform the measurements into voltage information, with each site
sampled at 2,048 Hz, that is, 2,048 readings every second for each site. This is a
large amount of data, amounting to approximately an array of 258,000 real numbers
per min. Using 128 sites would increase these data with an order of magnitude.
The raw signals needed to be processed and transformed into high-level cognitive
indicators in real time. Designing the cognitive model to support this analysis
required many experiments before the actual exercise to ensure that the model was
adequate for the analysis. The details of the cognitive model are too technical for
the purposes of this chapter, but it is very important to note here that a great deal
of work is involved with this modeling, and it requires significant interdisciplinary
experience. This cognitive model cannot simply rely on an understanding of human
cognition or neuroscience, it requires experience and knowledge of air traffic,
optimization, simulation, and the overall CRT tools and concepts.
The amount of traffic data is much less than the amount of brain data. Similar
data-analysis architecture is used to analyze the traffic, extracting indicators such as
the number of aircraft within the sector, crossing angles between pairs of aircraft,
and CPAs for pairs of aircraft.
5.3.6 The Optimizer
The role of optimization within this exercise was to find actions that can steer
complexity back to the engaging region. This meant that when the complexity was
too high, it needed to be decreased, and when the complexity was too low, it needed
to be increased to keep the controller engaged.
The list of actions was predetermined and designed to be realistic. The optimizer
was called every 2 min and a decision was needed within 30 s. After an in-depth
analysis of the problem, it was decided that an exhaustive search strategy was
possible. Thus, a complete search was conducted by evaluating the impact of every
action on every aircraft using the simulator. This allowed the optimizer to select the
optimal actions to generate the desired impact.
The challenger in this exercise is composed of multiple components. One com-

ponent is concerned with the estimation of cognitive boundaries for the user’s
engagement in a task. This component approximates this boundary and continues
to monitor the state of the user’s engagement level.
The second component estimates the complexity boundary of the traffic itself,
and continues to monitor this boundary. The complexity boundary of the traffic was
pre-estimated before the actual CRT exercise. The reason was that we were using a
scenario within a simulation environment. Therefore, we were able to analyze this
scenario before the exercise.
In the case of the human-engagement level, pre-estimation of the complexity
boundary was not possible because of the variations of human subjects. Therefore,
the estimation for engagement level needed to be executed dynamically within the
exercise itself.
The third component of the challenger defines the rules and actions of challeng-
ing. The rules’ subcomponent is built on top of the first two components to decide
whether complexity is crossing the boundaries. If it decides that complexity is
crossing from one subspace that is desirable to an undesirable subspace, the second
subcomponent manufactures the required response through a negotiation process
with the optimizer. This negotiation process does not only rely on the best solution
found by the optimizer, but also on the history of previous actions taken, and the
integrity constraints to ensure that actions are realistic and consistent with the traffic
environment and operating procedures.
5.3.8 Experimental Protocol
The primary set of experiments was conducted over a working week with four
ATCOs and two pilots. Figure 5.5 summarizes the experimental protocol for each
subject. Each subject underwent a debrief and initial training or re-familiarization
with the environment because they all had experience in previous experiments with
similar tasks. A demographic survey was conducted followed by two cognitive tests,
mainly for the purpose of studying differences and similarities of the subjects. Each
subject conducted four air-traffic-control sessions.
Figure 5.6 summarizes the protocol followed in each of the four sessions. Each
session lasted for 75 min, and included one ATCO and the two pilots. The ATCO
will be referred to as the “measured player,” while all other team members within
the experiment, including analysts and pilots, will be referred to as the “unmeasured
players.” The position where the ATCO is sitting will be referred to as the “measured
position,” while other positions are the “measured” positions.
The “measured” refers to the process of measuring data from the human and
the traffic. The purpose of this CRT was to red team complexity with the ATCOs;
therefore, the ATCOs were the only subjects to be analyzed in this experiment.
The traffic scenario needed to be tested under the four following conditions: situ-
ations in which CRT was not used; situations in which CRT was used and relied on
traffic complexity alone; situations in which CRT was used and relied on cognitive
Fig. 5.5 Protocol for each ATCO/subject tested during the exercise
Fig. 5.6 Protocol for each of the 16 sessions conducted during the exercise
complexity alone; and situations in which CRT was used and relied on both traffic
and cognitive complexity. Each ATCO underwent all four conditions over four
sessions. The sequence of conditions was shuffled for each ATCO.
Subjective assessments where conducted at the beginning of each session in the
form of a survey, and at the end of each session in the form of a set of rating
questions to assess the perceived complexity of the scenario.
Human-brain data vary between different humans and even for the same human
at different times of the data and in different situations. As such, at the beginning of
each session, each ATCO was monitored for 6 min to collect baseline information
about the human at this particular point of time. This information was divided into
monitoring for 2 min with eyes closed and the person attempting to relax; 2 min
with eyes open and the person attempting to relax; and 2 min with eyes open and the
person attempting to solve a computational task. The same 6 min where repeated at
the conclusion of the traffic scenario.
5.3.9 The Response
The exercises were a great success. Some of the initial technical results were
published in [2]. The CRT environment, along with the cognitive models, were
successful in adapting the environment to the traffic and ATCOs’ states.
The most significant and domain-specific finding was that mathematical models
that test the complexity of a situation, such as air-traffic states, were insufficient
in approximating the cognitive complexity of the ATCO. There are legitimate and
logical reasons for this, besides the evidence extracted from these experiments. First,
the complexity within an air-traffic environment is built up over time.
Therefore, models that rely on a snapshot of the traffic are only examining what is
happening now, rather than how the situation has been evolving since the beginning
of the shift. Therefore, one may not notice an increase in complexity over time, but
continuous exposition of a human to a situation over time can create implications
of cognitive complexity for that human. Therefore, complexity measures need to
integrate their findings over time.
Second, each human is different. For example, humans differ in their skills,
problem-solving strategies, and perceptual abilities, all of which impact human
perception of complexity and the real impact of the air traffic on the human
cognitive processes. Therefore, a single mathematical measure that only considers
the traffic/context without the human managing the traffic/context can be very
misleading in judging a situation.
The findings of this exercise indicated the need to design complexity metrics that
are dynamic; interested readers can visit [3] for more details on this issue.
References 209
References
1. Abbass, H., Deborah, T., Kirby, S., Ellejmi, M.: Brain traffic integration. Air Traffic Technol.
Int., 34–39 (2013)
2. Abbass, H., Tang, J., Amin, R., Ellejmi, M., Kirby, S.: Augmented cognition using real-time
EEG-based adaptive strategies for air traffic control. In: International Annual Meeting of the
Human Factors and Ergonomic Society. HFES, SAGE (2014)
3. Abbass, H., Tang, J., Amin, R., Ellejmi, M., Kirby, S.: The computational air traffic control
brain: computational red teaming and big data for real-time seamless brain-traffic integration.
J. Air Traffic Control 56(2), 10–17 (2014)
4. Alam, S., Abbass, H.A., Barlow, M.: Atoms: air traffic operations and management simulator.
IEEE Trans. Intell. Transp. Syst. 9(2), 209–225 (2008)
5. Alam, S., Shafi, K., Abbass, H.A., Barlow, M.: An ensemble approach for conflict detection in
free flight by data mining. Transp. Res. Part C: Emerg. Technol. 17(3), 298–317 (2009)
6. Amin, R., Tang, J., Ellejmi, M., Kirby, S., Abbass, H.A.: Computational red teaming for
correction of traffic events in real time human performance studies. In: USA/Europe ATM
R&D Seminar, Chicago (2013)
7. Amin, R., Tang, J., Ellejmi, M., Kirby, S., Abbass, H.A.: An evolutionary goal-programming
approach towards scenario design for air-traffic human-performance experiments. In: IEEE
Symposium on Computational Intelligence in Vehicles and Transportation Systems (CIVTS),
pp. 64–71. IEEE, Singapore (2013)
8. Amin, R., Tang, J., Ellejmi, M., Kirby, S., Abbass, H.A.: Trading-off simulation fidelity and
optimization accuracy in air-traffic experiments using differential evolution. In: IEEE Congress
on Evolutionary Computation (CEC). IEEE, Beijing, China (2014)
9. Calder, R., Smith, J., Courtemanche, A., Mar, J., Ceranowicz, A.Z.: Modsaf behavior sim-
ulation and control. In: Proceedings of the Conference on Computer Generated Forces and
Behavioral Representation (1993)
10. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic
algorithm: Nsga-ii. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)
11. Dowek, G., Geser, A., Munoz, C.: Tactical conflict detection and resolution in a 3D airspace.
In: Proceedings of the 4th USA/Europe ATM R&D Seminar, Santa Fe (2001)
12. Endsley, M.R.: Measurement of situational awareness in dynamic systems. J. Hum. Factors
Ergon. Soc. 37(1), 65–84 (1995)
13. Endsley, M.R.: Toward a theory of situational awareness in dynamic systems. Hum. Fact.:
J. Hum. Fact. Ergon. Soc. 37(1), 32–64 (1995)
14. Erzberger, H., Paielli, R.: Conflict probability estimation for free flight. AIAA J. Guid. Control
Dynam. 20(3), 588–596 (1997)
15. Gazit, R.: Aircraft surveillance and collision avoidance using GPS. Ph.D. thesis, Stanford
University, Stanford (1996)
16. Hoekstra, J.: Designing for safety: the free flight air traffic management concept. Ph.D. thesis,
Delft University of Technology, Delft (2001)
17. Hoekstra, J., Gent, R., Ruigrok, R.: Conceptual design of free flight with airborne separa-
tion assurance. In: Proceedings of AIAA: Guidance, Navigation, and Control Conference,
vol. 4239, pp. 807–817 (1998)
18. Shafi, K., Abbass, H., Zhu, W.: Real time signature extraction during adaptive rule discovery
using UCS. In: Proceedings of the IEEE Congress on Evolutionary Computation, Singapore
(2007)
19. Tang, J., Alam, S., Lokan, C., Abbass, H.A.: A multi-objective approach for dynamic airspace
sectorization using agent based and geometric models. Transp. Res. Part C Emerg. Technol.
21(1), 89–121 (2012)
20. Wittman Jr, R.L., Harrison, C.T.: Onesaf: a product line approach to simulation development.
Technical report, DTIC Document (2001)
Chapter 6
The Way Forward
Abstract This book presented the first steps which transform RT, the art, into CRT,
the science. Thomas Gilbert, the father of human performance, based his work on
three principles to describe a science: simplicity, coherence, and utility. In writing
this book, these three principles were closely followed. The language has been
simplified to bridge the gap between management and computational scientists;
architectures were used to both connect the concepts in a coherent manner and
provide the basic structure to design and implement the computational models; and
examples have been given to demonstrate that the true utility of CRT extends beyond
the military to individuals, organizations and a whole of government. The aim to
plant the seed of CRT has been fulfilled, but the way ahead is long. This chapter
attempts to draw patterns in the sand and lay pheromones of trails for ideas that can
inspire CRT researchers.
6.1 Where Can We Go from Here?
Many areas start as an art, evolve into a science, mature into engineering solutions,
and blend into our daily life as a technology. RT is the art. As this book
establishes the science for CRT and the first steps toward engineering solutions,
it simultaneously establishes many gaps in that science. By nature, when scientists
solve a problem, they define many new ones. However, the architectures presented
in this book provide designs for engineers to plant the seed for many technologies
that can transform CRT into benefits for the society.
The way forward for CRT from a technological perspective is to simply put
aside the philosophy of CRT and focus on the opportunities that the architectures
presented in this book offer. Some of these opportunities are discussed below.
6.1.1 Future Work on Cognitive-Cyber-Symbiosis
Humans evolved from a single family to isolated groups. As the size of each
group started to increase, connections started to emerge. Travel from one group to
another was the means to transfer knowledge, while the human brain was the only
computational device to create this knowledge on earth.

212 6 The Way Forward
Academic publishers emerged and became the service bus in the SOA designed
by the social system. Publishers established a database of science services, through
collecting scientific papers with authors and their addresses. Publishers maintained
a subscribers’ database of scientists, libraries, etc., to disseminate knowledge.
Invention of the electro-magnetic transmitters, such as telephones and telegraphs,
made it possible to share information faster. But then ships and aircrafts provided
the means for publishers to sell whole books over hundreds and thousands of miles.
Up to this point of time, the thinking world was controlled by human minds.
The invention of computers followed by the internet created tipping points in this
thinking sphere. Today, there are machines that can solve complex problems that
humans cannot, and an environment of different nature that connected the overall
Electro-Magnetic Spectrum; an environment we call today: the Cyber space.
The Cyber space redefined cognition. Today, this Cyber space is the host of our
minds and brains. We discovered that our brain is nothing but an electro-magnetic
spectrum in itself. Signals that carry information get transmitted between neurons
that store information. Today, we discovered that we need to shift our focus from the
physical space to the cognitive and Cyber spaces. Classical physics can only offer
limited progress in these new spaces. As the world evolves, our brains will need to
blend with the wider Cyber space.
This blending process has seen many research activities over the years; but our
imagination in the past underestimated the reality of today. Classically, researchers
talk about adaptive automation, augmented cognition, human machine integration,
human machine interfaces, brain machine integration, brain machine interfaces,
human machine symbiosis, brain computer interfaces, and the list goes on. But the
words “machine”, “integration” and “interface” do not capture the true complexity
that is emerging around us. Hence, we adopted the terminology CoCyS in this
book to represent this blending process and the fluidity that exists in CoCyS. We
emphasize that CoCyS is not limited to a single human, a single machine, or a single
computer. CoCyS is the new space that blends information together from machines,
signals from the environment, brain waves, behavioral and social attributes from the
human [2, 3].
CoCyS is a transformation of the social system. It redefines the social system,
as it reconnects the minds across classical physical boundaries. As the minds
disconnect from some and connect to other minds, hearts follow. CoCyS will
reshape hearts and minds. We may fear the science fiction, but if science fiction
is an imagination, we should not forget as humans that our imagination is only
the starting point for what we have called “innovation”. We, as humans, have been
very successful all over the history of mankind in transforming imagination into
prototypes then into fully functional systems and reality! If we can imagine it, we
can design it; if we can design it, we can build it.
CRT is at the heart of CoCyS. Decisions in this new space need to be challenged,
not as we have classically have been doing using our human mind alone, but by
relying more and more on the Cyber space. We need to test trust through probing
for and gathering of information, estimating reliability of information by deploying
powerful data mining techniques, and acting fast and right.
6.1 Where Can We Go from Here? 213
There are many research opportunities that exist in CoCyS. For example, we need
to understand what new forms of thinking exist in this new space, the relationship
between information and decision, how human brains immerse into the Cyber space,
how cognition and cyber blend together, and how to analyze the fluidity of CoCyS.
This is a random sample of ideas, but ideas more relevant to the topic of this book
are discussed in the next section.
6.1.2 Future Work on the Shadow CRT Machine
CoCyS will evolve and grow with and without our will. In this complex environ-
ment, we need the computer agent that can watch our back, augment our limited
cognitive capacity so that we are able to manage the level of complexity that our
brain is unable to comprehend. The Shadow CRT Machine does this. Today, we
have all our communications in the Cyber space, from emails, telephones, and even
GPS trails. We have our IPads, smart phones, smart glasses, smart watches, smart
houses, to name a few. While CoCyS is about the space where cognition and Cyber
blends, the Shadow CRT Machine is our slave computer agent that only focuses
on “us”. It knows our objective, monitors the environment, challenges the data it
receives and the context that comes with these data, assesses risk, and shares with
us its opinion.
To support the development of the Shadow CRT Machine, we need the data
mining tools that can analyze big data seamlessly, can discover noise in the
information, can reveal deceptive information, and can go beyond simple models
of reliability to complex models of trust.
The Shadow CRT Machine suggests more research into simulation as it needs
the simulation environment to be the platform to transform and communicate
our understanding of how systems work to the computational, then the Cyber,
environment. Simulation starts as the computational form of human understanding,
but can evolve autonomously by refining this understanding based on the abundant
information available in the Cyber space.
The Shadow CRT Machine demands more research into optimization theory, but
not in simple optimization models that assume that mathematical optimality is more
important than the assumptions in a problem. The Shadow CRT Machine requires
optimization solvers that can handle many objectives, noise, changing environment,
and high level of nonlinearity. The Shadow CRT Machine necessitates the design of
optimization solvers that are efficient and fast, and that provide lifelong optimization
capabilities.
6.1.3 Computational Intelligence Techniques

for Computational Red Teaming
Designing automated methods to support a CRT exercise requires sophisticated pre-

dictive analytics, optimization, and simulation techniques as discussed in Chap. 3.
Computational intelligence techniques have a lot to offer in this literature as has
been demonstrated by many publications such as [1, 4–6].
In the meantime, CRT offers many challenges for the computational intelligence
literature. Some of these challenges are briefly described below.
There has been an extensive literature on evolutionary multi-objective optimiza-
tion. This literature has been driven by real-world requirements, where decisions
need to be made in many applications given the existence of conflicting objectives.
We have discussed that conflicting objectives require different treatments based on
who owns the objectives (see Sect. 2.2.2).
If all conflicting objectives are owned by the same organization, evolutionary
multi-objective optimization methods are suitable to generate the trade-off curve. In
this case, CRT poses certain challenges that encourages more work in this direction.
For example, CRT always relies on simulators of some sort. The computational cost
of using these simulators to evaluate solutions is usually very high. Evolutionary
multi-objective optimization methods rely on surrogate methods to approximate
the evaluations so that the number of objective function calls to the simulators is
minimum and without sacrificing the quality of the final solution.
There is a need for more studies to understand this trade-off between the level of
approximation made by surrogate methods and the quality of generated solutions.
Moreover, the stochastic nature of these simulators increase the cost even more.
There is an urge for new multi-objective optimization methods that can work with
uncertain simulation-based problems.
Another characteristic of CRT is the existence of multiple competing teams
including the red and blue teams. This generates a co-evolutionary optimization
problem. The nature of the co evolutionary dynamics of this problem is critical in
CRT, because by understanding the co-evolutionary dynamics, we can understand
how decisions made in one team impact the decisions of the other team. New tools
and methods for analyzing uncertain multi-objective co-evolution can provide both
the analysis tool set to understand the nature of the interdependency between the
two teams and efficient optimization methods to solve this multi-stage dynamic
optimization problem.
As the discussion on data mining has demonstrated, CRT heavily relies on
efficient data mining tools in its computational environment. This includes classical
tools to infer patterns from data, but there are many new opportunities that CRT
identifies for data mining. For example, the coupling of the optimization and
simulator as discussed above generates a large amount of data. Efficient real-time
data mining techniques can extract patterns from these data to approximate the
fitness landscape and to design efficient surrogate models for evolution to use. CRT
is the science that blends risk analytics using challenge analytics to create value for
organizations.
6.1 Where Can We Go from Here? 215
CRT is a rich area for the application of behavioral mining techniques. New data
mining techniques are needed to understand the dynamics of the interaction between
blue and red in the simulator, to autonomously extract patterns from agents in the
simulation to better understand how the agents act, and to autonomously infer intent
information from group behavior. This last point opens many possibilities in the use
of network mining for CRT.
Departing from the computer-based environment and coming to the real CRT
exercise, there are big data that the humans in the exercise generate. Automated
tools to analyze the exercise offers many different types of problems for data mining.
For example, discussions in the exercise can be captured through voice recording,
lots of drawings on the wall, and exchange of unstructured text. Each type of these
data-capture mechanisms generates different types of data which need to be mined
using methods such as speech analysis, conversation analysis, graph mining, process
mining, and text mining.
In summary, CRT offers many optimization and data mining opportunities for
the computational intelligence literature. It generates all sorts of data, and requires
very efficient tools to transform these data into meaningful information to describe,
explain and understand lessons learnt during the exercise.
6.1.4 Applications of Computational Red Teaming
This book has demonstrated many examples where CRT can be applied to every
situation where a decision is needed. CRT is a rational approach to decision making,
whereby alternative courses of action are challenged, risk are assessed, and, in light
of the analysis, decisions are made. Therefore, CRT can be applied everywhere.
However, it is important to weigh the cost against the benefits. The implementation
of CRT systems requires very skilled individuals to blend elements of the system
together.
One can say that the cost of any decision is high, but in CRT, a big chunk of this
cost needs to be paid upfront. As an example, for a simple decision like eating, if
we randomly choose what things to eat, the cost each time we make the decision is
small. However, over the many times we will make this random decision, the cost
can be accumulated to the extent that our lives become the cost. Alternatively, we
can pay a large cost upfront by analyzing our body needs, and develop a healthy
dietary program. This large cost that is being paid upfront can reduce the larger cost
that we may pay over time without noticing. CRT systems save cost by bringing
some of the cost and risk forward in time, thus, providing a hedging strategy against
high uncertainty in the future.
Many domains of application can benefit from CRT. For example, CRT has
been used extensively in Cyber-security, including evaluations of web services,
authentication protocols, and computer network security. However, the Cyber space
extends beyond computer networks as we have argued in previous sections. There is
an urge for CRT systems for Cyber Security. Given the computational nature of
CRT, it is more feasible and efficient than relying on humans in the Cyber space.
Many large organizations can rely on CRT to connect day-to-day decisions all the
way up to the formation of their strategic plans. By ensuring that every tiny decision
made is linked to organizational vision, and that negative risks are preempted in
advance, a Shadow CRT Machine will become a proactive firewall for all decisions
made in an organization.
Individuals and small businesses can apply the science with pencils and papers.
Individuals can transform the science of CRT and rely on their brains, the best
computational machine that has been created, to do CRT.
The identification of novel applications for CRT is not a difficult task. However,
once a domain of application has been chosen, research and effort need to be
directed to the design of such systems and the identification of proper models to
support a CRT system.
References
2. Abbass, H., Tang, J., Amin, R., Ellejmi, M., Kirby, S.: Augmented cognition using real-time
EEG-based adaptive strategies for air traffic control. In: International Annual Meeting of the
Human Factors and Ergonomic Society. HFES, SAGE (2014)
3. Abbass, H., Tang, J., Amin, R., Ellejmi, M., Kirby, S.: The computational air traffic control
brain: computational red teaming and big data for real-time seamless brain-traffic integration.
J. Air Traffic Control 56(2), 10–17 (2014)
4. Ilachinski, A.: Enhanced ISAAC neural simulation toolkit (EINSTein): an artificial-life labo-
ratory for exploring self-organized emergence in land combat (U). Center for Naval Analyses,
Beta-Test Users Guide 1101, no. 610.10 (1999)
5. Yang, A., Abbass, H.A., Sarker, R.: Evolving agents for network centric warfare. In: Proceedings
of the 2005 Workshops on Genetic and Evolutionary Computation, pp. 193–195. ACM,
New York (2005)
6. Yang, A., Abbass, H.A., Sarker, R.: Landscape dynamics in multi–agent simulation combat
systems. In: AI 2004: Advances in Artificial Intelligence, pp. 39–50. Springer, New York (2005)
Index
A Embodiment, 12
Action Environment, 70
Deliberate Action, 68 Experiment, 116
Intentional Action, 50 Experimentation, 114
B F
Behavior, 70, 77 Fundamental Inputs to Capabilities, 171
Bias, 8, 17
Big Data, 137
Blue-red Simulation, 37 G
Goals, 54
C
Capability, 172 H
Challenge, 6, 89 Hypothesis, 115
Challenge Analytics, 86, 100
Deliberate Challenge, 6
Competency, 73 I
Comparative Competency, 76 Imitation Game, 40
Conflict, 10
Military, 10
Cyber Operations, 178 M
Cyber Security, 15, 178 Mission, 172
Cyber security, 176 Motivation, 87
Cyber Space, 178
N
D Network, 174
Data Mining, 129 Effects, 172
C4.5, 134 Network Operations, 178
Deliberate Action, 69 Deny Operation, 181
Destroy Operation, 183
Detect Operation, 179
E Hide Operation, 184
Effect, 58, 172 Identify Operation, 180

H.A. Abbass, Computational Red Teaming, DOI 10.1007/978-3-319-08281-3
218 Index
Network Operations (cont.) Risk Lens, 8

Isolate Operation, 182 Threat, 63
Neutralize Operation, 183 Vulnerability, 15, 63
Prevent Operation, 181 Risk Analytics, 100
Reshape Operation, 184
Track Operation, 180
S
Scenario, 159, 160
O Design, 162
Optimization, 118 Design in CRT, 164
Blind, 122 Plausibility, 159
Knowledge-based, 122 Possibility, 159
Negotiation-based, 123 Shadow CRT Machine, 154
System, 123 Simulation, 125
Abstraction, 127
Fidelity, 128
P Resolution, 127
Performance, 70 Situatedness, 12
Gilbert Model, 79 Skills, 71, 72
Problem Sources of uncertainty, 4
Problem Definition, 118 Stimulation, 87
Problem Solving, 118 Strategic Thinking, 169
ProblemSolvingSchools, 105 Strategy, 168, 173
strategy, 4
System, 58
R Critical Component, 60
Red Teaming, 1, 3, 5 Model, 125
Accountability, 26 Objective, 52
Artificial Intelligence, 42 Purpose, 59
Automated RT, 36
Budget Estimation, 31
Data-to-Decisions, 142 T
Ethics, 26 Turing Test, 40
Functions, 15
Membership, 12
Success Factors, 10 U
Resources, 169 Uncertainty, 4, 61
Risk, 8, 61, 62
Critical component, 15
Hazard, 63 V
Mitigation, 10 Values, 172
Opportunity, 15 Vision, 172

Computational Red Teaming: Hussein A. Abbass

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computational Red Teaming: Hussein A. Abbass

Uploaded by

Copyright:

Available Formats

Hussein A.

Computational Red Teaming

ISBN 978-3-319-08280-6 ISBN 978-3-319-08281-3 (eBook)

© Springer International Publishing Switzerland 2015

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)

Chapter 5 complements the materials by presenting three case studies of

Canberra, ACT, Australia and Singapore, Hussein A. Abbass

We receive limited education through schools and universities. The unlimited

scientists worked with me on some aspects of CRT: Dr Sameer Alam, Dr Sondoss

1 The Art of Red Teaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1

1.5.5 Post Analysis of the Exercise . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 24

3 Big-Data-to-Decisions Red Teaming Systems . . . . . . . .. . . . . . . . . . . . . . . . . . . . 105

4.3.2 Cyber Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 176

ACE Automated co-evolution

SWOT Strength, Weaknesses, Opportunities, and Threats

Fig. 1.1 Connecting relationships among agents to high level reasoning . . . 39

Fig. 3.2 Fitness landscape of Ramada’s loyalty in response to

Table 1.1 Categorization of alternative analysis methods as

1.1 A Little Story

© Springer International Publishing Switzerland 2015 1

1.2 Red Teaming

Know your enemy, but in so doing do not forget to know yourself.

We need to know our competitors because a hidden enemy is more threatening

In this document, they recommended the instilling of effective RT in the Department

of possibilities,” “exploring nonconventional behaviors,” “testing strategies,” and

1.2.2 Executing Exercises

RT does not operate in vacuum, or by a simple answer to a what-if question. RT

1.2.3 Deliberately Challenging

A main differentiator between RT and any other exercise or experimental form

The concept of a deliberate challenge is key in RT. RT is an exercise that has

1.2.4 Risk Lens

1.2.5 Understanding the Space of Possibilities

1.2.6 Exploring Non-conventional Behaviors

Let us imagine a competing company to this technology savvy company; this

1.2.7 Testing Strategies

1.2.8 Mitigating Risk

1.3 Success Factors of Red Teams

1.3.1 Understanding and Analyzing the Concept of a Conflict

A conflict in RT is a situation in which two entities have conflicting objectives (see

In many cases, conflicting objectives are resolved through cooperation. As such,

1.3.2 Team Membership

1.3.3 Time for Learning, Embodiment and Situatedness

1.3.4 Seriousness and Commitment

1.3.5 Role Continuity

1.3.6 Reciprocal Interaction

RT is about deliberate challenges. Red cannot think in isolation. It needs to interact

1.4 Functions of Red Teaming

As an exercise, RT focuses on forming deliberate challenges to perform a number of

1.4.1 Discovering Vulnerabilities

A vulnerability is an exposure of a critical component in a system to a hazard or

1.4.2 Discovering Opportunities

Discovering opportunities is complementary to discovering vulnerabilities, but is

1.4.4 Thinking Tools

A RT exercise assists players in gaining an understanding and an appreciation of the

The dynamics of back-and-forth interaction teaches participants and observers about

1.4.5 Bias Discovery

1.4.6 Creating Future Memories and Contingency Plans

1.4.7 Memory Washing

wireless-communication device. As the participants become more embodied and

1.5 Steps for Setting Up RT Exercises

The design and implementation of a RT exercise undergo a structured process. This

1.5.1 Setting the Purpose, Scope and Criteria of Success