Book - Analytics For Instructional Designers Association For Talent Development

© 2023 ASTD DBA the Association for Talent Development (ATD)
All rights reserved. Printed in the United States of America.
26 25 24 23 12345
No part of this publication may be reproduced, distributed, or transmitted in any form or by any
means, including photocopying, recording, information storage and retrieval systems, or other
electronic or mechanical methods, without the prior written permission of the publisher, except in the
case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted
by copyright law. For permission requests, please go to copyright.com, or contact Copyright
Clearance Center (CCC), 222 Rosewood Drive, Danvers, MA 01923 (telephone: 978.750.8400; fax:
978.646.8600).
ATD Press is an internationally renowned source of insightful and practical information on talent
development, training, and professional development.
ATD Press
1640 King Street
Alexandria, VA 22314 USA
Ordering information: Books published by ATD Press can be purchased by visiting ATD’s website at
td.org/books or by calling 800.628.2783 or 703.683.8100.
Library of Congress Control Number: 2022949753
ISBN-10: 1953946445
ISBN-13: 9781953946447
e-ISBN: 9781953946454
ATD Press Editorial Staff

Director: Sarah Halgas
Manager: Melissa Jones
Content Manager, L&D: Jes Thompson
Developmental Editor: Jack Harlow
Production Editor: Hannah Sternberg
Text and Cover Design: Shirley E.M. Raybuck
Printed by BR Printer, San Jose, CA
Contents
Introduction
Part 1. The Foundations
1. Why Should Instructional Designers Care?
2. Getting Started With Definitions
3. Data Specifications in Workplace Learning
4. Unique Learning Metrics
5. A Little Bit of Statistics
Part 2. Designing for Data
6. A Framework for Using Workplace Learning Data
7. Make a Plan for Gathering and Using Data
8. Form Your Hypothesis and Specify Data Needs
9. Identify Data Sources
10. Build in Data Capture
11. Store the Data
12. Iterate on the Data and the Analysis
13. Communicate and Visualize the Data
14. Build Scale and Maturity
Acknowledgments
Further Reading
References
Index
About the Author
About ATD
Introduction
I’ve been working in L&D since 2002, when I helped a large healthcare
organization implement their first learning management system (LMS). It
was the software company’s first LMS, and mine too. These were the early
days of SCORM (sharable content object reference model). The e-learning
industry was about to take off as the advent of rapid authoring tools and
LMSs began to democratize access to scale.
In 2012, I learned about Project Tin Can, which would create the
Experience API, or xAPI. Our team at TorranceLearning had been seeking a
learning and performance environment that offered a richer and more varied
learning experience, and a correspondingly interesting data set. We were
stretching our technical muscles. Our instructional designers were grappling
with the new grammar of reporting data in a repeatable but not-yet-
standardized environment. We were asking our LMS team: Shouldn’t this
do xAPI, too? (The answer: Why, yes, yes it should.)
In 2014, we launched the Ann Arbor Hands-On Museum’s Digitally
Enhanced Exhibit Program (DEEP). Student groups on field trips would use
beacons to identify themselves to the networked tablets placed around the
museum. They engaged seamlessly with interactions and questions that
would be recorded by exhibit and curriculum standard. At the end of the
visit, teachers would receive a stack of reports about their students’
activities and engagement, and each student received a personalized one-
page report detailing their field trip to the museum. This was exciting stuff.
We often kept coming back to the question: What can we do with all this
data? How can we take advantage of it? What insights might lie in there?
In 2015, our team took on the duty of hosting the Advanced Distributed
Learning (ADL) group’s cohort model for introducing innovators to xAPI.
We started our first 12-week xAPI Learning Cohort with a group of 35
invited designers and developers in the fall of 2015, and ran two a year for
the next seven years. Cohorts routinely exceeded 600 members each
semester and the projects that teams took on ranged from e-learning to
gaming to chat bots to virtual classroom to Alexa and everything in
between. As of this writing, more than 5,000 L&D professionals have
participated in the xAPI Learning Cohort.
As cohort members and organizations adopted xAPI and other data-rich
learning experiences, we kept coming back to the question, “What can we
do with all of this data?” That’s what this book sets out to help you answer.
How L&D Can Use Data

In the early days of the COVID-19 pandemic, the L&D team the UK arm of
PriceWaterhouseCoopers used their analysis of search requests on the
organization’s learning experience platform to identify the needs being
faced by their staff and managers. This allowed the L&D team to respond
very quickly, almost at a week-by-week pace, to these emerging needs.
At LeMoyne Institute in upstate New York, the learning design team
used detailed data from a learning experience to fine-tune the screen design
for an adaptive e-learning curriculum. By simply changing the layout of the
screen, they could improve relevant performance in measurable ways.
QuantHub, a learning experience platform focused on data science, uses
data to personalize learning across a competency map used by major
organizations to upskill their professionals.
At Trane technologies, learner and manager feedback are combined with
employee engagement survey results to prove the positive impact of their
leadership courses. This data is used to obtain additional budget to continue
running the program, as well as to attract new learners to the experience.
And as interesting as these quick case studies are, there are countless
organizations using data and analytics in similar ways to identify learning
needs, hone the design of their learning experiences, personalize learning in
new ways, support decisions, and evaluate the impact of learning. We’ll
hear from several of them in this book, at the end of each chapter.
Why an L&D Book on Data and Analytics?
This book tackles an unaddressed need in the market for workplace learning
and talent development.
First, there are lots of books, articles, courses, and academic degrees in
data and analytics. However, I find they tend to be focused on the
marketing, sales, or operational aspects of a business, where the data is rich
and the metrics are commonplace. It’s not very often that I see an analytics
case study that addresses the kinds of data we are using in L&D.
Second, with K–12 and higher education maturing in their use of
learning management systems, and the popularity of the MOOC (massive
open online course), the academic field is investing in student analytics.
There is much to be learned from our academic colleagues for sure, but
their analytics work doesn’t fully account for the workplace setting.
Third, there are handfuls of books about learning measurement in the
corporate space, going into the familiar evaluation levels developed by
Donald Kirkpatrick, Raymond Katzell, and Jack Phillips, and beyond them
to address the culture and practice of regular data gathering, analysis, and
reporting. In fact, if this is your interest, I strongly recommend
Measurement Demystified by David Vance and Peggy Parskey (2021) and
Learning Analytics by John Mattox, Peggy Parskey, and Cristina Hall
(2020).
Missing across these resources is a focus on the unique data that is
attainable in the corporate learning space at a granular level, as well as
specific direction for instructional design teams about how to generate this
data to feed the downstream uses.
Why? I’m sure there are several reasons for this. Chief among them is
that in L&D we tend not to have as much data at our fingertips as other
functions in the business, and therefore tend not to use data to drive our
decisions. In most organizations, finance, sales, and operations all have very
granular data available within a few clicks to drive their decision making. In
L&D we have training completion data: Did learners complete the training?
When? How long did it take? What are the test scores? Did they like it? Are
they motivated to apply it?
We tend not to have good insight into the learning experience itself. For
example, what did they click on? What did they do in class? How many
times did they practice? Who gave them feedback along the way? Nor do
we have good insight into what happens after the learning event: What
outside sources did they use to fill in any remaining gaps in their
knowledge? Did they use the job aids we gave them, and did that make any
difference? How did they perform on the job after training? Did their
manager support them?
As an industry, what we gained as we adopted SCORM and LMSs was a
globally standardized, interoperable, interchangeable way of managing the
learning function. This allowed for the rapid rise of formalized and yet
distributed training delivery and the growth of this industry. With the
technologies available at the turn of the 21st century, the institutionalization
and globalization of business, and the interoperability offered by SCORM
and LMSs came a shallow data set focused on the completion of event-
based training. That was fine for its time, but didn’t evolve as fast as other
organizational functions, creating a sort of vicious cycle: We can’t make
data-driven decisions because we don’t have rich data.
Don’t Be Afraid of the Math!

Before we get deep into the weeds of data and analytics, I want to bring
your attention to the fact that we’re about to encounter something that looks
a bit like math. I have found that the L&D profession is not rife with
mathematicians, so this might start to trigger what’s commonly known as
“math anxiety” for you. Let’s pause for a moment and see if we can
alleviate some of that.
Sarah Sparks, senior research and data reporter for EdWeek, wrote:
Emerging cognitive and neuroscience research finds that math anxiety is not
just a response to poor math performance—in fact, four out of five students
with math anxiety are average-to-high math performers. Rather, math anxiety is
linked to higher activity in areas of the brain that relate to fear of failure before
a math task, not during it. This fear takes up mental bandwidth during a math
task…. In turn, that discomfort tends to make those with math anxiety more
reluctant to practice math, which then erodes confidence and skill. In part for
that reason, anxiety has been linked to worse long-term performance in math
than in other academic subjects like reading.
Here most of us are, having accumulated years and perhaps decades of

avoidance of math. And as we saw in the prior section, L&D’s historical
tools, platforms, and analysis do not require or even afford us the
opportunity to do much math beyond the occasional averaging of some
course evaluation data. It’s OK if you’re feeling anxious.
And don’t worry, we’re not going to spend a lot of time doing a lot of
math. L&D data analytics isn’t about trying to multiply three-digit numbers
without a calculator. In fact, a lot of the actual calculations are automated,
and, even when they’re not, you have a computer to assist.
What we are going to do is provide the tools and some awareness of the
concepts that you’ll be working with, perhaps in partnership with fellow
professionals who are more experienced in these spaces.
Do you need to become a statistician to do data and analytics? No, I
don’t believe so. However I do believe that having a working knowledge of
the concepts will help you get started on your own, make you a better
partner to team members who have these skill sets, and tip you off when
you would be better served to consult someone else with this expertise. This
is very similar to the conversation our industry had a decade ago about
whether or not instructional designers needed to be able to code. In my
opinion, they do not; however, they do need to have a functional
appreciation for computer science to collaborate effectively.
So, in the first part of the book, we’ll cover the basics of why you should
care (chapter 1), level setting with definitions (chapter 2), data
specifications (chapter 3), L&D-specific data metrics (chapter 4), and a
little bit of statistics terminology (chapter 5).
And if, as Sparks points out, this anxiety stems from a fear of failure that
occurs before you even get started, I’m going to ask you to live with that
discomfort just long enough to learn through the experience and perhaps get
over a little bit of that trepidation about using analytics.
What Does It Mean to Design for Data?

We all know that data is knowledge, and knowledge is power, but once we have
access to it and realize that it is, indeed, oceans of data, how do we not
“drown” in it, and, perhaps more importantly, how do we make sense of it?
—Marina Fox, GSA’s DotGov Domain Services,

Office of Government-Wide Policy (OGP)
After we lay down the foundations of learning data and analytics, we will
start to take a look at the process for actually getting and using data in part
2. First, we’ll talk about making a plan for what kinds of data you will
gather, including aligning with organizational metrics and many of the
common learning and development frameworks that we use for analysis
(chapters 6 and 7).
Next, we’ll dive into forming your hypothesis from the questions you
need to answer with data (chapter 8). We’ll then take a look at actually
identifying the data needs that will serve those purposes (chapter 9),
building the data capture into our learning experiences so we actually get
the data we need (chapter 10), and collecting and storing it (chapter 11).
At this point many people will arrive at what I have in my own projects
referred to as the moment of “Oh my gosh my baby is ugly!” This is where
you have collected some data, made some analysis, and realized that what
you really wanted to answer was something other than what you just did.
Here’s where the fun begins as you iterate on the learning and data
experience by looking at what you have gathered, and then fine-tuning it
(chapter 12).
Many people conflate the visualization of data with the analysis of data.
And up until this point, we haven’t talked about visualizing data at all!
We’ll spend a little bit of time talking about how we communicate and
visualize data in chapter 13. This is another one of those places for which
dozens upon dozens of wonderful resources exist, so this book will cover it
on just a very high level.
Finally, we’ll take a look at what it means to scale up your analytics
efforts, moving from one or two pilot projects to a future enterprise-wide
state. There are few organizations at this stage as of the writing of this book,
so the future is full of opportunity for you to define it.
How This Book Will Help You
This book takes an “If I can see it, I can be it” approach to learning data and
analytics. In my work helping organizations adopt xAPI, I am frequently
asked for case studies. The questions sound like “Who’s really doing it?”
“How does that actually work?” and “That sounds great, but can you share
an example so I really know I’m getting it?”
My emphasis in this book will be not only on practical what-is and how-
to content, but also real-world examples and longer case studies from
practitioners. In some cases, I’m telling the story. In other cases, the people
who have built it and lived with it share their story in their own words.
Each chapter will conclude with opportunities for you to put these
techniques to work right away, whether you are in a data-rich environment
already, or are just getting started and working on hypotheticals. These
opportunities to give the concepts a try are a valuable part of extending your
learning beyond the pages of this book and into the real world all around
you. If you are learning with your team, these activities can be done in pairs
or in small groups for shared learning impact.
As much as I would love to offer you a book with immediate practical
value to your own work, it’s entirely possible that you don’t yet have the
data necessary to apply these concepts right away. As such, the “give it a
try” ideas at the end of most chapters include reflections and hypotheticals
to let you dig in right away, even though they might not reach your loftiest
aspirations just yet.
And while I aim to be definitive whenever possible, remember there are
very few hard and fast rules. Simply, a lot of it depends. So, at the risk of
sounding like I’m unable to make a firm decision in offering advice, I find
that the very interesting questions in life often have multiple right answers.
The “rightness” depends on your situation, your needs and capabilities,
what you have access to right now, and what your leaders deem makes
sense. And, some of the most complex “right” answers change over time.
Let me also note that this isn’t a book about xAPI. While I believe that
the widespread adoption of a rich and interoperable data specification is
good for the industry, individual professionals, and organizations buying
products and services, I also realize that xAPI is not the only way to work
with learning and performance data.
Whether you’re using xAPI that far extends the capabilities of SCORM
to track learning, practice, and performance activity or another data model,
we have the ability to get our hands on far more plentiful, granular, and
interesting types of data. That’s what this book is about: what data to get,
how to get it, and what to do with it once you have it.
So, let’s get some data!
PART 1
THE FOUNDATIONS
Chapter 1
Why Should Instructional

Designers Care?
It’s hard to pick up a business or organizational effectiveness publication

these days without seeing multiple articles that refer to the use of data in the
process of improving results. You may have heard these and other
variations of these phrases:
• What gets measured gets done.
• What gets measured improves.
• What gets measured gets managed.
In other words, the ways that we prove that we have accomplished our
goals is by measuring them. In turn, in the data-driven organization, the
departments that are able to gather, process, analyze, and use data will be
the ones that will gain influence.
At the same time, I can argue that a singular focus on data-driven results
may lead to gaps in understanding and result in errors in judgment.
Consider Chris Anderson’s words, “There is no guarantee that a decision
based on data will be a good decision.” And this from Simon Caulkin,
summarizing a 1956 VF Ridgeway article in Administrative Science
Quarterly: “What gets measured gets managed—even when it’s pointless to
measure and manage it, and even if it harms the purpose of the organization
to do so.”
The goal of this book is not to encourage you to focus solely on data and
analytics as a source of insight for decision making, but rather to use it as
one of several sources of insight.
But this gets us to the question of why instructional designers and L&D
professionals should care about data and analytics in the first place. We
should care because our organizations care. We should care because …
Digital Transformation Is Everywhere

CIO magazine offers a commonsense definition of digital transformation:
“A catchall term for describing the implementation of new technologies,
talent, and processes to improve business operations and satisfy customers”
(Boulton 2021).
And while you may be thinking, “We’ve been digitizing work for half a
century” (and you would be right), organizations are still reimagining the
use of technology, the skills of their people, and their business processes.
Digital transformation includes both delivering value to customers and
channel partners through new software and processes (such as e-commerce,
mobile, social, and cloud-based solutions) and gathering data from those
interactions to form insights and improve results.
L&D is ripe for digital transformation. Gathering analytics on the
incoming data will be a key component of this shift.
Consumer-Focused Software Is So Good at What It Does

One of the results of digital transformation is that as consumers we all use
digital tools that have far more sophistication than ever before. Our
learners, or employees, are used to interacting with software that is highly
personalized to their needs, makes useful recommendations, and adapts
overtime. All of this is possible because of the data that is gathered at every
interaction.
In their personal lives, our learners interact with very smart apps that are
designed and informed by software designers using analytics. But when
people get to work, the learning and performance tools they interact with
seem far less mature.
We Will Need to Improve in How We Use Data

As a result, data literacy is an emerging skill. In ATD’s Talent Development
Capability Model, under the Impacting Organizational Capability domain,
lies the data and analytics capability. According to the model, “Discerning
meaningful insights from data and analytics about talent, including
performance, retention, engagement, and learning, enables the talent
development function to be leveraged as a strategic partner in achieving
organizational goals” (ATD 2019). Skills for this capability include
developing a plan for data analysis, gathering and organizing data,
analyzing and interpreting analysis results, and selecting data visualization
techniques.
Gartner Group defines data literacy as “the ability to read, write, and
communicate data in context, including an understanding of data sources
and constructs, analytical methods, and techniques applied; and the ability
to describe the use case, application, and resulting value. Further, data
literacy is an underlying component of digital dexterity—an employee’s
ability and desire to use existing and emerging technology to drive better
business outcomes” (Panetta 2021).
Whereas in the past instructional designers have focused on performance
outcomes and the delivery experience, we will now need to also attend to
the data and how that informs our work. We have the opportunity to explore
how well the learning experiences we design perform in the field, how
people interact and engage with them, and whether or not they lead to
improved performance on the job. The insights we gather from this
exploration will absolutely inform our design work.
We Need the Insights for Continuous Improvement of the

Learning Function
In Measurement Demystified, David Vance and Peggy Parskey (2021) offer
this list of reasons why we measure in L&D:
• Improve programs.
• Establish benchmarks.
• Communicate findings.
• Monitor results.
• Manage operations.
• Discover insights.
• Analyze results.
• Ensure goal accomplishment.
• Demonstrate process.
• Inform stakeholders.
• Evaluate programs.
• Assess value.
• Plot trends.
• Identify success rates.
• Assess gaps.
This is quite a list! As professionals, we are constantly curious about
how we are performing in our work, how we compare to the past and to our
peers, how can we make changes in the future to better serve our learners
and our organizations, and how to communicate our value to the rest of the
organization. All of this can be done with data.
It’s no longer sufficient to simply create a learning experience and put it
out into the world for people to consume without gathering data about its
effectiveness, efficiency, or outcomes. In small L&D teams or “team of
one” situations, instructional designers must include some aspect of
learning data and analysis in their work. On larger teams, a dedicated
measurement and analytics team may complete this task. Without some
degree of rigor in our data collection and analysis, we risk having only
anecdotal evidence of impact, missing out on opportunities to improve and
becoming irrelevant.
We Will Need to Design Experiences to Collect Data

While we will be using data to inform our work, we will also need to design
our learning experiences in ways that enable the effective and efficient
capture of that data. It is one thing to want to know how a learner interacts
with a particular piece of content, and an entirely different thing to be able
to design for the capture of that interaction with the content. We will
explore this aspect of designing for data in chapter 10, “Build in Data
Capture.”
Where Do We Stand in L&D?

Other functions in most organizations operate with far more data than the
L&D function is accustomed to using. This an interesting phenomenon.
Whereas the L&D industry has had a common interoperable data
specification for more than 20 years (SCORM), and most other functions do
not have such a vendor-agnostic specification, we tend not to have the
richness of data that these other functions have. SCORM has enabled the e-
learning and LMS segments of our field to grow in volume and number of
vendors. It has enabled both great fragmentation as well as ease of
consolidation, creating an environment in which the market is very fluid for
buyers and has very few barriers to entry for suppliers.
However, the shallowness of the data available from SCORM, and the
fact that it is only available for e-learning content that is launched from an
LMS, has left our industry with a lot of surface-level data, leading to our
current predicament. Three solutions are emerging in this space:
• Elegant all-in-one learning platform solutions offer rich learning
experiences and deep analytics on the data they generate. These
software platforms expose us to the power and promise of data and
analytics in our space. However, since most organizations have
multiple platforms and environments in which learning takes place,
the full picture of learning and performance data may be invisible to
these systems.
• A new free, shared, global specification for learning and
performance data interoperability, the Experience API (xAPI) offers
the richness that SCORM lacks. With xAPI, we can track nearly any
learning or performance experience at very granular levels of detail
and still retain the data’s marketplace fluidity and vendor-agnostic
nature. xAPI Learning Record Stores (LRSs) provide the analysis
and visualization tools to pull this data together.
• Some organizations are pulling all their data from multiple learning
experiences regardless of format or data standard and exporting it to
business intelligence tools for analysis.
There is no one right approach, and the three listed here are an
oversimplification of the current and future possibilities. What is clear is
that as a function and an industry, we are relatively new to the data and
analytics space. As a group, we’re a bit behind, but we can catch up quickly
using the tools and skills that have been developed in other functional areas.
What Kinds of Data Drive Decision Making?

Data is used in nearly every industry and every function to improve
performance. These lists are examples of the kinds of measures and metrics
used. (The difference between measures and metrics is defined in chapter
2.)
Sales Metrics
• Annual recurring revenue
• Average revenue per user
• Quota attainment
• Win rate
• Market penetration
• Percentage of revenue from new versus existing customers
• Lifetime value (LTV) of a customer
• Average profit margin
• Conversion rate
• Sales cycle length
• Average deal size
• Year-over-year growth
• Deal slippage
Finance Metrics
• Earnings before interest and taxes (EBIT)
• Economic value added (EVA)
• Current ratio
• Working capital
• Debt-to-equity ratio
• Contribution margin
• Customer satisfaction
• Liquidity ratio
• Return on equity
• Days in accounts receivables
• Net cash flow
• Gross profit margin
• Transactions error rate
Human Resources Metrics

• Time to hire
• Cost per hire
• Employee turnover
• Revenue per employee
• Billable hours per employee
• Absenteeism
• Cost of HR per employee
• Employee engagement
• Cost of training per employee
• Diversity and EEOC numbers
Customer Service Metrics

• Average issue count (daily/weekly/monthly)
• First response time
• Average resolution time
• Number of interactions per case
• Issue resolution rate
• Preferred communication channel
• Average handle time
• Self-service usage
• Backlog
• Average reply time
• Customer satisfaction score
Healthcare Metrics
• Number of medication errors
• Complication rate percentage
• Leaving against medical advice
• Post-procedure death rate
• Readmission rate: hospital acquired conditions (HACs)
• Average minutes per surgery
• Average length of stay
• Patient wait times by process step
• Doctor–patient communication frequency
• Overall patient satisfaction
• Patient-to-staff ratio
• Occupancy rate
Hospitality Metrics
• Energy management
• Labor costs as percent of sales
• Employee performance
• Gross operating profit
• Occupancy rate
• Average daily rate (ADR)
• Average room rate (ARR)
• Revenue per available room (RevPAR)
• Net revenue per available room (NRevPAR)
• Revenue per occupied room (RevPOR)
• Gross operating profit per available room
• Marketing ROI
• Online rating
• Customer satisfaction
• Loyalty programs
Manufacturing Metrics
• On-time delivery
• Production schedule attainment
• Total cycle time
• Throughput
• Capacity utilization
• Changeover time
• Yield
• Scrap
• Planned maintenance percentage (PMP)
• Uptime/uptime + downtime
• Customer return rate
• Overall equipment effectiveness (OEE)
Childcare Metrics
• Child abuse indicator
• Immunization indicator
• Staff–child ratio and group size indicator
• Staff training indicator
• Supervision or discipline indicator
• Fire drills indicator
• Medication indicator
• Emergency plan or contact indicator
• Outdoor playground indicator
• Toxic substances indicator
• Handwashing or diapering indicator
Technology Metrics
• Defect density
• First response time
• Function points
• Incidents
• Information security scores
• IT overhead
• IT risk score
• IT security training rates
• Noncompliance events
• Patch rate
• Planned downtime rate
• Project cycle time
• Security overhead
What Could Possibly Go Wrong?

In the chapters to come, I’ll highlight some of the common missteps in each
area (or some glaring ones with big consequences, even if they’re not
common). I’ll also offer two or three opportunities to try out the concepts,
with both hypothetical and real-world challenges. This way you’ll have a
chance to practice or reflect, even if your current job role doesn’t offer
much opportunity to work with data (yet!).
What could possibly go wrong? You skip over the activities and don’t
give yourself the benefit of these experiences to support your learning with
this book. It’s your choice!
Give It a Try
Here’s a Hypothetical
Leverage the consumerization of data-driven apps to examine the use of
data in your daily life. Choose a piece of software or an application that
you’re familiar with and see how many of these questions you can answer:
• What data is being gathered?
• How is the data gathered and stored?
• What kinds of insights might the company being making from the
data?
• What kind of data is not being gathered but might also be useful?
• What do I, as a user, get from this data?
Do It for Real
It’s entirely likely that if you are reading this book, you are also on the
lookout for experts, articles, podcasts, and conference sessions on data
analytics in L&D. As you do so, take a structured approach to your
learning. Here are some questions you can ask about the case studies you
discover on your hunt (including the ones in this book):
• What data is being gathered?
• How is the data gathered and stored?
• What kinds of insights were derived from the data?
• What kind of data was not gathered but might also have informed
this decision?
• What do the learners get out of this?
• What is the next iteration of this work?
Bonus Points
Reach out to someone in your organization (outside the L&D function) and
ask them how data is used in their work. Use as many of the questions from
the two lists here as are relevant to your conversation.
Using Analytics to Improve Video Design
By Josh Cavalier, Owner, JoshCavalier.com
Some years ago, I released a series of training videos on YouTube. The

learning objective was to transfer knowledge about specific functions
within e-learning authoring tools. This was the first time I had used
YouTube, so I had no idea if the format would be successful, the content
was appropriate, or the audience was even interested in watching the entire
video. I also wanted to leverage YouTube’s analytics to make format
adjustments and other improvements to future videos.
YouTube’s platform provides rich data points, including audience
retention over time, the geographic distribution of views, and total view
count. All of this is available when viewing your video in YouTube Studio
and navigating to the Analytics tab. You can immediately see the number of
views and audience retention.
For me, the most insightful data point from YouTube is audience
retention over time, which measures if a viewer stays or leaves the video at
that moment in time. Interestingly, this specific piece of data will also track
when an audience member scrubs to a different part of the video. Retention
will drop as they scrub the playback head, and retention will gain when they
stop at a point in the video that piques their interest. You can also check
relative retention, which will compare your video to all the other videos on
YouTube with a similar length.
I use this data to design better videos. For example, I learned not to list
learning objectives as animated bullet points at the beginning of a training
video. Our audience retention would drop 45–65 percent during the first 30
seconds when we displayed the learning objectives in this format. Viewers
would constantly scrub to see the result of the steps we were describing, so
it was easy to adjust to visually show the results while describing the goal
of the video at the beginning.
Using Analytics to Evaluate Program
Performance
By Becky Goldberg, Learning Analyst, Travelers Insurance
Travelers hires experienced underwriters, but our focus on underwriting

excellence and our extensive client base makes it challenging to source all
our talent needs through the open market. We also value tenure and
employee development, and providing upskilling opportunities is a priority
for our organization. The Underwriting Professional Development Program
(UPDP) is designed to teach professionals with little to no insurance
background how to be independent underwriters in the business insurance
segment. UPDP is a two-year course of study that learners can complete
while working in their business units in supporting roles. Part of our
evaluation and program modification analysis is to compare participants’
performance with their peers who did not go through the program.
All underwriters’ performance is evaluated, among other metrics and
feedback, based on the performance of their book of business. This is
reflected on a dashboard our managers use to provide coaching and
performance feedback.
UPDP graduates are identified through an HR indicator, allowing us to
compare their performance with their non-program peers. Tapping into the
dashboard that’s already being used to evaluate underwriter performance
allows us to evaluate the program’s overall success, and tracking individual
UPDP performance correlated to their training participation performance
helps us identify which learning experiences were the most impactful.
We track the following data for this program:
• Completion of learning activities
• Participation in graduate advanced learning offerings (such as a
three-year graduate conference)
• Performance data: How well the individual’s book of business does
in comparison with their peers’ books of business
• Retention data: How long do UPDP graduates stay with Travelers as
compared to peers
After collecting the data for the UPDP program, we determined that
graduates have a faster time to proficiency than non-program new hires.
UPDP graduates also stay with Travelers longer than their non-program
new hire peers. Those who are still with Travelers three years after
graduating from the program are invited to a VIP graduates’ conference,
which is attended by senior leaders and offers advanced development
opportunities. This conference has also boosted the length of time UPDP
graduates stay with Travelers.
Even pre-pandemic, we had been transitioning more of the program to a
virtual, self-study format. We’re now evaluating if the move to on-demand
training has an impact on the quality of the results. Currently, we’re using
extensive xAPI data to determine the length of time people spend in the
self-study modules and the average length of time to complete segments of
the program. This data will help us understand if there’s time-saving
potential over instructor-led training. We also expect to see cost savings in
the reduction of travel expenses of bringing the cohorts together from all
over the country.
UPDP provides an avenue toward career development for individuals
who would otherwise have limited opportunity to establish an underwriting
career. Many of our participants start with Travelers in a service partner role
at an hourly rate. Others enter the program directly from school, and in two
years’ time are set up with a full book of business. UPDP is one of our
strongest responses to the competitive hiring landscape and the need to
upskill, grow, and retain top talent. We’re able to demonstrate the value of
the program by contemplating all aspects of our performance: If we only
looked at assessments and consumption data, we’d miss the big picture that
our development program produces better quality books, reduces turnover,
and positions Travelers as a place people want to stay their whole career.
Chapter 2
Getting Started With Definitions
To support our journey to using learning data, let’s start by establishing

some shared definitions. What follows is a set of common terms used in
data, statistics, and analytics that will help you navigate this space in the
L&D industry. These terms are not mutually exclusive, nor is this an
exhaustive list.
We’ll start with the biggest of them all: data itself.
What Is Data?
The terms data, information, and sometimes even knowledge are frequently
interchanged and misunderstood, but in most casual conversation it doesn’t
really matter. For the purposes of this book, however, we’ll want to be a bit
more specific.
Data
Data consists of symbols and characters that are an abstract representation
of something that exists(ed) or happens(ed) in the world. Without context,
data is meaningless—it is simply a representation of facts, without context,
meaning, or judgment. You could think of a list in a spreadsheet as data.
Here are some examples:
• It rained.
• I am 68 inches.
• Beverly completed a course.
• 94 percent
• 8,287 pounds
When you collect and measure observations, you create data. Since you
can’t collect and measure everything about everything, data is an
incomplete and imperfect representation of what you’re interested in. As
such, data can be remarkably disappointing. There are also plenty of
opportunities for people to poke holes in your analysis if your first step—
your data collection—doesn’t adequately measure what you’re interested in.
(See chapter 9 for more information on identifying data sources.)
Information
Information adds context and meaning to data. It’s the who, what, where, or
when about a particular data point. (Take note: It is not the why.)
Information generally offers some sort of relationship, which may or may
not be true. It can require many data points to create the context that’s
needed. If data is a list in a spreadsheet, you could think of information as a
table with two or more columns. Here are some examples of information:
• It rained on September 20, 2021, in St. Thomas.
• I am 68 inches tall. My best friend is 61 inches tall.
• Beverly completed a course called Introduction to Statistical
Analysis with a score of 94 percent.
• The average dumpster contains 8,287 pounds of garbage.
Information and data aren’t necessarily meaningful. Some information is
meaningful to different people at different times, and some of it never turns
out to be meaningful at all. Other times, you won’t know if it’s meaningful
until you look at it. Here are some pieces of information with different
levels of meaningfulness:
• Beverly completed a course called Introduction to Statistical
Analysis for Marketing with a score of 94 percent.
• Beverly completed a course while wearing a blue t-shirt.
• Beverly completed a course online.
• Beverly completed a course from home.
• Beverly completed a course while drinking a glass of wine.
• Beverly completed a course using the Chrome browser.
• Beverly completed a course and referenced the book The Art of
Statistics, by David Spiegelhalter, while doing so.
• Beverly was wearing pajamas while they completed the course.
Some of this information is very easy to observe, collect, and record.
Some is very difficult, if not outright unethical, to collect. Some of it may
have immediate utility. Some of it may be immediately discardable as
irrelevant. Some if it may be tempting to have on hand just in case you need
it in the future. This is part of the challenge you will face on your data and
analytics journey.
Knowledge
Knowledge is the useful and applicable result of information and the
patterns we can discern from it. With knowledge we can make decisions,
solve problems, and generate predictions. Knowledge influences action.
Here are some examples of knowledge:
• When people reference outside textbooks while taking online
courses, they tend to score higher on the exit tests than people who
only used the course material.
• People who take the online version of the course are more likely to
use outside references and textbooks than people taking the course
with an instructor.
• People who take the online version of the course are more likely to
also be drinking alcohol and wearing pajamas while participating
than people who take the class in person.
Wisdom
Wisdom is generally considered the application of knowledge (along with
other internal resources such as experience, morality, and emotion) to make
good decisions and judgments. It requires use of the principles that underly
the situation at a systemic level to draw novel conclusions. It is an essential
ingredient to intentional innovation.
To sum it up, data can be gathered. Information can be memorized.
Knowledge acquisition (and application) is generally what we consider to
be learning. Wisdom is what we expect from our leaders and experts. Figure
2-1 shows the progression from data to information to knowledge to
wisdom as a function of increased connectedness between the data and our
understanding of it.
Figure 2-1. The Progression From Data to Wisdom
What Is Big Data?

The concept of big data encompasses how we gather, store, analyze, and
extract information from extremely large sets of data. How large does a data
set have to be before it is considered “extremely large”? For all practical
purposes, big data is larger than what your desktop computer and
spreadsheet software can handle. It’s also more than what the average
professional-in-something-other-than-data-science has the skills to manage.
Big data typically has multiple sources of structured and unstructured
data. The data is fine-grained enough to convey relevant detail about the
activity you’re interested in. Big data is often most useful when it is
relational across sources; for example, when you have employee data in a
learning system and a coaching system and you can connect the two via an
employee ID number. Big data often requires statistics to analyze (more
than simple math like counts and averages) and it lends itself well to
machine learning and artificial intelligence because there’s enough of it to
discern patterns. Big data sets are often extensible, meaning you can add
new data types to them without breaking the original model.
Most instructional designers won’t be working with truly big data. We
usually have “lots of data” or “enough data,” but we can still leverage some
of the techniques of big data.
What Is Data Analytics?

Rather than reinvent the wheel, I’ll lean on some very current definitions
from within our field. In Measurement Demystified, David Vance and Peggy
Parskey (2021) define analytics this way:
While many in the profession consider any operation involving numbers to be

analytics, we reserve the term to mean higher-level analysis, often involving
statistical tools. For example, we will not refer to determining the number of
participants or courses as analytics. The same is true for reporting the average
participant reaction for a program. In both cases the value of the measure is
simply the total or average of the measured values—no analysis required. In
contrast, a detailed examination of any of the values of these measures, perhaps
using their frequency distribution, the use of regression to forecast of the value
of a measure, or the use of correlation to discover relationships among
measures will be referred to as data analytics or analysis.
In short: Analytics is the means by which we use data to create

information and knowledge.
What Is Learning (Data) Analytics?

In Learning Analytics, Mattox, Parskey, and Hall (2020), define learning
analytics this way:
Learning analytics is the science and art of gathering, processing, and

interpreting data and communicating results including recommended decisions
and actions related to the efficiency, effectiveness and business impact of
development programs designed to improve individual and organizational
performance and inform stakeholders.
We could think of learning analytics as simply data analytics on the topic

of learning and performance. And while it might seem that any data
professional could come in and provide insights, I find that some of the
types of data we use in learning and performance are unique (as we’ll
discuss in chapter 4 on unique learning metrics) and require some
explanation to our non-L&D colleagues.
What Kinds of Analytics Are There?

Analytics is often thought of as having four levels of sophistication or
maturity. While I could argue that a stepwise progression wrongly implies
that future stages are inherently more useful than early stages, this is a
commonly used framework across analytics, and is thus a useful one for
you.
In their TD at Work issue, “Fuel Business Strategies With L&D
Analytics,” Gene Pease and Caroline Brant (2018) describe the four levels
this way:
Descriptive analytics create a summary of historical data to yield useful (but

limited) information and answer the question “What happened?” These are
commonly referred to as activity or department metrics. They include
information on the number of learners, how many learners completed the
course, and how much time was spent on learning and then offer results and
evaluation by way of a survey, a test, and assessment scores.
Diagnostic analytics examine data or information to answer the question “Why

did it happen?” It is characterized by techniques such as drill-down, data
discovery, and data mining that create correlations. With diagnostic analytics,
you can provide an understanding of the mechanisms’ underlying behavior.
Predictive analytics give you the power to use what happened yesterday—that
is, our old ways of measuring and looking at training effectiveness: the
descriptive and diagnostic analytics—to accurately predict what will happen
tomorrow. These data answer the question “What’s going to happen?”
Prescriptive analytics suggest the best action to take to influence a different

outcome and answer “How can we make it happen?”
Learning analytics can be further parsed into learner analytics and

learning program analytics. With learner analytics, we’re looking at data,
patterns, and performance of individuals and groups. With learning program
analytics, the focus is on the learning experience and the subsequent
changes (or lack thereof) in performance (Watershed 2022).
What Is Data Science?

Analytics is a component of the larger field of data science. Data science is
a broad term that includes data gathering, preparation, analysis, and
presentation—topics we’ll cover in this book. Data scientists have a broad
range of skills, such as statistics and applied mathematics, programming,
artificial intelligence, and business acumen.
If your organization has data scientists on staff, or people with data
science skills, now would be a fantastic time to get to know them. And they
want to know you—you bring the L&D acumen and insights to
the partnership.
What Is the Difference Between Metrics and Measures?

Metric and measure are somewhat overlapping terms. In some industries
and organizations, the difference between the two is relevant and I
recommend you follow the local lead there. In common parlance, however,
they are often used interchangeably.
A 2021 National Institute of Standards and Technology article suggests
that measure be used for concrete and objective attributes, and metric for
more abstract, higher-level, or calculated (such as miles per hour) values.
This is the approach that I’ll take in this book.
Organizations are all about measures and metrics as guides to improving
results. Key performance indicators, or KPIs, are the metrics that an
organization selects to assess overall effectiveness. Some organizations use
an objectives and key results (OKRs) method, with nested and cascading
goals that are measurable. Measures, metrics, and analytics are at the core
of all of this.
And, in fact, measurement has been a component of the L&D industry
for decades. Our L&D measurement methodologies (such as Kirkpatrick
and the ROI Methodology) are the processes, frameworks, and tools by
which we gather data.
What Are Data Reporting, Scorecards, Data

Visualization, and Dashboards?
If analytics is the gathering, processing, and statistical analysis of enough
data to draw conclusions and insights, how is that related to other terms that
we already know and use? Here are rough descriptions of several terms that
may come up in this part of the process. Keep in mind that since there are
no universally accepted definitions for these things, there is a fair amount of
flexibility.
• Reporting is how we display and structure the results of our
measurement work, and share that with stakeholders in the
organization. In common usage, it includes the realms of data and (to
some extent) information; generally involves basic math (count, sum,
average) and not statistical analysis; is descriptive in nature; and
tends to be static. Reports are often run weekly, monthly, or
quarterly; are distributed to a group of stakeholders; and can be
accessed online or printed.
• Scorecards often summarize data from multiple reports into a single,
static snapshot across a variety of related dimensions.
• Data visualization is the graphical display of data and information
using charts, graphs, maps, and so on. Following the axiom “a
picture is worth a thousand words,” data visualization makes the
patterns and insights from reams of data more accessible to humans.
It’s often much easier to see patterns, trends, and outliers in a
graphical format than in a spreadsheet or list.
• A data dashboard usually comprises several related data
visualizations (typically organized by topic), pulling from near-real-
time data. Dashboards can provide dynamic interactions with the
data, such as drill-downs and filtering by different dimensions (like
date ranges or organizational units). Dashboards are usually
considered online tools that are not printed like a report can be.
As you approach reporting, scorecarding, visualizing, and creating
dashboards for your data, you’ll find that there is considerable effort and
opportunity for creativity in this work. At the same time, there is
considerable room for bias, either conscious or subconscious, in what gets
chosen to be highlighted, how, and why.
What Is Business Intelligence?

Business intelligence is the term applied to both the practice and the
organizational function responsible for collecting, storing, and analyzing
data for the purpose of improving organizational performance.
To a certain extent, the processes that will be discussed in this book are
very similar, if not identical, to business intelligence, just focused on the
learning function. L&D professionals can learn much from their BI
colleagues, and it’s likely that the BI function will be interested in what
you’re doing with L&D data.
What Are Artificial Intelligence and Machine Learning?

Artificial intelligence (AI) is a broad and emerging field in which
computing is used to simulate human thinking. AI includes functionality
such as voice recognition, machine vision, image processing, and natural
language processing (NLP).
Machine learning is a subset of AI that enhances data analysis by
automating the process of identifying patterns and outliers. Machine
learning applications process data at great speed, following numerous paths
of discovery and rejecting un-useful ones quickly, supporting the analysis of
large quantities of data more quickly than humans can.
These Terms May Evolve

As I noted in the beginning of this chapter, this is not an exhaustive list, nor
are most of these terms mutually exclusive. Perhaps even more exciting,
this is an evolving field. As the maturity around data and analytics in L&D
develops, advances in computing and the business application of data will
evolve into the faster, easier, and more ubiquitous use of learning analytics.
If you, like me, are fascinated with this topic, we can be encouraged that the
opportunity to learn and do more will evolve with and ahead of us in the
years to come. There is plenty of work to be done, and careers to be had in
this emerging field.

All of this is a lot! And it’s very nuanced. I think one of the biggest things
that people could get wrong at this stage is to either become completely
overwhelmed with the number of new terms, or, at the other extreme, to get
overly pedantic about their specific use. As you continue learning about
data and analytics, be aware that you may be ahead of some of your
colleagues, but behind some others. Perhaps the best way of preventing
miscommunication is to avoid throwing around buzzwords while assuming
that everyone else will know what you mean. Use the terms, but follow
them with a description of what you really mean.
In the next chapter, we’ll move into data specifications … and more new
vocabulary words!
Give It a Try
In each of this chapter’s exercises, you’ll examine a learning program using
most of the definitions we’ve just reviewed. Much of the process of how to
do this will be covered later in this book, so don’t be alarmed if you don’t
have sophisticated answers yet. This is just a starting point!
Explore a learning program for which you may or may not be gathering
data. Answer the following questions about that program to practice using
the terminology. (Note that you are answering questions about the program
delivery, structure, results, and so on, not about the topic of the learning
itself.)
• What data could be gathered from or about this program?
• What information would be relevant about this program?
• What kinds of knowledge about the program would it be insightful
for the L&D team to have?
• What sorts of wisdom would the L&D leadership offer in
this context?
• What kinds of simple math can be performed on the data?
• What kinds of descriptive questions do you have about this program?
• What kinds of diagnoses could you perform based on the data about
this program, if you had it?
• What kinds of things would be helpful to predict based on data about
this program, if you had it?
• What kinds of actions might you prescribe based on data about this
program, if you had it?
• What kinds of information would you put on a report or scorecard
about this program?
• What kinds of visualizations might be useful for this program?
• What kinds of things might a dashboard display? Who would want to
use it?
• Does your organization have a business intelligence function? Do
they use any of this data already? Do they use it for AI or machine
learning?
Do It for Real
Explore a learning program for which you or your organization are already
gathering and using data. Answer the following questions about that
program to practice using the terminology in this chapter. (Note that you are
answering questions about the program delivery, structure, results, not
about the topic of the learning itself.)
If at any point in this list you run into a “Oh, we’re not doing this”
response, feel free to skip back to the previous set of hypothetical questions
to continue on.
• What data is being gathered from or about this program?
• What information is being generated about this program?
• What kinds of knowledge about the program are being created?
• What sort of wisdom is being applied (or perhaps challenged) here?
• What kinds of simple math are being performed on the data?
• What kinds of descriptive analysis are being created about
this program?
• What kinds of diagnostic analysis are being done with this data? Or,
if none is being done now, what could be done with the data?
• What kinds of predictive analysis are being done with this data? Or,
if none is being done now, what could be done with the data?
• What kinds of prescriptive analysis or action are being done with this
data? Or, if none is being done now, what could be done with the
data?
• What kinds of information are being put on a report or scorecard
about this program?
• What kinds of visualizations are being used for this data?
• What is being displayed on a dashboard? Who uses it?
• How is your business intelligence team using this data?
• Is any of this data being analyzed using AI or machine learning?
Bonus Points
Find a stakeholder or expert outside the L&D function in your organization
who is interested in these types of questions. Let them know you’re reading
this book and would like to start a series of conversations about how you
can all better leverage workplace learning analytics. This interested
stakeholder may become a central figure in some of your early analytics
projects.
Using Data to Make Predictions About Future
Performance and Learning Transfer
By Emma Weber, CEO, Lever–Transfer of Learning
Used in the area of learning transfer, predictive analytics can determine

engagement in learning activities, subsequent behavior change, and
business impact of the learning. For example, a global technology company
was investing in a leadership program. The client set a 2030 goal that a
nominated percentage of employees will believe their leader is inspiring.
The learning team needed to ensure the leadership initiatives they created
would generate behavioral change toward that goal.
The leadership program had a three-day virtual learning component
covering key elements of teamwork, diversity and inclusion,
communication, and understanding people. At the end of each module the
learners captured insights and then created an overall action plan with three
commitments concerning how they would apply what they’d learned in
their day-to-day roles. During the next eight to 12 weeks, they had three
conversations with Coach M, a chat bot tool. In the conversations, they
would slow down, reflect on their commitments, share with the chat bot the
progress made, and then create a plan for what they would do to move the
commitment forward over the next two weeks.
In the past, the organization had run similar learning programs, but the
challenge had always been threefold, with key questions being asked by the
learning team:
• How do we identify risk during the 12-week program? What
visibility do we have into what is happening in the workplace after a
program?
• How do we ensure people apply the learning in their roles after the
virtual facilitation phase to produce the desired business outcomes?
• How do we measure and capture the impact of the program?
The learning transfer element of the program was delivered with
participants having reflective coaching conversations with the chat bot.
Each conversation was specific to the participant and tailored based on the
action plan inputs and the participant’s responses.
A mass of data can be collected by this type of tool, so making decisions
about which data is useful in terms of analytics was key. Our process
included reviewing:
• What do we want to know at different points during the program?
• What hypothesis are we testing using the data?
• How easy is it to access and manage the data?
Predictive analytics played a big part in rolling out the program and
keeping it on track. We knew that if we could determine early on whether
people did or did not complete a plan, we could get ahead of any significant
problems. (Fortunately, more than 95 percent of people who attended the
program completed an action plan.) For the predictive analytics, we used
our experience over the previous five years to create a model based on
three factors:
• Average number of action plans
• Number of goals created by an individual
• When the action plans were created
This enabled us to determine how many people would likely complete
the eight to 12 week program. Using predictive analytics was an easy way
to identify, at an early stage in the program, whether learners would be on
track at the end of the 12 weeks, and to provide an opportunity for
corrective action, if necessary.
Actual progress included people who were on track and those who were
at risk. If they dropped out of the follow-up process after not replying or
rebooking a conversation time, they were removed from the bar chart.
These people were called escalations and were captured in more detail in a
pie chart at the bottom of the dashboard, at which stage they were escalated.
We also used predictive analytics to identify where people were
maximizing the tool and where they were getting stuck in the process. For
example, we could see that when people engaged with the chat bot at
learning break one, they stuck with the tool a predictable percentage of the
time and moved forward over the 12 weeks.
We were also able to determine what level of behavioral data would be
collected to illustrate outcomes. Knowing what percentage of people would
have a 30 percent uplift against their goals at a minimum gave us early
visibility into the impact and outcomes.
We reviewed the alignment of the commitments set across the group and
identified participants whose goals were not aligned with the program
outcomes, allowing us to assess the risk issues mentioned previously.
Finally, we were able to identify key themes from the action plans to
highlight to the learning department the content that learners were most
committed to applying and which program areas were not being given due
focus by learners in terms of application. For example, in an organization
with strategic imperatives around diversity, equity, and inclusion (DEI), this
process helped identify early on whether people were selecting goals from
the program’s DEI area. Looking at the specific goal score and number of
goals by area could illustrate where people were making progress and
compare the relative uplift between areas.
As always with any data and analytics generated by a project, the key is
not only to create the analysis and review the data, but to use the data to
make decisions that benefit your initiative. Decisions were both summative
(such as program improvements that will create better outcomes for the
initiative for future programs or cohorts), and formative (such as optimizing
the reflection and goal-setting process during the program).
What Is the Learning Analytics Maturity Model?
By Ben Betts, CEO, Learning Pool
The Learning Analytics Maturity Model (LAMM) is a simple diagnostic

that aims to benchmark your organization’s current level of maturity when
it comes to the adoption of learning analytics. It’s a five-point scale, with
level one being a catch-all for folks who haven’t formalized their plans yet.
The schema then has four more levels: describe, analyze, predict, and
prescribe.
When we say your organization is capable of describing (level two)
using learning analytics, that tells us that you are consistently using learning
data to show you what’s happening in your systems. This is typical of
digital learning reporting dashboards.
The next step up is analyzing (level three). This is where you are
consistently seeking to understand why things are happening.
And if you’ve developed the capability to consistently (in a systemized
way) know why things are happening, you can open up to predicting (level
four) what is likely to happen next.
With that sort of data in hand, you can automate and prescribe (level
five) what should happen next to obtain the best outcome.
This fifth level is a bit of a holy grail, but it’s what folks are being sold
on a daily basis when it comes to recommendations and machine learning.
Using LAMM can give you an idea of how close you are to really using
these sorts of tools, given the data maturity you have at your disposal.
(Note: See chapter 14 for more info about LAMM.)
Chapter 3
Data Specifications in
Workplace Learning
It’s worth taking a moment to discuss the power and value of data
specifications and standards in our learning analytics work. Don’t worry,
we’re not going to spend a lot of time here or get too deep into the weeds.
But having a comfort level with the basics can give you greater insight into
the nature of your data and how easy it will be (or won’t be) to combine it
with other data for analysis. Refer to the appendix for additional reading if
this is an area you’d like to explore even more deeply.
SCORM, xAPI, and More

Our industry, as mentioned in chapter 1, has been both the beneficiary and a
captive of a globally shared data specification for more than 20 years. The
Sharable Content Object Reference Model (SCORM) is not the only game
in town, however, and there are a variety of data schema in our work.
SCORM does a solid job of keeping track of relatively basic data for
individual uses of e-learning content that are launched from an LMS. It
allows for code-free plug and play from a variety of authoring tools to a
variety of LMSs, and most instructional designers using SCORM don’t
need to know how it actually works.
The downside of SCORM is that it only tracks e-learning, which means
it is missing a whole lot of the learning activity that goes on in most
organizations (such as instructor-led training, video, virtual classroom, and
performance support). SCORM also only tracks learners one at a time, so
the manager who leads their team through an e-learning course during a
meeting, pausing to discuss and explore topics as they go, cannot get credit
for that shared learning experience, even if it did (partly) happen in the
LMS. Since SCORM only accommodates e-learning experiences, it
excludes data from digital learning experiences that don’t “look like” or act
like e-learning. Consider all the polling, microlearning, behavior change,
role play, software walkthrough, mentoring, and other tools that have their
own data schema independent of your LMS. As a result, we end up with
very siloed reporting that is driven by tool and not by topic.
xAPI was created to solve the problems left open by SCORM, notably in
that it can be used to record data, with more variety and greater depth, about
any learning or performance activity, including that which occurs outside an
LMS. (Of course, it also can be used in e-learning contexts, particularly
with the cmi5 data profile.) xAPI is still maturing, and adoption is not yet as
widespread as SCORM, but with popular authoring tools now supporting
more robust xAPI and the years of industry support from the US
government’s Advanced Distributed Learning (ADL) Initiative, software
vendors, and the xAPI Learning Cohort, this is starting to change.
Alongside SCORM and xAPI are two related data models: HR Open
Standards for human resources and 1EdTech (formerly IMS) Caliper
Analytics for higher education contexts. HR Open Standards is an initiative
that has been in action for more than two decades, developing and
promoting a set of specifications for the exchange of human resources–
related data across systems and platforms. The Caliper Analytics
specification is designed for collecting and analyzing learning data in
educational contexts.
You may also have heard of the AICC specification. While still relevant
for some very specific use cases, it has been obsolesced by the adoption of
the cmi5 profile for xAPI. In most cases, LMS providers are handling their
own road maps for the AICC and SCORM to cmi5 migration.
Each of these voluntary consensus specifications and standards are free
to use, and they all share some common principles and features in the spirit
of interoperability.
While you can absolutely use learning data and analytics without
following any shared data standard at all—for example, you can make up
your own—the standards are useful in saving hassle and time. In the L&D
profession, we are constantly creating new learning objects, leveraging new
learning modalities, and crafting new learning media. By using a common
data specification, we do not need to invest new effort each time to line up
the data models before we can deploy them. What’s more, employing the
common specifications means that the effort required to merge data from
two or more sources is significantly reduced, if not eliminated altogether.
How Do You Choose a Data Specification?

The good and the bad news of this is that you aren’t going to be choosing a
specification very often. Generally, you will be working within an existing
learning ecosystem that relies on one or more of the standards just
described, and don’t have to choose a new one for each project you take on.
If you are building or expanding your ecosystem, you may have the
opportunity to select a data specification, and you’ll want to do so with an
eye toward balancing the maturity and ease of SCORM with the richness
and flexibility of xAPI. We are going to cover this more in chapter 14, when
we discuss scaling up the ecosystem. My general preference in my own
work is to use the xAPI wherever I can because it provides the opportunity
for the richest data across the learning ecosystem, but the concepts in this
book are not dependent upon any one data specification.

After seven years of leading the xAPI Learning Cohort, I am likely biased
toward the use of xAPI in learning data and analytics. One problem I see is
when people try to create their own standards and specifications for their
organizational learning. Given that several data specifications in L&D
already exist, this seems like an unnecessary effort. It is also not a very
future-proof or extensible type of solution. However, if an organization has
an existing proprietary data model, continuing to use that approach might
make sense, at least in the short run.
In the next chapter, we’ll tackle unique metrics for L&D.
Give It a Try
Consider your organization’s data culture with the following questions:
• Is there a centralized shared team that is responsible for data (often
called the business intelligence function)?
• How easy is it to get data about the organization’s operations?
• How easy is it to get data about learning in the organization?
• How often is data used for decision making?
Do It for Real
Explore the various data specifications in use in your organization’s current
learning ecosystem with the following questions:
• What specifications are supported by your LMSs?
• What specifications do your e-learning authoring tools support?
• Do you have a learning record store (LRS) for xAPI?
• What other learning modalities does your organization use (such as
ILT or performance support)? How is data about usage and efficacy
recorded?
• Do you have learning tools in your ecosystem that do not share their
data with other platforms?
• Is there a place where all the data for learning is stored?
Bonus Points
Add questions to your next learning technology RFP to explore the
vendor’s use of data specifications and whether their data is interoperable
with the rest of your learning ecosystem.
Why Do Data Specifications Matter for Learning
Analytics?
By Tammy Rutherford, Managing Director, Rustici Software
When it comes to analyzing data—any data—the first step is having a data

set that has common sets of information in a standardized format that you
can group together. If you’ve ever tried to sort an Excel spreadsheet that has
freeform text fields or mismatched numbering formats, you know what I’m
talking about! Unless you have a well-defined and consistent expression of
your data, you’ll end up with lines and lines of information that are hard to
organize.
It’s no different when it comes to learning analytics. Without data
specifications, you are most likely going to end up with a lot of information
that’s hard to gather and organize. We at Rustici Software use the example
of video to illustrate how things can go awry without common language.
Some people might say they played a video, while others describe having
watched a video. Both have the same meaning, but when it comes time to
run a report to see how many learners accessed the video, you can end up
with two data sets rather than one (those who watched and those who
played). Specifications also come in handy when it comes to the
communication or exchange of data between the learning activity and the
tool responsible for your reporting or analytics.
There are a lot of options for data specifications, and it can be
overwhelming! It’s always a good idea to start with the end in mind. Ask
yourself, “What is the intent of this training?” “What do I need to know
about what the learner did within the activity?” Or, “Are you checking the
box for the required compliance training or looking to capture more
meaningful analytics to correlate content to broader learner trends?”
Knowing how much detail you want to see will help inform the best spec to
use.
It’s not just about the level of fidelity you want to capture, but also what
systems are in play when it comes to delivering and reporting on your
learning and training activities. You could build a beautiful course that
tracks a ton of details about how the learner interacts with it, but if your
target system can’t play it or the reporting tool doesn’t have the capability
to manage and uncover all that data, you now have a black hole and no way
to get that data out. Or worse, you may have no way for your learners to
access the content at all.
A question I’m often asked is, “If all you have is SCORM, can you do
learning analytics? Or, do I have to use xAPI?” My answer is that learning
analytics aren’t dependent on one specific standard. It really depends on
what you want to measure. SCORM allows you to collect a defined set of
data—score, duration, completion, satisfaction (pass/fail), and question
level details—known as interactions, for learning activities delivered
through an LMS or similar platform. If you want to analyze learner data by
those measures, then SCORM can do the trick. Things to note here are that
SCORM data is limited in what you can collect and the system that delivers
it. So, if you want to go deeper or analyze learning activities outside the
LMS, then you’ll want to look at xAPI.
xAPI is much more flexible in what data you can collect, and it allows
you to overlay context. It isn’t limited to where the learning happens (you
can capture information about learning outside an LMS). There are no
limits to the amount or type of data you can collect using xAPI, which can
be helpful when it comes to learning analytics, depending on the level of
fidelity you’re after. xAPI, by design, also supports sharing data across
systems and removes the risk of having data siloed in one system, as is
often the case with platforms that only support SCORM. If you do plan to
leverage xAPI but must work within the parameters of your LMS, the cmi5
specification should also be on your radar. Incorporating cmi5 (an xAPI
profile for launched activities) into your overall approach to learning
analytics will help bridge your transition from a legacy SCORM-based
model to xAPI.
If you’re interested in analytics and want to choose a specification, there
are lots of resources available across all the standards to help you get
started. SCORM.com is a great resource to dig into the details of how it
works; it also provides recommendations and best practices for creating
SCORM content.
xAPI and cmi5 also have a ton of resources. The xAPI Learning Cohort,
created by TorranceLearning and now operated by the Learning Guild,
offers a great opportunity to get first-hand experience with xAPI. And cmi5
is gaining traction, with available resources including a weekly cmi5
working group and recently released resources (2022) from Project
CATAPULT.
If you’re planning to implement learning analytics within your
organization, it’s really important to loop in key stakeholders early in the
process. Find out what data is important to various teams and understand
the tools and options available. This probably means getting to know your
IT team and LMS or other system administrators well. They will play a
huge role in making sure what you have envisioned is possible.
Chapter 4
Unique Learning Metrics
I’ve seen this scenario happen several times—enough to suggest that this is
a phenomenon that others will face as they take on learning analytics: A
team of data experts and L&D professionals comes together and realizes
that there is a lack of overlap in experience and terminology that needs to
be addressed before they can move forward together. The data folks really
know their statistics; the L&D people understand learning. But the data
team may not appreciate all the nuances that come along with a metric like
completion of learning. Many of the terms in L&D have very specific uses,
even though they tend to be described with very commonplace words. And
the L&D people may not appreciate the statistical concepts that will be used
to assess whether the learning is effective. We can’t just throw our data over
the wall to the data scientists and expect to get back something meaningful.
In this chapter we will focus on the kinds of metrics that are relatively
unique to learning and development; these are the concepts that may be new
to your colleagues in data science and analytics. Then in chapter 5 we will
spend time on the statistical concepts commonly used in the analytics and
data science field that L&D professionals may need to brush up on.
What Does It Mean to “Complete” Training?

Let’s start looking at learning data by examining the e-learning model.
That’s not to say that e-learning is the best use case or the most important
way in which people learn; however, it is where our industry has a great
depth of shared experience in basic data. It also is an excellent example of
the nuance that is required in learning data. Many of the types of challenges
you will face in analyzing learning data can be seen in e-learning.
To begin, what does it mean to complete a course? On its face, this
seems like a very simple question, yet when you dig into the details you
will find that there are a number of seemingly similar concepts that cover
the idea of completing something. Here’s a list of distinct terms from the
cmi5 specification, which is an xAPI profile for recording learning
transactions typically launched from a learning management system
(GitHub 2021):
• Completed
• Passed
• Failed
• Abandoned
• Waived
• Terminated
• Satisfied
If you’re familiar with SCORM or have managed an LMS, these will be
familiar concepts to you. As you can see, each of these terms seems like a
perfectly logical reason to believe somebody is “done” with a particular
course. And yet there are nuances here that will be useful for you when you
want to see whether completing training has an impact on performance.
Let’s review the official definitions of each term from the specification:
• Completed: This indicates the learner viewed or did all of the
relevant activities in an AU (assignable unit, such as a learning
object with a unique launch path) presentation. The use of completed
indicates progress of 100 percent.
• Passed: This indicates the learner attempted and succeeded a judged
activity in the AU.
• Failed: This indicates the learner attempted and failed a judged
activity in the AU.
• Abandoned: This indicates that the AU session was abnormally
terminated by a learner’s action (or due to a system failure).
• Waived: This indicates that the LMS determined that the learning
requirements were met by means other than the moveOn criteria
being met.
• Terminated: This indicates that the AU was ended by the learner
and that the AU will not be sending any more statements for the
launch session.
• Satisfied: This indicates that the LMS has determined that the
learner met the moveOn criteria for all AUs in a block or all AUs in
the course.
As you can see, it is possible to “fail” a learning activity and still be
marked “complete” for it based on the rules provided by the instructional
design and development team, if that is their intent. My goal here is not to
parse out the distinction between these fine details, but rather to call out the
importance in providing clear direction when using this sort of industry-
specific data.
There are other parameters around completion that may be interesting for
data analysis, and time is one of them. For example, the time duration
between enrollment and completion, data that can often be gathered from
the LMS or LXP, may offer insights to course rollout and change
management. The time a learner spends within a course, which is generally
calculated and sent by the course itself, can be an indication of something
about the learning experience. However, many of us have seen examples in
which an individual left a single course open in a browser for an extended
period of time while they went off to check email, get a cup of coffee, or
finish their work for the day only to come back the following day to finish.
This is an example of a seemingly straightforward metric collecting “dirty
data” that would skew analysis.
Another interesting piece of data on completion includes the number of
attempts required for completion. This can provide insights about the
learning experience, the length of time participants spent in any one sitting,
and the difficulty of the material. Note that depending on the platform and
the data specification you use, you may not have data on the number of
attempts on a single course.
Tests and Assessments

Test and assessment data offer a ready source of quantitative information
that is often a quick win for early efforts at learning data analytics, although
it is not the only game in town.
For every multiple choice or multiple answer question, there are clear
choices made by learners, clear correct and incorrect answers, overall test
scores and passing rates, and lots of relatively easy-to-use data that can be
generated as a result.
Other question types also lend themselves to easily quantifiable data,
such as true/false, numerical answers, and ranked or ordered questions. Free
text entry and hotspot questions can be scored as correct or incorrect,
although that greatly flattens the richness of the data that could be gathered
from these more complex question types. (For example, where did the
learners click? What did they actually type for their answer?)
You could also gather the amount of time spent on each question, which
can be easily done as long as each question exists on its own page or section
of the screen.
Confidence-based testing asks the individual how sure they are of their
response, in addition to their answer itself. This information can be used for
scoring the test, but can also be a useful data point for analytics.
Once you’ve gathered data from your assessment tests, you may be
interested in two measures commonly used in psychometrics (a field within
psychology and education that focuses on design and measurement of
testing and assessment instruments): item difficulty and item
discrimination. Item difficulty measures how many people get a particular
question wrong, and it’s calculated by looking at the number of people
answering correctly as a proportion of the total population. Item
discrimination is the ability of any one question to differentiate high-
scoring test takers from low-scoring ones. It answers the question, “Does
doing well on this test question correlate to doing well overall on the test?”
These two metrics can help you create more effective and reliable
assessments.
Engagement With the Learning Experience

Another interesting avenue for data and analytics is achieving a better
understanding of the learning experience itself. And while collecting
completion and test data is solidly achievable with SCORM, engagement
analysis generally requires something more fine-grained, such as xAPI.
Understanding the learning experience within a single course or e-
learning module can be incredibly helpful for the design team. For example,
years ago one of my clients wanted to measure a multi-part learning
experience that included a self-diagnostic, e-learning, a virtual classroom,
job aids, and manager support. I asked if they were interested in
engagement metrics from within the e-learning course. At first, the response
was something to the effect that “We will know they’re engaged in it
because we can see that they completed it.”
That is partly true. But is completion really the same as engagement? We
can also measure whether people are engaged by looking at how many
things they clicked on a screen. If they got a wrong answer, did they go
back and try to correct it? Did they review content multiple times or just
push through once to course completion? Did people even see the light grey
hint boxes that we put on several of the screens to give them more
guidance? And if they clicked on the hint, did that help them perform better
in the rest of the course?
We can also see what kinds of preferences and support people choose
when using a learning experience. If content is offered in both text and
video formats on the same screen, do they choose to watch the video? If a
particular type of hint is offered to a learner, does that help them perform
better? Or does it teach them to rely on the supports and discourage
independent thinking and risk taking?
Stepping back, we can use data and analytics to examine a learning
experience across multiple engagements. How do learners find a particular
experience? In what order do they consume resources? What is the last
thing that people do before abandoning the learning experience? Do
different cohorts tend to approach things differently and influence one
another? For learning experiences that persist over time, such as chat bots
and performance support, for how long do learners engage with them?
Effects of Learning on Performance

Most of us in workplace learning are keenly interested in the impact of
learning on-the-job behavior and performance. In some cases, an
individual’s performance data is captured (and maybe trapped) in the
systems of work. In other cases, we can use coach or manager observation
data to assess on-the-job performance, skill acquisition, and ultimately skill
degradation over time. Observation checklists and rubrics allow for
instructors, coaches, peers, and managers to evaluate on-the-job
performance in a survey-like format. This not only quantifies something
that may otherwise be difficult to quantify, but also digitizes it and gives us
access to that data for analysis. We can analyze the performance of
individuals and groups doing the job, as well as the performance of the
observers themselves, a concept called inter-rater reliability.

A couple of pitfalls exist in this space. One is that you focus simply on
course completion and test scores and don’t look at some of the broader
areas in which we can collect and analyze data. The other is assuming that
people outside the L&D function have the inherent understanding of the
types of data that we are interested in, and not engaging in a rigorous look
at the data and what it really means prior to diving into analysis.
As you can see, learning data and analytics is not like dry cleaning. You
cannot simply drop your data off with the data team (if one even exists in
your organization) and come back a bit later to pick up a clear set of
actionable answers. As with any function in the organization, our data has
some nuances that require our functional expertise to guide the exploration
and analysis. And at the same time, we have unique questions that are
informed by our profession and our experience that can guide us to
meaningful analysis of the data we have, as well as targeted efforts to gather
data that we don’t currently have for future use.
In the next chapter, we’ll turn to an overview of the key statistics terms
you’ll need to round out your foundation in data and analytics.
Give It a Try
Let’s take testing as an example because there is lots of data available here.
Imagine a test for a topic that you might create a course for:
• What kinds of questions do you have about the results of the test?
• What questions do you have about the experience of taking that test?
• What kinds of data would you want to collect to learn more
about that?
Do It for Real
Again, take testing as an example. Locate a course for which you have
ready access to question-level test data. Download that data in a way that
obscures the names of the individual test takers, because that is both
irrelevant here and a violation of those individuals’ privacy. We haven’t
gotten into the details of analytics yet so for now just scan, sort, and search
through that data and see if any patterns emerge as you go. What kinds of
questions do you have about the results and the experience of that test
itself? What kinds of data would you want to collect to learn more about
that?
Bonus Points
If your industry has an educational association or credentialing and testing
body, consider reaching out and contacting their testing team to see if you
can learn more about the types of test analytics they are using to fine tune
their products and credentials.
Using Learning Data to Discover Insights Into
Student Performance
By Andrew Corbett, Senior Instructional Designer, University of
California–Davis
At University of California–Davis, we were exploring clinical reasoning in

physician assistant students. Specifically, how do students approach the
diagnostic process? And, where do they tend to make errors in clinical
reasoning? We’ve only scratched the surface, but we did discover that some
clinical students have difficulty identifying pertinent negative findings
during the process of collecting a patient history and the exam.
To address these concerns, we collected data as students worked through
a computer-based simulation of a clinical encounter. We gathered data on
the sequence of steps students made, the questions they asked, and which
findings they considered pertinent. There were also pieces of data we didn’t
collect due to limitations in the design of the virtual patient experience. This
highlights the importance of knowing early on what behaviors and
decisions you want to examine.
To do this work, we needed to tap into our technology ecosystem. We
created our virtual patient simulations in Lectora to take advantage of its
built-in xAPI support. We were not in a position to deploy our own LRS in-
house, so we adopted Yet Analytics as our LRS (largely because the project
occurred early in the growth of xAPI and Yet Analytics was the only
affordable LRS solution we could find).
Our findings led to some surprises. In particular, I was surprised by the
variability in performance between simulated encounters by the same
students—there was more “noise” than I expected. The other surprise was
that I was unable to find any meaningful correlations between student
performance and any external factors or variables until we used cluster
analysis. To do this, we organized our data into groups or “clusters” (in this
case, students) based on how closely related they were. We could then
examine how different clusters of students were from each other, which
showed more closely how some groups performed consistently very well,
and others consistently did not.
The biggest challenge we faced was processing the data once it was in
the LRS. Even for our relatively small project, the data was voluminous.
And there was no easy way to extract the cross-sections we needed to
answer the questions we were interested in. So, there was a lot of manual
processing on my part. I would say that even now, with LRS dashboards
becoming more powerful, it still is incredibly important to construct xAPI
statements with an eye toward how you will extract the information you
need from the data on the back end.
Using Data to Proactively Identify Skills Needs
and Map Content Accordingly
By Derek Mitchell, Global Analytics Lead–People Development,
Novo Nordisk
Within Novo Nordisk, we realize the importance of understanding our

colleagues’ skills, abilities, and needs. We also realize that their time is
important, and we strive for data collection not to be onerous.
By simply asking colleagues two questions (“Tell us what skills you are
interested in” and “Tell us how you rate yourself on those skills”), we can
map individuals to relevant content that matches both their interests and
experience level. This is a benefit to our colleagues, but this data also offers
Novo Nordisk a tremendous competitive advantage.
We joined this data to other data sources we have permission to use, such
as HR hierarchy data. Where the learning function used to be a reactive
order-taking function, we can now (very quickly) see learning and
performance data for any cohort of age, location, seniority, or job function.
This allows us to be proactive in bringing solutions into the business.
Beyond using this very simple data set to radically transform the
relationship with senior stakeholders across the business, we can very easily
understand which skills are shared across groups.
For example, in Figure 4-1’s illustrative view, we can see that quality
and research functions share many skills, but research has four skills not
held by those working in quality and quality has two skills not held by those
working within research.
These two simple questions allow us to map out strategic journeys for
our colleagues based on where we see the business evolving over the next
five years, understanding where we may want certain functions to develop,
and allowing us to route people through the most efficient development
frameworks.
Figure 4-1. Skill Map for Quality and Research Functions

Chapter 5
A Little Bit of Statistics
While you can absolutely get a start in learning analytics using some simple
math that many of us learned in school, the truth is that to answer more
interesting and complex questions you need more interesting and complex
calculations.
It’s for this reason that more mature learning analytics teams include
someone with a statistics or data-science background. Not everyone on the
team needs to have these skills, but having an appreciation and awareness
for them will help you work more effectively.
Basic math and simple statistics will become useful competencies for
instructional designers as we continue to leverage analytics. Perhaps few
have said this as eloquently as Janet Effron of Notre Dame University
(whom you’ll meet later in a sidebar). She frequently says in her
presentations that “Las Vegas is a glowing monument to the fact that people
are ignorant about statistics.”
Fortunately, the analysis of data and statistics is currently a popular topic
in the press. There are several approachable—if not downright entertaining
—books on the subject that I’ve listed in the further reading section.
In this chapter we will take a look at measures of central tendency,
measures of spread, and the sometimes mysterious concepts of statistical
significance and confidence. But first we will explore three types of data.
Three Types of Data

Data typically comes in one of three forms. Knowing which form you are
working with is essential to knowing what you can do with it next.
Quantitative data encompasses things you can count. Assessing
quantitative data gives you a first glimpse of hard numbers that indicate
how successful a learning program is at achieving its goals. Quantitative
data can be added, averaged, subtracted, and so on.
Examples of quantitative data include how many learners attended the
course, how many completed the course, scores learners received on post-
tests, the number of hours learners spent practicing a goal-oriented
behavior, and sales or customer satisfaction numbers that increased or
decreased as a result of the training.
Qualitative data is information gathered for analysis that isn’t measured
numerically, at least not without some additional analysis. It can capture
data about the thoughts, opinions, and feelings the learner experienced.
Qualitative data can provide a narrative around quantitative data—what
caused it, how it happened, and why it happened. However, it can be harder
to analyze than quantitative data because it is context-specific and doesn’t
always provide insight beyond each individual or group of learners.
Qualitative data is usually gathered with smaller sample sizes than
quantitative and uses open-ended questions that are subjective. It can help
L&D professionals develop ideas for future instances of the same course or
similar courses. Examples of qualitative data include free text entry on
learner surveys, post-training interviews, observation notes, focus groups,
and so on.
Qualitative data cannot be added, averaged, or calculated upon without
additional processing. This additional processing might include reading and
scoring text, video, or images. Sentiment analysis processing, which uses
machine learning, can parse through large blocks of text information and
provide quantitative scores or categorical data that indicates the nature of
the text, such as positive, negative, emotion, suggestion, or action item.
Sentiment analysis is increasingly being used on the free text entry
responses on post-course evaluations, but can also be used within the course
experience itself.
Here’s one of my favorite examples of the use of sentiment analysis.
Learning Locker, an LRS and learning analytics platform provider (since
acquired by Learning Pool), offered a free course on how to use xAPI in
their Curatr (now called Stream) product, which tracked its data using
xAPI. Throughout the course, learners consumed some learning content
(video, article, interactive activity) and then answered open-ended
discussion questions about that content. The course collected the
quantitative data on the number of points earned by consuming content and
the scores of quiz questions along the way. And the course also collected
the free text discussion board data and processed it using sentiment analysis
to determine if it was a positive comment, a negative comment, or a neutral
comment. The thinking was that a mass of negative sentiment on particular
aspects could help them quickly pinpoint areas of the course that needed
attention from instructors.
And sure enough, it did! A concentration of negative comments was
highlighted for one of the activities. It turned out this activity had a bug or
incorrect link, and people were responding to that in the discussion board.
This allowed the design team to quickly make an update without having to
personally read all the course comments.
Categorical data is one that throws people off. It represents the data we
use to sort other data. Some examples of categorical data are often found in
our employee demographics, such as job role, job title, office location, and
new hire status. Categorical data can be a very important way of tagging
and slicing other data for comparison.
It’s important for you to be aware of the limitations involved with
performing mathematical computations on categorical data. Categorical
data can only take on a limited number of values. For example, the team at
HubSpot created a dashboard that displays course completion by manager.
One’s manager is a piece of categorical data: You cannot average one
manager with another manager to get something meaningful. However, it is
very meaningful to be able to slice the data by manager to show them each
how their team is performing.
The remaining discussion in this chapter focuses primarily on
quantitative data and the operations we can use to explore it.
Measures of Central Tendency

We use measures of central tendency when we are looking to discover the
usual outcome of a particular activity. We use these all the time, so they
may seem very familiar to you.
• Mean: The value of all numbers added up and then divided by the
quantity of numbers that were added. For example, the mean
percentage of scores on a post-test gives you an indication of how
well the participants did, which can help you evaluate the training
material or the questions the post-test contained. The mean is the
most common calculation when referring to “averages.”
• Median: The middle value in a list of numbers when sorted in
numerical order from smallest to largest. The median is useful in
analyzing data that has significant major outliers (or unusually high
or low values). For example, if you have a group of 20 learners and
19 of them rate the training somewhere between 7 and 10 out of 10,
but one of them gives it a 1, that single 1 will bring down the average
in terms of the mean rating significantly. Using the median instead of
the mean for analysis may help you get a better big picture of how
the training went.
• Mode: The value in a list of numbers that occurs the most often. If
all the numbers in a list of data are the same, there is no mode.
Otherwise, it’s the value that appears most frequently. Identifying the
mode is useful for seeing how often something occurred. For
example, if you ask learners how often they felt like they needed a
break during a training session and gave them a choice of one to
four, the mode might give you greater depth of guidance on how
many breaks to design into future sessions.
Each of these measures of central tendency offers a slightly different
perspective about what’s going on in your data. Selecting which you use for
a particular analysis requires you to understand the underlying data itself.
These simple formulas are available in Microsoft Excel and GoogleSheets
so you won’t have to manually calculate them.
When we think of a mean, we commonly think of data that is distributed
with a peak in the middle, tapering to lower values at the upper and lower
limits (Figure 5-1). Think of these as test scores on a course.
Figure 5-1. Measures of Central Tendency Example 1

However, what if your data is actually shaped like two mountains with a
valley in between (Figure 5-2)? In this case the mean is the same as in the
previous graph, but it is also meaningless, and perhaps even misleading as a
description of what’s going on in your data.
While that is an extreme situation, consider the impact that a wide

distribution of data has on the mean in a case where there’s a peak in the
high end of values, but a wide distribution over the rest (Figure 5-3).
In Figure 5-3, the mean is skewed lower than you might think because of
the presence of some data in the low end of the range. In such a case, the
median or mode may be a more appropriate measure of central tendency
(Figure 5-4).

Another option would be to calculate the mean without the outlier data.
If you choose to do so, you should seek to understand what is going on with
those outliers so that you are not missing something important. You should
also convey to the recipients of your data the fact that you excluded outliers
from your analysis.
Measures of Spread
Measures of spread help us understand the degree of variation—and
therefore risk—in our data. Measures of spread give us a sense for the range
of possible outcomes and can help us make smarter decisions based on the
data. For example, if you found out that your learners rated a course
experience three out of five stars, you might interpret that as a mediocre
rating, and take efforts to improve the course. However, you might find out
that you had had a group of people who were responding very poorly to the
course (rating it a one), and a separate group of people who are responding
very positively to the course (rating it a five), bringing the average of the
two groups to about a three-star rating. Changing the course overall might
actually make things worse for those people who really enjoyed it and rated
it a five. This is why a measure of spread provides more depth than simple
averages. Two measures of spread are range and variance:
• Range: The difference between the highest and lowest value in a
data set. As you can imagine, in situations of high spread, your
ability to rely on the mean, median, or mode may be lower. I like to
think of range as the fast-and-easy look at spread in a data set. For
example, in a batch of test scores, if the lowest scoring individual
earned a 40 percent and the highest scoring individual earned a 100
percent, the range is 60 percentage points. This may indicate that you
should go digging more deeply into why the test had such a large
range of success rates. On the other hand, if your lowest scoring
person earned a 90 percent on the test, a quick look at a very narrow
range on the very high end of test scores indicates that the test is very
easy to pass.
• Variance: The average of the squared differences of each value from
the mean. A high variance indicates that the data in the set are far
from one another and far from the mean. A small variance indicates a
tighter grouping around the mean.
Figure 5-5 shows a data set that demonstrates a lot of range and variance.
The data points are widely scattered from each other and across a wide
range of values. While you could calculate the mean for this number, it
would not be an accurate representation of the nature of the results.
Figure 5-5. Measures of Spread Example 1
The data set in Figure 5-6 shows less variance but a similar range. There
are two tight groupings of data points, one at the high end and one at the
low end. In this data set the mean would also be a useless number.
And Figure 5-7 shows low variance and a small range. As demonstrated
by a tight grouping around the mean, for this data set the mean would be a
useful number.
A third measure of range is standard deviation. Standard deviation—the

square root of the variance—allows you to examine each data point’s
relationship to the mean. Is it within one standard deviation? Two? More?
You can think of the standard deviation as the average distance from the
average: on average how far are each of my individual observations away
from the mean? The larger the standard deviation, the more spread there is
in your data. If the standard deviation is bigger than the average value itself,
that indicates there is a lot of variation in the data. If the variance and the
standard deviation are small, that indicates your mean is a good
representation of the data.
For example, let’s say that a team has a mean quality performance score
of 75 percent before training. Upon analysis of quality numbers one month
after the training takes place, you’re initially disappointed to see that
average quality scores only increase to 78 percent. Was the training not
valuable? However, upon deeper investigation, you see that the standard
deviation in quality scores was 5 before the training, and is now 2. From
this, you may be able to tell that the low-performing individuals improved
their results significantly.
There’s a very practical application to the use of the standard deviation.
If you were to assume a smooth and orderly, also known as normal,
distribution of data such as in Figure 5-8, the standard deviation can inform
your confidence in your assertions about your data.
Figure 5-8. Standard Deviation Example
What this says is that if I declare that my range of results is one standard
deviation on either side of my mean, 68 percent of my results fall in that
range. I can be 68 percent confident that the true answer is somewhere
within that range. If I go out two standard deviations, I can be 95 percent
sure that the true answer is within that range. And so on.
Of course, most data in the real world does not follow a perfectly smooth
normal distribution—and this is where it’s helpful to engage statistical
analysis.
Side note: You may have heard the term Six Sigma in an organization’s
quality improvement efforts. Sigma is the Greek symbol used for standard
deviation. Six Sigma implies a target quality level within 6 standard
deviations between the average and the acceptable production output, or
99.99966 percent.
Sampling
Sampling is when you only use a subset of the data instead of all of it. In
L&D we may have access to all the learning data for all the learners in our
population, and in these cases sampling may not be relevant. However,
sampling can be useful if our data set is very large, if we are just getting
started, or if we’re conducting an experiment and comparing one group to
another. What we’re doing with sampling is using a subset of the data to
make an inference about the entire population.
It’s absolutely critical when you are sampling to make sure that your
sample is actually representative of your entire population. As you can
imagine, if your sample includes a subset of the population that is different
from others in terms of performance, geographic region, tenure with the
company, and so on, your data and decision making will be skewed.
There are several decisions and techniques that you can use to more
reliably create a sample that is truly representative. Randomization is a
commonly used technique where participation in one or another test group
is randomly based. This tends to be the gold standard. At the opposite end is
convenience sampling, where you have a sample that is made up of the
portion of the population that is easiest to reach; email surveys are a
common example. One of my favorite examples of convenience sampling is
the Level 1 course evaluation as it is typically implemented in
organizations. The course evaluation is voluntary, so we only hear from the
people who chose to fill it out for one reason or another. It is convenient,
but not necessarily representative.
When we do not have a representative sample, we can say that the
sample may be biased. A biased sample gives us results that are not
generalizable across the entire population. A classic example of this is
medical and social science research that, throughout the 20th century, was
often conducted on white male college students because they were
conveniently available to researchers. However, young white males are not
representative of the entire human population, and thus the results of these
studies may not be applicable to other groups.
To draw on an example a little closer to home, consider the number of
times in which instructional designers reach out to their co-workers in L&D
to help them pilot-test a course. The thinking here is that their colleagues do
not know the content that is being taught, therefore they make good test
learners. On the other hand, the instructional designers are not in the job
function of the target audience and will never need to use this content to
improve their performance at work. They are not a truly representative
sample of the population who will use this course, and any results that are
gleaned from a pilot test will need to take that factor into account.
At its very core, sampling bias reduces our ability to draw meaningful
conclusions from data.
Relationships Between Variables

In your analytics work you will have different data points that you are
looking at, such as independent and dependent variables.
Independent variables are the features you are changing in different
trials to test their efficacy. Think about this in terms of an experimental drug
trial, where receiving the drug or not receiving it is the independent
variable. The experimenters are controlling who and maybe how much of
the drug is received. In this case, receiving or not receiving the drug is a
categorical data point.
The dependent variable is the outcome that you are measuring. It is the
effect, to follow our example, of having received that drug or not.
In a L&D context, Table 5-1 outlines some examples of independent and
dependent variables that you may work with. The dependent variables here
are quantitative data points.
Table 5-1. Independent and Dependent Variables

Independent Variable Dependent Variable
Course completion status Employee engagement
score
Final test score On-the-job quality output
Coaching provided throughout the Final test score
course or not
Course medium On-the-job quality output
Pre-training quality output Final test score
As you can see, depending on the nature of the question that you are
analyzing, different data points can serve as the independent or the
dependent variable. This is something we can have a little bit of fun with. In
chapter 8, we will discuss generating the questions that will form the basis
of your analysis.
Correlation and Causation

When an independent and dependent variable tend to move together, we can
say that they are correlated. You may have heard the phrase “Correlation
does not equal causation.” Causation is when the change in one variable
actually results in a change in the other variable.
There are all sorts of both logical and statistical tests that are required to
prove causation. In the L&D field, we are frequently called upon to “prove
our value,” and showing that training leads to better on-the-job performance
would be a fantastic result of our analysis. However, it’s entirely possible
that training might only be correlated with better results if, for example, a
simultaneous new product introduction is what actually caused the increase
in sales that we measured.
We won’t go into depth on specific statistical analyses required to prove
causation (it is considerably difficult to prove mathematically); however, I
suggest that you apply a healthy dose of common sense and logic if you are
tempted to claim causation in your data. For an amusing look at what
happens when we draw too close a connection between correlation and
causality, check out Tyler Vigen’s book Spurious Correlations, in which he
leads with the shocking correlation between the number of movies the actor
Nicholas Cage stars in each year and the utterly unreleated—but correlated
—number of deaths by pool drowning.
Statistical Significance
Statistical significance is a measure of how likely it is that the results that
you’re seeing in your sample data are not due to pure error or random
chance. Basically, how unusual is it to observe these results if this was
purely by random chance? Or conversely, how do you know that your
variables actually do relate to each other? Without statistical significance,
we cannot say that our results have any meaning.
There are online calculators that can help you determine whether you
have a statistically significant sample size. If you have a statistician or data
scientist on your team, they can help as well. Rather than go into all that
math here, let’s look at some general guidelines for helping ensure that your
results are more statistically significant:
• The minimum sample size to get any kind of meaningful result is
100. That means that if your population is less than 100, just include
all the data.
• A reasonable maximum sample size is 10 percent of the population,
or 1,000, whichever is less.
• If the effect you’re seeing in your data is small, you need a larger
sample size.
• If the decisions that will be made based on the data have significant
consequences, use a larger sample size.
• If you are subdividing your data (for example, by demographics),
you will need a larger sample size.
• If you think your data will be highly variable, you will need a larger
sample size. If you think most people will have similar results, you
can use a smaller sample.
• If you only need a rough estimate of the results, you can get by with
a smaller sample size.
• If you are measuring 100 percent of your population, you do not
need to worry about statistical significance because you are not
sampling. Your results are, by definition, statistically significant.
That said, even with statistical significance, our results may not have any
meaning. Take, for example, a perfectly statistically significant finding that
completing an exercise within a course leads to a better outcome on the job.
But that outcome on the job is of marginal impact. In this case, we have a
statistically significant finding, but we do not have a practically significant
one.

Don’t be overwhelmed! Many of the analysis techniques in this chapter can
be done using out-of-the-box formulas from Microsoft Excel and Google
Sheets. Others are very simple for data and statistics professionals to
calculate. The intent here is to provide just enough information for you to
be a little bit dangerous … but also to nudge you in the direction of
collaborating with others as you set off on your analytics journey. You don’t
need to know at all, but you do need to be able to be a good consumer of it
all.
This wraps up part 1 on the foundations. So far, we’ve covered why you
should care about data and analytics as an instructional designer and what
key definitions, specifications, metrics, and statistical terms you should be
aware of. Next, in part 2, we’ll move on to actually designing for data.
Give It a Try
Think about a digital learning experience that you’ve either created or
engaged with. What kinds of quantitative, qualitative, and categorical data
could you collect? Which measures of central tendency might be relevant
here? Do you think the spread is large or small? Would you need to use
sampling to gather data or could you use the whole population? If sampling,
what would be a statistically significant sample size to work with?
Do It for Real
Select a learning experience with which you’re familiar that is currently
collecting data. What kinds of quantitative, qualitative, and categorical data
are being collected? Which measures of central tendency might be relevant
here? Is the spread large or small? Does the analysis of this experience rely
on sampling or are you able to use data from the whole population? If
sampling, what is a statistically significant sample size to work with?
Bonus Points
Connect with someone in your organization who analyzes data and discuss
these concepts with them and how they apply in their work. Consider
meeting up with someone in marketing, manufacturing, or research.
Building a Continuous Program Measurement
Instrument: The Learner Adoption Index (LAI)
By Tiffany Jarvis, Edward Jones
We don’t learn in a vacuum. Like many in our industry, we at Edward Jones

wanted to be able to measure the business impact of our learning
experiences. But because we don’t learn in a vacuum, and we don’t succeed
in a vacuum, it was difficult to say with certainty what was truly coming
from the learning experience and what impact came from other sources. On
one hand, you don’t want to claim the learning was successful if something
else is what’s really moving the needle; on the other hand, you don’t want
to assume the learning isn’t designed well if there’s another factor at play
that’s preventing adoption. When Josh Bersin and others talk about
becoming a high-impact learning organization, they describe shifting the
intelligence capability from programmatic to contextualized to continuous.
At first, we were just aiming for a measurement tool that could provide
context.
We started by studying our learners and getting to know what influenced
their likelihood to adopt new behaviors. We identified factors that you’ve
heard about in any major adult learning or change management theory—the
involvement and attention of leaders, the relation of the learning to client
experience, how readily the learner could implement the new practices in
their daily work, and so on.
We used statistical analysis to ensure the validity and reliability of the
instrument and started using it with all our major learning experiences. It
wasn’t long before we had gathered enough data to start comparing survey
responses to later business results to determine that the index was reliable
enough to approximate likeliness to adopt.
Here’s how it works: Learners receive a customized version of the
Learner Adoption Index (LAI) upon completion of a learning experience.
They answer a quick series of scaled questions about the experience and
their work situation to help us paint a picture of their adoption profile. Were
they able to practice and receive meaningful feedback to improve? Is their
leader invested in following up with them about what they learned? Do they
personally believe that what they’ve learned will lead to a better experience
for their clients? Will they be able to immediately implement on the job?
Their responses are converted into a composite score that tells us how likely
they are to adopt what they learned. It also pinpoints the factors present that
may be putting adoption at risk. This is incredibly important to our team,
because we’ve grown to understand that measuring impact is not enough.
The real magic in an instrument like this is the capability to create a
formative assessment, one of that tells you not just what is done, but what
we should do next. Prediction starts as course correction. Just like a chef
tastes their dishes as they prepare, or a pilot monitors a GPS, we needed to
use the LAI to help us course correct in situations where the learner’s
responses indicated that adoption was at risk.
For each of our measured learning experiences, we partner with
stakeholders and business leaders to anticipate the most likely needed
solutions. We take those solutions and pre-program them into the tool, so
that when learners complete the survey, they immediately receive the
recommendations, programs, resources, and support that can mitigate those
adoption risk factors. This protects the investment in learning that we’ve
already made and yields a significantly improved outcome for learners. It
also creates trust in the survey itself; our learners know the LAI provides
them with meaningful data, so they are more fully engaged in its use.
The data helps the entire ecosystem realize the investment in learning by
increasing adoption rates and driving us to make better decisions. We can
look at the data from an organizational lens, divide it by any number of
properties (such as region, level, tenure), or conduct longitudinal studies
across programs or individuals. Here are some ways Edward Jones uses the
LAI:
• Our business partners use these dashboards to inform both current
and future state operations. (Which teams are most successful and
why? How can we ensure the success of this business decision in a
human-centered way?)
• Leaders can make real-time decisions to support change
management. (Do I need to be more intentional in my support for
this new behavior? When is the organization ready to move from one
phase of a change to another?)
• Performance support functions can reinforce the learning in a
calculated and scaled way. (How relevant does this new information
feel? Are associates getting the social collaboration and support they
need to stay engaged?)
• We in L&D can make better decisions about experience design and
facilitation. (Who facilitates this well, so we can share their best
practices with other facilitators? Are learners getting enough practice
in the flow of work to achieve mastery? Is this program ready to be
retired or redesigned?)
Getting clear on our shared goal of adoption drove us to use a common
tool to make decisions about learning and performance. And by putting the
learner, their experiences, and their context front and center, we’re able to
go beyond measuring our impact to actually increasing it.
Evaluating Program Effectiveness in the US Navy
By Kimberly Crayton, Operations Research Analyst, NAWCTSD
AIR GT535 and Rodney Myers, Branch Head, NAWCTSD AIR
GT535
The US Navy’s goal of virtual recruiting was to gain efficiencies in

recruitment of enlisted sailors while at the same time reducing the physical
recruiting footprint through fewer recruiting stations, decreased recruiter
physical presence in some areas, and less physical contact with future
sailors while in DEP (Delayed Entry Program), along with an inherent goal
to reduce DEP and recruit training attrition. The performance objectives
were future sailor motivation and mentoring, prerequisite training, learning
assessments, physical preparedness, reduced attrition, and maintaining
contact.
Analyzing the program data is important to support mentoring, reduce
attrition from DEP and RTC, and improve upon the design and
implementation of the mobile application in support of virtual recruiting.
Navy recruiters and onboarders, enterprise-level Navy Recruiting
Command staff, and mobile environment and model practitioners all use
data to assess the DEP and the effectiveness of recruiting and retention.
The data elements that proved to be most insightful to us included
mobile app daily downloads, frequency of single choice (the single training
activity a future sailor tried before they did not attempt any further training
within the mobile application), training activity usage (which helped to
prioritize updates), delayed entry program and Recruit Training Command
attrition and graduation numbers, and delayed entry program (DEP) Test
and Initial Fitness Assessment results.
As an example, it was important to measure the impact of the mobile
app. To that end, performance metrics from Recruit Training Command
were compared with those that did not have an opportunity to use the
Virtual Recruit Tracker (VRT) before shipping to Recruit Training
Command. The two comparison groups are represented in Figure 5-9,
comparing Prime-VRT users (PU) and Non-VRT (NU) users.
The DEP Test has a maximum score of 5 and a qualifying score of 4 for
early advancement (coupled with other requirements). The figure represents
a comparison of the three tests for PU versus NU. The PU cohort average
score was 3.66 versus 3.10 for NU cohort (on average, the NU did not attain
the minimum passing score of 3.2). For Test 1, the PU cohort attained an
average score a little higher than 4 versus 3.75 for NU cohort. Similarly, the
PU cohort outperformed the NU cohort on the final Test 2 with 3.86 versus
3.66 from NU cohort.
Figure 5-9. Recruit Training Command DEP Test Comparison
Engagement and use of the tool is most important in any mobile learning
environment. We had to develop a metric to represent engagement and use
of the tool. Our stakeholders guided the choice of metrics from DEP and
RTC. What is critical to them is critical to us. From there we conducted the
analysis to show whether it supported our hypothesis that using the tool
would prepare future sailors for RTC and lead to less attrition.
PART 2
DESIGNING FOR DATA
Chapter 6
A Framework for Using

Workplace Learning Data
Before we get too much further into the use of analytics in the L&D space,
let’s take some time to consider that analytics is just one of several uses of
workplace learning data. In this chapter I will suggest a framework that is
less about providing a strict set of rules and buckets, and more about
looking at the variety of things we can be doing with data to gather some
perspective. This is a starting place in an emerging and evolving field, and I
fully expect that as an industry we will improve upon it over time.
The Three General Categories of Data and Analytics

Before we can do anything with the data, we need to:
• Archive it: Collect it, store it, classify it, and manage it.
• Analyze it: This is typically done with sets of data from a number of
individuals (and is the subject of this book).
• Act on it: Use individual or aggregated data as part of a set of
workflow triggers or actions that flow from each data point.
Figure 6-1 illustrates these three categories.
Figure 6-1. The First Framework Layer

The Five Uses of Data
These three very general categories are then broken down into five uses of
data: store and secure, report and visualize, analyze, incorporate into the
learner experience (LX), and initiate workflows (Figure 6-2).
Figure 6-2. The Second Framework Layer
Storing and Securing

The first bucket is essential to all the rest: storing and securing the data. We
need to gather it, put it in a place, and keep data safe. When I talk about
keeping it safe, I am including the broad activities of backup and data
recovery, data privacy, and data security from outside and inside intruders.
This is also the space in which we grant and verify credentials, with varying
degrees of rigor. Without this first step of storing and securing data, none of
the rest are feasible or reliable.
Storing and securing data can be considered at multiple levels of
complexity. In simple learning ecosystems this may be entirely provided by
an LMS, which is the system of record that holds all the learning activity
that is monitored in the organization. In more complex learning ecosystems,
a variety of data sources, several LMSs, and several systems of work all
contribute data into a central repository, or sometimes several repositories,
within the organization. The Total Learning Architecture (TLA) effort is an
R&D project sponsored by the US government’s Advanced Distributed
Learning Initiative, collaborating with military, professional standards
organizations, the private sector, and academia to “define a uniform
approach for integrating current and emerging learning technologies into a
learning services ecosystem” (ADL 2022).
Also included in “storing and securing” is the storing of the credentials
earned by an individual—everything from internal certificates to micro-
credentials to assertions of competency and even post-academic
certifications. Often (in 2022) these credentials are siloed and not portable
across systems, although work is underway through the TLA, efforts by the
US Chamber of Commerce Foundation, various commercial tools, and
others to allow for the secure and validatable recordkeeping of
these credentials.
Reporting and Visualizing

Next comes reporting and visualization. In this bucket is a whole depth of
work covered in Measurement Demystified. Authors David Vance and
Peggy Parskey (2020) present a framework by which they explore the
processes, technologies, and people associated with a culture of
measurement. Reporting and visualizing are necessary first steps for most
analytics efforts.
Nearly every learning management system and learning experience
platform offers a reporting function, and some of them have the ability to
visualize data that is generated in those reports. Some organizations with
learning record stores (LRS) or learning analytics platforms (LAP) have
more advanced capabilities to report and visualize data across one or more
data sources using xAPI. Some organizations move their data out of the
learning domain and over to tools such as Tableau or Microsoft PowerBI for
more advanced reporting.
Performing Analytics
With data safely gathered and stored, and the basics of reporting and
visualization covered, we now have the ability to perform analytics on the
data. This is obviously the topic of this book, and there is much to come in
the pages that follow. Analytics are typically used to support decision
making within the organization, so they get a lot of attention.
Incorporating Into the Learning Experience

A fourth type of activity that we can undertake with data has little to do
with the organization’s reporting, visualization, or analytics. We can use
data to influence and extend the learning experience. This is the realm of
both personalization and adaptive learning. We can also provide data as
feedback to the learners about their progress and offer insights,
comparisons, and recommendations to improve their learning.
There are many examples in this bucket. In-classroom polling tools that
display collective data on screen as participants respond are an example of
using data to extend the learning experience. Personalized learning gives
the individual some agency in the learning process by allowing them to
tailor the experience, either by explicit choice or by nature of an adaptive
experience that adjusts to their performance. Perhaps my favorite example
of using data to improve the individual experience on a personal level is my
smartphone’s fitness tracker app, which shows my weight, daily calorie
consumption, exercise mileage, hours of sleep, and other data on a single
screen to influence my behavior in real time. (Yes, I am one of those people
who will walk an extra 0.02 miles to achieve a whole-number target! And I
suspect I am not alone.)
Integrating Workflows
A fifth use for data in this framework is as a marker, milestone, or trigger
within an automated workflow across systems. Using data in this way
doesn’t require visualization, analytics, or even sharing that information
with the individual. This includes and extends the concept known as
“learning in the flow of work” and incorporates what I might call “working
in the flow of learning.”
A simple example is rules-based audience mapping for learning content,
in which data about an individual’s role, activity in systems of work, and
prior learning are all used to assign required learning. It is likely that your
current LMS has the capability to do this. More complex workflows would
involve combining actual performance data and learning data to provide
access, limits, and learning opportunities that are exquisitely useful in the
moment. For example, if I am a salesperson who records an opportunity for
a product that I have never sold before, I could be offered a learning
experience about that product. (Presumably I do not know much about it.)
However, if I had already completed training about that product, and
performed well on it, my capacity constraints for that particular product
may be expanded.
The edges of these categories are left intentionally blurry because each
use of data may blend into the next, and the systems used for them often
accomplish several adjacent uses of data (Figure 6-3). While storing and
securing is a prerequisite to the rest, and reporting and visualizing are
precursors to analyzing data, there is no implied prerequisite, maturity, or
value progression in the other three uses of data. They are simply different
from one another.
Real-world applications in L&D will often include several of these uses
of data, not just one. For example, we at TorranceLearning have designed a
microlearning-based program for one of our manufacturing clients that
includes a significant amount of post-learning action to ingrain the new
skills into the daily workflow. The post-learning interactions use a Mobile
Coach chat bot to continually engage with learners over a 12-week period.
In this case, we first store and secure the data. Data from course interactions
and surveys are analyzed to assess the learner experience and program
engagement. The client can use the chat bot to answer questions about
program usage, engagement over time, and other specific topics. Data that
the learner provides during the chat bot interaction is used to further
respond to the learner in a relevant and personalized way.
Figure 6-3. The Third Framework Layer

Five Foundational Elements for Using Data
To enable these activities, an organization needs a foundation that includes
strategy, people’s skill sets, systems and data supply, statistics and data
science, and relationships among people (Figure 6-4).
Figure 6-4. The Framework for Using Learning Data
Successfully scaling a learning analytics program and culture in your

organization starts with a strategy for growth over time, one that is aligned
with business needs and tightly aligned key objectives and metrics. In
chapter 14 we discuss this overall strategy. Systems and data supply include
the sources of data and the places where it is stored and used within the
organization’s technology ecosystem. This will be covered in chapters 9,
10, and 11. The L&D team’s skill set will need to expand to include a
capability for using data in their work to design effective learning
experiences. A specific capacity for statistics and data science will either
need to be hired, partnered for, or built within the team. Some organizations
will embed this capability within the L&D team, and others will seat it in
the business intelligence and data function.
Bringing all these pieces together in support of a learning analytics
journey may require L&D professionals to forge new relationships within
their organization. In years past, I frequently heard from clients and
colleagues a consistent refrain of, “If we involve IT in this, we’ll never get
it done!” Those days are over. Now, if you don’t involve IT (and others) in
this, you’ll struggle to ever get it done!

In the chapters that follow, we’ll cover all these topics, and share case
studies from organizations that are at a variety of phases in their learning
analytics journey. But let me leave you first with two points of caution:
• Don’t get hung up on analytics! Yes, this book is all about analytics,
but don’t forget that your data can also be used for influencing the
learning experience through personalization, recommendation, and
adaptation. Data can also be used as part of an overall workflow
where learning data and business data create a handoff from learning
to working and back.
• Don’t go it alone! Building relationships throughout the business and
among your colleagues in behind-the-scenes roles like IT, legal, data
security, and business intelligence will offer you opportunities to
learn from what they’re doing and build coalitions of support for
your work.
Give It a Try
Explore a learning program for which you may or may not be gathering any
data or have access to it yourself. Answer the following questions about that
program to practice using the framework offered in this chapter:
• How could you store and secure the data that could be gathered by
this program? (Assume that it does gather data for the sake of this
exercise.) Where could data be stored? Who will need to access it to
verify credentials?
• What reports could you create with the data? What visualizations
might be useful? Could a dashboard be used?
• What kinds of questions could you answer if you analyzed the data?
• In what ways can the data be used to improve the learning
experience? For personalization? For progress tracking? For
recommendation? How about adaptive learning?
• How could this data be incorporated into a meaningful workflow? Or
conversely, how could performance data be used to influence or
trigger learning?
Do It for Real
Explore a learning program for which you or your organization are already
gathering and using data. Answer the following questions about that
program using the framework offered in this chapter.
• How is the data gathered by this program being stored and secured?
Who has to access it to verify credentials? (I recommend that if you
don’t know the answer to these questions, this is a great opportunity
to connect with someone in your organization to find out.)
• What reports use this data? What visualizations are offered? Is this
data on a dashboard? If yes, what else is on that dashboard?
• What kinds of analytics are being performed on this data?
• How is the data being used to direct in the learning experience? (or,
if it is not, how could it be?) For personalization? For progress
tracking? For recommendation? For adaptive learning?
• How is this data used in the flow of work? (Or, if it is not, how could
it be?)
Bonus Points
What would you add to this framework?
The Total Learning Architecture (TLA) Strategy
By Brent Smith, RD&E Principal (SETA), ADL Initiative
The Total Learning Architecture (TLA) defines a set of policies,

specifications, business rules, and standards for enabling an enterprise-level
learning ecosystem. The TLA is not a tangible tool you can work with;
rather, it is a data strategy that governs how different tools are connected
across the human capital supply chain. It is designed to benefit from
modern computing technologies, such as cloud-based deployments,
microservices, and high quality of service messaging services. TLA
standards help organize the data required to support lifelong learning and
enable organization-wide interoperability across Department of Defense
(DoD) learning tools, products, and data. Business rules and governance
strategies enable the management of this data across connected systems.
The TLA relies on common data standards and exposed data interfaces to
enable a wide range of functions. This abstracts away any dependencies on
a single component and enables these functions to be performed by any
connected component.
An important part of the US National Defense Strategy is the
modernization of the DoD’s education and training systems through the
development of new technologies, advances in learning science, and the
implementation of new policies and business practices. To lay the strategic
foundation for the DoD’s interoperable future learning ecosystem, the
department formalized the Enterprise Digital Learning Modernization
(EDLM) initiative. EDLM traces back to an overarching federal reform that
was initiated by a mandate directing executive branch agencies to identify
ways to improve their efficiency, effectiveness, and accountability.
Responding on behalf of the DoD, the secretary of defense identified nine
areas for reform, including IT and business systems.
One of the biggest challenges to making data usable across DoD services
was a lack of interoperability, necessitating the development of common
data models and standards. This was a critical requirement on the path to
creating and implementing a data fabric where common data elements
could be captured across different, highly unique and nuanced domains, and
interpretable for use by different stakeholders and purposes. The ADL
initiative is leading the charge to address this through the creation of an
IEEE (Institute of Electrical and Electronics Engineers) DoD Learner
Record standard, which will ultimately inform the individual DoD Service
Learner Record models and support interoperable and transparent data
flows.
The TLA creates a common data foundation that can be used to optimize
training and education for individuals and teams within an organization.
The increased granularity of the data allows organizations to better support
the career field management, workforce planning, upskilling, and cross
training of existing staff. The underlying data also describes the context of
how the learning takes place (such as on the job, in formal courses, and
through informal browser searches) and enables artificial intelligence
and machine learning solutions to improve the efficiency, scope, and scale
of an organization.
What’s cool is that the TLA can be used right now. The TLA is based on
IEEE standards, which are in varying stages of development. Although
many are currently in draft format, they are mature enough to use across
organizations (such as xAPI and sharable competency definitions). Others
are earlier in their development, but still provide value when designing or
developing solutions. The ADL initiative has also created a Capability
Maturity Model (CCM), which allows organizations to evaluate their own
learning stack to determine where they are in relation to the TLA (ADL
2020). The CMM is aligned to the adoption of different TLA standards, so
as organizations start to understand their level of maturity, they can also
determine what areas they want to improve upon based on their own
organizational goals.
Using Data Across Blended Scenarios
By Wendy M. Morgan, Learning & Development Senior
Strategist, Frank Porter Graham Child Development Institute,
University of North Carolina, Chapel Hill
In my work at the Frank Porter Graham Child Development Institute,

learning data is essential. Without learner activity data, you are limited in
not only your evaluation of the effectiveness of your instructional design
strategy—you are also limited in terms of your ability to provide effective
instruction in the first place. Instructional strategy is essentially a theory of
change, and data is necessary to test your theory (Morgan 2020). Any
change measured post-training can only be attributed to the experiences
designed if there is data to show the connection.
Beyond that, learning data is also necessary while providing instruction.
At the heart of the most effective e-learning technologies lies learner
agency, and more specifically, interactivity. In many instances, learners are
asked to apply their knowledge within realistic scenarios, including
interactive video, and they in turn receive feedback relevant to the choices
they have made. Some of our e-learning unfolds as a function of the way
the learners behave. In other words, our e-learning is a version of simulated
reality that is personalized to match learner needs.
We also have a lot of projects that provide enhanced blended learning
using learning activity data from the asynchronous component of our
blended learning program to strategically tailor the synchronous
instructional component to each individual’s unique needs. This is made
possible by accounting for the learner’s performance within the e-learning
content completed prior to on-site engagement. In other words, synchronous
support is strategically designed to use the data from e-learning. Instructors
can see each learner’s existing strengths and weaknesses before delivering
on-site support; they therefore can modularize and personalize their
approaches to review, extend, enhance, or skip areas depending on existing
learner knowledge.
Most of the data we capture at the FPG Child Development Institute is
related to learner behaviors indicating skill or ability levels, but the
behaviors that are relevant or necessary to capture vary, as do the analyses
and visualizations of data for interpretation and use. Some popular ways of
using learner activity data include:
• Providing micro-credentialing or continuing education unit credits
• Providing tailored learning experiences (as with scenario-based
virtual simulations of skills applications)
• Tailoring follow-up synchronous support to learner needs apparent
from asynchronous pre-work (this ranges from directing a general
focus of support to providing pre-populated agendas and
presentations according to individual learner needs)
• Providing personalized skill and behavior reports and follow-up
recommendations directly to learners
• Designing data visualizations such as green/yellow/red (typically for
project staff, these visualizations help direct next steps in tailored
support according to learner needs)
The bottom line is that we all want training that is effective and efficient.
We need a strategy and data to support that goal. For so long, learner
activity data has been limited to training completion and multiple-choice
knowledge checks. This is in part because the most common learner activity
data protocols (like SCORM) are limited. Going deeper, though, a lack of
understanding that the relationship between design and data defines good
strategy is often at play. Through our use of learning data, we’re hoping to
address this gap at FPG Child Development Institute.
Chapter 7
Make a Plan for Gathering and

Using Data
In this chapter, we’ll start the planning process by identifying the challenges
we are trying to solve and the kinds of data that will help us do it. We’ll
offer more ways to start thinking about what data to collect than are
reasonable or feasible for any one program. Your challenge is to find what
resonates with your organization and start there.
All good chapters on planning start with some recognition that if you
don’t have a plan, you won’t get where you’re going. Of course, few things
go exactly according to plan—but if you’re exploring with data, that’s
probably a sign that you’re doing at least some things right! So, with both a
need for a plan and a recognition that it will evolve along the way, let’s get
started.
Align With Business Goals and Metrics

It can be said of most projects in L&D that the goal is not to create training.
Unless your organization is actually in the business of creating training (for
example, you’re a training company or an off-the-shelf course provider), the
purpose of our work is to improve some organizational outcome: sales,
production, procurement, distribution, recruitment, risk management,
administration, and so on. Providing learning experiences and delivering
training are the means by which the L&D team contributes to those ends.
Similarly, the employees of an organization are striving toward their
individual purpose in doing their sales, production, procurement,
distribution, recruitment, risk management, administration, and other jobs
well. Completing the training we create for them is one of the ways that
they can improve their ability to do perform that job.
Or, as Mark Britz, author of Social by Design, says, “People don’t go to
work to make friends, to be entertained, or even to learn. People go to work
to work.”
It’s for this reason that I like to start my planning with the organization’s
goals, and not the learning team’s goals. Ideally, when learning experiences
are designed to meet specific organizational goals, the metrics by which we
evaluate the two can be aligned quite closely.
Now is a good time to gather your stakeholders together or interview
them one at a time to learn more about their strategies, goals, and metrics.
At project scale, you’re looking to find the specific goal for the training
project you’re about to undertake. At enterprise scale, you’re looking at the
set of strategies and goals they are responsible for. Often you will find a set
of primary goals, some subordinate goals, and some criteria for success (the
ways in which we achieve the goals must be legal, for example). There
many ways of defining and stacking goals, and the means by which the
organization does this is less important than that everyone knows what the
goals are and how they are measured.
Table 7-1 presents some examples of strategies, goals, and the metrics
that are used to measure progress against them. These metrics are
considered lagging metrics because they measure results after some
performance has been done. As you read the tables that follow, consider
what other metrics could be used to measure progress against these
strategies and goals. (Note that Table 7-1 contains a hypothetical list from
five different types of organizations to show a variety of examples.)
Table 7-2 shows another format for how you can look at goals, the
activities performed to meet those goals, and the leading metrics that
indicate whether the right activities are being undertaken to meet the goals.
It offers just one activity per goal, even though there could be a myriad of
different activities for each one. As you review this list, consider ways in
which these activities could be a part of a larger set of activities to reach
each goal, and the ways in which meeting these activities might not actually
lead to the goal’s attainment.
Table 7-1. Strategies, Goals, and Lagging Metrics

Strategies Goals Metrics
Increase per- • Sales per rep
1. Generate balanced rep sales by • Average sale
organic sales growth year 10% volume per rep
over year
Improve Customer rating of

2. Improve customer product product
satisfaction in our performance performance
products
Reduce defect Defects per 1,000

3. Improve production rate by 1% units
efficiencies
Open six new Number of

4. Launch services in new locations locations opened
geographic regions
Reduce lost Revenue write-offs

5. Reduce billing errors to revenue by
recapture lost revenue 14%
Table 7-2. Goals, Activities, and Leading Metrics

Goals Activities Metrics
Train sales reps on cross- Number of sales
1. Increase selling techniques reps trained
per-rep
sales by
10%
Replace materials used to Strength and
2. Improve make the product to ones durability of the
product that are more reliable product with new
performance materials
Work with suppliers to Parts defect rate

3. Reduce improve the quality of parts
defect rate used
by 1%
Hire and train new staff Number of weeks to

4. Open six ahead of opening competency of the
new new staff
locations
Audit last year’s billing to Quantity per error

5. Reduce lost find top sources of errors
revenue by
14%
Generally, your business sponsors will have a set of leading and lagging
metrics that they rely on. If needed, a large list of metrics can be narrowed
and prioritized by asking a few questions:
• Which of these are the most salient to you?
• Which of these are easiest and most timely to obtain?
• Which have the clearest and most agreed-upon definitions?
• Which is most likely to continue being used in the near future?
• Which metrics are you tracking at an individual level? Which do you
share with employees?
• Which of these are you willing to share with the L&D team?
• Where would you like to start?
In addition to strategies and goals, your business sponsors may have
some questions that you can help them answer with learning data. This is a
good time to collect them, too. It might look like Table 7-3.
Table 7-3. Questions, Answer Sources, and Metrics for the Business
Questions Answer Sources Metrics
Learning data, Courses
1. Does training help sales data completed by
improve sales results? rep, sales by
rep
Competency Assessment or
2. Is our engineering team assessment, expert certification
up to date with their analysis testing
materials skills?
Supplier learning Courses

3. Do our suppliers offer data offered, course
adequate training to access,
their employees? completion
dates
New location Courses

4. How can we speed up learning data, completed, job
the time to competency benchmark data performance,
when opening a new from similar employee
location? organizations retention
Common errors, Usage rate of

5. What support do our performance performance
billing specialists need support access support tool
to address complex rates
situations in the
moment?
These business questions help to align our efforts with organizational

outcomes and communicate to our business sponsors that we’re focused on
their needs. This connection to the business provides useful context to our
other more learning-related metrics.
Brainstorm Metrics From Instructional Design

Frameworks
Let’s turn now to the learning data side of things. We can look to a variety
of frameworks in the training and learning space when we’re looking for
metrics. Often the answer to “what should I measure?” has been right in
front of us for all these years, but we didn’t have the tools or the ability to
gather the data.
In the paragraphs that follow, I will do none of these instructional
methods justice. If something sparks your interest, I recommend you
explore more with the authors (see the resources at the end of the book).
Our focus here is on the inspiration for data gathering that we can draw
from each.
Cathy Moore’s Action Mapping

The Action Mapping approach helps focus learning interactions on the most
salient aspects of the most important goals an individual has to meet, and
matches learning practice to the on-the-job behaviors that one needs to
perform to meet a goal. The result is ensuring the knowledge or content
provided in the course is that which is needed to complete the practice
activities.
A measurement plan derived from a learning experience designed using
Action Mapping could look like Table 7-4.
Table 7-4. Measurement Plan With Action Mapping

70-20-10
The 70-20-10 framework is an instructional approach that acknowledges
that a significant part of our learning doesn’t happen in formal instruction
events (whether classroom or e-learning), but rather in learning from others
(20 percent) and learning through experience (70 percent). Our
measurement efforts have been historically focused on the 10 percent of
learning said to be achieved during formal instruction, in large part because
that’s controllable and convenient to measure. However, we can take some
inspiration in our data gathering from the various sources of informal
learning. A learning data plan inspired by the 70-20-10 model might look
like Table 7-5.
Table 7-5. Measurement Plan With 70-20-10

Learning Metrics
From
Formal • Completion • Test Scores
Learning • Attendance • Observation Rubrics
Others • Discussion group • Peer observation
comments scores
• Social group followers • Manager ratings
• Mentoring activity
Experiencee • Usage of performance • Individual
support tools performance metrics
• Knowledge base access • Scrap percentages
hits • Write-offs or cost of
• Help desk calls errors
Brainstorm Metrics From Learning Measurement

Frameworks
Another source of inspiration for learning metrics comes from a rich variety
of learning measurement frameworks. These are measures of the
effectiveness of the strategies we took to improve upon business outcomes.
In the paragraphs that follow, I explore two common frameworks. If
something sparks your interest—or if your organization is already using a
different framework—I recommend you explore more with their authors
and sources.
Kirkpatrick Levels of Evaluation

The Kirkpatrick Model is perhaps the most well-known evaluation model in
corporate learning and development. It consists of four levels to examine
different aspects of a program’s impact.
Level 1: Reaction
The first level of evaluation involves measuring the reaction of participants,
described by Kirkpatrick Partners (2022) as “the degree to which
participants find the training favorable, engaging and relevant to their jobs.”
This level measures participant favorability to the training as a whole, the
degree of engagement participants experience during the program, and the
degree of perceived relevance of the learning to their jobs.
Level 2: Learning
The second level of evaluation is described by Kirkpatrick Partners as “the
degree to which participants acquire the intended knowledge, skills,
attitude, confidence and commitment based on their participation in the
training.” The five components of this level are often measured via tests and
surveys:
• Knowledge: “I know it.”
• Skill: “I can do it right now.”
• Attitude: “I believe this will be worthwhile to do on the job.”
• Confidence: “I think I can do it on the job.”
• Commitment: “I intend to do it on the job.”
Level 3: Behavior
The third level of evaluation involves measuring participants’ behavior and
application of the training on the job. This level includes the measurement
of required drivers—“the processes and systems that reinforce, encourage
and reward performance of critical behaviors on the job”—not just the
measurement of the behavior itself. It can be measured directly in systems
of work or by surveying people or their managers.
Level 4: Results
The fourth level of evaluation involves determining the results of the
training on the job as measured by business outcomes. This level includes
the measurement of leading indicators—in essence, the measurement of the
impact that the newly trained behaviors have on the business, not just
measuring if participants are engaging in the trained behaviors on the job.
Learning-Transfer Evaluation Model (LTEM)
Will Thalheimer’s LTEM includes and expands upon the Kirkpatrick
model. The LTEM offers a great deal more opportunity for gathering and
analyzing data to help you understand if and how your learning programs
were successful.
Level 1: Attendance
While attendance is not a measure of learning and effectiveness, it is still
useful at a basic level—and without attendance, the rest doesn’t follow!
Questions include:
• Who’s enrolling?
• When?
• What referred them here?
• Are they qualified to participate?
• What else are they enrolled in (what’s the market basket)?
• Do they sign up for post-learning coaching and reinforcement?
• What devices are they using?
• Where are they learning?
Level 2: Activity
At the activity level, you can measure three dimensions: attention, interest,
and participation. It is still not sufficient for measuring learning, but it is
useful and necessary. (People can be objectively active in a program, yet
still fail to learn, learn poorly, or learn the wrong things.)
Questions include:
• Do they participate?
• Do they click on all the things?
• How do they answer questions?
• Do they turn in assignments?
• Do they engage in the discussion?
• Do they continue to engage in post-learning enforcement
and practice?
Level 3: Learner Perceptions
This level involves asking for feedback, either live in the classroom or via a
survey (informal or formal). Just asking about satisfaction is ineffective in
proving learning, but it’s still important. Asking the right questions can
help, such as whether the learning supported comprehension, remembering,
and motivation to apply what was learned, and whether support is needed
for after-training follow-through.
Questions include:
• Did the practice activities help you understand and apply?
• Are you motivated to implement this?
• Was the instructor or medium effective?
• What supports do you need to implement this?
Bonus: Ask the managers of people in the learning program a similar set
of questions.
Level 4: Knowledge
The knowledge level is an area that is well-supported with SCORM 2004
and traditional testing approaches. Many teams look at testing within a
course, but there are multiple dimensions that can be useful here:
• Testing learners right after (post-tests) is essentially knowledge
recitation. Recitation does not guarantee the ability to retrieve the
information later. Learners are just recalling facts and terminology.
• Testing after substantial delay (follow-up tests) indicates knowledge
retention. Retention gives you a more authentic grasp of what
worked, but it still focuses on facts and terminology so it can’t be
used on its own. We should focus on concepts, principles, and
wisdom that relate most directly to the learners and their future use
of the info.
While we still need more data, both short- and long-term remembering
measures are useful in their own way. Testing for short-term remembering
should involve measuring within the same day in which learning took place.
Long-term remembering can be operationalized to involve delays of three
days or more after the learning program—to be practical—although delays
of a week or more might be more realistic.
Ways to measure at this level include:
• Knowledge, skills, application assessment directly
• Simple scenarios or with support (open book)
• Complex scenarios or without support
Bonus: Use confidence-level testing, which involves asking learners to
rate their confidence that they answered a question correctly.
Level 5: Decision-Making Competence
This level expands on the Kirkpatrick model’s Level 2, learning. We want
learners to take the knowledge they receive from training and use it to make
better decisions.
Training that is aimed only at creating awareness—without any
expectation that behavior should change—doesn’t really apply here (and
depending on your perspective, may not be “training” so much as
“information” or “influence”).
Decision-making competence and task competence (the next level) may
be combined here. Decision-making competence involves learners choosing
something, but not taking action. Task competence involves taking action
after a decision has already been made. Assessing both moves us to level 6.
Level 6: Task Competence
Giving learners evaluations both shortly after the training and several days
or weeks later are both helpful, but the later one proves more effective and
fruitful. This does not yet represent actual transfer, though.
Task competence is often measured in business metrics, not training
metrics, although the practice work can be completed in a safe, sandbox, or
role-play environment.
Other ways to measure include:
• Observation checklists
• Manager evaluations
• Secret shopping
Level 7: Transfer
People have to use what they’ve learned in a training experience on the job.
There are several ways to observe this:
• Assisted transfer connotes situations where a person transfers their
learning to the job, but does so with significant assistance, support,
or prompting. (In some settings this is OK and necessary.)
• Full transfer occurs when a person takes what they’ve learned and
successfully puts it into practice in their work—without the need for
significant help or prodding.
At this level, we may no longer be asking questions! We could be
measuring on-the-job behaviors, the ones that stack up and lead to the
desired outcomes (known as leading metrics). We can ask the learners 30,
60, or 90 days out what they’re doing on the job if we can’t get the actual
on-the-job performance data itself. This runs all the risks you’d imagine
with self-reported data, but it may be the best you can get.
Bonus: You can use a similar version of this survey to ask managers 30,
60, or 90 days out what people are doing on the job, and what additional
supports need to be provided to get them to full performance. This may or
may not be a more objective measure than asking learners themselves.
Level 8: Effects of Transfer
Learning transfer can affect learners; co-workers, family, and friends; the
organization; the community; society; and the environment. This includes
all the outcomes and results that you’d consider with Kirkpatrick Level 4
and the Philips’ ROI Methodology Level 5: the outcomes that the
organization wanted to see when the learning program was commissioned.
It goes way beyond this, however, to the effects of the learning on the
learners’ perceptions, job satisfaction, career growth and mobility, retention,
and more.
Considerations include:
• How are work teams affected if one person attends? If everyone
attends? If only the leader attends?
• How are managers affected by their direct reports improving their
skills and abilities? How is the organization influenced?
Brainstorm Metrics Related to Learning Organization

Outcomes
Our business sponsors have goals they want achieved. Our learners have
activities to be measured. The instructional design and learning
measurement frameworks in our industry also offer ideas for what to
measure. And, as a learning professional, you have questions too! You may
be trying new learning software, experimenting with new instructional
strategies, or curious about new learning modalities. You may be hoping to
reduce training costs or improve speed-to-proficiency. You may wonder
whether all the effort you put into making a particular learning experience
actually makes a difference, or deciding which course titles need updating.
So, while you’re at it, make a list of your questions, where the answers may
lie, and the metrics that will provide the insights you’re looking for (Table
7-6).
Table 7-6. Questions, Answer Sources, and Metrics for the Learning
Organization
Questions Where Are Metrics
the Answers?
• User access • Percent of
1. How quickly will employees rates user log ins
adopt the new LXP? • Content by time
access in period
the LXP • Number
ofcontent
objects
accessed per
user
• Vendor and • Cost to build
2. Can we reduce the cost of internal the refresher
delivering annual refresher costs for training
training? development • Cost (hours x
• Completion pay rate) of
rates learners to
• Time spent take training
in learning
• Help desk • Help desk
3. Does teaching the use of the ticket tickets
performance support during system related to
training reduce the number of • training
help desk calls? Performance topics
support • Performance
usages support
access rates
Prioritize Your List of Goals, Questions, and Metrics

It’s quite possible that, after reviewing all of the possible metrics we’ve
discussed so far, you now have identified dozens and dozens of possible
metrics: far more metrics than you can feasibly gather, manage, and use.
You’ll need to prioritize. In fact, over time you may find that your analytics
work will start to indicate which types of metrics are most useful for you.
But until you have that meta-analysis to guide your way, you’ll need to use
your judgment to whittle down the list of possible metrics into a
manageable set to get you started. Here are a few perspectives on narrowing
the list.
To help prioritize, we can borrow a concept from the field of big data,
even if our data set isn’t actually that massive. The “Vs of data” (depending
on your source, there are four to 10 Vs) are words that start with V that
concern some aspect of data quality and utility. As you review your list of
data, evaluate them with this lens to help weed out the low-quality metrics:
• Volume: The quantity of data available to you. (Do you have enough
to be useful?)
• Velocity: The speed at which the data is being created, cleaned, and
made available. (Can you get the data fast enough to make
meaningful decisions?)
• Veracity: The reliability and relevance of the data. (Can the data and
its source be trusted?)
• Validity: The accuracy of the data for its intended use. (Is the
data correct?)
• Variety: Multiple sources of data that come together for a richer
picture. (Do the data sources tell a complete picture?)
• Vulnerability: The degree to which the data is protected from
breaches. (Can I protect privacy while using this data?)
• Volatility: The rate of decay of the data’s utility. (How soon will it
be too out of date to be useful?)
• Value: The extent to which the data is meaningful. (Is this data of
any value to the organization and the decisions you’re making?)
Note: You may need to collect and analyze some of your metrics to
determine whether they do indeed have significant “V” in some respects.
For example, you may need some statistical analysis to determine if a
particular metric actually has validity. When you’re just getting started, you
may decide that the need to analyze whether a particular metric actually
meets the criteria puts it outside your scope on that basis alone. Or inside. It
depends on what you’re trying to accomplish.
Often with my clients, after looking at the Vs of data, we then can
narrow our list of questions by prioritizing what is impactful and what is
easy—and when something is both impactful and easy, we start there. On a
recent client project, we brainstormed two ways:
• What questions about this learning environment do we have? And
who cares about them? We used a Venn diagram to identify whether
the business, the learner, or the learning team cared about each
question.
• What questions might our favorite learning measurement
frameworks suggest here? We looked at Kirkpatrick levels, the
Thalheimer LTEM levels, and an internal evaluation framework used
by the client’s IT team.
We summarized the two lists of questions, then identified the data source
for each one. We found that several questions could be answered with data
from a single, very accessible source. This was where we decided to start.
Take an Agile Approach to Your Work With Data

Data and analytics projects lend themselves very well to an Agile approach.
In this approach, iterative and incremental cycles of ideation, design,
development, and testing are used to quickly develop, test, and refine
deliverables.
When we take an Agile approach to software and learning experience
projects, we ask: What is the simplest thing that could possibly work?
And when we take an Agile approach to our work with data, we ask:
What is the simplest thing that could possibly tell us if we’re on the right
track?
What does this mean for prioritizing the list of possible metrics? An
Agile approach would suggest that we ask questions such as:
• What data is easy and inexpensive to gather?
• What can be a quick win?
• What builds capability for other metrics?
• What can tell us if we’re on the right or wrong track?
• What can be done in small scale?
• Which organizational units are willing to move fast?
• Which sponsors and stakeholders are willing to forego perfection to
get a good enough product out to learners?
We’ll tackle an iterative approach to data and analytics more in chapter
12.
Use Your Business Acumen and Your Intuition

It might seem a bit odd, in a book about using data to drive decision
making, that I would suggest using your intuition, but a bit of common
sense goes a long way here. Working with your business sponsor, you may
be able to reflect on the list of goals, questions, and metrics and with
nothing more than your own insight and that of your sponsor, narrow the
list to a meaningful starting place. I strongly recommend that you do this in
collaboration with the business sponsor (or their designee): They represent
the business, you represent the learning side, and you meet in the middle.
Questions you can ask include:
• What is the sponsor’s intuition about what is most meaningful? What
about yours?
• What data is unethical to gather, store, and report on?
• What does everyone agree to?
• What will have a lot of impact if you can show it?
• What data can you reliably gather into the future?
• Which stakeholders control or have an interest in this data?
• Where can you gather qualitative data to add insight to your
quantitative data?
• What can be done quietly (and ethically) to provide a small-scale
proof of concept?

So many questions! Doing all of this might generate a list of dozens and
dozens of questions for possible exploration—more than you can tackle at
once. I find this a useful exercise for three reasons:
1. It provides a mechanism for aligning with the organizational needs
and your project sponsor
2. It helps uncover the most meaningful and available sets of questions
and data for your initial exploration
3. It helps you and the rest of the L&D team begin thinking about all the
possibilities for analyzing data. When the right team is pulled
together, this is often an exciting exercise full of “Oh! What about
this question?” kinds of exclamations.
Here are two additional considerations:
• Don’t start with the learning metrics! Don’t worry, you will get there.
I am sure you will. Starting with the business metrics helps us align
with the organization’s needs, without which our learning metrics are
less meaningful. When we ask the business sponsor what they’re
trying to accomplish and how it is measured, we help the entire
learning and development team start from that mindset.
• Don’t think you’re done after finishing this piece! What you will
have at the end of this exercise is a list of questions that are often too
vague and require further definition before you can dive in and start
analyzing data. This gets you started, but you’ll need to hone your
questions more specifically to have a meaningful starting point.
That’s what we’ll explore in the next chapter—forming a specific and
testable hypothesis.
Give It a Try
Take an existing recent project that you’re familiar with. Challenge yourself
to complete the tables in all or most of the sections of this chapter. What is
your list of metrics? Then work through the hypothetical, whittling down
the questions. Which ones would you prioritize?
Bonus Points
Do this activity with a colleague who is also familiar with the program.
What is their list of metrics? What can you learn about the differences
between the two?
Do It for Real
Work with the business sponsor for a project you’re kicking off now. Work
through their goals and metrics. Work through your instructional strategy
and measurement approach. Work through your own questions. Now
whittle the list down.
Using a Business-Focused Learning Data and
Analytics Plan to Drive Results
By Derek Mitchell, former Head of Insight and Analytics, Sky
At Sky, we were launching a new product, and our use of learning and
performance data enabled us to meet our business goals for it.
Here’s what happened: All 10,000 contact centre staff at Sky needed to
be briefed and trained on a subscription product that was launching in a
regulated market. Training was to commence three weeks prior to the
launch with the training of all staff to take place over a 10-week period
(continuing post launch), meaning that most agents would not be trained in
time for go live. The initially proposed approach for training was to take
groups of agents through the training program in their teams. Delivering
training to the audience in this manner had a high chance of creating a
significant opportunity cost in that many customers might work with a
contact center staff member who had not yet received training. We thus
needed to figure out how to minimize this cost to the organization and
possible friction with customers.
To do this, the learning analytics team at Sky stepped in to question the
order in which agents were being trained and quickly identified that the
historical distribution of sales conversion rates was very broad across the
agent population. We built the argument that simply by changing the
scheduling of training, we could deliver greater revenue for the business
while using the same material, over the same 10 weeks, with the same
group of learners.
We were able to use sales data on existing products to model the sales
rates over the 10-week roll-out period based on both the random selection
of teams attending training (each team consisting of higher and lower
performers), and on front loading the training with the best performers
historically.
By front loading our best performing agents, we were able to route
potential customers of the new product to that group and thus benefit from a
higher conversion rate than otherwise would have been the case. We sold
more products faster because our best agents had additional weeks to sell
than if their training had been distributed across the 10 weeks.
This simple use of analytics generated an additional £300 thousand over
the 10-week period, which otherwise would not have been realized.
Chapter 8
Form Your Hypothesis and

Specify Data Needs
In the previous chapter we identified multiple frameworks by which you

can generate dozens and dozens of questions and ideas of things to measure.
We also suggested some ways in which you can filter and prioritize that list
into a manageable set of indicators that you’re interested in analyzing. In
this chapter we will start looking at the ways in which we narrow down our
questions into an analyzable hypothesis (or set of hypotheses). This will
help you identify the specific data needs that are necessary to answer any
one of your questions.
This is generally going to follow the scientific method, which you may
recall from your middle school science class. There are several ways to
show this, but we’ll use a version that illustrates the method as a circular
cycle (Figure 8-1).
The scientific method is very helpful in distilling big questions based on
observations about the world into discrete, testable, and analyzable
questions that can then be used to draw conclusions and make decisions. In
fact, I’ll go so far as to add decision making into our process (Figure 8-2).
Figure 8-1. The Scientific Method

Figure 8-2. The Scientific Method Applied to Learning Analytics
Exploratory Analysis
While hypothesis testing is a rigorous approach to analytics, I should point
out that not all analysis needs to be this hard-edged and focused. In fact,
transforming the questions posed by the business and your L&D team into
discrete, testable hypotheses may require some initial exploratory analysis.
In exploratory data analysis, you will collect a lot of data and dig
through it looking for patterns or a lack of perceivable patterns. This is like
walking into a retail clothing store and not looking for anything particular;
you’re just hoping that a sweater catches your eye and you’ll pick it up, try
it on, and see if it works for you. In this metaphor, the clothes are the data
sets and your casual search is the analysis.
Of course, it’s entirely likely that you had a general idea of the kind of
thing you were looking for when you went into the store (otherwise you
probably wouldn’t have gone into the store). This metaphor continues to
hold true for that type of exploratory data analysis. For example, when I go
into a clothing store, I am only looking for women’s clothing that would fit
me and generally for the season that I happen to be in. I know what clothes
I already have and therefore don’t need to buy, so I kind of have a sense for
what I’m looking for. At the same time, I’m open to new things I might
stumble upon as I shop. The exploratory data analysis is an opportunity to
help find additional questions and create different hypotheses.
One of our nonprofit clients was open to us pursuing an analytics
endeavor for their learning offerings, but didn’t really know what to ask.
They were open to whatever insights we might find and how they could
help them create a strategy for future course development, fundraise more
effectively with donors, or improve the overall experience for the learners.
While we didn’t have any concrete questions, we did have a good sense for
their data because we support their LMS and their course offerings. We had
enrollment dates, user group information, some very limited user data,
SCORM data from their e-learning programs, and very basic video
consumption data from a series of videos they’d released last year.
Knowing what data was available to us helped our team come up with an
initial list of questions that we thought would be useful.
Hypothesis Testing
That gets us to the hypothesis-driven approach to analytics. A hypothesis is
an educated guess or a suggested solution for a particular challenge, where
we don’t know the actual answer. Here are some examples:
• When the weather is rainy or cold, attendance at indoor museums
increases.
• Adults who get at least eight hours of sleep are more alert than those
who get less than eight hours.
• Employees who complete their annual compliance training are less
likely to commit ethics violations.
• When e-learning is designed for mobile access instead of computer-
only, learners will complete the course on their mobile devices more
often than their laptops.
• Technicians who access performance support tools more frequently
make higher-quality repairs than those who do not use performance
support tools.
You may notice that each hypothesis is stated as though it is or could be
true. This is very important. Your hypothesis is simply a testable claim.
What’s more, while your analysis can provide support to confirm your
hypothesis, you cannot claim that your hypothesis is true 100 percent of the
time.
For this reason, we create both a hypothesis and a null hypothesis. The
null hypothesis states that there is no relationship between the variables in
your hypothesis. Table 8-1 offers an example of null hypotheses to the
hypotheses already stated.
Table 8-1. Hypotheses and Null Hypotheses

Hypothesis Null Hypothesis
When the weather is rainy or cold, There is no significant
attendance at indoor museums difference in museum
increases. attendance on sunny days
versus rainy cold days.
Adults who get at least eight hours There is no significant
of sleep are more alert than those difference in alertness
who get less than eight hours. whether adults get more or
less than eight hours of sleep.
Employees who complete their Employees who complete
annual compliance training are less their annual compliance
likely to commit reportable ethics training are no more or less
violations.
likely to commit reportable
ethics violations.
When e-learning is designed for There is no significant
mobile access instead of computer- difference in the devices used
only, learners will complete the to complete training
course on their mobile devices regardless of whether a
more often than their laptops. course is designed for
mobile access.
Technicians who access Technicians make the same
performance support tools more rate of high-quality repairs
frequently make higher-quality whether or not they access
repairs than those who do not use performance support tools
performance support tools. frequently.
And, while you cannot prove that your hypothesis is true 100 percent of
the time, you can (if your data supports it) be confident in rejecting the null
hypothesis.
Identifying Specific Data Needs

Once you have defined your hypothesis, you can identify the specific data
that you’ll need to gather for your analysis. This can be a surprisingly
challenging exercise, because you’ll need to be quite precise in order to
have solid enough data to rely on for your analysis.
The best way to approach this is with an example. A common question is
whether or not training improved performance in the learner population that
participated in it. So, our hypothesis could be stated this way:
Employees who took the training showed a measurable improvement in

performance afterward.
Now we may want to improve upon this a bit, because maybe everybody
showed a measurable improvement in performance during the period in
which we were testing. In fact, in a “what gets measured gets done”
environment, simply focusing on something and measuring it often
improves performance. So, let’s hone this hypothesis:
Employees who took the training showed more improvement than employees
who did not take the training.
Of course we would like to believe that the training was the cause of the
performance improvement, but let’s leave that aside for right now and just
go for a relationship that we can show. While we cannot prove that training
caused the improvement, what we can do is identify and test a null
hypothesis. (You’ll find with experience that the null hypothesis is actually
the easier of the two to test and verify or disprove.) A null hypothesis, or
the evidence that there is no relationship, would be expressed like this:
Employees who took the training did not have any difference in improvement
relative to those who did not take the training.
So we now have our hypothesis and our null hypothesis and we can
identify the data that we will collect to test them. Let’s pull these two apart
a little bit more:
Hypothesis: Employees who take the training show more improvement in

quality scores than employees who did not take the training.
Null hypothesis: Employees who take the training do not

have any difference in improvement in quality scores
relative to those who did not take the training.
Identifying the data needs to test your hypothesis can involve

deconstructing the hypothesis almost word-by-word.
• “Employees”: In this case we are looking at employees who took the
training, and not any other sorts of people who took the training, so
we want to make sure that our data set includes employees only. That
is likely something that your organization is already able to track and
measure. But as you can imagine, there may be different types of
employees, whether they were contractors or new hires or employees
from a particular team. There might be instances in which you would
want to refine this definition in your data, or at least pay attention to
whose data is in your set.
• “Took the training”: Let’s assume that you can identify the training
as a discrete data element and pull the results from that particular
course. You’ll then want to define what “taking the training” actually
entails. Is it launching the course? Is it completing the course? Is it
completing and passing the quiz at the end of the course? What if
someone only made it 80 percent of the way through the course? All
these are things that you will want to define. If you are using a data
standard like SCORM, you have the ability to home in on the
completion record. There is an equivalent completion record in xAPI
as well, but the “completed” verb may be used for subordinate or
superordinate and related activities, so pull your data carefully. If the
training isn’t in a SCORM or xAPI course, you may need to find
other ways of defining “took the training.” You will also need to
define what it means to have not taken the training. What if someone
consumed 80 percent of the course in question? Is that close enough
to “not taking the training” for you? Or do you want to separate your
audience into people who have never come into contact with the
training at all?
• “Improvement”: You need to define what you mean by improvement.
You will want to select one metric at a time that is representative of
job performance and be able to measure that accurately. You will
want to define a time range for how long after training, and during
what period you are measuring, to define improvement. And
improvement in what? You’ll also want to have very specific metrics
for how quality performance and improvement are being measured.
All of this for a very simple hypothesis!
At every step of the way in our little example here, we made some
simple choices. Of course, at any step we could have made a different set of
choices, and it’s likely that as you read this, you were thinking of what
those choices could have been.
The choices that we make for the hypothesis we define and the data we
collect to evaluate it become incredibly important to the analysis that we
get. These often are invisible or unspoken choices, and your job is to make
them explicit, and ideally documented, so that when you are evaluating
your data you know more about what you are looking at.
You may want to lay out your data needs similar to how we’ve shown
them in Table 8-2.
Table 8-2. How to Go From Core Question to Data Source

What Can Possibly Go Wrong?
My caution here is to not go into this exercise with a leading question. I
often hear that people are interested in learning analytics because they want
to “prove their value to the organization” or “show the effectiveness of
training.” The assumption in these statements is that they do offer value to
the organization and training is effective, and they would like data to back
up those assertions. However, the risk is that they then only seek out the
data that proves their point, ignoring questions and data that might suggest
otherwise. Instead, I would much rather see us ask open-ended questions
like “Is this training effective?” and “How does training affect
performance?”
Give It a Try
Consider one of the questions that you identified in the previous chapter
and form a discreet and testable hypothesis about it. Create the
corresponding null hypothesis. Now parse the two into specific data points
and definitions that you’ll need to be able to have testable data. What are
some of your observations? Would this be easy to implement in your
environment? What would need to change to gather this data (including
your hypothesis)?
Do It for Real
As you work through the analytics process with a current project, consider
the questions that you identified in the previous chapter and form a discreet
and testable hypothesis about one of them. Create the corresponding null
hypothesis. Now parse the two into specific data points and definitions that
you’ll need to be able to have testable data. What are some of your
observations? Would this be easy to implement in your environment? What
would need to change to gather this data (including your hypothesis)?
Bonus Points
Start with an analysis already performed on data in your organization
(learning data or not). Here we’ll work backward through it to help you be a
critical consumer of data and analytics. Based on the analysis (typically
shown as a chart or graph), what might the hypothesis and null hypothesis
have been? What sorts of definitional assumptions or decisions would have
to be made about each of them to arrive at meaningful data?
Identifying Data Needs Requires Deep
Stakeholder Engagement
By Janet Laane Effron, Data Scientist and Instructional
Technology Specialist
A company in a highly regulated industry wanted to get some metrics

around the efficacy of some key compliance training. Performance after
training had support in the form of resources and reference guides, as well
as direct feedback and coaching in the context of performance. On-the-job
performance was critical to the organization, so there was great interest in
understanding what aspects of training were effective, what was not, and
the role of ongoing performance feedback and support in improving
compliance.
To identify the data we needed, we began with a two-day workshop with
the key stakeholders during which we defined the training goals and what
successful outcomes looked like. We then discussed the information
ecosystem to delineate the data available—not just regarding course activity
and results, but also on-the-job performance data, error logs, and access to
resources and real-time coaching, as well as the impact of resource use and
coaching on subsequent performance.
We evaluated and prioritized data sources based on how essential they
were in the analysis of impacts on performance; the quality, relevance, and
level of detail of the data in describing activities and performance; and how
accessible the data was in terms of data ownership, legal, regulatory, and
privacy constraints, and data integration and interoperability concerns. This
allowed the initial project to focus on data sources that were of highest
value, and those that were most easily accessible.
This was an exciting project because the customer was willing to put in
the time and effort to build a data content strategy that was well-grounded
in project goals, in the process defining data needs and thoroughly
understanding the practical constraints regarding data access and analysis.
Testing Program Rollouts With Control Groups
and Partnering With the Business
By Ulduz Berenjforoush Azar, People Analytics Operations,
Critical Equity Consulting
People analytics is a new field that captures analytics around people,

programs, and processes. We look at data as a source of feedback to
understand if programs with an HR, recruiting, and learning focus are
working the way we want them to work. The data sources that I look at are
from ATS (applicant tracking systems), HRIS (the HR system), learning
systems, and surveys.
I recommend running pre- and post-surveys when launching a new
program. When we launch, we use a control group that doesn’t have the
new training in place, and then we have a test cohort that does have that
new learning and development program. Running the pre- and post-surveys
for these two cohorts helps us understand the impacts of that new program
or initiative.
When talking about the data, you must be very clear about data
definitions. It’s very easy to talk about, say, the word employee, but what
does an employee mean to you, the instructional designer and L&D
professional, versus an HR business partner? Does it mean a regular full-
time employee? Does it mean anyone who works for the company? Does it
include contractors and the contingent workforce? Be super clear about
what data is being included in an analysis. Are there certain data fields
we’re filtering out? Are there timeframes we’re looking at?
Throughout this process, the best relationships and the best results from
programs come in through that collaborative relationship—that is, when I
am able to partner with different stakeholders to really understand their
business questions and their whys and whats. That way I’m able to really
look at the different data points that I can bring in and the rich data that we
already have but maybe are not tapping into.
This is work done over time, starting with building the databases,
constructing the data sources, and making sure data is there and it’s clean.
And over that time, the trending data you collect is super helpful. While the
external benchmarks are great at providing general trends and general
movements, it’s always best to benchmark yourself against your own trend
data by asking, “How did we do quarter-over-quarter or year-over-year?”
That’s one of the data points that I always highlight and recommend to my
stakeholders to keep ourselves accountable for our own results over time.
Chapter 9
Identify Data Sources
We are awash in data. (Whether or not we have ready access to it all in one
place in a usable format is another thing entirely.) The opportunity to collect
and use this data is rich. It is up to us to leverage this opportunity, be
responsible with it, and use it to help drive positive change in our
organizations. In this chapter, let’s take a look at where this data comes
from along what we might refer to as a “learning data supply chain.”
Data From HR Systems

In many organizational learning ecosystems, data about individuals is fed in
from the human resources information systems (HRIS) or the payroll
system. This is a good place to start, as it is where many of the dataflows
we will be working with in this book also start.
These systems store, and often send to the learning ecosystem, data such
as the following for everyone in the organization:
• Dates of hire, leave, promotion, job change, layoff, termination
• Salary, salary band, dates of salary changes
• Managerial status (does someone report to this person) and the
implied hierarchy that creates
• Office location
• Home location
• Gender
• Race
• Ethnicity
• Citizenship
• Languages spoken
• Hours worked, and in some cases, on what tasks
• Job function
• Job title
• Organization unit
• Academic and other credentials
Note that the definition of “everyone in the organization” varies, and it’ll
be important to know who is included in “everyone” when you’re analyzing
data. Possible people who could be included are:
• Employees
• Employees based in a particular country (or excluding a particular
country), often because of inconsistent deployment of systems across
the organization
• Partners, executives, or owners
• Contractors
• Seasonal and temporary employees
• Interns
• People on leave of absence
• Employees of companies that subcontract with your organization
• Employees without email addresses
• Employees of franchises to your organization
• Employees who have left the organization
HR performance management systems additionally gather and store
information about employees (and often not on other types of people) such
as:
• Performance ratings
• Goals
• Determinations such as “high potential”
Sometimes the demographic data for your learning analytics is supplied
by a customer relationship management (CRM) or sales tool, particularly
(but not exclusively) when your learners are customers. In this case, you
may have a different set of demographic data on them, often less than you
probably have about your organization’s employees.
Since this book is primarily focused on workplace learning analytics, we
won’t go too far here, but suffice to say that when learners are students (K–
12, higher education, vocational) or when learners are customers who are
buying your educational product, the data you will have about them will be
somewhat different. In some cases, these differences and the ways in which
you manage and protect this data will be governed by additional privacy
laws (particularly for younger learners).
Regardless of what demographics and performance data you have about
the individuals in your ecosystem, you have at your disposal a fair amount
of data that can be used to differentiate individuals as well as to put them in
groups for comparison. As we’ll see later on in this book, you may be
interested in finding out what drives differences in learning and
performance across groups. To do that, you will need to know which groups
people are in. Sometimes these groups are based on location or function or
role, and sometimes they’re groups based on organizational hierarchy. Your
detailed transactional data from learning experiences is not likely to include
this demographic data. Therefore, you will need to join the demographic
data with your transactional data to make these comparisons.
Data From Learning Experiences

We have data about our people. Now let’s look at what it is they’re doing
when they’re learning and the kinds of information available.
Self-Directed E-Learning
Thanks to SCORM, we’re already tracking some data from our e-learning
experiences. We know who has been assigned training, who has completed
training, and who is currently incomplete in their training. We know how
long they have spent with the course open (which is different from how
long it takes to complete the course!), and which screens they have viewed.
We know the score achieved on one of the tests offered in the course,
usually the most important one. And we may also know their answers to the
individual questions on that test.
If you are using xAPI, you can also track data on every screen, every
interaction, every quiz, every download, every video, and more. With xAPI,
you can see how many attempts it took for an individual to get a question or
an entire test correct.
Video
Video interactions offer another rich source of data. You can think of each
click on the video player as an opportunity to gather data: start, pause, scrub
back to watch something again, skip all the way to the end to get done as
fast as possible, as well as where participants stop watching (abandon) the
video. All of this can be tracked and marked by the timestamp within a
video launched by an LMS or video streaming player.
Assessment and Testing

Assessments and testing can be the source of a lot of data. This includes
overall test scores, individual question answers, the time taken to answer
each question, and in some cases the number of guess-clicks that are made
prior to submitting a final answer choice. The psychometrics and testing
industry has a long history of superior statistical analysis on this data.
Instructional designers and L&D professionals can learn much from our
colleagues in the testing field. However, if we focus exclusively on test
results, we will miss an opportunity to understand how and why people
perform on tests the way they do.
Observations
In some respects, observations can be thought of as very similar to
assessments. Of note, however, is the fact that observations are conducted
by someone else (an instructor, subject matter expert, auditor, manager, or
peer). What this means is that not only do we have data about the individual
being observed and their performance, we are also able to gather data about
the individual performing the observation, such as a facilitator observing
performance during a training program. This data can include everything
from their usage of the observation tool, to the tendencies of individual
observers to rate others higher or lower than their peers.
Live Classroom
In comparison to e-learning, video, and testing, we are often not gathering
nearly as much data about what happens in a live classroom. In corporate
learning we are often recording manually whether or not somebody has
attended and thus completed a class, generally by the instructor marking an
online roster afterward. In some settings, a hand signed roster is gathered,
and this can be digitized and stored along with the completion record for
added (and auditable) proof. In-class engagement and response technologies
allow us to capture individuals’ answers to questions posed throughout the
class.
Any class work that is performed digitally can be tracked and stored.
Several years ago, I experimented with capturing the images of flipcharts
and activities performed during a live class, recording them using xAPI and
then attaching them to the individuals’ completion record in the learning
management system.
Virtual Classroom
Holding a class in a digital environment offers an entirely different
opportunity to gather data. Who is in the class and when do they leave,
when do they comment and what do they say? How do they answer
questions? Do they raise their hands? All of these events are recorded
digitally, and could be a source of data for you. Since activities in a virtual
classroom can be done on a digital whiteboard, this, too, can be captured as
part of the data stream.
Intranets, Knowledge Bases, Performance Support

Intranets, knowledge bases, and performance support tools share several
features of the data in which they collect. We know who is accessing them,
where they go, and how long they stay there. This allows for some
interesting data, such as unique versus repeat visits and the pathway by
which people interface with these systems. The types of data available here
are very similar to what you may be used to seeing from Google Analytics,
although within the organizational ecosystem, you may have the added
benefit of knowing who did what as they used the systems.
Social Learning
Social media, social platforms, collaboration platforms, and the like gather
and store data each time an individual contributes, reads, reacts, or responds
to content. Many of these platforms allow for the storage of a variety of
types of content (such as text, images, videos, links, and polls) and thus we
are able to gather data not only about who does what, but also which types
of media they consume. We also have the ability to see and record data
about who follows whom, and which content generates the most
engagement. Basically, all the types of data that marketers use on social
platforms are also available to us as we create communities of learning
within our organizations.
Chat Bots
Interactions with chat bots generate learning data at every step. The receipt
of each message, the response to messages, the time it takes to respond to
the messages, and the interaction drop-off rate, in addition to the actual
content of responses and searches, provide a wealth of insight into this type
of learning experience.
Simulations, Virtual Reality, Augmented Reality, and

Immersive Games
While I hesitate to lump so many different learning modalities into one
heading, the reality is that these advanced technologies are all very data-
driven and offer the opportunity to capture a data stream that can be
gigantic. Every turn of the head, every click, and every flip of a switch or
turn of a handle, as well as the time interval between each of these
activities, can be recorded and reported. Perhaps the biggest challenge in
this space relative to the others is the sheer volume of minute detail
available, and deciding what is actually meaningful.
Data From Learning Delivery Platforms

While it may be easy to see the learning experience as a source of data and
the platforms they reside on as simply where we store that data, our
learning platforms themselves also generate a lot of data that may inform
our work.
Learning Management Systems (LMS)
Learning management systems are a source of data above and beyond the
learning experiences they serve up. The LMS stores information about user
login and activity, enrollment (which means that we are gathering
information even before the learning experience starts), and useful
administrative information such as requests and approvals for training,
expiration dates, and continuing education credit counts.
Learning Experience Platforms (LXP)

Learning experience platforms share some, but not all, of the types of data
that are available in learning management systems and in social learning
environments. Many LXPs pride themselves on their data- driven
approaches and the insights that can be gleaned from their analytics
dashboards.
Search
The search box is a feature offered on most LMSs, LXPs, and several of the
learning modalities listed earlier, as well as browsers, intranets,
collaboration tools, and more. We don’t often think of search as a data
source, but data we can gather from it includes answers to questions like:
• What are people searching for?
• What results are fed back from the system?
• How many searches does it take before a person finds something to
click on?
• Which results do people pursue?
• Which search terms have no good responses to offer?
All of this can provide insight to the learning team.
Devices and Browsers

Many of the learning platforms we use also provide insight into the device
and browser learners are using. This can be very helpful information to
support an individual who may be having difficulty accessing training. It
can also provide data to help us answer questions such as “What devices are
being used?” and “Does the device or context affect how people engage
with and perform in their learning?”
Data From Surveys

Surveys are a popular source of data for many organizational purposes, not
the least of which is learning and development. Our field has been using
surveys for quite some time, including our Kirkpatrick Level 1 course
evaluation surveys, needs analysis surveys, and net promoter score surveys.
Our colleagues in human resources use a variety of surveys that may have
interesting data for us, too, such as employee engagement surveys and pulse
surveys.
Surveys are a very easy way to get data from individuals. They can be
implemented quickly using code-free, off-the-shelf software or even a
simple email or pen and paper. They’re also inexpensive and accessible
ways of gathering data. No wonder they’re popular.
Note, however, that surveys rely on self-reported data in which people
are generally incentivized to report the very best case about themselves (and
sometimes the very worst case about others), and are usually voluntary on
the part of the respondents. This means that the data that we get from
surveys may be less reliable than other kinds of data.
Data About Job Performance

Let’s not forget about the fact that the purpose of learning in most
organizations is to support and improve work performance. Thus, to analyze
learning’s impact on performance, we will need the data about
that performance.
Each of the organizational metrics provided in chapter 1 is composed of
data that is gathered from systems of work as individuals perform their job
tasks. In some organizations, the single largest hurdle to higher-order
learning analytics is getting access to this data at the proper levels of detail.
This task is far easier when the work is performed in or immediately
recorded into systems of work (such as a point-of-sale system, a CRM
system, an order tracking system, or medical records) or where the
equipment involved in the work is capturing this data (such as
manufacturing equipment, vehicles, or medical equipment). In some
situations, you will have insight as to who is responsible for a particular
outcome recorded by a particular piece of data (the user logged into the
system), and in other cases you will simply know some circumstantial data
about the event (the time it was recorded or the team assigned to that piece
of equipment). In the latter situation, in order to analyze learning relative to
performance, you will be working at the group level, not the individual
level, and you will want your learner demographics to include group
designations so you can tie them together.
Data About Things That Don’t Happen Online

Of course, let’s not forget that a lot of learning and performance happens
nowhere near a computer that can easily record transactional details for
future data extraction. In other words, if we want to analyze the experience
in an empirical way, we will need to find a way to digitize its salient
aspects. This can be done by surveying or asking individuals to record their
interactions, in much the same way that a salesperson may record notes and
data about a sales call with the perspective client.
For example, we could ask individuals about a particular learning
experience. Or, we could ask them to reflect on their learning and their
resulting behavior change. We could also encourage them to keep a journal
about their learning journey.
Just remember that any time we ask individuals to report something that
happened after the fact, we open our data set up to the risk that we will not
capture all the data or that an individual’s memory or efforts to present the
best impression of themselves will affect the integrity of the data. That does
not mean it is bad data, but it is important for us to acknowledge as we
analyze it.

In this chapter we listed a lot of data sources, each of them with many
different pieces of data. It can be overwhelming! The goal isn’t necessarily
to get all the data from all the sources, but rather to explore the landscape of
possibilities as you begin your analysis. Some of these data sources may be
very easy for you to work with, while others may be quite difficult.
Do not underestimate the effort it may take to join data from two
different sources, particularly if those two sources don’t share index values,
such as how employees are identified. For example, if one system lists
learners by employee number and another system lists them by email
address, you will have to reconcile these two data sets before you can join
them and use them.
Give It a Try
Think about a digital learning experience that you’ve either created or
engaged with, and all the platforms and systems that are involved in
delivering it. What kinds of data could have been recorded about that
experience? And, if you are privy to this, what kinds of data actually were
recorded?
Do It for Real
Select a learning experience with which you’re familiar. Considering all the
platforms and systems involved in storing, delivering, and consuming
learning content, what kinds of data are being recorded about that
experience? What sources and types of data are being missed in this
experience?
Bonus Points
Is data being stored in a single place for all of the points along that learning
delivery supply chain? If not, in how many different places is data being
stored?
Data Ethics
By Stella Lee, PhD, Director, Paradox Learning
Data ethics is a broad topic. It is a branch of ethics that builds on the

foundation of computer and information ethics. It examines moral issues in
every aspect of data usage—creation, mining, processing, interpreting,
communicating, and sharing. It also addresses the moral nature of
formulating algorithms and corresponding practices, such as hacking,
programming, and professional codes of conduct. Personally, I like to look
at data ethics through the lens of ongoing awareness, reflection, discussions,
and solutions, rather than a compliance-driven effort.
Any time we make use of data, we need to consider the following
questions:
• Where do we source our data?
• What data are various learning and performance support platforms
collecting?
• Is the data representational of what we are trying to do?
• What important metrics are missing? How complete is the picture
with the data we have?
• Is the data skewing us toward a certain bias?
• How do we interpret the data?
• What assumptions are we making about our learners or staff?
• How do we communicate the data?
To put it simply, we always need to think about data in a holistic, big-
picture way, and to understand that there is no such thing as unbiased data
and metrics.
This often involves putting human interests in front of organizational and
commercial interests, which means that when we provide products and
services, our design decisions need to be deliberate and consider people’s
privacy, consent, and rights from the very beginning rather than as an
afterthought. For example, by using the Privacy by Design Framework,
developed by Ann Cavoukian, we can ensure that privacy is incorporated
into what we create by default. This framework proposes that personal data
needs to be automatically protected in any system or business practice, and
that the data life cycle is secure from creation to retention and destruction.
At the same time, we need to be mindful not to let rules and regulations
become stringent to the point of stifling innovation. Striking such a balance
is not an easy task, but it is the path we must take to create a responsible
design culture.
In general, there has been very little discussion about organizational data
ownership. Historically, data collected on employees and customers belongs
to the organizations. The more pertinent questions should be: who, or more
commonly, which group of people should be held accountable for making
use of the data, and what kind of data governance and policy needs to be in
place to ensure the proper management, access, security, and retention of
data. It is important that organizations set aside adequate time and
appropriate resources for establishing and sustaining data governance. This
is not something we or our organizations should take on as side project;
rather, it needs to be an intentional, strategic, enterprise-endorsed effort.
Data ethics is not new. Organizations have always collected data about
their employees from training enrollment and completion records to
engagement survey results. However, what is significantly different now is
that there is more readily-available data from a variety of technology
platforms, including learning management systems, learning experience
platforms, content libraries, intranets, collaborative and communication
platforms (such as Slack and MS Teams), and mobile and wearable
technologies. It is not only that we have this massive increase in the volume
of data, but also the unprecedented ability to triangulate the data from many
sources to detect patterns, make predictions on people’s behavior, and
influence decision making that has the potential to be fraught with bias and
harmful outcomes.
Data Protection and Privacy
By Ulduz Berenjforoush Azar, People Analytics Operations,
Critical Equity Consulting
Besides making sure that we follow the best data analysis practices, with
people data we have a special responsibility around privacy. And for people
data at Critical Equity, it’s a practice that we’ve had in collaboration with a
legal team to make sure that the personally identifiable information is
always protected. We have a data privacy policy in place, and we also
maintain different levels of data permissions for different team members,
which dictate what they can access and at what level.
And this is where aggregate data can be very helpful. Aggregate data is
when we look at data in more of a sum at a high level rather than individual
levels, and we look at not just one person but a group of individuals’ data.
So, for learning analytics, similar to other people-analytics teams, we
have to make sure we follow all the same sort of data privacy and policy
guidelines. We keep things as much as possible at the aggregate level. After
running an audit of the data and making sure we have the right data sets,
there’s a lot of partnership and investment in relationships before we start
even publishing analysis. This includes cross-functionally building and
enhancing relationships to make sure that we have trust in the data, and that
the business and other stakeholders within the people team trust what comes
out of those conclusions and trends.
All this takes collaboration, consideration, patience, and time to get it
right.
Chapter 10
Build in Data Capture
It probably goes without saying that in order to analyze data, you will need
to collect it and store it, which is a little bit trickier than we might like. For
example, I remember the early days of xAPI when people would tell me
that they had published an e-learning course for xAPI, connected it up with
a free trial of a learning record store, and were very disappointed to find
that the only data they collected looked like SCORM. Of course it did.
That’s all the course was sending! If you want to get more interesting
information, you need to collect more interesting information, and many of
those early tools just could not do that (at least not without additional
programming effort).
As you define your data questions and make a plan for collecting data,
you’ll also need to make a plan for capturing and sending the data.
In this chapter, we will take a look at capturing data from the platforms
you use, building data capture into the learning experience that you
develop, gathering data from non-digital learning experiences, and
improvising when all of the above doesn’t quite work the way you want it
to.
Capturing Data From Pre-Built Systems

In some cases, we will be capturing data from pre-built platforms that you
will not be able to modify yourself. Think of these as off-the-shelf systems
like your LMS, your LXP, some vendor software that you acquire, and
many of the systems of work that employees interact with every day. What
these systems all have in common is that you probably cannot modify them
to extract any more data than they are currently collecting (at least not
without custom development work by the vendor, which is often not an
option). So, if you are interested in the amount of time that somebody
spends on a particular page in the platform, but that data point is not
captured by the platform, you are going to be hard pressed to get that
information.
There are two steps to capturing the data you need with pre-built
systems. First is to figure out what the system does collect, and second is to
access it in a usable format either within that tool or outside it. Generally,
your platform vendor should be able to tell you the kinds of data that it
collects. This may include data that is available in the reporting function of
that tool, and it may include additional data elements that might be of
interest to you.
Once you know what kinds of data the platform is collecting, you can
decide whether to use that data within the tool’s reporting function (if one
exists), or whether you will want to extract that data to be analyzed with
other data sets outside the platform. Some platforms make this easy, and
some do not. This will require a little bit of investigation on your part.
Table 10-1 outlines some examples of the types of data different
platforms collect.
Some platforms will allow you to add JavaScript widgets or other means
of extending the data capture capabilities beyond what’s available on the
screen. For example, you could put a link to a video or embed a poll on a
course description page within your LMS. Engagement with that video or
poll would give you some insight about who is reading the course
description page, even those people who did not actually sign up for the
course itself. Or, you could put a JavaScript add-on in a Microsoft
SharePoint page that sends an xAPI statement or a Zapier automation
trigger whenever it is interacted with. This would allow you to capture more
data from that page than the page itself would be telling you.
Table 10-1. Platforms and Data Types

Platforms Data Types
Web conferencing and virtual • Name of attendee
classroom • Time logged in/out
• Time on camera/not
• Time on mute/not
• Transcript of event
• Comments entered into chat
• Participants in breakouts
• Answers to polls
• Hands raised
• Device and browser used to
access
Video streaming • User ID when accessed
• Timestamp play, pause, stop,
abandon
• Scrub or Seek
• Time spent on pause
• Use of transcript or closed
caption function
• Device and browser used to
access
Microlearning LXP • User ID when accessed
• Content accessed
• Amount of time spent on each
content item
• Quiz questions answered
• Content shared
• Comments made
• Comments liked or reacted to
• Comments reported
Online conference app • Name of attendee
• Time logged in/out
• Comments made
• Comments liked or reacted to
• Sessions attended
• Session evaluation surveys
• Points earned
• Connections made with other
attendees
• Sessions added to schedule
• Impressions on sponsored
content
I like to think of data captured from learning systems in the same way as
data captured from sensors such as motion detectors, luxometers,
speedometers, security badges, and so on. These sensors offer a wealth of
information that may be valuable to you in your learning analytics. Devices
such as smartphones and watches, digital assistants, and computers, as well
as the browsers and apps on them, also can provide a lot of information
including location and device information itself. You likely won’t be able to
modify what data is captured, and it may even be difficult—or unethical—
to extract information in a format that you can use, but these may hold
unique opportunities for you and your learning experience.
For a very simple example, at the TorranceLearning headquarters office
we have a sensor on a door to one of our conference rooms. It records an
xAPI statement every time the door is opened or closed. This particular
sensor hardware also collects data about the temperature, altitude, and
barometric pressure at the time the door is opened or closed. Temperature is
sometimes interesting, but altitude and barometric pressure are useless
information to me. Furthermore, the door does not know who opened it, so I
am hard pressed to link that data to any other employee or visitor records.
Similarly, manufacturing systems may be capturing data but without a
connection back to who was operating it at the time the data was captured.
These gaps will provide some challenges to you and your work, but they are
not insurmountable.
Building Data Capture Into the Learning Experience

There are times when you will have much more control over the data
capture than was just described, and this is where you will need to get
intentional about your design for data capture.
Data can be generated any time the user clicks, enters information,
downloads something, or uses voice commands. Conversely, if something is
available without needing to click, enter information, download, or use
voice commands, there may not be a moment of capture for you to grab
onto.
For example, imagine you have an FAQ list on your webpage or in your
course. If all the answers and questions were immediately available to the
user, your options for data capture are relatively limited. You might be able
to capture the amount of time spent on the page, but you won’t have fine-
tuned information on what kinds of questions people were most interested
in finding the answers to. However, if you collapse the answers and require
a click on the question to reveal its answer, you have then created a moment
of interaction with the screen that you can capture and record. That would
enable you to have some insight about the kinds of questions people are
interested in.
Here’s another example from an e-learning course our team built for a
client a few years ago (Figure 10-1). Imagine you have a hint that you want
to provide to learners as they move through an activity. If the hint is readily
displayed on the screen, you cannot know if they used it. But if the hint is
tucked underneath a button that needs to be clicked, you can keep track of
the number of times people click the button. (Did you notice the light-gray
hint button at the upper right corner of the screen? Many users didn’t. And
we know because we collected that data.)
Figure 10-1. Example of Instrumenting E-Learning to Enable Data Capture
To take this one level deeper, if you want to keep track of the number of
times someone attempts a particular interaction on a screen, you will want
to capture data with every click on the screen, rather than design the screen
so that the only data that gets sent back is the learner’s final response. For
example, imagine you want to know how many times people clicked on the
answer options on the screen in Figure 10-1 as indication of their
uncertainty about their response. The best way to capture this data would be
to place data capture on each click, not just the final submitted answer.
Beyond tracking screen clicks, a commonly used instructional strategy in
e-learning is to ask for learners to answer a free-text question. The data that
learners enter is often not able to be used outside of the course as it is not
captured using SCORM. With xAPI, we can capture the answers to that
question, read them, and analyze them. This offers an entirely different
depth of insight into the learners’ experience (Figure 10-2).
Figure 10-2. Example of Instrumenting E-Learning to Capture Text Entry

Data
Some e-learning authoring programs will do much of this data capture

for you, while others will capture very little information on a screen without
you building in the JavaScript triggers. The tools that do the work for you
make capturing data very easy, although there tends to be less control over
what and how data gets sent.
Capturing Data From Non-Digital Experiences

All of the previous sections assumed that an individual is interacting with a
computer or device at the time of data capture, but we all know that
learning and performance often happens away from the presence of a
computer or device. After all, learning and performance happens when
managers coach their employees, when we take people off-site for a team
building and skill building activity, when staff are interacting with people in
the community, when an instructor facilitates an in-person class, and myriad
other times.
To capture what happened, you’ll need to figure out a way to digitize the
experience. This can be done via surveys, observations that are documented
in online tools, testing after the experience, or by finding ways to
incorporate technology into the experience through apps, surveys, digital
journals, or other tools.
Know When You Have to Improvise

You won’t always be able to get the data that you want. That’s when you
improvise. Say, for example, you are not able to capture every click on a
screen as an indication of learner confidence when they answer a question.
Instead, you could ask learners how confident they feel in their answer and
record that.
This is an opportunity for you to get creative, but it’s also an opportunity
to ensure that when you report your results and insights you are very clear
about the improvisations you needed to make to gather the data that you
did.
Consider Data Privacy, Confidentiality, and Security as

You Collect Data
As you are designing for data collection and gathering data from a variety
of sources, this is an excellent time to reconnect with the issues of data
privacy, confidentiality, and security. While they’re very similar concepts in
lay terms, when we’re talking about workplace learning data, we need to
consider each one as its own distinct concept.
Privacy is the individual’s right to keep their information to themselves,
shared only with those to whom they give permission. In the workplace, we
have historically considered data that is created on the job using employer-
provided hardware, software, and training experiences to be the property of
the company. Most employees sign an agreement as such when they start
their employment and it may be included in the conditions of daily sign-on
to the organization’s systems of work. And most employees can assume that
their personal data and activity (such as social media use during breaks on
their own devices) should remain private to them. This gets a bit blurry
when individuals sign into workplace-provided learning on personal
devices such as their smartphones, or when learning experiences are offered
by third parties where a credential is portable and applicable across
employers (such as training provided by an outside provider or an industry
association). Privacy issues come into play whenever you are recording or
monitoring individuals, and may also include the individual’s right to have
their data “forgotten” by systems that hold it.
While a deep dive into data privacy protections is beyond the scope of
this book, suffice it to say that if you are considering collecting or
connecting to data that may come from individuals’ personal accounts and
devices, you would be well served to consult with a data privacy expert and
your organization’s legal counsel first.
While privacy deals with what information you collect, issues of
confidentiality have to do with what you do with the data and how you
safeguard it once you’ve got it. When you are maintaining the
confidentiality of the data, you’re taking steps to ensure that it is accessible
only to qualified personnel and that the data is not intentionally or
accidentally released to those who should not have it. We often think of
confidentiality as issues of sharing, releasing, or selling data. It also applies
to who can see an employee’s performance data, comments, and assessment
scores.
Data security is how we safeguard the organization’s data from
intrusion, corruption, or theft throughout the entire life cycle of data use.
Security often includes the concepts of privacy and confidentiality, although
it does not have to.
The organization—and by extension, you—has responsibility for data
privacy, data confidentiality, and data security, both within its internal
systems and in any vendor-provided software and services. The laws and
regulations regarding these concepts vary from country to country and, in
some cases, by organization (for example, if you are doing work for a
government entity). Your organization’s IT, HR, and legal teams likely have
guidance that you can follow as well.
Planning and designing for data capture requires a lot of upfront thought
and thus presents many pitfalls. For example, you might fail to properly
protect the confidentiality of learner data and carelessly or inadvertently
release data that should not have been released. This includes joining data
from sensitive sources (such as HRIS) and less sensitive sources
(performance support tools) in ways that reveal certain information more
widely than it should be. This is illegal in some jurisdictions (Germany, the
EU, and Canada, among others) and, even where not illegal, can violate
trust in you, the team, and your ethics.
Or you might design or acquire a new learning tool without making a
plan for collecting meaningful data about the experience and its impact,
forcing you to retrofit the experience later on to fit your new learning
ecosystem.
In the next chapter, we’ll dive into the places where your data is stored.
Give It a Try
Consider a learning experience that you’re familiar with. What kinds of
data are collected about that experience? What are the triggers or actions
that generate that data? What additional types of data could be collected? If
multiple sources of data are involved, how are they indexed or matched up
so they can be used together? How is personal data kept private? How is the
data kept confidential? Who has access to see it and why? How is the data
kept secure?
Do It for Real
As you’re designing your learning experience, and you’ve identified the
data that you would like to capture from it, what are the triggers or actions
that generate the data you need? If multiple sources of data are involved,
how are they indexed or matched up so they can be used together? How
will personal data be kept private? How will the data be kept confidential?
Who will have access to it and why? How will the data be kept secure?
Bonus Points
Discuss the data privacy, confidentiality, and security protections your
organization has in place with a data expert or the organization’s
legal counsel.
Instrumenting Learning Experiences to Capture
Meaningful Data
By Matt Kliewer, Learning Engineering Team Lead,
TorranceLearning
I’ve instrumented data collection (mostly using xAPI) in a variety of

learning experiences, from e-learning courses using built in authoring tools
like Articulate Storyline, Adobe Captivate, and dominKnow Flow to
custom software that we’ve developed for our clients.
In an e-learning context, it’s important to be aware of the strengths and
limitations of the software’s built-in capabilities for capturing data—some
have more flexibility than others for customizing data collection. Examine
what you get out of the box, and then what features are available to extend
that if needed (such as executing a JavaScript code).
Data collection is event-driven—an event occurs (a course is launched, a
page is navigated, a question is answered) and information about that event
is captured. But a lot of events aren’t truly relevant to the questions you’re
trying to answer with your data, so sometimes working backward from
what you (or your stakeholders) want to eventually glean from your data
reports can clarify where and how to capture these events, what to ignore,
and what additional retrofitting steps might be necessary to make it happen.
It’s normal to iterate on this process, and to refine the details of the data
capture (to the degree your authoring tool allows).
For instance, you probably don’t need to capture every “next” button
click—it may not be insightful data, particularly if the whole course is
required. Also, most authoring tools are already automatically capturing this
data on every page, and some may include an xAPI statement that a user
has “left” a slide that contains the duration spent there; this could be
valuable information, but you don’t have to collect it manually. But you
might capture data for certain branching pathways, resources available to
view, and other optional materials such as hints. Higher-level business
requirements may be “completion” or final scores; you and your department
may be more interested in how users engage with the material and its
effectiveness, which can be refined in future revisions or new initiatives.
You might find that an e-learning authoring tool’s built-in quiz data
capture works just fine—but it’s good to be aware of its capabilities and
when it might be more useful to create custom interactions to send more
specific data. For example, where the standard quiz might just send the last
result, you may want to capture multiple attempts at a particular interaction.
You could also embed data from prior interactions into your data capture, so
if you ask a question or gather input from the learner elsewhere in the
course (such as, “What did you notice in that video?” or “Which avatar do
you want to guide you through this course?”), you can include that in
subsequent data for further analysis.
When developing for custom software, the basic rules are the same—we
decide what events are relevant to capture meaningful data and construct
new events as needed. However, since you aren’t limited by what you can
accomplish with a single tool, the options broaden considerably. Of course,
you are still constrained by time, budget, and skill, so layering complexity
over time is a good approach. Start with the basics and add on additional
depth and breadth of data as constraints allow.
For both authoring tools and custom software development, it is
beneficial to document what data is being captured. For xAPI, this is often
as simple as a spreadsheet with verbs, activity names, and additional
embedded data (score, duration, context, and so on). This helps with data
design but is also an invaluable reference for the whole process, from
design to implementation to analysis.
For example, I built a progressive learning experience that took place in
the collaboration tool Slack. Because we knew there would be very specific
requirements for how the xAPI data was going to be captured (within a
custom app added to Slack but using Slack’s event infrastructure), we
started by documenting everything we thought would be useful to gather
along a user’s journey (which in this case was almost everything) and
investigating what other metadata from Slack we could gather and include
(such as date of the interaction). We built out a basic profile for how all the
data would be grouped and arranged hierarchically and what data to embed
to be able to create reports that could be sliced and diced.
This documentation was referenced and modified repeatedly during the
development of the app; it was useful both for me as the developer
constructing the code and for the instructional designers who were writing
content and user flow. During the pilot, we could then use that to build
dashboards more easily, knowing what information was contained in that
collected data.
Chapter 11
Store the Data
Of course, the unsung hero of all that we have been discussing so far in this
book are the places that store our data so we can use it. Up until now we
have been discussing sources of data, data providers, and the kinds of
information that we can gather from them. Data storage is often a space in
which instructional designers will need to collaborate with their IT
departments to establish secure and interconnected data stores.
As we proceed through this chapter, you may think, “Wait, we just talked
about xyz tool as a source of data, and now we’re talking about it as a
storage point, too?!” Yes! And while this may seem confusing at points,
some systems will be both generators of data and places to store it.
Data Storage Tools and Services

Let’s take a look at some of the types of data storage tools and services that
are common in a learning ecosystem. Keep in mind that the lines between
data providers and data stores are sometimes blurred, and the multiple
locations in which data can be stored within an organization can be layered.
Upstream Data Sources

Systems such as the human resources information system (HRIS), payroll
systems, talent acquisition systems, competency models, and sometimes
even external course catalogs are often data providers into other
components within the learning ecosystem. For example, your HRIS and
payroll system are often the source of employee demographic and job data.
It is entirely possible that these systems won’t provide all their data to the
downstream learning tools, so accessing them may be of value in a learning
analytics situation. For example, the HRIS may house employee
demographic data that would offer insights into learner audience
segmentation across gender, age, race, and other factors that are generally
not passed along to the LMS. (And per our previous discussions, this use of
demographic data brings with it quite sensitive privacy issues. Consider
aggregating it.) And yet, this data can be very valuable for identifying
whether or not all employees are achieving the same levels of success in
their learning.
Learning Management Systems and Learning

Experience Platforms
LMSs and LXPs are both interfaces for learning as well as data stores for
the learning activity that takes place within them. An LMS will contain
enrollment, completion, test scores, and time spent in learning programs
often stored as SCORM data, along with activity from instructor- led
training that is often stored in LMS-specific data schemes. An LXP will
contain similar data for the learning objects it houses, although this
information is often stored using proprietary data models and may not
include instructor-led activity. You can also perform analytics on the data
stored within them.
Traditionally neither an LMS nor LXP accepts data from outside their
platform in a rich or easy-to-use format. They are really designed to look
deeply at the data that they have created within them. Typically, to do an
analysis across platforms you will need to extract your data from these data
stores. This can be done buy a custom flat file (like a .csv) or via generic
API or xAPI if it is supported by the vendor.
Learning Experiences
It is not unusual for a learning experience that is delivered outside the LMS
to have its own internal data storage and analytics capabilities. This
includes performance support tools, memory aids, role-play applications,
virtual reality, augmented reality, and many types of gamification. They are
both data providers and datastores all in one component. For many of these
learning experiences, SCORM was never really a relevant concept, so their
adoption of xAPI and other data exchanges may have lagged. More and
more of them are beginning to adopt xAPI, allowing for deep analytics of
the learning experience within the tool, as well as portability of summary-
level data outside to a separate LRS.
Learning Record Stores

The learning record store (LRS) is the official database of xAPI. An LRS
allows for the storage of data from a wide variety of data providers,
including LMSs and LXPs, making them all available for analysis together.
An LRS can also house data from outside business systems that has been
expressed as xAPI so that performance data can be analyzed alongside
learning experience data. In many cases, the LRS is the first downstream
cross-platform data store available to the learning function and thus
becomes the analytics workhorse for the L&D team. Many of the
commercial LRSs have analytics and visualization capabilities that will
meet your needs.
What’s more, an LRS is also designed to provide outbound data back to
learning experiences, thus enabling us to use that data for the
personalization and adaptation of the experience itself. (When we get to the
discussion of data warehouses and data lakes later in this chapter, note that
these tools are not designed to support this level of dynamic querying.)
This is a good opportunity to take a quick pause and discuss the kinds of
data and analytics at the varying levels we’ve discussed. Within any one
learning experience or learning experience type, the data gathered can be
incredibly, even excruciatingly, detailed. Consider the opportunities for
analytics if you had data about every single button click, when a click is
made, how long it takes to get there, whether or not it’s the one that you
wanted to happen, and how you can influence that. That is exciting stuff
and essential to the analysis and improvement of the learning experience.
Most of the analytics at this level of granularity are situation- and tool-
specific and not very easy or useful to generalize across learning
experiences. By this I mean, a click of a particular button on a particular
page in a particular learning experience may not have much relevance to
another click on a different part of a page in a different learning experience
about a different topic. This layer of data, storage, and analysis is what we
refer to as the “noisy data layer.” In the noisy data layer, there’s a lot of data
about a lot of very little things that allow us to make very specific changes
and adjustments to the learning experience.
As we start to compile data across learning and performance
experiences, we need to be a bit more normalized and at a higher level in
the data that we report and the expectations we have for our analysis. It is at
this more transactional layer that we use summary-level statistics data for
our analysis, while still having access to the noisy layer of very granular
data in case we need to dive in deeper. Data standards such as xAPI allow
us to gather data in one place and analyze it across a wide variety of data
providers. Without unifying data standards, we are left with only the noisy
layer, which may be siloed from learning experience to learning experience,
limiting our ability to make conclusions about the learning experiences
across platforms.
Data Warehouses and Data Lakes

Many organizations are building their capabilities in terms of cross-
functional data storage and analysis for business intelligence purposes. To
do so they’re turning to two different data repository types: warehouses and
lakes.
Data warehouses contain structured data from across the organization;
we can contribute our learning data to this set. Data lakes contain less
structured or completely unstructured data and may also contain learning
data.
This is a space that, as the movement toward big data and data-driven
business decisions continues, will evolve and mature. If your organization
has these tools, the teams that support them can guide you in their usage
and help you decide if sending your learning and performance data to these
stores would be a useful activity.
Note: Some L&D teams use their own xAPI LRS and other learning data
analytics platforms for their analysis, and they send learning data to their
corporate data lake for others to use as well.
Learning Analytics Platforms and Other Tools for Data

Analysis and Visualization
A variety of business tools exist for data analysis and visualization, drawing
from the data warehouses and data lakes available to them. In some cases,
these tools are integrated into the data stores; in other cases they are simply
add-on software that leverages the data, but doesn’t handle the storage
component. Typically the learning team at this level will adopt whatever the
rest of the business is leveraging.
A Multitude of Options and Approaches

Wow! Are you feeling like that’s a lot of different types of systems? Yes, it
is, but each one serves its own purpose within the ecosystem. What is very
effective about a modular and layered system approach that leverages
several platforms (such as learning data stores, organizational data stores, or
data visualization tools) is that any one component can be swapped out with
relatively little effect on the rest. This enables your organization to grow
and mature in its learning experience design, its data gathering, and its
analytics capabilities while being less constricted to the existing ecosystem.
For example, many of the xAPI LRSs allow for the easy forwarding of
data from one LRS to another, enabling organizations to pool data from
multiple sources or to leverage the right tool for the right job at every step,
or simply to change vendors without a major impact to end users.
I once worked with a large organization that was trying out three or more
LRSs simultaneously, one in each business unit. Upon completion of their
trial, their goal was to consolidate data into a single vendor—or not—
depending on the results they found. A fluid and interchangeable set of
systems makes this possible.
It is not uncommon for organizations that are just starting their learning
data journey to connect tools with data streams that are locally optimized
but, as the ecosystem grows, become confusing, inefficient, and difficult to
manage. Having a plan in place before things get out of hand will prevent
deconstruction of the ecosystem and rework later on.
As you’re getting started with your learning analytics journey, the map in
Figure 11-1 may help you outline and organize the kinds of data sources
and stores in your ecosystem. Using this tool, you can map out the
platforms and systems, including sources of data (activity), stores of data,
analysis, tools, and perhaps even aggregation platforms.
Figure 11-1. Mapping the Learning Data Ecosystem

As you’ll see in Figure 11-2, data activity, storage, and analysis may all
be combined into a single tool, or some tools may be completely cut off
from the data ecosystem. There is no one right answer here. The goal with
using this type of chart is to discover the opportunities within the learning
function and in the business for data gathering, analysis, and insight.
Figure 11-2. Example Learning Data Ecosystem
In the example in Figure 11-2, we can see some patterns and insights
emerging, such as:
• The HRIS and performance system are simultaneously a source of
data and a storage point for data, while also offering analytics
capabilities.
• The virtual classroom platform is where a lot of activity may be
happening (it is a potential source of data), but it’s not sending that
data anywhere—so we may be missing out on some value there.
• The competency map and the LRS are sending data to the learning
and employment record (LER), but these systems are not connected
to the HRIS.
• Business systems that house the actual performance of work by
employees are not connected to learning systems for analysis, and
thus they present an opportunity for additional connection.
The other purpose for this tool is to recognize the levels at which we can
expect to use and analyze our data. The closer we are to the activity layer,
the more granular and noisy our data becomes about a particular learning
experience. The further we are from the activity layer, the more that our
data needs to be transactional and interoperable, in order to draw broader
conclusions across learning data types.
The Role of IT
In my experience as a learning designer for more than two decades, I have
often heard the following sentiment: “If we get IT involved, we will never
get this project finished.” However, when we move into a learning analytics
space, and start looking at data outside the granular activity layer, we will
absolutely need to involve IT if we want to achieve our goals. In most
organizations, IT is playing a leading role in data capture, storage, security,
privacy, and transport, all of which you need to do your work in learning
analytics. This is generally not a space in which you will want to work
independently of the infrastructure and support your organization provides.
As you begin your learning analytics journey, be sure to connect with your
IT and BI teams so you can leverage their expertise and the ecosystem they
are already building.

While data and analytics—and using it to inform learning design and
business decisions—are top of mind for L&D, data storage frequently is
not. Thus, you need to be mindful of a few stumbling points. For example,
you might underappreciate the value of some of the tools and platforms in
your ecosystem as both data sources and data stores.
Or you might fail to move data from siloed systems to aggregated
storage tools, and thus miss out on the opportunity to include their data in
your analyses.
And, at the risk of sounding like a broken record, you might fail to
include your organization’s IT, BI, and HR teams in your work and create
another siloed set of data.
In the next chapter, we merge our data and analytics discussion with
another passion of mine—Agile methods—and how to iterate during the
process of using data in learning design.
Give It a Try
Refer back to the learning experience that you have been following along
for each chapter, and use the tool provided in Figure 11-1 to map out the
sources of activity level data, where data is stored, where it is analyzed, and
any connections to performance data or other learning experiences that may
be relevant. What patterns do you notice? What gaps do you notice? What
might be an opportunity that warrants further exploration?
Do It for Real
Use the tool provided in Figure 11-1 to map out your organization’s
learning data ecosystem: the sources of activity level data, where data is
stored, where it is analyzed, and connections to business performance data.
It is quite likely that a real-life ecosystem will need a far larger map than
the one provided here, so feel free to expand as wide and as deep as you
need to go. What patterns do you notice? What gaps do you notice? Is there
an opportunity that warrants further exploration?
Bonus Points
Reach out to the IT and business intelligence teams in your organization
and discuss your map and findings with them. What additions would they
make to the map? What insights can they draw from the learning data that
they might not have already included in their data sets?
Working Within a Unique Learning Data
Ecosystem
By Wendy M. Morgan, Learning & Development Senior
Strategist, Frank Porter Graham Child Development Institute,
the University of North Carolina, Chapel Hill
The learning data ecosystem within which the Frank Porter Graham (FPG)
Institute operates is quite unique. Although we’re situated within UNC–
Chapel Hill, the LMS used for student coursework would not work for us.
In fact, we could not find any LMS that would work for us! First, FPG is
home to a many independently funded projects with different audiences,
websites, and data. Second, the common threads among all FPG projects
are that all training is external and community-focused, and research
requires statistical analysis with meaningful variables.
For the FPG Institute, learner activity data is critical not only for
evaluation but for providing effective instruction and support. Each choice
our adult professional learners make within our web-based learning
products is reported through JavaScript as xAPI protocol data and captured
within a learning record store (LRS). This way, our e-learning is truly
portable; it can be hosted on any website, and it will still report the data to
the same database.
Use of the xAPI protocol also allows us to customize the data we collect.
In other words, each learner choice and interaction can be recorded and
operationalized into conceptually meaningful variables that are aligned with
instructional design strategy. We can also query data into custom PDFs
housed within the digital lessons, enhancing the instructional design
strategy. Learners can leave each module with a custom summary
conveying their individual progress, including the areas where they may
need to improve.
Trained as an academic researcher, I’ve always valued making decisions
based on data. Before I accepted my current position, I learned about the
FPG’s data capabilities and raised the subject of the likely need for an xAPI
or LRS learning ecosystem during interviews for the role. I knew that to
provide the most effective instructional strategy and design, I would need
an infrastructure in place to provide custom learner activity data. We started
with an open-source LRS system but eventually found that we were
bumping against the ceiling of our data plan. Once I accepted the role, I
championed the effort to install it. We had a fully functional LRS within
about four years. As we explored our options for a larger system, we were
working with these considerations:
• We needed separate LRSs for each project.
• As a research institute, our needs were very specific as we’re
collecting data to support the unique research objectives of each
principal investigator on the team.
• We would be running our own statistical analyses and creating our
own custom visualizations, so we didn’t want to pay for those add-
ons.
• We needed the power to handle a lot of data.
Eventually, we decided to invest in an on-premises set-up with an LRS
provider, which has worked out very successfully for us.
Our learning ecosystem has become a critical part of our work achieving
FPG’s strategic initiatives. Our ability to collect custom learner activity data
has made us an attractive partner for collaborative work with research
projects in other departments and universities. It is also attractive to
funders.
Considerations for Operating Within Your
Learning Data Ecosystem
By Brent Smith, RD&E Principal (SETA), ADL Initiative
As an organization’s maturity level deepens and it’s building out a learning

analytics platform ecosystem, there are many things that the L&D team
needs to be aware of. They first need to understand how other functional
areas within an organization might use this data. The C-suite team is going
to look at data differently from supervisors, instructors, course developers,
or business developers. The L&D team needs to know how this data can be
parsed to support the different user communities. They also need to know
what data they have available to support those communities.
Some organizations already have an existing learning analytics
ecosystem, often proprietary purpose-built environments developed either
in-house or based on a relationship they have with a vendor. These data
structures may predate adoption of xAPI and thus not be interoperable with
outside systems and tools. As these organizations move into a more fluid
and interoperable data environment, this might present problems. One is
when organizations jump into xAPI adoption without a clear strategy for
how they’re going to use the data. The most successful organizations start
by understanding the end-users of the data. Once you know how the data is
being used, an xAPI data strategy can be developed. Part of building an
xAPI data strategy is looking at the different xAPI profiles that are in use
and the data they’re able to generate, and identifying gaps in the data
required to support the different user communities. At ADL, we’re spending
a lot of time looking at the different total learning architecture (TLA) data
types to see how those insights can better inform decision making, support
artificial intelligence and machine learning, and provide traceability across
the life cycle of learning for an entire organization. Any migration strategy
needs to be deliberate and should include a process for data governance.
Other organizations have been focused on the analytics available to them
in their LMS and LXP up to now. As these organizations embark on some
of their first cross-platform learning analytics projects, they will need to be
mindful of data ownership from different systems. The TLA work aligns
with the overall Department of Defense Data Strategy, which looks at data
as a strategic asset for any organization. There are many systems out there
that only minimally share their data. In these systems, their analytics are
only reflective of what that system knows about the learner. To stay
competitive and increase efficiency, we need to share data about learners
between connected systems. The insights gleaned from the collective data
enable data-driven decision making across the organization. Business
developers might look at existing workforce skills to find new markets,
leadership teams might use this data to cross-train or upskill employees to
retain and grow good talent, and adaptive systems will be able to optimize
learning, development, and career growth.
As they procure new systems to add to their learning ecosystem, learning
professionals should be asking themselves and their vendors a few
questions. For example, how are you sharing your data? With the rapid pace
of technology development, how are vendors sharing their data so that it
becomes visible, accessible, understandable, linked, trustworthy,
interoperable, and secure (VAULTIS)? What standards are they following to
stay in tune with best practices for sharing this data with customer
organizations?
Privacy, security, and ownership are also questions I would have for any
vendor. What are their data sharing policies? Who owns the learner data?
How much control does the learner have to protect that data? When learner
data is coupled with other organizational data, it becomes even more critical
to protect it.
Chapter 12
Iterate on the Data and the

Analysis
This chapter is a slight detour to the intersection of my work with Agile

project management for instructional projects and data and analytics (see
also my book Agile for Instructional Designers, ATD Press, 2019).
Instructional design and development projects have historically followed
the ADDIE approach, which is often drawn like Figure 12-1.
Figure 12-1. ADDIE
This approach was designed in an effort to mitigate risk by doing all of

our thinking up front during the analysis and design phases; this way,
change would not negatively impact the project as it progressed. However,
as it is often said, the pace at which we must accommodate change is
accelerating, and a model such as ADDIE does not support that as much as
Agile models.
What’s more, we typically save evaluation for the end of a project and do
not do it as frequently or as richly as we would hope to. Of course, this
book focuses on the data and the analytics that we use to support the
evaluation part of the model.
In my work with LLAMA, the Lot Like Agile Management Approach,
we take a different look at the ADDIE process. In this approach—and other
iterative design models—the steps are the same as with traditional ADDIE,
but more frequent testing and evaluation is done throughout the project
(Figure 12-2).
Figure 12-2. LLAMA’s Iterative Approach to ADDIE
This approach is both iterative and incremental as a program progresses.

The approach is iterative in that during each round we are improving upon
the gaps identified in the prior rounds of design and development work. The
approach is incremental in that we are often advancing the degree of finish
and delivery of a particular piece of learning material as it moves through
this process. Using data and analytics can be a key part of the
implementation and evaluation boxes in this diagram, providing the project
team with the insights they need to continue their work.
I think of iterating on the data as a process that happens at two different
levels, one at the data collection level, and one at the level of your analysis.
And, each of these levels affects the other one because they’re not entirely
independent either.
Iterating on the Data Collection

It’s not uncommon for raw data sources to contain junk data. The more
sources of data you bring into one pool, the more likely it is you will face
duplications, inconsistencies, and things that just don’t match up in the
ways you want them to.
Sometimes the junk data comes from the initial test runs of something
before you released it to pilot testers. Other times you may find that two
data sets don’t fully match. For example, you may be looking at learner data
from your LMS and matching it with data from your HRIS, only to find that
your contractors who are LMS users are not in your HRIS. Therefore you
will not have the same types of data about that learner subset. You’ll then
have a decision to make about how to go forward.
The process of “cleaning” your data includes such things as
standardizing references to learning objects, eliminating duplicate data, and
reconciling any mismatches. This is an opportunity to find and fix any
structural errors, such as inconsistent labeling or classifications.
I like to take a step back at this point and look at a random sampling of
the actual lines of data. Do they make sense? Am I seeing what I expect to
see? This isn’t a process of using data to confirm my assumptions, but
rather getting a sense of the data before proceeding with analysis. For
example, if I were looking to see how much time was spent on a particular
task, I may realize that my data notes the start and end times on two
different rows. This means that in my analysis I will need to calculate the
duration of time between the start and end. That leads to a cumbersome and
intensive process that could be much easier if I were able to send a single
line of data that included the duration spent on the task. It then becomes a
straightforward software development task to capture the data this way.
You can save yourself a lot of time in analysis by having clean data
going into it. This often takes several iterations. Each time you identify and
fix data inconsistencies you have an opportunity to think about whether you
can prevent this muddiness in your data at the source when it is created.
Note that as you iterate on the data you are collecting and make
adjustments to your data senders and sources, you may find that you also
discard large quantities of data created early on in a program’s design
because it is no longer consistent with the new data you are collecting.
One way to mitigate the impact of this type of iteration on your data is to
do your first few rounds in a test environment. That way you are not filling
up your data storage with junk. It is also important to discuss this process
with business sponsors of your work, because they need to understand the
impact of collecting and then possibly discarding or not using entire sets of
actual learner data.
As an example, the TorranceLearning team was working with a client on
a first xAPI proof-of-concept project. The goal was to release an e-learning
course that collected interaction engagement data as a proof of concept for
the organization and a learning experience for the design team. We went
into this knowing that the data we collected might not follow the
organization’s future data profiles for that type of learning (because they
were still being developed). We also knew that the data we were collecting
might be used to change the learning experience and the data source itself.
We had that conversation up front with the business sponsor and were able
to reassure them that the basic completion data would still be relevant even
in future iterations of the learning program.
Of course, remember to point your data sources to your official data
stores when you go live! At this point, you may need to communicate with
pilot participants and other stakeholders about the results of the early
iterations and what that means for their data as you move into full
production mode.
Iterating on the Data Analysis and Visualizations

One thing I’ve seen over and over again is that once you’ve answered the
first set of questions that you had designed at the beginning of project, you
then uncover a whole new set of more interesting questions as you spend
more time analyzing that program. Sometimes the first questions that we
are challenged to provide answers to rely on basic information. Perhaps the
business sponsor has sunk their teeth into something basic like course
completion, while you may want to go off into more granular and
interesting (to you) aspects of the design of that learning experience.
Just as often, however, answering one set of questions leads to another
set of deeper—and more informed—questions for analysis.
Please don’t discount those early analyses and their significance in the
process. It is important to satisfy the early questions that arise in a project.
These often need to be settled before you and your stakeholders can see or
attend to what needs to be analyzed next. In some cases, those deeper
questions that you ask on that second or third iteration require you to collect
additional data, which may require some redesign at the data source. In the
meantime, your first iteration data and analysis can continue.
I recall putting together my first data dashboard more than two decades
ago for an HR transformation project at a major healthcare organization that
included the organization’s first LMS. This was the early 2000s, and that
first dashboard was created by making a dozen or so individual charts in
Microsoft Excel, screenshotting them, and pasting them into PowerPoint,
which I then exported to a PDF and emailed to the human capital leadership
team. My dashboard had lots of graphs, was very colorful, and showed a
number of key metrics about a variety of HR processes—all on about two
pages. It took a lot of manual effort to create, so it was a monthly activity—
and thus already somewhat out of date by time I distributed it. Still, it was
powerful in that it was the first time the organization could look across the
entire function and get a sense for what was going on. It was very popular
report.
Simply having the data to create this report represented the
accomplishment of an entire team in centralizing and automating the
processes that generated it. That was no small undertaking! After a few
months, however, I realized that it was not very actionable or insightful
data. The first iteration of that dashboard simply reported descriptive data
on the volumes of activity happening in various places, but it didn’t analyze
or compare this activity to targets or prior data to understand trends or
progress.
The next iteration of the dashboard included historical trends for several
of the metrics so that we could see whether things were moving in the right
direction. A third iteration included qualitative data along with quantitative.
For example, we reported the types of LMS help desk issues we were
seeing and quotes from course evaluation surveys.
You’ll see many of the same kinds of evolutionary processes at work in
your own data analysis whether you create them with Excel and a screen
capture tool like TechSmith SnagIt, or they are dynamic data dashboards in
a learning analytics or business intelligence platform.
As you begin providing your organization with better and deeper
analysis of the learning experience, you are also creating more informed
consumers of your work. You can then plan on being asked to continue to
do more of this work and iterate on your previous analysis.
Iterating for Continuous Improvement
Iterative approaches such as this rely on a mindset of continuous evolution
and exploration, as well as a set of tools and project-planning approaches
that support you, the team, and your business sponsor. As each new round
of analysis comes to a presentation point, an effective team might ask
themselves, “Are we done here? What else do we need to know?” You and
your business sponsor will work together to identify when an analysis is
“done enough,” and when to move on to the next project.
This is also a fantastic opportunity for conducting a retrospective on the
work that the team has done and the work processes the team used to get
there. You can learn more about conducting retrospectives in my book Agile
for Instructional Designers, or from any number of Agile software
development resources. Retrospectives are a commonly used tool to
incorporate learning into work at each iteration.

When a team is not expecting to take an iterative approach, it can feel very
frustrating to present the results of your analysis and all of your work, only
to find out that there are new questions and new analysis to be performed as
a result, just when you thought you were done! Without this expectation of
iteration, heading into a new round of analysis work with a new set of
questions can feel demoralizing and frustrating. Having a plan from the
start to take an iterative approach can help to inoculate against this
experience.
In the next chapter, we’re going to step into the world of data
visualization and how to plan for communicating your data and analytics
findings to anyone who needs it.
Give It a Try
Using any of the learning programs that you have been working with as
your example throughout this book so far, make a list of the potential
sources of dirtiness in the data that you collect. What kinds of mitigating
factors could you put in place to prevent them?
Next, sketch out some of the questions and the analysis that you may
perform on the data that you collect. And by sketch, I absolutely do suggest
pencil and paper drawings! I often hand draw graphs and charts during my
design process. In such simple and quick sketches I can find the errors in
my logic before I invest a whole lot of time and effort to collect the data.
Since I don’t know which way my data will turn out at this point, I sketch
both a direct and an inverse relationship and see if either of those actually
has any meaning to the work that I’m doing.
Once you have these sketches, you can step back and take a look at what
new questions you might have about your hypothetical learning program.
Do It for Real
Take a look at the raw data being collected by a new or in-development
learning program. You may need to work with a member of your learning
analytics or data team to get this expressed in a raw format you can use,
such as Excel. (You may also need to request a subset of the data so you
don’t overwhelm your personal computing capacity, as some programs may
collect an astonishing amount of data.) Use sorting and filters on your data
to see if you can find any omissions, outliers, or inconsistencies that may
point to dirty data. (Keep in mind that simply because something is an
outlier does not necessarily mean it is dirty or inaccurate, but it might be.)
Next, take a look at some of the existing visualizations and analysis for
the data. Now that you know this, what additional questions do you have
that would take the analysis even deeper?
Bonus Points
There are so many ways to earn bonus points in this chapter!
• Identify ways in which you can mitigate the impact of dirty data or
pre-clean it before it comes into your analysis space.
• Talk to business sponsors about the analysis that you have for the
learning program and find out what new questions they have that
would help you take the analysis even deeper.
• If you are just getting started, be sure to build time and resources into
your project plan to account for any iterative data and analysis work
that you will be doing.
The Ad Hoc Nature of Learning Data Projects
By Janet Laane Effron, Data Scientist and Instructional
Technology Specialist
Earlier in this book, I shared a project in which we had a very orderly

process of identifying our questions and our data sources. However, that’s a
rare occurrence. Most of our data projects seem to happen in a more ad hoc
way, especially early ones when an organization has perhaps not yet learned
the value of learning data, when there is not a mature data process, if time is
short, or it’s a serendipitous opportunity.
I’ve found the most interesting results in less formal projects, ones where
we saw the potential for interesting insights in the data beyond what the
customer was looking for. In those cases, we’d start sharing those insights,
and that would ignite enough interest that we would sometimes get the
latitude to extend our data work. Those situations were ones where a lot of
innovation happened, and where the customer would get more valuable
information than what they were originally seeking.
While more informal, adaptive projects allow you to be flexible and
innovative, and can help clients gain a greater understanding of the breadth
of opportunities to be found in data, they can be harder to align (or keep
aligned) with the larger strategic goals and objectives. I think the ideal
situation is one where a formal structure to the project provides a
framework and some meaningful objectives, yet there is still latitude for
exploration of the data when you find opportunities to discover unexpected
insights.
Chapter 13
Communicate and Visualize the

Data
Visualizations of our data—such as charts, graphs, or pictographs—allow

us to see patterns and communicate the results of our work to others. These
results are often presented in reports, dashboards, and presentations, which
share some common thoughtfulness and design considerations but serve
different purposes on different timelines.
John Mattox, Peggy Parskey, and Cristina Hall (2020) define reports and
reporting strategy as “a means to consistently deliver the right data to the
right people at the right time to inform their decisions to drive continuous
improvement and communicate value.” Reports help us create a standard
cadence for information delivery, a common language about which
performance is discussed, and visibility to the key metrics in a timely
fashion. Generally, reports provide a consistent set of data produced in a
cadence with the decision making for the organization, whether that is
annually, quarterly, monthly, or more often.
A dashboard, typically created and accessed through an online interface,
provides a dynamic and often a nearly real-time view of the data. This
allows anyone with access to get an instant look at what’s going on without
waiting for someone to create a report for them. Dashboards often allow for
filtering and drill downs that enable the user to home in on exactly the kind
of information they need in the moment. Dashboards do not need to be
created on a periodic basis, as they are always available on demand.
In contrast, a data presentation is generally created and delivered at a
particular point in time, and for a specific purpose, and may never created
or delivered again. This fundamental difference with presentations will
require that you approach them somewhat differently. For example, many
reports and dashboards have multiple graphs and charts all visible at once,
whereas a presentation generally calls for just one, maybe two, graphs per
slide. Many presentations are designed to persuade, and thus often have
additional messaging around each piece of data.
As Nancy Duarte says in DataStory (2019), “Communicating data
effectively isn’t about creating sexy charts and showcasing your smarts. No,
it is about knowing the right amount of information to share, in what way,
and to whom.” So, regardless of whether you are creating a report, a
dashboard, or a presentation, you will need to address several key elements:
the purpose, the audience, the framing of your message, and finally the
design of charts and graphs to support your message.
Purpose
What is your purpose or goal for communicating this data? What do you
want the intended recipient to take away from your message? Being clear
about your purpose helps you decide what to include and what not to
include in your communication. Here are some questions to ask:
• Are you trying to inform the audience about something new or
something they are used to seeing on a regular basis, such as a
quarterly update on the L&D team’s activity?
• Are you evaluating a program’s impact?
• Are you trying to persuade someone to make a decision based on
your data?
• Are you using data to support a needs analysis?
This clarity of purpose can keep you on track and on message.
Audience
Consider what you want the audience to do, then think of what your
audience needs from you in order to do it. What information do they need to
see, andover what time periods? What do they need to feel confident in
your analysis, in your tool set?
As you think about your audience’s needs, consider how familiar they
are with this topic and your terminology. Are they used to working with
data, or will you also need to provide some education as you go? How
much time are they willing and able to spend getting oriented to the
conversation and the way you’re sharing your data? Do they want all the
details and variations in your data, or do they just want you to get to the
point?
In 2020, I worked with the Learning Guild on a research report about the
anticipated stickiness of changes in learning technology use as a result of
the COVID-19 pandemic. We asked a complex grid question of the
respondents: How had their training delivery methods changed as a result of
the pandemic, and how did they anticipate them changing in the future? We
compared their 2019 baseline with March–May 2020 (when the survey was
open), the second half of 2020 (when many people at the time expected to
be back to “normal”), and then two future time periods: 2021, and 2022 and
beyond. We asked about six different training delivery modalities (Torrance
2020).
When it came time to report our data, I was very excited with the
visualization I chose, showing both mode (using a heat map) and mean on
the same graphic for each modality (Figure 13-1).
Figure 13-1. Example of a Complex Data Graphic
I thought this was quite clever, and I appreciated the visual designer’s
work in bringing this data to life. The trouble was that every time I showed
this chart to someone (and we have several of these, one for each learning
modality) I had to stop and explain how to read it. In fact, the final version
of the analysis report included an entire page that served as a key to these
charts (Figure 13-2)!
Figure 13-2. Detailed Explanation of Complex Data Graphic
I’d like to think of this particular compound visualization as a really

effective way to show the data as long as the audience has time to spend
learning how to read it. However, in a short conference presentation where
this set of slides might only get a few minutes of attention, it falls quite
short of my goals for audience understanding.
If you are preparing a data report or visualization for someone else to
present and carry the message forward, it’s quite important for you to make
sure that they understand how to explain any complex visualizations, since
you may not be around to help out. (I can tell you that very few people have
chosen to cite my results from these visualizations or use them in their own
presentations. Lesson learned!)
Framing Your Message

Once you’ve identified your purpose and considered your audience’s needs
and ability to connect with your report, we can get to the framing of that
message. You’ll want to be very clear with your audience about what you
are presenting and what your goals are at both the macro level (the
presentation overall) and the micro level (each visualization or statistic
you’re sharing). As you plan your message, consider where you will place
the following components:
• The call to action, bottom line, or punchline: Whatever you call
your big message, consider where you will put it in the presentation.
Many communicators advise delivering the big message early on,
then using the rest of your time to back up that message with
additional supporting data. This is referred to as the “BLUF,” or
Bottom Line Up Front, approach. It can be very effective with senior
executives or anyone who feels like they are very pressed for time.
However, you may want to save your big message if your findings
are counterintuitive so you can use the beginning of your
presentation to build up to it.
• Directly supporting data: Consider the placement of drill downs or
alternative looks at your big message that help establish it.
• The process: In some cases, telling the story of how you arrived at
this analysis will help build credibility. In other cases, it’s not only
unnecessary, it’s tiresome and distracting from the big message. It’s
quite possible that the person most interested in all the work that
you’ve done to get to this point is you! This is a time when knowing
your audience is key.
• Appendices: I love having all the detail and backup data at the ready
in case I’m asked to produce it, but not everyone wants to look at all
of that. Having the key components of your analysis in an appendix
allows you to quickly go there if you need to, but doesn’t bog down
the conversation with chart after chart.
Design of Charts and Graphs

There are so many good books and resources about selecting and designing
charts and graphs that I will simply steer you in their direction for your
reading pleasure. (See the resources at the end of the book for a few of my
favorites.)
I will share a few of my favorite tips here, though, to get you started.
• Don’t overdo it. As you saw in the story of my Learning Guild
research report, complex and fancy charts may look impressive and
get you very excited, but run the risk of obscuring your message in
the process. Often a bar chart, line chart, or pie graph can convey
what you need.
• Draw attention where you want it. Use color or callouts with
intentionality to highlight your message. Is a particular part of a
learning experience performing unusually well? Was there a blip in
time when performance suffered? Was a particular cohort of learners
unusual for some reason? Each of these could warrant a contrasting
color in a visualization. You can use an overlay to highlight
important events like the beginning of the pandemic period, a
product release, a system-wide outage, or some other factor that
might have an effect on your data.
• Use your very best instructional design thinking as you balance
completeness with cognitive load. Just as we don’t want to overload
a slide with bullet points, we don’t want to overload a presentation
with slides of data with no break for conversation or reflection.
The good news is that, in many respects, the considerations that you will
make in communicating the results of your analysis, whether it’s in a report,
a dashboard, or a presentation, are similar to the types of considerations that
instructional designers make all the time in designing learning experiences:
the objective, chunking content, cognitive load, and visual design
principles.

This chapter has been a quick view into the power of visualization and the
considerations you should be aware of when communicating out your data
and analytics. I’ll leave you with two points of caution:
• In all the excitement over an excellent data visualization, don’t forget
the power of qualitative data to help tell the story in memorable and
contextualized ways.
• I would be remiss if I did not point out that we have a responsibility
when communicating with data to be sure that we are not misleading
or misinforming by the very nature of the visualizations we choose.
Tricks such as altering the axes on a chart, mislabeling data, omitting
data sets, combining disparate data sets, or even just using an
inappropriate chart type can be misleading, either intentionally or
unintentionally.
In the next chapter, we’ll wrap up the book by looking at how to scale
your data and analytics efforts and gradually mature your L&D function as
trusted resources in this ever-important capability.
Give It a Try
Become a student of data visualizations in your everyday life. Harvard
Business Publishing, the Federal Reserve Economic Data site (FRED), and
the New York Times have libraries of visualizations you can explore, but
you can also find charts and graphs in all sorts of places. Ask yourself:
What message is being conveyed? Who might the intended audience have
been for this data? See if you can identify any data distortions at work.
Do It for Real
Make a plan for communicating about the data you have collected. Begin
with your purpose and audience, then consider your message and what data
visualizations you will use. Will you use spreadsheet software like
Microsoft Excel or Google Sheets, the visualization functions of a learning
analytics platform, or tools like Microsoft PowerBI, Tableau, Looker, or
other visualization software? Give your report, dashboard, or presentation a
pilot test with people who understand or share the needs of your intended
audience and see how they respond to your work. Adjust accordingly.
Dashboarding Data for an Equity, Diversity, and
Inclusion Program
By Emma Weber, CEO, Lever–Transfer of Learning
One of our clients is a global professional membership body, representing

more than 50,000 members dedicated to driving excellence in architecture.
They serve their members and society to deliver better buildings and places,
stronger communities, and a sustainable environment.
The global protests against police brutality in 2020 sparked a global
conversation addressing long-overlooked systemic racial inequality. For our
client, this ignited a period of self-reflection, with the organization realizing
it had not given adequate attention or resources to tackling injustice,
systemic racism, and discrimination. Not only did the organization need and
want to change, but as the custodians of the architecture industry, they
recognized it was their responsibility to address the industry’s lack of
diversity as a result of inaccessibility and discrimination.
Our client identified some ambitious goals and behavioral outcomes:
creating an inclusive culture at the organization where all staff and members
could feel they belonged, holding staff and members accountable for
inclusive action, modeling how to drive an inclusive culture in the
architecture profession, delivering actual behavioral change in regard to
cultural awareness and sensitivity, and driving a perception change among
staff, members, and the public.
A suite of programs across different levels within the organization
included equity, diversity, and inclusion training workshops (instruction),
action planning, and execution supported by learning transfer coaching. The
coaching was delivered by a combination of a chat bot and a human
coaching team. The action planning and execution phase used the Turning
Learning into Action methodology to deliver the behavioral outcomes. The
methodology is a rigorous approach to behavioral change, with a focus on
change and data to capture outcomes and progress.
It was important to use data to analyze program outcomes and progress
toward the organization’s goals, especially given the nature of change in
DEI. As the director of inclusion at the organization shared, “The nature of
EDI (equity, diversity, and inclusion) learning programs makes achieving
behavioral change that much more difficult. To change our behavior, we
have to acknowledge that we’re inherently biased and discriminatory,
participate in some uncomfortable introspection, and identify what about us
needs to change. This is much harder to achieve than behavioral change in
other contexts.”
As we thought about what data and analyses to present on the dashboard,
it was important to consider what data was being generated and at which
stage in the project. We then considered what was the most meaningful data
that would inform and create insights. We used this as the basis of our data
collection strategy, knowing that we also had the need to provide
stakeholders with the up-to-the-minute information they needed to make
decisions at key points in the learning journey.
We were primarily using four data sets to manage alignment,
engagement, risk, and impact with learning transfer to achieve behavior
change—and, ultimately, culture change—over time:
• Action plan data
• Chat bot usage data
• Chat bot conversational data (generated as a byproduct of
participants chatting with the bot)
• Cultural intelligence (CQ) dimensions
Data was easy to collect using the chat bot tool that the organization used
to embed the behavior change. The chat bot was designed to embed
behavioral change based on the individual action plans participants had
created. In having reflective conversations with Coach M, users generated
data about progress toward goals and barriers that they anticipated coming
up against.
Visually, it was important to make the dashboards look appealing and
easily understood by those inside and outside our direct stakeholder group.
Plus they had to be valuable in terms of the decisions that people could
make using them. We carefully selected a combination of donut charts, bar
charts, column charts, and bubble charts on the dashboard to represent the
data in the most compelling and effective way (Figure 13-3). Often this is a
work in progress, and we are continually reviewing and improving the
dashboards. No data made it onto the dashboard if it didn’t meet three
criteria:
• Does it share a meaningful point for the client with reference to the
program?
• Is it easy to understand the key point?
• Will the data lead to actionable insight?
The designs were iterated over time across two years of working with the
chat bot on developing our analytics suite.
The design of the dashboard and the process of the data collection were
both iterative. Initially data was exported from the Coach M platform into a
CSV file and uploaded into Google Sheets. In the future we will use an API
where the data will upload directly from the platform to Google Sheets.
It’s not unusual to first use a manual download and upload process to
ensure you have the data you need for the analysis you want to create
before creating an API. Many times, one analysis will generate a question
that leads to another analysis, so, especially in the early days, creating a
dashboard format will be part of an iterative process.
The most insightful analyses on the dashboard is where we use text
analytics. Text analytics is a subset of natural language processing (NLP)
that deals with the identification and extraction of themes from text. The
theme extraction algorithm used by the chat bot analyzed anonymized and
aggregated Coach M conversation data across all participants.
These insights were primarily used to determine if the action plans and
goal statements participants said they were working on aligned to the
training program goals, and to identify the key themes emerging from the
action plan. It is useful as an early stage indication of alignment to what an
organization is looking to achieve.
Figure 13-3. Sample Coach M Dashboard

Communicating Your Analysis: Engage to
Influence, Influence to Drive Action
By John Polk, johnpolkandassociates.com
We analyze data to identify valuable, actionable insights for the business.

But the path to realizing that value isn’t direct, and we have to communicate
the analysis in a way that engages our audience. In doing so we help them
pay attention, share our excitement for the topic, and buy into our ideas.
When our audience buys in, we can spur them to drive action. This path
becomes smoother when we tell a clear story, leverage graphics, and reduce
the noise.
1. Tell a Clear Story

The power of storytelling is in our DNA. Logical, compelling stories
designed with your audience in mind are critical to driving action.
To tell a clear story, you should:
• Assess your audience, purpose, and setting to determine the
design. Humans are lazy, distracted, and forgetful. However, if you
are clear on the critical takeaways, your audience has a better shot at
remembering them. Determine the three most important things you
want your audience to remember. If your decision maker is results
oriented, focus on the value of your recommendations and when
you’ll deliver that value. If your decision maker is people oriented,
focus on the impact for customers and employees.
• Make it compelling. Many analysts believe that data and logic can
drive all decisions. As an analyst myself, I wish that were true. But
humans are emotional creatures. So, use qualitative data, anecdotes,
and inspiring stories to bring your data to life.
• Put important things first. Imagine you are reading a mystery
novel. Part of the fun is guessing “whodunnit.” But your
recommendations shouldn’t be a mystery. If your audience is
spending mental energy guessing at your recommendation, they
aren’t paying attention to your story. So instead, lead with the
answer, then give the supporting data.
2. Leverage Graphics
Never, never, never use a plain bullet slide. This tedious, wall-of-text style
is what people mean when they say “death by PowerPoint.” Instead, use
graphs, frameworks, images, and icons to make your insights
understandable, memorable, and actionable.
To properly communicate your analysis, you should:
• Support your story with data and graphs. This goes without
saying in a book about analytics. The typical problem I see in data
visualization is the equivalent of the wall of text: too much data in
each graph without a clear insight, or too many graphs on one page
without a clear story. Instead, use color, size, enclosures, and arrows
to help your audience see your main point.
• Use frameworks, images, and icons to convey key points. Even if
your audience has the fortitude to read your wall-of-text slides, they
won’t be able to remember all 10 bullets on the page. Instead, chunk
up your ideas into related buckets and create a framework. They’ll
have a better shot at remembering your three categories. And the
three categories will help them retain more of the 10 bullets. Images
and icons help engage your audience and illustrate key points; just
make sure your images and icons are relevant. Adding a stock photo
of employees around a conference table isn’t helping.
3. Reduce Noise
In electrical and audio engineering, the signal-to-noise ratio measures the
relationship between the desired signal and the undesired noise. For
example, imagine listening to a baseball game on AM radio driving out of
town in a storm. (You remember AM radio—that’s the button you hit by
accident when connecting your phone to your car’s Bluetooth.) The signal
is the announcer’s voice, the crack of the bat, and the crowd’s roar. The
noise is the static from the lightning and the fading signal. Anything you
put on your slides that is unnecessary to communicate your key points is
noise. To reduce noise, you should:
• Leverage a standard template with a simple, elegant design.
When collaborators use inconsistent templates, it causes the
“Frankenstein effect,” where differences in design, style, and tone
interfere with communication. Work with your brand team to
develop a clean template with crucial design elements built in, like
color palette, line spacing, page numbers, and corporate logos. Then
build a slide library of standard slides, including cover page, agenda,
graphs, frameworks, and section dividers, so authors don’t have to
reinvent the wheel. Figure 13-4 provides an example of a graph with
common noisy elements called out. Figure 13-5 shows the same
graph with the noise removed. Notice how your brain can focus on
interpreting the data vs. searching for the signal through all the noise.
Figure 13-4. Example of a Noisy Graphic
Figure 13-5. Example of a Clean Graphic

• Write professionally. In a workshop many years ago, I noticed that I
had misspelled a word on a flip chart. I asked the participants if they
saw it. Half the class said “yes.” When you have typos, grammar
issues, or misaligned elements, some portion of your audience will
think you are sloppy, or at the very least they will be distracted from
your message. No one’s perfect on the first try; seek a proofreader
for your presentations if possible.
• Use the fewest words required. We’ve all heard the phrase “less is
more.” I’m not a fan because it leads to presenters creating slides that
aren’t useful without you there to present them. Unless you’re doing
a TED talk, chances are your slides will be forwarded to someone
who wasn’t there to hear you speak. In a perfect world, you’d create
two versions of your presentation, one to present and one to forward.
Unfortunately, we rarely have that luxury. The key to having the best
of both worlds is to “word diet” your text to the fewest words
required to make your ideas clear.
Chapter 14
Build Scale and Maturity
We’ll finish out this book with an eye to the future: building scale and
maturity in your learning analytics work. To do that, we’ll focus on the five
elements of the foundation discussed in chapter 6—strategy, skill set,
systems and data supply, statistics and data science, and relationships—and
reflect on how they can evolve as you move from early prototype projects
to a full-scale implementation. This is a new space in our industry, and few
organizations are doing deep analytics across all areas of their learning
ecosystem. As the industry matures, you’ll see more and more organizations
taking this on and (hopefully) sharing their case studies and insights for
others to follow.
Figure 14-1. The Framework for Using Learning Data

These same five foundational elements apply to every stage in an
organization’s learning data and analytics maturity; however, specific
activities and emphasis vary as an organization moves through stages of
maturity. While some organizations will take on some of these steps in a
different order or pull some tasks ahead earlier or later, the considerations
through each stage are consistent.
Prototyping
Very often, early analytics projects are taken on by one or two interested
people in the L&D team—such as you—and it is not unusual for these
projects to be considered side projects, done in “spare time” in a matter of
weeks. The goals of these projects are often to see what is possible, verify
that they actually can get data out of a learning experience, and demonstrate
a proof of concept. Many people who have participated in the xAPI
Learning Cohorts have taken on prototyping projects aiming to extract and
review data from e-learning, LMSs, virtual classroom platforms, apps,
business data, or even voice response systems. In this stage, the data
analysis is often secondary to simply proving that data can be extracted and
reviewed, although if enough data is gathered, some basic analysis can be
performed.
The goal of the prototyping stage is usually to get resources (both time
and budget) and permission to take on a pilot project.
Piloting
Pilot projects are generally one-off projects that receive focused attention
by instructional designers and the business as a first foray into learning data
analytics.
Since the purpose of these early experiments is often to gain support and
funding for using data and analytics in production projects, it is important to
ensure that the analysis you are conducting has real meaning to the
business. It may be a simple analysis, but it should be as accurate as your
data allows it to be. The goal of the pilot projects is to find out what the
organization needs to put in place to move to first production projects.
That said, pilot projects typically begin before an organization has fully
committed to an analytics strategy and before they have fully assembled an
infrastructure, so the data you gather and the analytics you conduct may
serve a short-lived purpose. In some cases, these pilots take place in part to
vet the vendors of their LMS, LRS, and other ecosystem components.
A client of mine, after gaining some basic familiarity with xAPI,
assembled a team to create their first pilot project. The L&D team had
created an e-learning course with a great deal of flexibility in how learners
progressed and how much content they consumed. They were curious to
know how learners actually used the course, and they knew that this
information was not available using SCORM. We shared some best
practices for using JavaScript triggers in their e-learning course to
instrument the data collection, and they set up a trial account with an LRS
provider. Two members of the data science team who were newly dedicated
to L&D participated in the project as well. Because the project was moving
faster than the organization’s overall learning data infrastructure work was
moving, we all went into it knowing that the data we collected and the
analysis we would do would likely not be compatible with the future data
model, and the business sponsor agreed that this would be an acceptable
potential outcome. In this project, the learning developers built skills in
designing and instrumenting for data, and the entire team learned a bit about
what was possible with learning data and analytics at the individual course
level.
First Production Projects

When your pilot project is successful enough that you get the go ahead to
use data and analytics on a full production project (real project, real
learners, real data), your work is just getting started. These first production
projects—and you may need several of them before you are ready for the
next phase—generally require the organization to commit people, time, and
financial resources to the effort. Based on what’s learned in the piloting
phase, you need to spend time upfront planning for the analysis and
building data capture into the learning experience. This may also require the
team to learn new skills in the process.
I am often asked what kind of project makes a good first production
project, and a few criteria come to mind:
• Quick wins. A project that has a near-term release date and for
which you expect to see results relatively quickly means that you
will also get your data quickly so that you can evaluate results. Do
not confuse the term quick wins with emergency projects. You don’t
want to be so rushed that you do not have the time to do the work
properly, and thus jeopardize your opportunity to learn and build a
case for further implementation.
• Big-impact projects. The advantage of a big-impact project is that it
tends to attract the resources, investment, and attention of senior
leaders who are interested in understanding and measuring the results
of a specific learning initiative. The disadvantage is that it tends to
attract stress and the attention of senior leaders, and thus may not be
the best choice for a project on which everyone is expecting to learn
a lot and make mistakes in the process. This is the sort of thing that
you will decide based on your organization’s culture, your comfort
level, and your experience—it’s not for the faint of heart!
• A supportive sponsor. A supportive and engaged sponsor can work
with you on a number of levels. First, this person can help you
identify interesting questions for analysis and meaningful business
metrics. Second, they can also help you with prioritization, clearing
of roadblocks, and access to resources that you may need throughout
the project. A supportive sponsor also understands that you are
learning on this project, and allows space for making productive
mistakes along the way.
• Easy access to data. Your first production project will be more
successful if it provides easy access to lots of data. For this reason, I
see many first production projects that use e-learning tools and focus
on topics with clear measurable outcomes in near-term timeframes.
These first production projects also generally require that you purchase
new or additional licenses for data storage and analysis, such as an LRS.
Since this is production data with real employee information that you would
like to keep, and keep safe, I generally do not recommend using free trial
software. It is worth investing some resources at this stage to do it right, and
this often means a software selection process will need to take place.
While in your early experiments, it’s entirely possible that you did the
work by yourself; in the first production projects, however, your team may
include learning design, learning technology, IT, business intelligence, and
the business or organization unit being served by the learning content itself.
This is an opportunity for you to start building the relationships within the
organization, if you do not already have them, that you will need for full-
scale implementation.
During this first production project, you will be making decisions about
data, data standards (SCORM, xAPI, something else), data profiles, and
data governance that will lay the groundwork for the future. My
recommendation at this stage is that you rely on existing data schema
(profiles) rather than inventing your own. This not only keeps things
simple, it also allows you to be more future-proof by reducing the risk of
obsolescence of your data set as you implement more widely later on. As of
the writing of this book, the xAPI community has built several robust data
profiles for common types of learning experiences (such as e-learning,
assessments, video, performance support, and game-based learning) and I
expect that this work will continue, providing our industry with a common
method of expressing data and saving you the time and effort of creating
your own. If all or most of the data for this first production project exists
within a vendor’s tool or something you have created in house, this may
mean a lighter lift for you.
At this first project stage, some of your software vendors (LMS, LXP,
LRS, authoring tools) will work extra hard to support you to ensure your
success. They will want you to scale up and buy more! This is an
opportunity to learn and build your skill set while you continue your
evaluation of the vendor relationships you are building. If you find that the
tools or the support are not what you need, you can easily switch gears
before moving into more full-scale enterprise-wide implementation, at
which point it is more difficult to change vendors. I know of at least one
large organization that experimented with multiple vendors at this stage,
each business unit using a different one compare the experiences before
settling on a single tool enterprise wide.
Multi-Project Implementation
Between first production projects and full-scale implementation,
organizations will be running several sets of projects, perhaps not all
integrated with one another. This will be a time of experimentation,
learning, and growth as the skills required for working in a data-centric
environment are transmitted through the L&D team, business sponsors
become used to this as normal, and the organization’s learning technology
ecosystem is built out.
Full-Scale Implementation
Full-scale implementation is as much a business initiative as it is an L&D
one. It is a cultural shift in how we approach L&D as a data-driven activity
as well as a technical exercise.
Organizations beginning their journey at full-scale implementation may
find that their L&D teams have skills gaps in data, analytics, visualization,
and technology, and that resolving these can be accomplished with training
and hiring onto the team. Building cross-functional connections with the
business intelligence team can also help.
Full-scale implementation is generally accomplished in phases, working
either by business unit or by learning technology to limit the number of
moving parts in any given phase. Some considerations for full-scale
implementation include:
• Data governance. This includes setting up and communicating the
common ways in which all learning developers will collect data in
their work so that it can be easily analyzed. Data governance
describes the structure and processes by which the organization
keeps its data neat and tidy. It includes common ways to structure
learning asset catalogs, the data standards that will be used, and who
and how new content (and its data) can be added.
• Data privacy, confidentiality, and security. It is likely that your
organization already has these protections in place, and you will want
to make sure that your work falls under this purview.
• Organization design. Some organizations create a learning analytics
team with designated data scientists as members. In other
organizations, the business intelligence team supports the learning
analytics work.
• Conversion strategy. You will need to consider whether to
backtrack and begin collecting data from already existing learning
experiences, or to only focus on future new design work. One
organization I have worked with decided to limit their conversion
effort to only the most-used resources in their current ecosystem, and
focus instead on gathering data from any newly created learning
items in their catalog.
Looking across the foundational elements for each phase of an analytics
implementation gives insight to the work ahead (Table 14-1). Keep in mind
that you may take a somewhat different path through these stages, moving
faster or slower as the specific needs and capabilities of your organization
support.
As you and your organization chart a path to a full-scale implementation
of learning data and analytics, be sure to think across all these foundational
elements because they are linked together. In some cases, the linkages will
come naturally as, for example, your IT team gets involved while you seek
to integrate new software. In other cases, you may need to intentionally
seek out support from other teams to learn what they are doing with their
data and analytics, and perhaps borrow resources.
Table 14-1. The Foundational Elements and Stages of Maturity

At every step of the way, the buy-in from key business sponsors—the
ones who help you define your questions for analysis early in the planning
stage—is essential. Business sponsors can help ensure that your analysis
meets their goals (and is thus worthy of their attention and resources), as
well as help clear organizational hurdles to getting things done, if needed.
A final word of advice: It may be unrealistic to expect an orderly and
linear progression as outlined here. As you try, learn, make mistakes, and
start over, you may find yourself needing to go “back” a stage to reestablish
the foundation based on the experience that you’ve had. Taking this on with
an open mind toward learning and growth—for you personally as well as
for the organization—will help you navigate the years ahead.

Over the course of this book, I’ve frequently harped on one piece of advice:
Don’t go it alone! Even if you’re an L&D team of one, you likely operate
within a company that employs data experts and enthusiasts or at least
spreadsheet magicians. As the ecosystem, the analysis, and the data
protections required for implementation get more complex beyond the
piloting phase, building relationships across the organization will be
essential to your success. The professionals in your IT, legal, BI, and
business teams have insight and experience that can help you move ahead
faster and with fewer mistakes.
We’ve covered a lot of data and analytics ground in this book. We started
by setting expectations around terminology, introducing you to the common
learning data specifications and learning metrics, and overviewing the
statistical concepts you’ll turn to when conducting analytics. We delved into
the core topics of data sources, data collection and capture, data storage,
and data visualization. It is my hope you’ve learned from my data and
analytics experiences as well as each chapter’s case studies—and you’ve
put some of the concepts to work either in hypothetical or real scenarios at
work.
I ended the introduction with a call to action, “Let’s get some data!”
Now that we’ve reached the end of this book, let me amend it a bit: “Let’s
get some data! And let’s use it to transform our organizations!”
Give It a Try
Consider the current state of your organization’s learning data and analytics,
its capabilities and needs, and the resources available to you. What could be
a possible path forward? What steps that would you need to take to reach
the next phase in your implementation? Consider whether this is something
you can discuss with your learning leadership or business sponsors.
Do It for Real
Use the table in this chapter to identify where your organization is in its
implementation of learning analytics right now. What might be missing at
your current stage? What could be a possible path forward? What steps will
you need to take to reach the next phase in your implementation? Consider
whether this is something that you can discuss with your learning leadership
or your business sponsors.
Bonus Points
Submit your story as a case study for others to follow, either in the industry
press or as a conference presentation. This emerging space is hungry for
examples of learning data and analytics implementations and the lessons
you have learned along the way.
The Learning Impact Engine
JD Dillon, Chief Learning Architect, Axonify
Correlation is not causation. Two events may happen one after the other, but
that doesn’t mean the first event caused the second. Sales may increase after
the implementation of a new sales training program, but this doesn’t verify
that the program caused the improved result (even if L&D desperately
wants it to be true). Lots of factors can influence business outcomes.
Marketing promotions, changes to employee compensation, a new CRM
platform, and increased customer footfall due to nice weather may have
also influenced sales numbers. Training likely played a role, but every other
department will claim their strategies caused the positive outcome too.
To establish true relationships between learning and performance, L&D
needs a learning impact engine.
L&D can feed the learning impact engine with a variety of workplace
data points, including:
• Engagement: How often employees participate in various learning
activities
• Consumption: Which content objects employees use
• Knowledge: How each person’s knowledge of important topics
changes over time
• Confidence: How their self-reported confidence shifts over time
• Behavior: How they are or are not applying their knowledge on the
job
• Results: Business outcomes associated with employee performance
The learning impact engine applies a machine learning model
(customized to an organization based on its unique data requirements) to
establish relationships between changing data points. Then, L&D can train
the model using historic learning and performance data. This allows the
technology to learn how changes in different data points relate to one
another over an extended period. As L&D gathers new data, we can test the
model’s predictive ability and accuracy.
The engine applies its relational understanding of the organization’s data
to isolate the impact of learning activities on business outcomes,
represented as a percentage. For example, the engine may determine that a
specific training program caused 15 percent of the company’s quarterly
sales result. L&D can then convert this percentage to a monetary value and
calculate the return on investment for the training initiative. If provided
with the right data, the engine may also be able to determine how other
factors, such as marketing, coaching, or seasonality, influenced the business
outcome. Otherwise, the engine will still isolate these influences but label
them as “unknown.”
L&D can take this process a step further by applying the learning impact
engine to proactively adjust learning strategies. Consider an example of the
engine used to measure the impact of training within a warehouse
operation. It determined a training program meant to accelerate shipments
was not having the desired impact. Rather than waiting for the flawed
program to complete and making assumptions about its impact, L&D
evaluated the training, found that it was focused on the wrong employee
behaviors, and adjusted the content in-flight. The engine’s insight proved
critical to recording the desired improvement in business results upon
program completion.
A learning impact engine can also enable hyper-personalized learning
experiences. For example, it may recommend a topic, activity, or content
object to an individual based on its proven ability to affect their
performance goals. Rather than modeling our approach to personalized
learning on Netflix and relying on subjective peer ratings and content
consumption patterns, L&D can take an outcome-based approach and adapt
learning to focus on tactics that really work.
Learning impact engine technology is available right now. We can finally
measure the impact of learning! So why is L&D still struggling to prove
our value?
Technology isn’t the problem. It’s the data needed to power the system
that’s lacking. L&D doesn’t just need more data. We need a greater variety
of data to establish connections between our solutions and the results we’re
trying to achieve. To fix measurement, we must first evolve our learning
strategies. Traditional training is data-poor. There’s only so much insight
you can gather from course completions, test scores, and reaction surveys.
To understand the relationship between learning and performance, L&D
must assess how people change (or do not change) over time as a result of
our programs. This starts with adopting continuous learning tactics, such as:
• Microlearning: Targeted content provides more granular insight into
specific concepts
• Reinforcement: Practice activities provide insight into how people’s
knowledge and skills change over time
• Observation: Behavior data provides a critical connection between
what people know and if or how they apply that knowledge on
the job
Once we’ve expanded our data collection practices, L&D must partner
with stakeholders to access the business data needed to power a learning
impact engine. This may include data related to sales, safety, productivity,
customer satisfaction, or anything else the organization measures associated
with our learning programs. When partnering, L&D must explain how this
sensitive data will be used to improve support and drive business results.
This data is essential to establishing the relationship between learning
activities and performance outcomes, so we must ensure our stakeholders
trust us with it.
Measuring learning impact is difficult, but it’s not impossible.
Technology plays a critical role. Good data is essential. But fixing L&D
measurement begins with making it a priority. If we want to move L&D
forward and get that “seat at the table,” improving our data practices must
be our top priority. Otherwise, causation will remain out of reach, and
“butts in seats” will be the measure of our contribution to the organization.
Scaling the Learning Analytics Team
By Tiffany Jarvis, Director of Learning, Edward Jones
I was a middle school teacher for 10 years, as well as a department chair

and team leader who was passionate about using formative assessment in
the classroom. I earned a doctorate from Lindenwood University, and the
two statistics classes I took to conduct my research were the only formal
math I had beyond my undergrad requirements. I’m not a mathematician;
I’m not a data scientist. At my heart, I’m a classroom teacher who believes
in data and assessment for learning.
Before we implemented learning analytics at Edward Jones, there was
curriculum management, which largely consisted of ensuring the accuracy
and maintenance of our programs and monitoring them through
measurement plans. The measurement plans, and their accompanying
surveys, were ad hoc in nature and weren’t delivering intelligence that
business leaders or L&D designers could use to take action. Our L&D
leaders were interested in maturing our organization’s capabilities—
especially in the area of data-guided decision-making—which required
streamlining our measurement at a time when our curriculum was also
rapidly growing.
We started by standardizing surveys and measurement plans, which
required a very well-developed business case. At first, business leaders
were uncomfortable; they were accustomed to extremely specific surveys
that described learning objectives in detail and recorded large amounts of
open-text commentary. When we pointed out the significant time and
resources it took to create those surveys and compared it to their actual use,
they were willing to try something new. And when that new, standardized
survey was able to capture relevant learner information and also
accommodate cross-program comparisons, they were sold. Our organization
relies heavily on branch teams who are always learning and developing in
service to clients, so it wasn’t long before business areas were lining up to
increase the rigor of their plans and get the data that could help them guide
their decisions.
Soon measurement became my full-time job, and I began exploring the
concept of learner experience research as another way data could guide our
decisions in L&D. We partnered with several design thinking consultants to
begin using tools like journey maps, user personas, and focus groups to
mature our strategy and design capabilities. Again, business leaders were
uncomfortable at first; LX research seemed to take time and resources, and
they needed their new designs yesterday. But, again, once we were able to
illustrate the power and payoff of using learner experience research, they
were sold. We could not conduct intakes fast enough, and the enthusiasm
only continued to spread.
At this point, we knew we needed a team. Our principal at the time
dedicated three full-time employees to learning analytics: a leader, a
learning analyst, and a data analyst. I’d never led a learning analytics
function—remember my background?—but I’d had a lot of success in
standing this function up based on my experiences with assessment,
education, design thinking, and stats. I was pretty sure that if I could find
others who had that mix of background, we could make a team that could
successfully champion both learner experience research and learning
analytics. Instead of recruiting data scientists, I hired a psychology
professor with a passion for mixed methods research and a human resources
specialist with a background in operational excellence and project
management. What they brought to the team was an intense passion for
learner-centricity, a well-rounded understanding of how research and data
should inform decisions, and an ability to make connections with both
business leaders and our learners. The learning analyst focused on business
partnerships and collaborated with learning and performance consultants to
design learner experience research and measurement plans, conduct
research, and provide recommendations. And the data analyst focused on
further scaling our data and measurement collection, dashboarding, and
identifying trends that could become intelligence.
The team has grown since then, and is now five strong. We’ve
reorganized into a center of excellence model that is similar to what you
might see in a traditional PhD lab: Senior members of the team lead the
research design, methodology, and business intelligence while junior
members conduct research, increase operational efficiency, and support
measurement, reporting, and analysis. They are part of a larger enablement
team that supports all of enterprise learning, ensuring we have the
infrastructure we need to make the high-impact learning content our
organization requires.
Building our department from the ground up allowed us to ensure the
learning analytics capability is fit for purpose within our L&D organization;
starting small allowed us to bring the entire department along and build
grassroots interest and participation in data, analytics, and research. This
has been a progression for us over the years of starting small with pilot
projects before building in the culture and processes for analytics
throughout all our work.
Establishing Learning Measurement and
Evaluation Processes
By Alfonso Riley, Digital Program Manager, Caterpillar
When I joined the learning analytics team at Caterpillar, the team had
already begun their data and analytics journey. The situation when I started
was to establish the learning measurement and evaluation processes to
enable learning analytics across our dealer-facing programs. The first
milestone was to gather all the historical data across our different learning
systems and provide an overview of our dealers’ learning activity across the
learning ecosystem. With more than 160 independent dealers that account
for 150,000 learners worldwide, the ability to track and analyze learning
data at scale is critical to ensure the success of our learning program goals.
From a metrics standpoint, our leaders needed to identify our “true”
active user population and their adoption patterns. Therefore, the focus was
to identify what learning modality our dealer learners were using (formal
versus informal activities) and gauge the adoption of our key learning
programs (leadership, sales, and service career development).
Achieving this goal required a deep understanding of how the historical
data and the current data were generated, so data assumptions could be
developed to avoid misinterpreting it. Once completed, we used those
assumptions to develop our data import and normalization plan, which
consisted of importing the historical data from our LMS into our external
LRS, as well as integrating each of our learning systems (including third-
party content, videos, and assessments) so we could normalize the data
using the xAPI profile.
The use of ingestion templates helped us properly import the data to our
LRS and then use our learning analytics platform to analyze the data and
find patterns across it. In this way, we could check the validity of our
assumptions and act on which data was needed and which was not. The data
normalization through xAPI ensured we could evaluate the entire data lake
regardless of having different data sources.
This provided the foundation of our learning analytics strategy, which
encircled three types of standard dashboards for our learning programs’
stakeholders: program readiness, program skills and knowledge transfer
effectiveness, and learning experience effectiveness.
• The program readiness dashboards focused on using descriptive
analytics around target population adoption, providing insight to the
question “Is our target audience completing the critical learning
programs on time for product or services deployment?” This
question goes from a global view to a dealer-by-dealer view.
• The program skills and knowledge transfer effectiveness
dashboards focused on using diagnostic analytics, which combines
formative assessments (such as gap assessments or self-assessments)
with summative assessments (such as tests, exams, or performance
assessments) associated with the critical learning programs. The gap
assessments determine who should be going to what learning
program or competency development plan, and the performance
assessments validate the competency level. This type of dashboard
focused on a specific learning program in a given timeframe and
requires a holistic approach rather than focusing just on specific
learning activities. The questions center on gauging the effectiveness
of a given learning program to improve the performance of a target
audience.
• The learning experience effectiveness dashboards focused on the
quality of the learning program, mostly looking at engagement
behavior displayed through multiple visualizations using trending
and correlation analysis to visualize digital learning activities
patterns (such as in e-learning and videos engagement). This one
helps our learning experience design team identify what modalities
have the highest reach based on the target audience and topics (such
as leadership, sales, and service programs).
This structure enables us to provide a common framework to share
insights into our learning programs, by showing adoption rates (total people
engaging with the learning activity versus total people completing the
activity or program).
My advice for others looking to get started and build a similar structure
is first to figure out what data you are already collecting. Sometimes we
may be eager to start from scratch, but if your organization is already
collecting data from learning activities, start with them and identify what
data is valuable or not.
From a deployment planning perspective, the approach that worked for
me was:
• Normalize your data. For us, the key was having an external LRS
that could ingest both learning data and business data and consolidate
it into a single data lake. In this way, the data can be easily structured
to show the insights we were looking for.
• Experiment with your stakeholders. A big success for adoption
was including our stakeholders on the development phase of our
analytics dashboard. This helped us ensure alignment with our
business requirements as well as explore new ways to use the data.
And finally, discuss frequently with your stakeholders how they intend
to use the data. Ask questions about their business requirements, because
they may shift often. Also, encourage them to brainstorm on what success
and failure look like from a data perspective, so standard metrics can be
developed based on those scenarios.
Acknowledgments
“We are uncovering better ways of developing software by doing it and helping
others do it.”
The acknowledgments of my previous ATD Press book opens with this

quote from the Agile Manifesto, and it seems fitting for this book as well.
As practitioners, leaders, and members of the community of professionals,
we are uncovering better ways of accomplishing our work by doing it and
by helping others do it. In this case, we’re uncovering and sharing better
ways of using the data at our disposal to create better learning experiences,
more capable and confident people, and more successful organizations.
The work this industry is doing now to come up to speed with a more
data-centric world means we’re all learning together, and the process of
researching and writing this book has an iterative and collaborative
experience.
“If you want to go fast, go alone. If you want to go far, go together.”
This proverb means so much to me on so many levels. Here, we are going

far and we’re doing it together. In my work with xAPI, I am often asked for
examples of who’s doing it “for real.” Having case studies from a variety of
sources, industries, and experiences was something that I knew was going
to be very important for this book. I so appreciate the time and care that the
contributors spent to share their stories so generously for this book: Josh
Cavalier, Becky Goldberg, Emma Weber, Ben Betts, Tammy Rutherford,
Andrew Corbett, Derek Mitchell, Tiffany Jarvis, Kimberly Crayton, Rodney
Myers, Brent Smith, Wendy Morgan, Janet Laane Effron, Ulduz
Berenjforoush Azar, Stella Lee, Matt Kliewer, John Polk, JD Dillon, and
Alfonso Riley.
When I think about “going together” I think about the xAPI Learning
Cohort. Picking up from the ADL’s xAPI Design Cohorts run by Craig
Wiggins, the TorranceLearning team led 14 cohorts over seven years,
supporting 4,500 professionals as they learned about xAPI, getting data
from their learning experiences, and ultimately doing something with it.
Alison Hass, Peter Guenther, Jessica Jackson, Erin Steller, Jami
Washington, Leanne Gee, and the utterly unflappable Matt Kliewer have
been a huge part in growing a community of professionals learning together
—and teaching one another—about data.
Jack Harlow and Hannah Sternberg, the editors I work with at ATD
Press, have once again helped guide this work with gentle nudges,
thoughtful research of their own, and their most delicate way of saying,
“Megan, this makes no sense. Try again.” Thank you.
“Skip to the good parts.”
About the time I kicked off the writing of this book in earnest (or should
have kicked it off), I got some fantastic advice that has helped me push
through some of the big blockers in life—including writers’ block: “Just
skip to the good parts.” So many people—Michelle, Matt, Maria, the entire
team at TorranceLearning—have helped me see the “good parts” of life
along the way. And a well-placed “you’ve got this” goes a long way (I’m
looking at you, Emmet). Thank you.
“You need a vacation.”
The daily demands of my role at TorranceLearning make carving out time

for writing very difficult. My parents have provided support, space, and
places to write this book. And, at the same time, they’ve helped to manage
my burnout. And now … now, I’m ready for a real vacation. Thanks, Mom
and Dad. I love you.
Further Reading
Are you ready for more? I found these books interesting and useful along
my own journey, some of which provided insight and direction for this book.
Others were interesting beach reading.
The Data Detective: Ten Easy Rules to Make Sense of Statistics Tim
Harford (Riverhead Books, 2020)
Behind Every Good Decision: How Anyone Can Use Business Analytics to
Turn Data into Profitable Insight Piyanka Jain and Puneet Sharma
(American Management Association, 2015)
The Art of Statistics: How to Learn From Data David Spiegelhalter (Basic
Books, 2019)
The Functional Art: An Introduction to Information Graphics Alberto Cairo
(New Riders, 2013)
The Big Book of Dashboards: Visualizing Your Data Using Real-World
Business Scenarios Steve Wexler, Jeffrey Shaffer, and Andy Cotgreave
(Wiley, 2017)
Show Me the Numbers: Designing Tables and Graphs to Enlighten Stephen
Few (Analytics Press, 2012)
Learning Analytics: Using Talent Data to Improve Business Outcomes John
R. Mattox II, Peggy Parskey, and Cristina Hall (KoganPage, 2020)
Investigating Performance: Design and Outcomes With xAPI Janet Laane
Effron and Sean Putman (MakingBetter, 2017)
Measurement Demystified: Creating Your L&D Measurement, Analytics,
and Reporting Strategy David Vance and Peggy Parksey (ATD Press,
2021)
Making Sense of xAPI Megan Torrance and Rob Houck (ATD Press, 2017)
Data Story: Explain Data and Inspire Action Through Story Nancy Duarte
(IdeaPress, 2019)
The Visual Display of Quantitative Information Edward R. Tufte (Graphics
Press, 2001)
References
ADL (Advanced Distributed Learning). nd. “ADL Initiative R&D

Projects.” ADL Initiative. adlnet.gov/research/projects.
ADL (Advanced Distributed Learning). nd. “TLA CMM Level 1—
Instrumenting Learning Activities With xAPI and Connecting to an
LRS.” ADL Initiative. adlnet.gov/guides/tla/service-definitions/xAPI-
Adoption-TLA-CMM-Level-1.html.
Anderson, C. nd. “Understanding and Visualizing Data.” University course
materials, Cornell University.
ATD (Association for Talent Development). 2019. The Talent Development
Body of Knowledge. td.org/tdbok.
Bellinger, G., D. Castro, and A. Mills. 2004. “Data, Information,
Knowledge, and Wisdom.” Systems Thinking. systems-
thinking.org/dikw/dikw.htm.
Boulton, C. 2021. “What Is Digital Transformation? A Necessary
Disruption.” CIO, June 24. cio.com/article/3211428/what-is-digital-
transformation-a-necessary-disruption.html.
Britz, M., and J. Tyler. 2021. Social by Design: How to Create and Scale a
Collaborative Company. New York: Sense & Respond Press.
Duarte, N. 2019. DataStory: Explain Data and Inspire Action Through
Story. Washington, DC: Ideapress Publishing.
Fox, M. 2015. “Using a Hypothesis-Driven Approach in Analyzing (and
Making Sense) of Your Website Traffic Data.” Digital.gov, April 16.
digital.gov/2015/04/16/using-a-hypothesis-driven-approach-in-
analyzing-and-making-sense-of-your-website-traffic-data.
Kirkpatrick Partners. 2022. “The Kirkpatrick Model.” Kirkpatrick Partners.
kirkpatrickpartners.com/the-kirkpatrick-model.
Malone, N., M. Hernandez, A. Reardon, and Y. Liu. 2020. “Advanced
Distributed Learning: Capability Maturity Model—Technical Report.”
ADL Initiative, July. adlnet.gov/publications/2020/07/ADL-Capability-
Maturity-Model—Technical-Report.
Mattox, J.R., P. Parskey, and C. Hall. 2020. Learning Analytics: Using
Talent Data to Improve Business Outcomes, 2nd ed. New York: Kogan
Page.
Morgan, W. 2020. “The Science of Instructional Strategy.” Learning
Solutions Magazine, June 17. learningsolutionsmag.com/articles/the-
science-of-instructional-strategy.
NIST (National Institute of Standards and Technology). 2021. “Metrics and
Measures.” NIST Information Technology Laboratory, May 17.
nist.gov/itl/ssd/software-quality-group/metrics-and-measures.
Panetta, K. 2021. “A Data and Analytics Leader’s Guide to Data Literacy.”
Gartner Insights, August 26. gartner.com/smarterwithgartner/a-data-
and-analytics-leaders-guide-to-data-literacy.
Pease, G. and C. Brant. 2018. “Fuel Business Strategies With L&D
Analytics.” TD at Work. Alexandria, VA: ATD Press.
Ridgway, V.F. 1956. “Dysfunctional Consequences of Performance
Measurements.” Administrative Science Quarterly 1(2): 240–247.
jstor.org/stable/2390989?seq=1#page_scan_tab_contents.
Sparks, S.D. 2020. “The Myth Fueling Math Anxiety.” EdWeek, January 7.
edweek.org/teaching-learning/the-myth-fueling-math-anxiety/2020/01.
Torrance, M. 2020. “COVID-19 and L&D: Present and Future.” The
Learning Guild Research Report, July 8.
learningguild.com/insights/252/covid-19-and-ld-present-and-future.
Vance, D., and P. Parskey. 2021. Measurement Demystified. Alexandria,
VA: ATD Press.
Vigen, T. 2015. Spurious Correlations. New York: Hachette.
Watershed. 2022. “What Is Learning Analytics?” Watershed.
watershedlrs.com/resources/definition/what-is-learning-analytics.
Index
Page numbers followed by f and t refer to figures and tables, respectively.
A
Action Mapping, 95, 95t–96t
adaptive learning, 80
ADDIE, 165–166, 165f, 166f
Advanced Distributed Learning (ADL), vi, 36, 86–87, 162–163
aggregate data, 136
Agile approach, 105–106, 165
AI (artificial intelligence), 26
AICC specification, 36
all-in-one learning platforms, 7
analytics, 21. See also data and analytics
categories of, 77, 78f
levels of, 22–23
performing, 80
Anderson, Chris, 3
Ann Arbor Hands-On Museum Digitally Enhanced Exhibit Program
(DEEP), v
answer sources, 103, 103t
artificial intelligence (AI), 26
assessments, 46–47, 126
audience, 176–178, 177f, 178f
augmented reality, 128–129
Axonify, 200–202
Azar, Ulduz Berenjforoush, 121–122, 136
B
Bersin, Josh, 70
Betts, Ben, 33
BI (business intelligence), 7, 25
big data, 20–21
blended scenarios, using data across, 88–89
Bottom Line Up Front (BLUF), 179
Boulton, C., 4
Brant, Caroline, 22–23
Britz, Mark, 92
browser data, 130
building scale and maturity, 189–208
case studies, 200–208
first production projects in, 191–194
foundational elements in, 195–198, 196t–197t
and framework for using learning data, 188–189, 189f
full-scale implementation in, 194–195
multi-project implementation in, 194
pilot projects in, 190–191
prototyping in, 190
business acumen, 106
business intelligence (BI), 7, 25
C
categorical data, 57
Caterpillar, 206–208
Caulkin, Simon, 3
causation, 66, 200
Cavalier, Josh, 14
Cavoukian, Ann, 134
chart design, 179–180
chat bot data, 128
childcare metrics, 11
“cleaning” data, 167
cmi5 specification, 36, 40
communicating data, 175–188. See also visualization(s)
audience for, 176–178, 177f, 178f
with dashboards, 175
with data presentations, 176
design of charts and graphs for, 179–180
framing message in, 178–179
purpose in, 176
through reports, 175
completion of learning metric, 43–46
confidence-based testing, 46
consumer-focused software, 4
continuous improvement, 5–6, 170
control groups, 121–122
convenience sampling, 64
conversion strategy, 195
Corbett, Andrew, 50–51
correlation, 66, 200
COVID-19 pandemic, vi, 177
Crayton, Kimberly, 73–74
Critical Equity Consulting, 121–122, 136
customer service metrics, 9
D
dashboards. See data dashboards
data, 17–18, 20f. See also data and analytics; workplace learning data
categories of, 77, 78f
communicating (See communicating data)
meaningfulness of, 18–19
storing (See storing data)
types of, 56–58
uses of, 78–82, 78f, 82f
data analysis
in data science, 23
iterating on, 168–170
storing data from, 155
data and analytics, 3–13
case studies, 14
to drive decision making, 8–12
for L&D function, 7–8
reasons for caring about, 4–6
data capture, 137–149
case study, 147–149
designing experiences for, 6
improvising with, 143
from learning experiences, 140–142, 141f, 142f, 147–149
from non-digital experiences, 142–143
from pre-built systems, 137–140, 139t
privacy, confidentiality, and security with, 143–145
data collection, 6, 167–168
data confidentiality, 144, 145, 195
data dashboards, 25
communicating data with, 175
for equity, diversity, and inclusion program, 182–184, 184f
iterations of, 169–170
data ethics, 134–135
data gathering, 23. See also planning
data governance, 195
data lakes, 154–155
data literacy, 5
data needs, 115–118, 118t, 120, 122
data preparation, 23
data presentation, 23, 176
data privacy, 136, 143–145, 195
data projects, ad hoc nature of, 173
data protection and privacy, 136
data science, 23
data security, 78–79, 144–145, 195
data sources, 123–136
HR systems, 123–125
learning delivery platforms, 129
learning experiences, 125–129
moving from core question to, 118f
for things that don’t happen online, 131–132
upstream, storing data from, 151–152
data specifications and standards, 35–41
AICC, 36
case study, 39–41
choosing, 37
HR Open Standards, 36
IEEE standards, 87
for learning analytics, 39–41
1EdTech Caliper Analytics, 36
SCORM, 35–36
usefulness of, 37
xAPI, 36
DataStory (Duarte), 176
data visualization(s), 25, 79–80
audience for, 177–178
cautions about, 180–181
dashboards, 175
design of charts and graphs for, 179–180
iterating on, 168–170
purpose of, 175
storing data from tools for, 155
data warehouses, 154–155
decision making, data for, 8–12
DEEP (Ann Arbor Hands-On Museum Digitally Enhanced Exhibit
Program), v
definitions, 17–29
evolution of, 26
of terms, 17–26 (See also individual terms)
Department of Defense (DoD), 86–87
dependent variables, 65–66, 65f
descriptive analytics, 22
devices, data from, 130
diagnostic analytics, 23
digital dexterity, 5
digital transformation, 4
Dillon, JD, 200–202
DoD (Department of Defense), 86–87
Duarte, Nancy, 176
E
EDI (equity, diversity, and inclusion) dashboard, 182–184, 184f
EDLM (Enterprise Digital Learning Modernization), 86
Edward Jones, 70–72, 203–205
effects of learning on performance, 48
Effron, Janet Laane, 55, 120, 173
engagement
to influence, 185–188, 187f
with learning experience, 47–48
Enterprise Digital Learning Modernization (EDLM), 86
equity, diversity, and inclusion (EDI) dashboard, 182–184, 184f
Experience API (xAPI), v, vi, xii, 7
flexibility of, 40
learning record stores, vi, 80, 153, 155–156
xAPI Learning Cohorts, vi, 36, 40, 190
exploratory analysis, 112–113
F
finance metrics, 8–9
first production projects, 191–194
Fox, Marina, x
framing messages, 178–179
Frank Porter Graham Child Development Institute, 88–89, 160–161
“Fuel Business Strategies With L&D Analytics” (Pease and Brant), 22–23
full-scale implementation, 194–195
G
goals
aligning data gathering/use with, 91–94, 93t, 94t
prioritizing, 103–105
Goldberg, Becky, 15
graph design, 179–180
H
Hall, Cristina, vii, 22, 174
healthcare metrics, 10
hospitality metrics, 10
HR Open Standards, 36
HR systems data, 123–125
HubSpot, 57
human resources metrics, 9
hypothesis formation, 111–122
case studies, 122
exploratory analysis in, 112–113
and identification of specific needs, 115–118, 118t
scientific method in, 111, 112f
testing hypotheses, 113–115, 114t–115t
I
immersive games, 128–129
improvising data capture, 143
independent variables, 65–66, 65f
influence, 185–188, 187f
information, 18–19, 20f
instructional design frameworks metrics, 95–97, 95t–97t
integrating workflows, 81–82, 82f
inter-rater reliability, 48
intranet data, 127–128
intuition, 106
IT, data storage role of, 158
item difficulty, 46
item discrimination, 46–47
iterating, 165–173
and ADDIE approach, 165–166, 165f, 166f
case study, 173
for continuous improvement, 170
on data analysis and visualizations, 168–170
on data collection, 167–168
LLAMA approach to ADDIE, 166, 166f
J
Jarvis, Tiffany, 70–72, 203–205
job performance data, 130–131
johnpolkandassociates.com, 185–188
JoshCavalier.com, 14
K
Katzell, Raymond, vii
key performance indicators (KPIs), 24
Kirkpatrick, Donald, vii
Kirkpatrick Model, 97–98
Kirkpatrick Partners, 97–98
Kliewer, Matt, 147–149
knowledge, 19, 20f
knowledge base data, 127–128
KPIs (key performance indicators), 24
L
LAI (Learner Adoption Index), 70–72
LAMM (Learning Analytics Maturity Model), 33
LAPs (learning analytics platforms), 80, 155
Learner Adoption Index (LAI), 70–72
learner analytics, 23
learning analytics, 22
data specifications for, 39–41
to drive results, 109
levels of, 23
scientific method applied to, 111, 112f
Learning Analytics (Mattox, Parskey, and Hall), vii, 22
Learning Analytics Maturity Model (LAMM), 33
learning analytics platforms (LAPs), 80, 155
learning analytics team, scaling, 203–205
learning data
to drive results, 109
framework for using, 82–83, 83f, 189, 189f
lack of, vii–viii
from the workplace (See workplace learning data)
learning data ecosystem
considerations for operating within, 162–163
mapping, 156–158, 156f, 157f
working within, 160–161
learning delivery platform data, 129
learning experience(s)
data capture from, 140–142, 141f, 142f, 147–149
as data source, 125–129
incorporating data in, 80
storing data from, 152–153
learning experience platforms (LXPs), 129, 152
learning function, continuous improvement of, 5–6
Learning Guild, 177
learning impact engine, 200–202
Learning Locker, 57
learning management systems (LMSs), v, viii, 7, 129, 152
learning measurement framework metrics, 97–102
learning organization, 103, 103t
Learning Pool, 33
learning program analytics, 23
learning record stores (LRSs), vi, 7, 80, 153
learning-specific metrics, 43–53
completion of training, 43–46
effects of learning on performance, 48
engagement with learning experience, 47–48
tests and assessments, 46–47
learning transfer analytics, 30–32
Learning-Transfer Evaluation Model (LTEM), 98–102
Lee, Stella, 134–135
LeMoyne Institute, vi
leveraging graphics, 186
Lever-Transfer of Learning, 30–32, 182–184
live classroom data, 127
LMSs. See learning management systems
Lot Like Agile Management Approach (LLAMA), 166, 166f
LRSs (learning record stores). See learning record stores
LTEM (Learning-Transfer Evaluation Model), 98–102
LXPs (learning experience platforms), 129, 152
M
machine learning, 26
manufacturing metrics, 11
math anxiety, ix
Mattox, John, vii, 22, 174
maturity. See building scale and maturity
mean, 68–70, 69f, 70f
measurement and evaluation processes, establishing, 206–208
Measurement Demystified (Vance and Parskey), vii, 5–6, 21, 79
measures, 23
of central tendency, 58–60, 59f, 60f
for decision making, 8–12
of spread, 60–63, 61f–63f
median, 68, 69, 69f
metrics, 23
aligning data gathering/use with, 91–94, 93t, 94t
for decision making, 8–12
from instructional design frameworks, 95–97, 95t–97t
from learning measurement frameworks, 97–102
for learning organizations, 103, 103t
learning-specific (See learning-specific metrics)
prioritizing, 103–105
related to learning organization outcomes, 103, 103t
Mitchell, Derek, 52, 53f, 109
mode, 68, 69f, 70f
Moore, Cathy, 95
Morgan, Wendy M., 88–89, 160–161
multi-project implementation, 194
Myers, Rodney, 73–74
N
National Institute of Standards and Technology, 24
noise reduction, 186–187, 187f
noisy data layer, 154
non-digital experiences
data capture from, 142–143
as data source, 131–132
Novo Nordisk, 52, 53f
null hypothesis, 114–116, 114f–115f
O
objectives and key results (OKRs), 24
observation data, 126–127
1EdTech Caliper Analytics, 36
organization team designs, 195
outcome metrics, 103, 103t
P
Panetta, K., 5
Paradox Learning, 134–135
Parskey, Peggy, vii, 5–6, 21, 22, 79, 174
Pease, Gene, 22–23
people analytics, 121
performance
data from support tools for, 127–128
effects of learning on, 48
predictive analytics for, 30–32
student, learning data for insights into, 50–51
using analytics to evaluate, 15–16
personalized learning, 80
Phillips, Jack, vii
pilot projects, 190–191
planning, 91–109
Agile approach in, 105–106
to align with business goals and metrics, 91–94, 93t, 94t
case study, 109
metrics from instructional design frameworks, 95–97, 95t–97t
metrics from learning measurement frameworks, 97–102
metrics related to learning organization outcomes, 103, 103t
prioritizing list of goals, questions, and metrics, 103–105
using business acumen and intuition in, 106
Polk, John, 185–188
pre-built system data capture, 137–140, 139t
predictive analytics, 23, 30–32
prescriptive analytics, 23
PriceWaterhouseCoopers, vi
Privacy by Design Framework, 134–135
program rollouts, testing, 121–122
Project CATAPULT, 40
Project Tin Can, v
prototyping, 190
purpose, in communicating data, 176
Q
qualitative data, 56–57
QuantHub, vi
quantitative data, 56
questions, prioritizing, 103–105, 103t
R
range, 60–63, 61f–63f
reporting, 24–25, 79–80. See also communicating data
reporting strategy, 175
retrospectives, 170
Riley, Alfonso, 206–208
Rustici Software, 39–41
Rutherford, Tammy, 39–41
S
sales metrics, 8
sampling, 63–65
sampling bias, 65
scaling, 83. See also building scale and maturity
scientific method, 111, 112f. See also hypothesis formation
scorecards, 25
SCORM (Sharable Content Object Reference Model), 35–36, 40
SCORMs. See sharable content object reference models
search feature data, 129–130
self-directed e-learning data, 125–126
sentiment analysis, 56–57
70-20-10, 95, 96t
Sharable Content Object Reference Model (SCORM), 35–36, 40
sharable content object reference models (SCORMs), v, viii, 7
simulation data, 128–129
Six Sigma, 63
skills needs identification, 52, 53f
Sky, 109
Slack, 148
Smith, Brent, 86–87, 162–163
Social by Design (Britz), 92
social learning data, 128
Sparks, Sarah, ix
spread, measures of, 60–63, 61f–63f
Spurious Correlations (Vigen), 66
stakeholders, 92, 120
standard deviation, 62–63, 63f
standards. See data specifications and standards
statistical significance, 66–67
statistics, 55–72
correlation and causation, 66
measures of central tendency, 58–60, 59f, 60f
measures of spread, 60–63, 61f–63f
relationships between variables, 65–66, 65t
sampling, 63–65
statistical significance, 66–67
types of data, 56–58
storing data, 78–79, 151–163
from data analysis and visualization tools, 155
from data warehouses and data lakes, 154–155
from learning analytics platforms, 155
from learning experiences, 152–153
from learning management systems and learning experience platforms,
152
from learning record stores, 153
options for and approaches to, 155–158, 156f, 157f
role of IT in, 158
tools and services for, 151–155
from upstream data sources, 151–152
storytelling, 184
survey data, 130
T
Talent Development Capability Model, 5
technology metrics, 11–12
terminology. See definitions
tests and testing
data from, 126
of hypotheses, 113–115, 114t–115t
learning-specific metrics from, 46–47
of program rollouts, 121–122
when iterating on data, 168
Thalheimer, Will, 98
Total Learning Architecture (TLA), 79, 86–87, 162–163
Trane technologies, vi
Travelers Insurance, 15–16
Turning Learning into Action, 182
U
Underwriting Professional Development Program (UPDP), 15–16
University of California–Davis, 50–51
UPDP (Underwriting Professional Development Program), 15–16
upstream data sources, storing data from, 151–152
US Navy program effectiveness, 73–74, 74f
V
Vance, David, vii, 5–6, 21, 79
variables, 65–66, 65t
variance, 61–61, 62f
video design, 14
video interaction data, 126
Vigen, Tyler, 66
virtual classroom data, 127
virtual reality data, 128–129
visualization. See data visualization(s)
“Vs of data,” 104–105
W
Weber, Emma, 30–32, 182–184
wisdom, 19, 20f
workflows, 81–82
workplace learning data, 77–89
categories of data and analytics in, 77, 78f
foundational elements for using, 82–83, 82f
uses of, 78–82, 78f, 82f
X
xAPI. See Experience API
xAPI Learning Cohorts, vi, 36, 40, 190
xAPI Learning Record Stores, vi, 7, 80, 153
Y
YouTube, 14
About the Author
Megan Torrance is CEO and founder of TorranceLearning, which helps

organizations connect learning strategy to design, development, data, and
ultimately performance. She has more than 25 years of experience in
learning design, deployment, and consulting. Megan and the
TorranceLearning team are passionate about sharing what works in
learning, so they devote considerable time to teaching and sharing
techniques for Agile project management for instructional design and the
Experience API. TorranceLearning hosted the xAPI Learning Cohort, a
free, virtual 12-week learning-by-doing opportunity where teams form on
the fly and create proof-of-concept xAPI projects.
Megan is the author of Agile for Instructional Designers, The Quick
Guide to LLAMA, and two TD at Work guides: “Agile and LLAMA for ISD
Project Management” and “Making Sense of xAPI.” She is a frequent
speaker at conferences nationwide. TorranceLearning projects have won
several Brandon Hall Group awards, the 2014 xAPI Hyperdrive contest at
DevLearn, and back-to-back eLearning Guild DemoFest Best-In-Show
awards in 2016 and 2017 with xAPI projects. TorranceLearning is a 2018
Michigan 50 Companies to Watch.
A graduate of Cornell University with a degree in communication and an
MBA, Megan lives and works in Ann Arbor, Michigan.
About ATD
The Association for Talent Development (ATD) is the world’s largest

association dedicated to those who develop talent in organizations. Serving
a global community of members, customers, and international business
partners in more than 100 countries, ATD champions the importance of
learning and training by setting standards for the talent development
profession.
Our customers and members work in public and private organizations in
every industry sector. Since ATD was founded in 1943, the talent
development field has expanded significantly to meet the needs of global
businesses and emerging industries. Through the Talent Development
Capability Model, education courses, certifications and credentials,
memberships, industry-leading events, research, and publications, we help
talent development professionals build their personal, professional, and
organizational capabilities to meet new business demands with maximum
impact and effectiveness.
One of the cornerstones of ATD’s intellectual foundation, ATD Press
offers insightful and practical information on talent development, training,
and professional growth. ATD Press publications are written by industry
thought leaders and offer anyone who works with adult learners the best
practices, academic theory, and guidance necessary to move the profession
forward.
We invite you to join our community. Learn more at td.org.

Book - Analytics For Instructional Designers Association For Talent Development

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Book - Analytics For Instructional Designers Association For Talent Development

Uploaded by

Copyright:

Available Formats

© 2023 ASTD DBA the Association for Talent Development (ATD)

All rights reserved. Printed in the United States of America.

Library of Congress Control Number: 2022949753

ATD Press Editorial Staff

How L&D Can Use Data

Don’t Be Afraid of the Math!

Here most of us are, having accumulated years and perhaps decades of

What Does It Mean to Design for Data?

—Marina Fox, GSA’s DotGov Domain Services,

Why Should Instructional

It’s hard to pick up a business or organizational effectiveness publication

Digital Transformation Is Everywhere

Consumer-Focused Software Is So Good at What It Does

We Will Need to Improve in How We Use Data

We Need the Insights for Continuous Improvement of the

We Will Need to Design Experiences to Collect Data

Where Do We Stand in L&D?

What Kinds of Data Drive Decision Making?

Human Resources Metrics

Customer Service Metrics

What Could Possibly Go Wrong?

Some years ago, I released a series of training videos on YouTube. The

Travelers hires experienced underwriters, but our focus on underwriting

Getting Started With Definitions

To support our journey to using learning data, let’s start by establishing

Figure 2-1. The Progression From Data to Wisdom

What Is Big Data?

What Is Data Analytics?

While many in the profession consider any operation involving numbers to be

In short: Analytics is the means by which we use data to create

What Is Learning (Data) Analytics?

Learning analytics is the science and art of gathering, processing, and

We could think of learning analytics as simply data analytics on the topic

What Kinds of Analytics Are There?

Descriptive analytics create a summary of historical data to yield useful (but

Diagnostic analytics examine data or information to answer the question “Why

Prescriptive analytics suggest the best action to take to influence a different

Learning analytics can be further parsed into learner analytics and

What Is Data Science?

What Is the Difference Between Metrics and Measures?

What Are Data Reporting, Scorecards, Data

What Is Business Intelligence?

What Are Artificial Intelligence and Machine Learning?

These Terms May Evolve

What Could Possibly Go Wrong?

Used in the area of learning transfer, predictive analytics can determine

The Learning Analytics Maturity Model (LAMM) is a simple diagnostic

SCORM, xAPI, and More

How Do You Choose a Data Specification?

What Could Possibly Go Wrong?

When it comes to analyzing data—any data—the first step is having a data

Unique Learning Metrics

What Does It Mean to “Complete” Training?

Tests and Assessments

Engagement With the Learning Experience

Effects of Learning on Performance

What Could Possibly Go Wrong?

At University of California–Davis, we were exploring clinical reasoning in

Within Novo Nordisk, we realize the importance of understanding our

Figure 4-1. Skill Map for Quality and Research Functions

A Little Bit of Statistics

Three Types of Data

Measures of Central Tendency