Download as pdf or txt
Download as pdf or txt
You are on page 1of 84

INNOVATE

JUNE 2021 | VMWARE CONFIDENTIAL


NATURE OF WORK
RE-THINKING THE
Table Of
Contents

00
Perpetual Innovation: Tanzu Service Mesh
Greg Lavender, Emad Benjamin, Pere Monclus
The lesson from this story is that bold ideas are good, but bold actions are
better.

05
New Support Model for Largest Customers
Renu Raman, Chirag Patel, Michael Hein
A new support model architecture to make customers happier, while re-
ducing necessary headcount by 73%.

12
Environmental, Social & Governance
Natasha Tuck
In order to stay focused on the most pressing matters, we have to set a
vision for where we want to go. Our 2030 Agenda is that vision.

20
A Brief History of Differential Datalog
Mihai Budiu
A 1,800-line Java program could be replaced with a 30 line Differential
Datalog program that is faster, uses less memory, and has fewer bugs.
23
Patent Talk: Extensible Token-Based Authorization
Dale Olds, John DiRico, Dexter Arver
Dale gives us the perspective of the ideator, while John gives us the be-
hind the scenes reveal of how VMware's patent process works.

FUSING EMPATHY
30

+ URGRENCY
Remote Field Work: Part 2
Bob Motanagh
Bob interviews Branden Lugabihl and Benoit Serratrice to find out how
The Field has adjusted to working remotely with customers.

36
A Data Analytics Platform on VMware Private Cloud
Rumen Barov
Super Collider is VMware's internal analytics service that has had an
annual data volume growth rate of 200% in the last 7-8 years.

45
Project Kepler: Origins of the Anywhere Workspace
Shawn Bass, Craig Connors, Brianna Blacet
One of VMware's responses to the pandemic was to help our customers
and their employees work from anywhere—securely and easily.

53
Objectives & Key Results: A Short How-To
Jen Handler
Get the real low-down on OKRs by learning from
Jen's practical experience in using them.
DEMOCRATIZING
DIGITAL ACCESS

58
The Backlog Standup
Erica Dohring
Learn and try the Backlog Standup for a faster,
more focused start to the day.

60
Aligning on a Shared Future: OKRs, OGSMs, NSMs
Andrew Zusman
Andrew breaks down three popular organization-alignment frameworks.
Which one should you use? Read to find out!

69
The ACE Team: Unblocking FedEx + vRA 8.3
Tom Scanlan, Luis Valerio Castillo
The ACE team accelerates adoption of new products by collaborating
with customers and VMware Business Units.

73
University Talent: Benefits, Adaptability, and Innovation
Kate Wilkinson
The UT team overcame the various challenges caused by COVID-19 to
better engage with interns and new college grads.
Editors: Joe Samagond, Dexter Arver, Austin Roth Eagle

T h e Tanz
n : u S
i o er
a t
ov

vi
ce
nn
etual I

Mes h
St
p

o
r

r
Pe

y
GR

S
LU

G
NC
E

LA
VE M O
ND
ER, ERE
EMAD MI N,P
BE NJA

0 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Crisis Breeds Innovation Three Key Concepts
Last year provided ample proof that crisis often breeds innova- There are three key concepts that are critical to seeing any inno-
tion. The relatively sudden onset of a global pandemic quickly vation through from idea to product introduction. All three are
led to remarkable achievements: from mapping the COVID-19 evident in the Tanzu Service Mesh story:
genome to releasing the first of several vaccines in just nine 1. Co-innovation is the concept of working with other
months. partners—internally and externally, including other busi-
ness groups, customers, academia, institutions, and oth-
There are important lessons for all of us to learn from this ers—to produce ideas. It is critical in any organization, but
achievement about meeting a critical and urgent societal need. particularly in a global enterprise technology company like
The pandemic response necessitated tremendous collabora- VMware;
tion between the healthcare-and-biopharma industries, and 2. Co-investment is the grease that helps the process move
governments around the world. The success story of the Covid along smoothly. Any innovation requires seed funding, re-
vaccines is one of focused dedication and the application of the sources, and buy-in from different sources to succeed, often
most modern technology combined with traditional scientific both inside and outside the company;
methods—all performed under extreme pressure in the public 3. Collaboration. Innovation is a collaborative process. Good
spotlight. ideas can come from anywhere and multiple sources. It is al-
most always necessary to work across teams and silos within
VMware has grown and thrived because we have focused from the organization to achieve the desired result. Getting
the start on delivering innovations that meet or anticipate our critical analysis from others, while often uncomfortable, can
customer’s needs. But how do we maintain and build upon a cul- lead to a better product in the end.
ture of perpetual innovation, even as a large, global company?
Seizing an Opportunity for Innovation
A great example of such innovation at VMware is VMware Tanzu CIOs are hired to drive feature velocity and some are later fired
Service Mesh. This revolutionary product’s journey from idea for resiliency issues with their systems. We have witnessed that
through incubation, financing, testing, and final delivery was a driving feature velocity can complicate system resiliency.
huge achievement for VMware. The lessons from this story not
only show us how to ensure our continued growth as a compa- The modern app is dynamic and distributed. It is made up of
ny—it has already helped perpetuate our culture of innovation. dozens, even hundreds, of microservices which can be spun up
and scaled quickly to meet evolving user and market demands.
This is a story of things done right. Architecture flexibility in a multi-cloud world often results in a
lack of visibility—with services spread out across multiple cloud

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 1


providers and on-premises infrastructure. As a result of that, Teams also need to more effectively ensure that users are given
cloud architects are finding it difficult to know whether mod- the right access permissions to applications; that application
ern applications are performing as intended. They can't rely on components are properly ring-fenced; and that communications
the old approach that expects every service in this distributed across hybrid infrastructures and workloads are secured. Pere
world to perform consistently as designed. Instead they should Monclus recently wrote about this in detail on VMware's Net-
embrace a well-advertised “declaration of service intent” in order work and Security Virtualization blog2.
to tame complexity in such systems.
We realized that what CIOs really needed were more intelli-
Over time, as it became clear that application owners were strug- gent, resilient, and secure systems that can observe application
gling to maintain desired Service Level Objectives (SLO)1 for their performance and automatically address issues in real-time—so
application users, we ventured to look at SLOs as a “declaration that the applications behave in accordance with the prescribed
of service intent”. The providers who host these applications are SLOs. Creating in essence, a system of resilience trust via a set of
then expected to deliver and operate against those SLOs. What is cascaded SLOs across distributed application services.
needed in this environment is a modern app connectivity solu-
tion to provide true multi-cloud, multi-cluster, and end-to-end Solving the end-to-end application resiliency problem required
secure connectivity. a new way of thinking about apps and app experience. Since
the modern app is essentially a dynamic network of services,
Today, many SLOs are defined in spreadsheets, with various developers need to treat it as such. This requires a new, modern
manual processes that stitch them together in an attempt to service mesh that acts like an interconnectivity superhighway
achieve the desired systems behavior. With application services to give application owners and cloud architects critical infor-
spread out across multiple cloud providers and on-premises mation across disparate silos, so they can make just-in-time
infrastructure, cloud architects find it difficult to know where decisions.
applications are performing as intended. Combine this with the
low reliability of some systems, and we have a problem. Almost So, we asked ourselves, “If we had to build a product from scratch
half of new applications fail to meet performance SLOs even in this space what would it look like?” This is where we came up
though most enterprises overspend on cloud costs by a factor of with the concept of the Predictable Response Time Controller
two to three. (PRTC). With the PRTC, the user specifies the performance met-
rics expected from their services and the controller does its best
Another challenge facing application teams developing new to deliver on these objectives based on the available resources.
applications using a microservices architecture is enabling We built a prototype that needed a service mesh. That is when
connectivity between microservices deployed as containers and we discovered that Pere Monclus and his team were separately
distributed across multiple clouds and hybrid environments. looking at the potential of Istio, an open-source service mesh
technology. We combined efforts to chart the future of applica-
tion resiliency.
1 SLOs define the quality of a service consumers can expect from the application
experience.
2 https://blogs.vmware.com/networkvirtualization/2021/05/addressing-multi-
What should this new, modern service mesh look like? We first
cloud-connectivity-and-security.html/ aimed to understand and truly listen to our customers.

2 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Afterwards, we determined these five critical capabilities: and Security Business Unit (NSBU), were separately exploring
• Declarative Approach: Rather than manually stitching using a service mesh approach to build a holistic product that
together the right experience, organizations need to take would offer various connectivity and security patterns. After
a declarative approach to application experience where a several brainstorming sessions, we realized that if we combined
desired SLO can be set and automatically delivered. application-resiliency use cases with security into one product,
• Traceability: The service mesh needs to track the overall it would revolutionize market perceptions of what a service
performance of user transactions in a specific geographic mesh could do.
location—from the time a user clicks on an application to
the time the transaction is completed—regardless of the While collaboration sometimes feels like a slower, more difficult
number, location, and provider of the distributed cloud path, it usually produces more refined, sophisticated solutions.
services it uses. This is why some of the most remarkable innovations are the
• Context: The new, modern service mesh then needs to product of one key idea combined with a collaborative team
be able to apply these insights to specific metrics that are effort to bring that idea to life. We certainly can see that in the
constantly measured and put into the proper context across Tanzu Service Mesh example.
multi-cloud environments.
• Testing and Iteration: Testing allows the user to roll out We can also see how one idea grew from exposure to others in
new services in a mesh and compare predicted experiences the organization. It’s important to remember that innovation is
to a set baseline. not the sole responsibility of a single business unit, or the CTO.
• Align to Cloud Spend: Experience doesn’t occur in a Good ideas can arise in any part of the enterprise. They often
bubble, so the new, modern service mesh should be able to come from those we work with outside our own walls, especially
apply desired experiences to actual cloud costs. This allows customers, partners, universities, and government agencies.
organizations to balance experience with cloud spend.
Pere Monclus, CTO of NSBU, had the original idea—NSX Service
The result was Tanzu Service Mesh3, which combines connectiv- Mesh. This is the seed that grew into Tanzu Service Mesh, incor-
ity, observability, control, and security across a set of microser- porating the power of a service mesh in solving the SLO prob-
vices. Tanzu Service Mesh allows the customer to define ap- lem. Emad Benjamin of the VMware Office of the CTO conceived
plication SLOs and track application health, while also allowing the killer use case (Predictable Response Time Controller4) to
operators to centrally manage end-to-end application traffic bring the idea to life and helped pull all the internal parties
routing, resiliency, and security policies. It was just what our together. Greg Lavender, VMware’s CTO, saw the opportunity to
customers needed. solve a significant customer problem and had the key to unlock
the funding to overcome the shortfall in resources to resolve it.
The Innovation Process The final product is now being distributed through the Modern
So how did the Tanzu Service Mesh idea come together? And Applications Platform Business Unit—a totally different Busi-
how did it achieve success in only a few months? Keep in mind ness Unit! The entire process required a high level of coopera-
the three key concepts of innovation that we mentioned earlier: tion and collaboration among these teams. No one team could
co-innovation, co-investment and collaboration. Let's go back have done it alone.
and look at what Tanzu Service Mesh team did—in detail—after
they built their prototype. In the success of any innovative product or service, there are
several common characteristics to keep in mind as you deal with
The vision of the OCTO (VMware Office of the CTO) team was the inevitable challenges and setbacks:
geared around application resiliency patterns that can be deliv- • Innovation happens both organically and inorganically,
ered with a service mesh, while Pere’s team, in the Networking often in unplanned ways. It requires an open, innovative
mindset, and an investment plan to bring off-roadmap
innovation to life, and to empower people to come up with
3 For more details on Tanzu Service Mesh, check out disruptive ideas and experiment with them.
https://www.vmware.com/products/tanzu-service-mesh.html and https://tanzu.
• Innovation involves an inherent tension between oppos-
vmware.com/content/vmware-tanzu-service-mesh-resources/tanzu-service-
mesh-datasheet. ing points of view. This process can be uncomfortable. But
4 https://via.vmw.com/PRTC rather than seeing “pushback” as an obstacle, view it as an

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 3


element of creative tension, as a way to improve an idea or Conclusion
concept through a respectful back-and-forth process. In The successful incubation and launch of Tanzu Service Mesh
the Tanzu Service Mesh case, there were several junctures occurred during a time of crisis. Although the project started
where the proposal could have gone off track, but the team’s before the pandemic hit, we had to adjust and work together
culture kept us focused. As an innovator, sometimes you while being in various locations around the world. That immedi-
need to step back, reassess, and either validate or change ate and radical change in our work environment was unsettling
the direction toward a more productive outcome. Critics can and strange at first. We suspect our desire to achieve success
almost always be overcome with effective results. with Tanzu Service Mesh, while everything around seemed to be
• Learn from setbacks and failure. As we challenge each in a state of flux or uncertainty, gave us an even stronger sense of
other’s ideas, we identify what will and will not work. This mission. Crisis breeds innovation.
process helps us build the mental muscle and fortitude to go
the distance and become a catalyst for others to join in. Even We want to leave you with one more important thought that this
failures can be good motivators. story embodies: Bold ideas are good, but bold actions are better.
• Innovation involves collaboration and stimulates more
innovation. Once the initial hurdles to develop an idea are Great ideas succeed only through persistent actions. Putting
overcome, such as funding, the forward motion increases ideas into action sounds easy, but it is actually quite hard. Too
in velocity, attracts more supporters, and becomes a self-re- often, teams fall into the trap of confusing activity with progress.
inforcing cycle. The speed with which we got Tanzu Service Things can go sideways quickly, so there must always be a stan-
Mesh to market reflected this phenomenon. dard of progress against which we measure ourselves.
• Listen, listen, and then listen some more. Listening to our
customers and other stakeholders is critical. We believe lis- This is how we help set the disruptive technology agenda for
tening has a distinct “look.” It is perhaps the biggest source ourselves, and how we create the potential to influence the
of ideas for innovative solutions. True listening helps us ar- direction of the industry.
rive at a deeper understanding of our customers’ needs and
helps us identify opportunities to address them. Along the Greg Lavender is the Chief Technology Officer (and SVP), who leads
way, we need to continuously re-validate our assumptions. the Office of the CTO. You can reach him at lavenderg@vmware.com.
• Seek out diverse opinions and thought. Throughout our ca-
reers, we have absolutely found diversity to be a competitive Emad Benjamin is the Chief Technologist for Cloud Application
advantage. Diverse thinking, skills, backgrounds, perspec- Platforms and works in the Office of the CTO. You can reach him at
tives, and approaches are integral to collaborative problem- ebenjamin@vmware.com.
solving and co-innovation. This is how great advances are
made, how lasting innovation thrives, and how material Pere Monclus is the Chief Technology Officer in the Network and
progress is made. Security BU. You can reach him at pmonclus@vmware.com.

4 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


A NEW SUPPORT MODEL FOR
OUR LARGEST CUSTOMERS:
VRE, SRE, AND ITOPS
Authors: Renu Raman, Chirag Patel, Michael Hein with inputs from Kit Colbert, Jeff Hu, Alex Rankov, and many more. Editor: Dexter Arver

Motivation our customers having to navigate a We have data to support the econom-
The classical IT-ops support model— somewhat broken organization tree ic value of VRE to VMware. Currently,
where a customer is supported by as Support Tickets and Escalations we deliver our Infrastructure as a
VMware’s Engineering, Support, are handed off between responsible service (IaaS) solution in one of four
Professional Services Organization Business Units. For our top 100 cus- ways:
(PSO), and Customer Success orga- tomers, who are looking for a private • Individual VMware Product
nizations—is a multi-layer support cloud that mimics the capability of Components
structure resulting in: Amazon Web Services (AWS), Google • VMware Cloud Foundation (VCF)
• many exchange of hands for the Cloud Platform (GCP) or Microsoft • VMware Cloud on Dell EMC
customer; Azure, we must transform the tradi- (Dimension)
• mis-alignment of authority and tional IT-ops and VI-admin model • VMware Cloud on AWS (VMC)
responsibility; from both a customer satisfaction
• issues that are ping-ponged be- and business model perspective. This As can been seen in the chart on the
tween Business Units; is a proposal to take the best learnings top of the next page, VCF (the red box)
• more people being needed, which from classical IT-ops and Site Reli- has the least influence in the total
means a higher Total Cost of ability Engineering (SRE) and adapt cost of operations. Dimension (the
Ownership (TCO); it to VMware’s on-prem customer blue box) goes one step further in
• and lower availability at scale. base. We call this VMware Reliability handling physical infra operations,
Engineering (VRE). but is limited to vCenter. As we evolve
In many ways, this has resulted in to have foundational IaaS services,

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 5


ure data from 2018 to 2019 (provided
under confidentiality agreements)
calls out the top 4 non-availability
scenarios:
1. lack of automation
2. software versioning of apps and
IaaS
3. incorrect configuration setup due
to human error
4. inability to triage a problem in
real time during incidents

Equally relevant is the monitoring of


all failure events and taking correc-
tive action by triggering automated
and/or manual processes.

Both the economic and availability


VMC (jointly with AWS) is covering ment (ELA) is only 10% of the TCO, reasons highlight the need to reduce
all the operations—from data center "human-in-the-loop" to achieve lower
while 50% of the TCO is the people
to elements of Platform as a Service cost, while increasing availability.
cost. What is the reasonable limit of
(PaaS). For us to achieve a reduction in the
physical machines, operations staff,
number of staff that are assigned by
and thus virtual machines? Control-
There are two overarching reasons to our customers to support the cloud
ling the cost of operations staff is stack environments, we can re-use
ponder a different structural model difficult, but the desired outcome
of operating IaaS. One is related to and build on the same methodologies
here is to be cost competitive relative that many of our cloud competitors
the economic model and the other, to consuming a VM with a hyper-
availability. are deploying. This becomes a model
scaler—at-least 50:1 (i.e., 5 times the for not only how we release & support
current ratio). our software, but also as a model for
Economic Reason
Let's dig into the TCO model by taking how we should be reorganizing to de-
Availability Reason liver on these goals. To achieve these
a look at two anonymized companies.
There is a second reason for a dif- optimizations, we need to explore
Company A, a large international
ferent structural model of operating and characterize the guiding prin-
database vendor, has 25,000 physi-
cal machines, which are managed by IaaS—availability. Company A’s fail- ciples of the VRE/SRE model.
2,500 direct staff and 2,500 extended
staff (lower cost). Company B, a large
international bank, has an environ-
ment that is roughly twice the size of
Company A. These companies have a
physical machines to IT staff ratio be-
tween 10:1 to 30:1. The drive to keep
that ratio as low as possible leads to
a non-optimal decision to consoli-
date around bigger machines at the
cost of larger fault domain or blast
radius (i.e., one 4 socket server vs two
2 socket servers). The TCO of a 150K
VM landscape with an annual growth
rate of 25% results in a 500K VM
landscape over 5-years. It costs $2.5B
to operate such a landscape in which
VMware Enterprise License Agree-

6 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Guiding Principles for VRE/SRE model
1. Discipline and Execution: Repaving “Build the plane as you fly the plane” Methodology. There is industry data that
shows—at scale (1K, 10K, 100K, 1M VMs)—there are faults that have to be tested at development time that has to be
tested, validated, and integrated in situ.
2. Automation/Manpower: The #1 problem for availability is related to inadequate monitoring, automation, and human
intervention in operations. The #1 cost component in the TCO model is operational staffing cost. The need is more highly
skilled and trained operational staff who can automate workflows and be less dependent on vendor or 3rd parties by
relying more on VMware and our customer operations team. Achieving 15 minutes of Mean Time to Recover (MTTR) will
require revisiting architectural assumptions along with upskilling and cradle-to-grave ownership of the services engi-
neered for 24/7/365.
3. Culture: A culture and methodology of fail-fast and fail-first with integrated testing. A culture of not shooting the mes-
senger or passing blame, but taking every failure as a lesson learned to improve the system. A culture of developing and
deploying only if there is measurable improvement over what exists. The teams will deliver to defined SLAs and inter-
faces.
4. Skills: Everything delivered as code (automation); turn manual processes into automation. Those automations should
be designed to cloud scale, which are inherently distributed. A culture of connection between operations (50%) and
development (50%).
5. Accountability + Responsibility: This is what successful support teams at AWS and other cloud organizations look like:
flat and modular teams in development with associated business leadership that interacts directly with the sales account
teams. Alignment of accountability (including revenue ownership, which trickles back from the general account man-
ager to the services PM) and responsibility for both incident response, feature development, and revenue targets. There
is one throat to choke at each layer on the in-bound (revenue and incident) and outbound (which is feature, RCA, and
operational automation).
6. Streamline the # of layers: Currently, it’s a matrix structure that responds to customer needs. As you will see below, we
can streamline the model with a clear in-bound (e.g., revenue, incident, feature requests) and outbound (e.g., feature
capability, issues/fixes, Day N operational support) flow.

Having defined the core principles for the VRE/SRE model that we desire, let’s take a look at the golden rules that governs the
interaction with our customers and how this model will benefit the overall customer experience.

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 7


Details of VRE, SRE…Current The visuals below show the current (automated or manual) to the dev
engagement model with cus- state of engagement and enablement teams (SRE) and other 3rd par-
tomers of private cloud with our customers. ties. A visual of the incident flow
Key tenets of the VRE/SRE model are and the role of the VRE is shown
the following: Having defined the high-level prin- below. The incident management
1. Minimize the # of hand-offs ciples and the customer interaction workflows are detailed here1.
from development (features and model, let’s look specifically at the 3. Private Cloud build (Day 0, 1
capabilities) to deployment at profile of the reliability engineers and automations): Works with the
the customer site (alignment of their roles and responsibilities. customer to architect private
authority and responsibility). cloud builds from standard
2. Formulate an SRE team that VMware Reliability Engineer blueprints and help drive stan-
extends engineering’s ability to profile dardization across customers. In
create foundational services that VRE is the customer's primary addition, the VRE (in conjunction
are supported and maintained interface. VRE works closely with with the customer) will validate
24/7/365 around the globe. the customer during Day 0,1, and 2 every release, including sub-sys-
operations on behalf of VMware’s tem release, using the customer's
3. Build a closely knit VRE team that
SRE and engineering teams. Their test landscape and green light
takes the output from the SRE
primary objectives are to help reduce releases. Lastly, the VRE will sup-
team and enables the customer
customer operational expense and port the customer by automating
to operationalize their private
improve availability and operational workflows—closing gaps in VCF
cloud—including Day 2 opera-
experience. To that end, a VRE has that are bespoke to the customer
tions. The VRE team also acts as a
the following attributes. instance.
“PM” for operational features that 1. Customer requirements: Trans-
the customer requires. 4. Day 2 Automation: Today, a
lates the customer requirements significant gap that exists is Day
4. This is not a separate delivery into product/service require-
team or function. Backend 2 operations from the standpoint
ments. Acts as a voice of the of TCO and availability. Many au-
engineering is still responsible customer to the SRE and product/ tomations (failure recovery, per-
for releases and no separate forks service teams. formance, and capacity manage-
or releases are created. Patches 2. Incident Management: Manag- ment) are bespoke to a customer,
are still delivered via the regular es ticketing and routing of events but some need to be brought
channels.
back to the core platforms—the
1 https://miro.com/app/board/o9J_lfQSnzs=/
VRE needs to be skilled enough
to make that determination. A
VRE typically spends 50% of their
time in software development in
creating bespoke automations.
This includes translation of run-
books into automation flows with
the customer. They are co-located
with the field organization and
works across VMware's SRE
teams to ensure the right services
and automation workflows are
delivered against the needs of the
private cloud instance of their
customers. To align accountabil-
ity and responsibility, a VRE also
acts as the customer's product
manager.

8 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


The VRE (in light green) would orchestrate the flow of information
between the customer and all other parties.

VRE Role and Responsibility


• Is the customer's solution ar-
chitect and develops the archi-
tecture in conjunction with the
customer, while also validating
the platform.
• Is the customer's product and
service manager, thus collect-
ing requirements and acting as a
gatekeeper for feature releases.
• Has the responsibility and
authority to address customer
private cloud availability. This
means 24/7/365 support—2 or 3
shifts with pager responsibilities.
• Is responsible for routing alerts,
events, and takes on L1 calls—
jointly with the customer op-
erations team—before SREs and
product teams are alerted. The
VMware GSS team will be co-
participants.

It is critical to empower the VRE with


the above to be effective in achieving
the desired objective.

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 9


Services Reliability Engineering service would require 8-16 engineers,
Profile who would be responsible for "cradle-
SREs are owners of the services that to-grave" support—24/7/365 for the
are delivered in conjunction with the said service.
engineering product teams (i.e., they
create the services that run on the SRE Role and Responsibility
VCF infrastructure). With Anything as • Creating Services (foundation
a Service (XaaS), SREs own out-of-the- and platform services).
box services that are customer visible • Collaborating with engineering—
and consumable. These services are cross-functional and cross -BU.
the basis of the foundational IaaS • They are ephemeral teams that
layer; and examples include Vir- are formed to address VRE needs.
tual Machine as a Service (VMaaS), • They are an extension of platform
Container as a Service (CaaS), and the teams.
foundation services (e.g., file, block, • They are a part of the incident
object, LB, DNS, Auth, register, etc.). flow, working with VREs and
product teams, or resolving is-
The SREs partner with the core sues with the services that they
engineering teams to develop these own.
services that are optimized for at- • They act as a backstop for the
scale clusters. There is significant VREs. Most field issues that are
overlap and movement between the escalated are handled by SREs
core engineering team and the SREs. before its sent over to the dev
But SREs has one responsibility— teams.
which is to work closely with the VRE • 24/7/365 role.
teams and make VREs successful in • They generally possess unique
consuming these foundation services expertise in one of the founda-
for the target customers. A typical tion or platform services.

10 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


In Summary
We have defined a new proposed responsibility. We have seen that the ing tooling, and reducing the incident
structural model that sits between model has the potential for up to a time (i.e., recovery automation and
engineering and sales/customer- 73% reduction in headcount between many more areas). In addition to
facing organizations. This model VMware and the customer. We have the TCO and MTTR gains, an overall
streamlines the operational needs of also observed a reduction in overall corresponding Customer Satisfaction
our large customers while lowering cost even with the addition of the (CSAT) is also anticipated. We expect
the overall cost and creating stron- VRE teams. Furthermore, we expect gains in Infrastructure Readiness (IR),
ger alignment of accountability and an MTTR gain by updating monitor- with improved automation goals.

NEXT STEPS
We will work closely with 1-2 customers to pilot this new model. We hope to be able to give an update once we have demonstrable
results from this new model. If you would like additional information, please reach out to any of the authors of this paper.

Renu Raman is a Senior Cloud Platform Architect working in the Office of the CTO. You can reach him at renur@vmware.com.

Chirag Patel is a Principal Consulting Architect working in Americas Professional Services. You can reach him at cpatel@vmware.com.

Michael Hein is a Sr. Staff Engineer working in the Office of the CTO. You can reach him at heinm@vmware.com.

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 11


Environmental, Social & Governance
Author: Natasha Tuck | Editor: Dexter Arver
VMware has a legacy of positive impact on the environment, as well as responsible cor-
porate citizenship. We have enabled carbon emissions avoidance of over 1.2B metric tons
through the use of our products—that’s equivalent to a 2019 Tesla Model S Long Range
electric car driving back and forth to Mars more than 28,000 times!!1 We have a strong 1 IDC White Paper, sponsored by
culture of people who live our EPIC2 values and care deeply about giving back as citizen VMware. “Enabling More Agile
& Sustainable Business through
philanthropists. Carbon-Efficient Digital Trans-
formations” August 2020. http://
At VMware, we have always viewed sustainability with a holistic lens—we acknowledge www.vmware.com/go/VMware-
the interconnectedness of our environmental efforts, social initiatives, and strategic busi- IDCWhitePaper2020
ness priorities. This is why, over the past 5 years, we have talked about sustainability in
terms of our People, our Planet, and our Products. By the way, this “3P” or “Triple P” fram-
ing is commonplace in the sustainability field and is intended to emphasize that sustain-
ability does not standalone. Recently, we have evolved our strategy further and developed
an innovative approach to our next set of goals—our 10-year commitment to driving
positive business outcomes by 2030. In this article, I will cover how we arrived at our 2030
Agenda and the shift toward Environmental, Social & Governance (ESG) through the lens
of innovation.

Our Innovative Approach to Getting to 2030


At VMware, we love to innovate in everything that we do. And so, when we went to devel-
op the 2030 Agenda, we innovated on how to develop an ESG strategy. As we got started,
we asked ourselves what are we really trying to achieve by 2030? And what does the world
need in order to be better? This is how we arrived at our outcome-driven approach, which
keeps Trust, Equity, and Sustainability front and center.

Additionally, we had a unique opportunity at VMware to build our strategy from the
inside-out—we started with our people, we reviewed our products and technology vision,
and only then we aligned the ESG goals (according to impact). Unlike many companies,
VMware’s core business strategy (e.g., intrinsic security, distributed workforce technology,
radical efficiency of digital infrastructure, etc.) is uniquely aligned to ESG and can there-
fore contribute in powerful ways—amplifying the impact that we can have as a company
on material environmental and social issues.

What do we mean by Trust, Equity, and Sustainability and how are they aligned to our
business? Here are some examples:
• Intrinsic security to drive trust in the IT sector.
• Digital workspace technology to drive equity by enabling people to work from any-
where and however they want to work.
• Radically efficient workloads to drive sustainability and support our customers in
decarbonizing their IT infrastructure.

14 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


“Our 2030 Agenda is transformative because our 30x30 goals are owned by our
business leaders and built into everything we do—from the way we develop
software and bring our solutions to market, to the way we build a culture
of inclusion—and in this way we are aligning our core purpose with doing our
part to create a more sustainable, equitable, and resilient world.”
Nicola Acutt, Vice President, Environmental, Social and Governance

Additional examples of ESG goals within the VMware 2030 Agenda include:
• Collaborating with our public cloud partners to achieve zero carbon operations by
2030.
• Investing in transformative research that inspires the next generation of sustainable
digital infrastructure.
• Closing the digital skills gap and making digital transformation more accessible for
all.
• Hiring one woman for every one man and ensuring 50 percent of our managers are
women or from an underrepresented community.
• Achieving net-zero carbon emissions for our operations and supply chain and reduce
our emissions 50 percent by 2030 from our 2018 baseline.
• Engaging 75 percent of suppliers (by spend) to reduce their emissions by setting
science-based targets by 2024.
• Procuring 100 percent of our power from renewable energy sources.

After aligning our core products and business strategy with the material topics that we
identified after a formal materiality assessment process (you can read more about this 2 https://www.vmware.com/
process in our 2020 Global Impact Report2), we began to work with key stakeholders—de- content/dam/digitalmarketing/
vmware/en/pdf/sustainability/vm-
veloping goals across these topic areas and aligning them to our outcomes. We have built
ware-global-impact-report-2020.
in accountability by integrating these ESG goals into our business functions. These goals pdf
are owned by our business leaders.

The most far reaching impact (the ‘outest’ of our inside-out) is represented by the United 3 https://sdgs.un.org/goals
Nations’ (UN) Sustainable Development Goals (SDGs)3. These were created in 2015 by the
UN to address the world’s most pressing issues—with the hope that by achieving these 17
goals, the world could be a prosperous planet for all by 2030. VMware is a member of the 4 https://www.unglobalcompact.
UN Global Compact and we are aligned4 to several of the UN SDGs, as well as their time org/what-is-gc/participants/137744
horizon. As a result of the work, we arrived at 30 goals to achieve by 2030, also known as
“30x30”, and these elements make up our integrated ESG strategy—our 2030 Agenda.

Innovation is a Mindset
As I reflected on our team’s development of the 2030 Agenda, I had an epiphany about
innovation. I believe that the way we can be more innovative is viewing our work through
the lens of our work’s greatest purpose. When we hold the belief that we are building
something larger than ourselves, we think differently—with more vigor, enthusiasm, and
urgency—and with this renewed sense of purpose, we see the possibility of what could be.

Another unique quality of our 2030 agenda is that it was developed from the ground up.
These goals were not prescribed by our executives, rather, they were developed by our
BUs and functional teams across the company. This was a collaborative process which
assessed the alignment of existing work and any moonshot goals. When I worked with
these teams, I saw first-hand that teams were energized by this alignment and the oppor-
tunity to embed environmental and social outcomes into business goals. I also saw that

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 15


VMware ESG Strategy = 2030 Agenda = 30x30 Goals
A CCO U N T A B L E
FRAM E W ORK TH EM ES G O A LS O U T CO M E S F U N C T I O N (S )

1. Workload Carbon Efficiency Accelerate productivity and carbon efficiency of customers’ digital operations OCTO/PCS
ENVIRONMENTAL

2. Zero-Carbon Clouds Catalyze the transition to zero-carbon clouds PCS/CSBU

SUSTAINABILITY
3. Carbon Transparency Enable transparency to the carbon reduction impact of VMware solutions WWCO/Mktg.

4. Net-Zero Emissions Achieve net-zero carbon emissions for our operations and supply chain REW/Sourcing

5. E-Waste Responsibility Drive e-waste responsibility throughout operations REW/IT

6. Business Resilience Ensure business resilience from our physical infrastructure to our distributed workforce IT/InfoS/REW

7. Distributed Energy Support the transition to distributed energy REW/Sustain.

8. Water Resilience Enable water resilience amongst our global communities REW

9. Impact Investments Invest in innovations at the intersection of social, environmental & financial impact Finance

10. Sustainable IT Infrastructure Advocacy Advocate for public policy that drives secure, resilient and sustainable IT infrastructure Legal/GA

11. Anywhere Workforce Enable our customers’ distributed workforces to be productive and engaged wherever they are working. IT/EUC

12. Nonprofit Digital Transformation Accelerate nonprofits' digital journeys OCTO/SI

13. Dynamic Workforce Build a diverse, innovative workforce by meeting talent where they are and how they want to work HR

14. Technology Accessibility Ensure the technology we develop is accessible for all PCS
SOCIAL

EQUITY
15. Equitable Pay Drive equity through equitable pay HR

16. Diversity & Inclusion Drive equity through doubling down on diverse hiring and inclusive leadership DEI

17. Engagement & Wellbeing Empower our employees through accessible, inclusive and innovative engagement and wellbeing programs HR

18. Culture of Service Foster a culture of service among our global communities Foundation

19. Supplier Diversity Support diversity in our supply chain by increasing spend with diverse-owned and underrepresented suppliers Sourcing

20. Digital Skills Advance technical and digital skills acquisition around the world OCTO/SI

21. Intrinsic Security Enable a safer cyber world through Intrinsic Security PCS/SBU

22. VMware on VMware VMware’s internal infrastructure will leverage its own software and services with a focus on trust, security, experience and sustainability IT
GOVERNANCE

23. Privacy by Design Inspire customer and employee trust by embedding Privacy by Design across our products, services and operations Legal/IT/PCS

24. Digital Ethics Advance our approach to digital ethics & stewardship OCTO/PCS

TRUST
25. Workforce Development Enable our people to advance from every chair HR

26. Fair & Ethical Advance fair and ethical business practices Legal

27. Integrated Reporting Transition to integrated reporting, meeting or exceeding the environmental & social disclosure standards Finance

28. Transparency for All Stakeholders Accelerate accountability and transparency for the benefit of all stakeholders Legal/Fin/ESG

29. Sustainable Finance Integrate sustainable metrics into our financial decision-making process. Finance

30. Social Impact Advocacy Support relevant public policy that drives social and environmental impact through IT Legal/GA
1
Confidential – Internal only │ ©2020 VMware, Inc.

each team is striving for innovation in its own way—whether


through implementing a new tool to do predictive analytics,
thinking differently about employee wellness, or re-imagining
responsible sourcing. This is a 10-year journey that we are just
getting started on and we will continue to work to bring along
each team with their part in driving the 2030 Agenda. Our
plan is to continue to embed our ESG strategy into business-
decision-making across the company (e.g., Which companies
should we invest in?, Are we building the latest feature with
‘green’ code?).

In some ways, it’s simple—as humans, we are raised to make


choices in life that are kind, considerate, and compassionate.
In business, we have steered away from the human aspect and
the understanding that our businesses should support society
in developing thriving communities. It’s not too late to priori-
tize people and the environment as we make business deci-
sions—we call it integrated decision-making .

One of my favorite sayings at VMware is “individual actions


matter” (this caught my attention when I joined six years ago as
an ongoing call to action). I too believe that small things add up
to big and even bigger things. I especially believe that if we are Natasha and her son, Nicco at the Duomo in Milan. "We’re
aligned, we can amplify our impact. not laying bricks, we’re building a cathedral!"

16 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Making sense of ESG
People are still trying to get theirs heads around ESG and I think that’s because the ESG
space is changing so rapidly. Investors owned the term ESG until recently and now compa-
nies are adopting this framework as a way to embed social and environmental, risks and
opportunities into their business. This new evolution of ESG is about fully integrating the
risks and opportunities to drive business resilience and to future-proof our business.

You can think of it this way…traditionally, in-house Sustainability or Corporate Social


Responsibility teams have been somewhat siloed and for the most part, a nice-to-have,
but ESG is a more strategic approach that considers the perspectives of a wider group of 5 “The Rise of ESG Reporting”
https://blog.nacdonline.org/posts/
stakeholders. As one example, ESG is assessing the impact that climate change could have rise-of-esg-reporting
on VMware’s business and putting good governance in place to address social and envi-
ronmental risks. Imagine a climate change event that impacts our largest customers, or a 6 https://www.vmware.com/con-
city where we have a concentrated number of employees—our preparedness for this is a tent/dam/digitalmarketing/vm-
ware/en/pdf/microsites/vmware-
reflection of our strategy. Investors want to know that companies are considering envi- global-impact-report.pdf
ronmental and social issues that could negatively or positively impact their business. It’s a
holistic view on these global risks that companies now need to consider.

2020 was marked by social unrest, the COVID pandemic, and environmental disasters
across the world—all which inadvertently raised public awareness around the burgeoning
field of ESG practices. Today, the notion of wiping out poverty, protecting human rights,
ensuring clean water, and mitigating climate change are no longer seen as feel-good
fantasies that are out of scope for businesses. They are viewed as moral and economic
imperatives for living gracefully on a planet inching its way toward 10 billion human in-
habitants. In fact, Julie Bell Lindsay, executive director of the Center for Audit Quality, said
that we are now experiencing a “watershed moment”5 with the increased investment in
public companies with strong ESG practices. I especially loved reading this after publish-
ing our 2020 Global Impact Report last August, which named this watershed moment on
our front cover6.

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 17


“ESG is no longer a nice-to-have, but it has become a business imperative. With
the changing political tide, increased consumer pressure and focus around
employee wellness, we see each environmental, social and governance issue
be top of mind of our CEOs, CFOs and audit committee chairs. With growing
internal and external momentum, the focus on ESG in North America has never
been more promising.”
Angela Jhanji, ESG Leader7

Does ESG matters to our stakeholders? 7 Grant Thornton (2021). ESG is a business
imperative. https://www.grantthornton.
The proof is in, companies that care about ESG outperform their peers. As Jon
com/-/media/content-page-files/realestate-
Hale reports in Morningstar, “after holding their own in the fourth quarter, sus- construction/pdfs/2021/ESG-business-
tainable equity funds finished 2020 with a clear performance advantage relative imperative-real-estate.ashx
to traditional equity funds.”8 For this reason, investors are continuing to sharpen
8 Hale, Jon (2021). Sustainable Equity Funds
their focus on ESG. It is the differentiator that is driving value. Blackrock9, the
Outperform Traditional Peers in 2020. https://
world’s largest asset manager, has doubled-down on the importance of ESG and www.morningstar.com/articles/1017056/
is leading the charge. Larry Fink (their CEO) has changed the landscape with his sustainable-equity-funds-outperform-tradi-
annual letter10, which is written with more conviction each year in its request, and tional-peers-in-2020
now mandate, for companies to get on board and accurately disclose their risks
9 https://www.blackrock.com/us/individual
and opportunities related to climate change. BlackRock will dispose of companies
that aren’t keeping up. As I read recently, the conversation has moved from “Why” 10 https://www.blackrock.com/corporate/
to “How.” investor-relations/larry-fink-ceo-letter

ESG is no longer a nice-to-have. Rather our focus on ESG will help VMware's business flourish.

18 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Here is another proof point on how investors view ESG; these are the latest stats from
Edelman’s Trust Barometer. The 2020 Edelman Trust Barometer Special Report: Institu-
tional Investors11 identifies pivotal issues shaping investment criteria and how companies 11 https://www.edelman.
com/sites/g/files/aatuss191/
can build trust with the investment community.
files/2020-11/Edelman%20
2020%20Institutional%20Inves-
The proprietary research surveys 600 institutional investors in six countries represent- tor%20Trust_FINAL.pdf
ing firms that collectively manage over $20 trillion in assets. The survey was fielded from
September 3 through October 9, 2020. Some of the findings are below-left.

Those statitics are no longer esoteric, as VMware’s top shareholders have high expecta-
tions. Here’s what they’re saying (below-right).

Edelman's trust barometer VMware shareholders & ESG


• 92% believe companies that perform well on ESG Dodge & Cox: "As value-oriented investors, we weigh valu-
deserve a premium. ation against risks and opportunities for each company and
issuer, and we believe material ESG factors can have a mean-
• 99% expect the Board of Directors to oversee at ingful impact on current and future valuations."12
least one ESG topic.
Blackrock: "We know that climate risk is investment risk.
But we also believe the climate transition presents a historic
• 88% believe companies are not prepared for regu-
investment opportunity...we are asking companies to disclose
lation of ESG reporting. a plan for how their business model will be compatible with
a net zero economy...[and] how this plan is incorporated into
• 93% believe that ESG activism by traditional finan- your long-term strategy and reviewed by your board of direc-
cial activists will increase. tors."10

• Investors want to hear more from: CFO (41%), Vanguard: "Climate change presents a profound risk to
Head of ESG (35%), and CEO (33%). companies and their long-term investors...we expect company
boards to be aware of their role in the changing climate." 13

Closing 12 https://www.dodgeandcox.com/
pdf/ESG_Policy_Statement_US.pdf
When we envision the future, we imagine a sustainable world that is secure, equitable,
and resilient. As breakthrough innovators, it is our responsibility to build dynamic and ef- 13 https://about.vanguard.com/
ficient digital infrastructures for our customers. As global citizens, it is our responsibility investment-stewardship/perspec-
to be better stewards of our global resources. As a responsible company, it is our business tives-and-commentary/2020_in-
vestment_stewardship_annual_re-
to build a secure, resilient, and sustainable digital foundation for a future in which our
port.pdf
technology will make a positive impact on all our stakeholders: employees, customers,
shareholders, citizens, communities, and our planet.

With each innovation our business has brought to market, we have seen how even a single
step forward can create a ripple effect that transforms an entire industry. We can and
should apply this thinking to the challenges that our world faces today: a global pandemic,
social injustice, financial instability, and climate change. As Greg likes to say, “innovation
happens everywhere” and this moment is an opportunity to create systemic and scalable “We cannot
change by applying the lens of innovation to everything we do. solve our
problems
In order to continue our legacy of positive impact, we need to continue to lead on issues
that are most material to the world and to our business. The world is constantly chang- with the same
ing and in order to stay focused on the most pressing matters, we have to set a vision for thinking we used
where we want to go. Our 2030 Agenda is that vision.
when we created
Natasha Tuck is the Director of Sustainability and ESG working in the Office of the CTO. You them.”
can reach her at ntuck@vmware.com. Albert Einstein

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 19


Editors: Leonid Ryzhyk, Ben Pfaff, Mohsin Beg, Lori Blonn, Dexter Arver

A BRIEF HISTORY OF
DIFFERENTIAL DATALOG
Mihai Budiu
Senior Staff Researcher
VMware Research Group
mbudiu@vmware.com

Research on programmable networks RADIO 2018


In 2017 Leonid Ryzhyk, a Senior Researcher from The team, together with a few other collaborators,
VMware Research Group (VRG), initiated a project wrote a paper about the NERPA design for RADIO 2018
called NERPA: NEtwork programming with Relational [2]. The paper was accepted as a long presentation at
and Procedural Abstractions. The goal of NERPA was RADIO, and the presentation was delivered by Leonid.
to make software-defined networks easier to create Following RADIO, the team was contacted by several
and manage. engineers from the NSX team who were intrigued by
Leonid’s background is in formal verification. the idea and wanted to give it a try. Wei Guo and Niko-
Formal verification deals with the creation of artifacts lay Semenov thought that the proposal could benefit
that can be proved automatically correct. Leonid’s some tricky parts of the NSX controller. As a result of
prior project, called Cocoon [1] (Cocoon is another this collaboration, the team focused all its efforts on
acronym, which stands for Correct by Construction the development of one of the two NERPA languages,
Networks) showed how formal methods can be suc- the one used for writing control-planes. Thus, DDlog,
cessfully applied in the context of network design or Differential Datalog, was born.
and management. In the same way type-systems and
garbage-collection led to the creation of safe languages Differential Datalog
such as Java, where certain classes of bugs are impos- The team realized that one of the constant things
sible, precise specification languages can be used in in computer networks (especially in virtualized
the network realm to simplify and improve network networks, like NSX) is… change. Networks change
construction. continuously and frequently, due to the various states
By analyzing the code of VMware’s network man- of virtual infrastructures: VM start/stop, VM migra-
agement platform, NSX-T, Leonid became convinced tion, network maintenance, network faults, change
that relying on general-purpose languages (such as
in administrative policies, or just changes in traffic. If
Java) for network construction and management was
one could automate the handling of changes, it could
not the best approach—rather, designing domain-spe-
simplify the management of networks significantly.
cific languages for the context of networks could lead
The team realized that by combining some existing
to significant improvements in programmer produc-
technologies, they could build the right tool—an incre-
tivity.
mental programming language. By using an incremen-
The NERPA project started with the goal of defin-
tal programming language, programmers only have to
ing a unified programming language for implement-
express what they want to achieve, and the language
ing both network data-planes and the network control-
automatically computes how to get there when exter-
planes. Leonid convinced several other employees to
join this effort—Mihai Budiu, from VRG, as wells as nal conditions change. This is DDlog in a nutshell.
Ben Pfaff and Justin Pettit from NSX. To prove NERPA’s The team obtained approval from the Open Source
viability, the team sought to rewrite a significant net- Program Office to start and evolve the DDlog project as
work management software using the proposed pro- an open-source project with an MIT permissive license.
gramming language. The NERPA project proposed two The project is hosted on github1.
languages joined at the hip—an imperative language
for writing packet processing pipelines and a declara- NSX-T Distributed Firewall
tive language for writing network control planes. Together with the NSX engineers, the team
rewrote several algorithms from the NSX centralized

1 https://github.com/vmware/differential-datalog

20 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Skyline
A serendipitous encounter with Mohsin Beg, a Se-
nior Staff engineer from CMBU, brought the incremen-
tal computation capabilities of DDlog to the attention
of CMBU. Coincidentally, this encounter occurred at
about the same time Mohsin transitioned from a previ-
ous project (where he led the Log Intelligence service)
to managing the Skyline project.
Skyline is a managed service, offering proactive
and preventative health analysis to VMware custom-
ers. Customers can opt into reporting their SDDC con-
figuration and topology data as part of their support
license. These configurations are then uploaded into
VMware’s cloud and analyzed (based on knowledge
base articles) using a custom rule engine that detects
misconfigurations, known problems, etc. Lastly, the
health results are then reported to customers in SaaS
service, who have remediated more than 65% of their
problems without engaging VMware Global Support.
While Skyline has been successful, the continuous
increase in breadth, depth, and frequency of health
findings was linearly consuming cloud resources.
Mohsin realized that an incremental engine that would
only analyze changed data, and not whole configura-
tion, would offer the non-linear resource consumption
scalability that was needed.
Mohsin made the risky bet of choosing the rela-
Figure 1: Leonid Ryzhyk presents on the mainstage at RADIO 2019. tively new technology of DDlog to rewrite the rule
engine and rule base. Moreover, since DDlog was not
controller, which dealt with the distributed firewall— designed as a cloud service, a whole new set of internal
with the hand-written, incremental version of these services for a clustered data-plane (with load-balanc-
algorithms being quite involved. The resulting paper ing, fault-tolerance, multi-tenant support, etc.) had
was presented at RADIO 2019 (Figure 1). It showed how to be developed to run DDlog as an analytics engine
a 1,800-line Java program could be replaced with a 30 service. The new Skyline Insights Platform, which
line DDlog program (see Figure 2) that is faster, uses powers Skyline, went live in June 2021. As proof of
less memory, and has fewer bugs. DDLog’s capabilities, the old and new health analysis
This was just an initial prototype. Several other en- services are being run in parallel and customers will be
gineers from the NSX team joined the effort to produc- migrated periodically without any loss of accuracy. The
tize the code, including Eric Kao and Harold Lim. The new Insights Platform has tremendously improved the
road to a product was a long and winding one. DDlog system scalability, bringing down cumulative compute
had to be included in the build system, certified via and memory required to produce the same analysis,
security reviews, tested, and hardened. The team also while decreasing the latency of health processing to
developed several other NSX controller applications. report health findings from a few hours to a few seconds.
NSX-T finally shipped in the fall of 2020 with several
components written in DDlog. OVN2
In the meantime, Ben Pfaff, Leonid Ryzhyk, and
Justin Pettit continued with their effort to show that
DDlog is not just a research prototype, but that it
could be used to write very complex applications. The
three ported an existing open-source virtual network
management system, OVN, from the C programming
Figure 2: This is the complete implementation of the NSX-T Distributed
Firewall span (the final program used is somewhat more involved due to
some manual optimizations.)! 2 https://github.com/openvswitch/ovs/tree/master/ovn

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 21


language to DDlog. The effort was substantial, and in D3Log is that is uses the DDlog language both as a
it produced a 16,000-line DDlog program that has data-plane language (for performing data transfor-
most of the features of an industrial-strength network mations), but also as the control-plane language, to
control systems. This DDlog program is roughly the manage the DDlog distributed system itself, as it was
same size as the original C code, but it is also fully in- originally envisioned and used in NSX.
cremental. This means that it will react very efficiently
Finally, the team has again revived the NERPA
to any change in configuration. This is very important
when managing large scale networks (hundreds of project, dormant since 2018, attempting to build a new
thousands of virtual machines). This DDlog implemen- prototype that marries DDlog with industry-standard
tation has been certified by tests to be fully equivalent programming languages, such as P4.
with the C implementation and has been merged into
the main branch of OVN in January 2021. Conclusions
The DDlog story has only begun. The open-source
xLabs funding project has high visibility and significant updates.
In the meantime, emboldened by the new use Several VMware products have shipped with DDlog
cases and demand, the DDlog VRG team applied for components. But perhaps the most interesting les-
funding from the xLabs program—an Office of the CTO
son of this story is about the successful technology
program which funds the development of promising
new technologies. The xLabs team approved funding transfer—an idea born in research, its adoption by
for three workstreams related to DDlog, and the team engineers who are looking for the best tools for solving
started looking for qualified engineers. The team cur- hard problems, and the persistence and continuous
rently has 3 xLabs employees, the first having started effort required to convert a great idea into a useful tool
in January 2021. They are accelerating the develop- that can be used on an industrial scale.
ment of the language, runtime, and associated tools.
Several of the xLabs proposals were also converted Bibliography
into proposals for RADIO 2020. Two of these were pre- [1] Leonid Ryzhyk, Nikolaj Bjørner, Marco Canini, Jean-
sented as a part of RADIO 2020: the development of a Baptiste Jeannin, Cole Schlesinger, Douglas B. Terry,
distributed DDlog implementation [4], and the imple-
and George Varghese, Correct by Construction Net-
mentation of a compiler that converts SQL to DDlog
[5]. The development of the distributed version of works using Stepwise Refinement, NSDI 2017.
DDlog was significantly bolstered by another VMware [2] Leonid Ryzhyk, Mihai Budiu, Nina Narodytska,
Engineer, Daniel Müller, who chose to work on DDlog Mooly Sagiv, Frank McSherry, George Varghese, Justin
by utilizing the Take3 program. Pettit, Ben Pfaff, NERPA: Concise, Modular Program-
ming of Industrial SDNs, RADIO 2018.
Mimar, D3Log, and NERPA again [3] Mihai Budiu, Wei Guo, Deepika Kalani, Yanjun Lin,
The Skyline team realized that there was high Leonid Ryzhyk, Nikolay Semenov, Accelerating Dis-
value in exposing the capabilities of DDlog, as a hosted tributed Firewall Span Computation Using Differential
service, that many teams within VMware could poten- Datalog, RADIO 2019
tially take advantage of. So, a new project was born,
[4] Daniel Müller, Mihai Budiu, Wei Guo, Eric Kao, Har-
called Mimar, which is attempting to provide a data
transformation as a managed pipeline. By the way, old Lim, Leonid Ryzhyk, Building Distributed Network
Mimar is also being funded by xLabs. The Mimar team, Control Planes with D3log, RADIO 2020
together with several engineers from the Workspace [5] Mihai Budiu, Lalith Suresh, Leonid Ryzhyk, Incre-
ONE Intelligence team, built a new prototype show- mental computation in SQL by compilation to
ing how a Mimar pipeline could replace a complex, Differential Datalog, RADIO 2020
big-data processing pipeline based on Apache Spark, [6] Mohsin Beg, Pooja Khandelwal, Manish Roy, Leonid
Flink, and Redis. This paper was published in RADIO Ryzhyk, Yao Zhang, Luke Yue, Mihai Budiu, Mimar: A
2021 [6]. Programmable Stream Processing Service, RADIO 2021
In the meantime, the xLabs team is forging for-
ward with developing a distributed runtime for DDlog Mihai Budiu is a Senior Staff Researcher working in the
called D3Log (from Distributed DDlog). The hope is Office of the CTO. You can reach him at
that this runtime can be leveraged by projects such as
mbudiu@vmware.com.
Mimar, Skyline, and NSX for running more scalable
versions of DDlog3. Another interesting innovation 3 https://via.vmw.com/ETYy

22 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Extensible
Token-Based
Authorization

The figure above shows a high level block system for authenticating access to resources in accordance with the present disclosure.

Patent Details Dexter Arver caught up with Dale Olds


Patent number: US10452328B2 (olds@vmware.com) and John DiRico
TITLE: Extensible Token-Based Authorization (jdirico@vmware.com) on April 26,
Inventors: Dale Olds, Fanny Strudel, Brad Neighbors 2021. Dale Olds is a Principal Engineer
Application filed: 2016-08-31 working in End-User Computing. John
Application Granted: 2019-10-22 DiRico is a Intellectual Property Coun-
sel II working in our legal department.

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 23


Dexter: Hi Dale! Could you explain of compartmentalizing the security July, we thought, oh this… this fits
this patent to me? characteristics. really well. The end result is that the
vast majority of the contents of this
Dale: This patent was about com- Dexter: And how did you come to patent are now the IAM (Identity
bining multiple areas together. It work on this? and Access Management) system for
was building on the OAuth2 stan- CSP, used by VMC (VMware Cloud),
dard protocol which has to do with Dale: I actually joined VMware to and all VMware CSP-based services.
delegating authorization to appli- work on Cloud Foundry and worked
cations, but it doesn't specify how on an authorization server on Cloud The patent, as it was written, didn't
that's done. All it specifies is how the Foundry called the UAA. And then, totally fit, and there were some
tokens are transferred and protects when Cloud Foundry was spun out, significant changes when we imple-
them. This patent was about how I switched to work in EUC on what mented it. For instance, the patent
could we do that in a way that used was then called Horizon Application doesn't specify a tenant model and
OAuth2, JSON Web tokens, and vari- Manager (it is now called Workspace a lot of what we had to do was deal
ous other standards; but combine ONE Identity Services). I've been in with tenancy issues for CSP.
them in a different way than I had this space for a long time—on exten-
ever seen done. sible applications, service-to-service Dexter: Did you have to file like
communications—and trying to another thing after to add the tenant
A lot of these protocols, such as make this stuff as secure as possible. model part or was that not a part of
OpenID Connect and OAuth2 were the patent.
designed for consumer use cases. This particular patent is interesting
For example, on your phone, you because we thought this was a good Dale: That wasn't part of the patent.
will get an alert to consent to an app idea from inception. We were inter-
getting your data (e.g. Facebook ested in solving for questions like, Dexter: How did you know that this
wants to know this much informa- "How do you add applications and idea was novel? Is it because this was
tion about you). All these protocols constrain what they can do?", "How something you hadn't seen and you
were about delegated authorization do you provide for service-to-service are very familiar with this space?
based on users’ consent. But in the authorization?". You know, these are
enterprise that's not really our use issues that came up a lot. Dale: Yes. I’ve dealt a lot with Ope-
case. We install applications and nID Connect and Oauth2 protocols;
our IT department configures what We came up with this patent ap- that's what I did at Cloud Foundry.
information that VMware owns plication that my co-inventor Fanny That’s what Workspace ONE oper-
about us can be delivered to ADP, for Strudel codenamed Hecuba. Brad ates on—these various identity
example. contributed some really good ideas. federation protocols.
Most of it was me and Fanny talk-
So, what we saw here in this patent ing about how we could solve these You know, I’ve tried to escape iden-
was a way that we could combine things on a whiteboard. And that tity management two or three times
these protocols in a way that made was...July 2016. in my career, but I can't ever do it—I
sense for enterprise use cases, and always have to come back to work-
even provided some security aspects We thought this was novel, so we ing on identity stuff. So it's the area
that I had never seen before. submitted the IDF. We were also that I have the most experience in.
trying to figure out how to get this
For example, we can now have a cus- into Workspace ONE. Then about Dexter: You want to do other stuff
tomer’s application that can be in- six months later, January 2017, besides identity stuff but you keep
stalled into VMware services, but an we had a hiccup in a project and it getting pulled in? The gravitational
administrator can say, I don't want needed identity services, like the IDF pull of identity management is too
that application to get authoriza- described. strong for you?
tion information for anything more
than read-only access. And that can We had a design that we were work- Dale: For instance, I did a start up
be configured in the authorization ing on for CSP (Cloud Services Plat- for a while—this was decades ago,
server. No matter what that applica- form) and it didn't work out. We had doing large file delivery over the
tion does, it cannot get authoriza- to scramble as we were way behind internet based on internet cost.
tion information that is beyond on our commitments. And when we That didn’t work out. When I joined
read-only capabilities. So it's a way looked back over this idea we had in Cloud Foundry, I thought, "oh cool

24 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


I'm going to work on platform as a into the Dimension project, and now I remember very well, that I was
service." And then I ended up do- I have a contact that I can talk to getting ready to leave for Burning
ing the identity system. And then I there. There are many identity peo- Man, which means that this was
didn't want to get spun out. I wanted ple in the company that I can turn to around late August, and I got some
to stay with VMware and the logical when identity issues come up. urgent calls because some part of
place to go was EUC. At that time, the paperwork for the patent ap-
it was called Horizon Application Dexter: (joking) Okay, I didn't know plication hadn't been done right or
Manager and I thought "Good at there was such a title for various something. Fanny scrambled and
least it's not an identity system." people inside the company...that got that done for me—so that I could
And then shortly after I started they there are a group of people just continue packing.
changed the name to VMware Iden- called the identity people. That's
tity Manager. pretty cool. Dexter: And so you mentioned
Fanny already, who are some of the
Dexter: (laughing) Nice. This is Dale: They probably would not self- other co-inventors and how did they
your destiny. identify that way. help you.

Dale: (smiling) Yeah, I gave up try- Dexter: (laughing) Probably not. Dale: The other co-inventor was
ing to escape it. Brad Neighbors. Brad was working
Dale: Just to add on about what out of the Cambridge office at the
Dexter: Are you known as the iden- we’re doing, a lot of what we're time. I believe he was working on
tity guy then? Like in your circles? doing right now is Project Eiffel—a what is now called UAG. And I had
part of the cloud architecture forum, gone out there to visit that office
Dale: Well, there are various circles. by the way—which is really a lot and run over some of these ideas
There is a group of identity people of the same things as this patent with them. And I remember Brad
in the industry that tend to all know application. It is figuring out how had a particular interest in it. And
each other because it's a very small we get consistent authorization and one of the things I was struggling
group. We go through various identity services across VMware with was how to identify resourc-
companies, but we all know each products. es—the issue is that if you want to
other. There's an internet identity attach an authorization policy to
workshop I just attended last week Dexter: So at first it was develop- something you need to specify what
that has been going on twice a year ing identity services for CSP, and that something is.
since 2006, and pretty much every- now you're working on how to bring
thing that happens in this space is this identity management towards I was struggling with various ways
discussed in those workshops. We all the other VMware products that of doing that, and Brad's idea was
all we all know each other pretty need it? just have it be a bunch of name value
well. A lot of my best friends outside pairs, almost like an LDAP search
of work are people that I met in the Dale: Yes, and another thing to con- filter. And so that really helped. That
identity space. Inside VMware... sider is that VMware is transitioning was Brad's contribution to the patent
pretty much what I do is work on to services. Which means that every and it was very useful.
identity across VMware across busi- service needs to be able to handle
ness units. federated identity and common Dexter: How did you know to reach
authorization—which is precisely out to Brad? I'm sure he was a con-
We've also had something similar what this this patent is about. tact you knew before, but why did
to what happens in the industry you bring that up to him and why
within VMware in that some of the Dexter: What was your title and po- did you seek advice?
the senior technologists like Fanny, sition when you first came up with
who was one of the primary authors the idea for this patent in July 2016? Dale: Well because I knew him
of this invention, moved on to work through the UAG team. The UAG
in CPBU where she's working on Dale: Yeah, so I had to look up all team worked within the same orga-
the workload control plane, Project this stuff because it's a long time nization (EUC and Workspace ONE)
Bedazzle, and those things. And ago. I looked it up and I had just organization and I was visiting their
so I have an identity contact that I been promoted to Principal Engi- office.
can deal with in that area. Emily Xu neer earlier that spring. And then,
moved from Workspace ONE Access we worked on this in July or so.

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 25


John DiRico Dale Olds
Dexter: And then with Fanny, I'm Dale: It was just the idea and filed it. Dexter: So when did y'all start
going to guess that she's somebody working together?
that you work with all the time or Dexter: I'm going to guess that
somebody you like to work with and you're very familiar with the patent John: Probably 2017.
so it just naturally came up. process when you reach out to the
legal team for this patent? Dale: It's been years.
Dale: She was the dev-ops manager
in the Workspace ONE Access team Dale: Yeah. Dexter: Dale, is John your favorite
at the time. Actually, when this legal person to work?
project started up as a part of CSP, Dexter: And did you work with John
because she had been working with before July or was this first time? Dale: (smiling) Absolutely.
me on these concepts, she transi-
tioned to become the tech lead for Dale: An interesting part of this Dexter: Why do you like working
the GAZ implementation team, the conversation is that I work with with John so much?
team that actually implemented John a lot, just not on this patent.
this stuff for CSP. She led that effort Dale: He’s practical. In reviewing
for a few years, then got promoted Dexter: Oh, okay. So you guys have the invention disclosures, there can
to senior staff. And then took off to like a long history. be all kinds of issues. I always found
CPBU. I work with Fanny a lot and I John to be very practical—he'll say
would say the vast majority of this John: Several years. Yes. This one "Okay, we don't really have anything
was whiteboard discussions. I don't was actually more broadly ap- here let's move on quickly," or "Okay,
even remember who thought of plicable, so it was handled by Jim we'll follow up this way" or "There's
what. Kiryakoza, who's on my team. But no reason to have a meeting if we
like Dale was saying, most of the don't have enough material." I ap-
Dexter: And so that's why she was rest of Dale's filings have been under preciate that.
the second name listed. me and I've also worked extensively
with Dale reviewing invention dis- Dexter: Dale, do you find that work-
Dale: Yup. closures. Dale is part of our patent ing with the legal team on a patent
committee that helps us evaluate filing is like a partnership, in that
Dexter: Did you come up with a whether to patent an idea or not both parties have to put in a lot to
proof of concept in July or just the when it's a close call for us. like get it through the process? How
idea? does this process feel like for you?

26 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Dale: One of the reasons I brought iterated to make a couple of adjust- decided internally that we're go-
up this particular patent is because I ments. But that was it. Up to that ing to file a patent application for
just remember being shocked. I had point, I had never experienced such an idea. There is usually just one
been listed on some patents before ease when filing for a patent. It was session, and these patent attorneys
but hadn't really participated in the phenomenal. get all the information necessary to
filing process much. This one was draft the patent application without
one that I pretty much drove. Dexter: I didn't even know that significant further involvement
was an option! Of having somebody from the inventors.
At my previous employer, I had filed interview the inventors and have
15 or so patents and was on their the interviewer write up the patent. Whereas, as I understand it, at other
patent review program. Compared I would have imagined the writing companies, it can require a lot more
to my previous experience, I was and filing portion to be the most hands-on involvement from the
shocked how easy and well things time consuming part. inventors. Some of the trouble that
were done at VMware. others run into is that there can be
Dale: It was great. attorneys who don't necessarily un-
On this particular patent (I had derstand the technology or that the
never seen anything like this be- John: Actually, that's our standard attorneys didn't sufficiently describe
fore), our Patent Team brought in an practice here at VMware. the invention in the patent applica-
outside patent attorney to interview tion. We try to take the burden of
Fanny and me—we spent one ses- At VMware, we use outside patent drafting the patent application off
sion, lasting a couple hours, on a attorneys (who have a long relation- of the inventor's plate as much as
whiteboard with him. The outside ship with VMware and are familiar possible.
patent attorney then went off and with VMware technologies) who
wrote up the entire patent applica- conduct what's called disclosure Dale: Yeah, that's definitely been my
tion! We, of course, reviewed and meetings with inventors after we've experience at VMware and I would

Patents At Vmware

4 in 25
Current R&D employees
have filed or have been
granted a US patent

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 27


encourage anyone that wants to file that it would pass muster under the
a patent application to do so, be- novel and non-obvious requirement.
cause the patent process at VMware Then if the idea is clearly something
is so well done. important, we'll approve it and
we'll file a patent application with
I had a very painful experience at a the Patent Office. If we're unsure
previous employer with an idea that whether an idea is clearly impor-
I thought was the most novel and tant, then we bring it to the Patent
patentable thing I have ever come Committee. The ideas that the Pat-
up with. For three years, I went in ent Committee goes over are ideas
circles between the patent attorneys that could go either way. We look
who didn't understand the idea and for the committee's feedback as they
the patent examiners who didn't may have better perspective into our
understand the idea. I kept having to product roadmap or technologies in
update the patent application over the invention disclosure.
The figure above is a high level block diagram of
and over and at a certain point, I a computer system configured in accordance with
eventually gave up. Dexter: John, what is the patent some embodiments of the present diclosure.
process like at VMware from your
But all of the patent filing experi- point of view? self. I believe there's business value
ences I've had at VMware have been to VMware in patents. And while I'm
amazing. The patent attorneys take John: So, there are VMware (in- not fond of using software patents
the time to understand it. They de- ternal) patent attorneys, like me, as weapons, one of the reasons why
cide how best to support you and it's assigned to various technical cat- I wanted to file this particular patent
been very positive. egories (TCs). So, if the IDF is being was that I believed VMware would
submitted from an inventor from use it defensively. And if someone
Dexter: So VMware has patent the TCs I am assigned to, then I am else had patented this idea, then we
review board meetings. Do we responsible for reviewing it. In the would have to figure out how to deal
need them because we file a certain review, if we decide that the inven- with that. So, I do feel pressure and I
amount of patents per year and we tion's scope is broader or different wish I could file more patents.
just want to make sure that each than the TC under which the IDF
patent we file is worthy of the invest- was submitted, then we recategorize Dexter: How many patents do you
ment required to file a patent? the IDF under the more appropriate have now?
TC and the IDF is reviewed by the
John: Yes, we review more poten- appropriate patent attorney for such Dale: The last time I counted I had
tially patentable ideas, per year, than TC. For the patent we've been talking something like 15.
we have budget for filing them. The about with Dale, this exact thing
majority of the ideas that we pat- may have happened where the EUC Dexter: Is there a number of patent
ent are ideas that are going into our patent attorney reviewed it, found filings that you could do for this year
products. That said, people still do the idea to be more broadly applica- where you wouldn't feel pressure to
submit ideas that could possibly go ble (to Security), and transferred the file more patents?
into our products in the future or file to the Security patent attorney.
have nothing to do with our current Dale: I think that's why I don't keep
product portfolio, but are good ideas Dexter: Dale, do you feel pressure track of the filing count—there isn't
and meet the technical require- to come up with patents? Is there a really a number I'm trying to reach.
ments for patenting (being that the quota for PEs? I'm just trying to protect the terri-
idea is novel and non-obvious). tory of our business.
Dale: I would say, I do feel pressure
The process for our Patent Team to come up with patents. I don't feel Dexter: John, do we use our patents
goes something like this. When we like I come up with enough. I have defensively or offensively?
receive an invention disclosure, the been asked by various people to
VMware (internal) patent attorney encourage people to file patents or John: The primary reason that we
takes a first pass reviewing them. file my own, but I don't think that's have a patent program is for defen-
We then narrow it down based on where the pressure comes from. I sive purposes. To date, we have not
whether we think there is a chance think the pressure comes from my- used them as an offensive weapon

28 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


except as counter measures in litiga- Dexter: How much work do the co- Dale: You know, the lure of being
tion brought against VMware. And inventors have to do to be listed on listed first can be used for good.
we have used them outside of litiga- the patent? Does that determine the Generally, the person listed first
tion as counter value for licensing order of the co-inventors? is the one who has spent the most
deals. So, in a minor way we have time writing up the IDF and work-
used them offensively, but for the Dale: My understanding is that ing through the patent process. I've
most part, our trove of patents dis- anybody who was there when the been involved in situations where
suades companies from bringing a discussion of the idea occurred, or we've had a junior developer in-
patent suit against us. contributed in any way at all, should volved with the idea write up the IDF
be included in the patent applica- and work with the patent process, as
Dexter: Have we had to deploy tion. it's good for career development.
those tactics against patent trolls?
John: Yes, so inventorship is actual- Dexter: John, why do you think Dale
John: The trouble is that you can't. ly a legal definition. And so we have has such a positive experience work-
We can't use them defensively an obligation to the patent office to ing with you?
against patent trolls because they're submit the appropriate names as in-
non-practicing. In these situations, ventors. The legal definition for in- John: Well, we really try to take
we attack the patents that they claim ventorship is that it is an individual everything off of the inventor's plate
we are infringing. who contributed to the conception as much as we can. I think we are
of the claimed idea. It's generally the just respectful of people's time and
Dexter: Dale, besides, the patent people who were white boarding the we get good results. Because we're
cubes, do you have any other forms solution or design together. Some- selective with the patents we file, we
of recognition for the patents that one who was merely involved with have a pretty good success rate our
you have filed? the idea in having helped imple- filed patents being issued. Another
ment the idea in software, without reason that engineers like work-
Dale: We get a little bonus at some having contributed to the idea’s ing with us is that we're engineers.
point along the way. conception, is not an inventor under Patent attorneys need to have a hard
the law. science or engineering background
Dexter: For every patent, do the in order to take the patent bar, so
business leaders give you a hug or Another common misunderstand- we're a little different than your
something when they see you? ing is the relative importance or standard attorneys.
weight of each inventor listed on the
Dale: (laughing) Not that I know of, patent. In every patent, every inven- Dexter: What do you find most
but you know, as soon as I get back tor has equal weight, and the listed rewarding about your work?
to the office, I'll bring it up to Kyle. order has no bearing on the weight
they are given under the law. This John: It's rewarding to recognize
Dexter: Once the patent is granted, means there is no concept of “pri- people's ingenuity. It's nice telling
do you have a little party with the mary” inventors under the law. people that they have really good
co-inventors of the patent? ideas, which is what our process is
Dale: And it has no weight on the effectively doing. It's also rewarding
Dale: I have not. I know some bonus, as it is split equally. to continue to work with inventors,
people do. I know some people because it keeps us engaged with
display all their patent plaques in John: Right. Yeah, the bonus we state-of-the-art technologies.
their office. I have mine in a box in give is $1,000 per person at filing
the attic. and $2,000 at issuance. There is a Dexter: How did you come to spe-
cap of five for both of those bonuses, cialize in the EUC space?
Dexter: You don't have a tattoo of which means filing is capped at
the number of patents? $5,000 and at issuance is capped at John: I was at AirWatch for two and
$10,000. The cap means that if there a half years before it was acquired by
Dale: No. But going back to the were more than five inventors, then VMware, and I’ve mostly focused on
party idea, I'm quite confident Fanny all inventors must split the bonus EUC since!
would use any possible excuse for equally.
celebration, so maybe we should, at Dexter: Thank you Dale and John
long last, do that. for your time!

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 29


Remote
FIELD
WORK:
Part 2
This image is another dramatization of what the author, Bob Motanagh, looks like while working
in his home office. He's aged a bit since the last issue, probably due to the Buffalo Bills.

Oh no, it's 2AM again and I'm working on yet another project. ing than just the general drone of the furnace (this is a run on
Not only that, but I'm writing about working at 2AM again. sentence above and really needs to be edited) [Editor's note: Nah].

Talk about Deja Vu. First is a good friend of mine named Branden Lugabihl. He's a
Senior Consultant with VMware's Professional Services Orga-
Except this time, the reason I'm up so late isn't by choice. I just nization (referred to as PSO from here on out) and specializes
got my first COVID-19 vaccine shot (Pfizer) and I can't sleep, in network/system security work. He’s really handy with NSX
which also makes it the perfect time to write! My left arm (where (both flavors, the en vogue NSX-T and the ancient forbidden art
I got the shot) is very sore, and I get a reoccurring sensation that of NSX-V.) We had a fun chat around how things have been going
there's this itch under the skin that I can't scratch, no matter for him during 2020, what engagements are like these days in
what I do. VMware PSO, and what he's done to stay sane and improve his
workspace and general outlook on day-to-day work activities.
It's more of an annoyance, but also just enough of one that it
makes it nearly impossible to sleep. I should have taken my Second is another old friend of mine named Benoit Serratrice.
friend's advice about having some melatonin before bed the He's a Staff Multi-Cloud Solutions Architect, but also used to
night after I receive my first shot. work in VMware PSO with me. He's quite knowledgeable in
several different disciplines across our various products, but
On the bright side, parts of the world are finally getting vaccina- I've always known him as one of the Masterminds of vRealize
tion shots, it's 2021, and things might finally (hopefully) be on Orchestration. He's good with vRO. Like really good. He's also
the up and up. I'm back to share two new interviews I've had had quite the journey as far as where he chooses to live. Prior to
with former colleagues of mine who also happen to work in The his current living situation in Singapore, he lived in the Boston,
Field. I kind of feel like the Cryptkeeper (from Tales of the Crypt- Massachusetts area and before that in France. We had an inter-
keeper) in setting the stage for these interviews... esting chat around how things have been going for him with
working in Singapore during the pandemic, as well as what his
Not that they're particularly horrifying interviews, I'm just mak- current office situation is like and how he's juggling working at
ing the comparison due to how I have Tales from the Cryptkeep- home with his newfound duties of being a relatively new dad.
er on in the background, like a comforting friend spinning weird
tales to make the background noise in my house more interest- Let us jump right into it with Branden, here we go!

30 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Branden Lugabihl

Branden Lugabihl

blugabihl

Sr. Consultant - Networking and Security

This is the very beginning of your direct message history with @blugabihl

Bob Motanagh Branden Lugabihl


HI BRANDEN! I would say I completely redid my office. This used to be
my bedroom, which is now a dedicated office, and my
Branden Lugabihl new bedroom is a room with no TVs, no distractions,
HI BOB! or anything—just a bed. I’ve completely segregated the
functions of those two rooms, which has been a big help.
Bob Motanagh I’ve got a couch in my office to keep it somewhat relaxed,
RANDALL1! MY DUDE! as I’ve got to have some warmth in my office; I can have
12–16 hour days at times depending on the customer or
Branden Lugabihl situation.
Yup there he is, haha.
Bob Motanagh
Randy.jpg
I know how that is with PSO work, good sir. There’s
a lot of hard work that gets done by your and your
colleagues. I also noticed a nice-looking cat tree in the
background there, is that new?
Branden Lugabihl
Yeah! I wanted Randy to be able to hang out with me
while I’m working so I got a nice cat tree so he can sit
down and rest during the day.
Bob Motanagh
Any new physical equipment on your end?

Branden Lugabihl
Yeah, I moved to a wireless mouse, and I got a really nice
Drop ALT keyboard. I also changed up my desk setup
to something nicer with more space. I went through
Bob Motanagh
two EVO desks due to issues with the desktop itself.
Anyways, I did this with Mr. Meulemans a couple of This one I’ve got here is just two inches of straight birch
months ago, so now it’s your turn. I know you’re in PSO wood, purchased from Home Depot at a cheap price.
(Professional Services Organization), and I’m still of the It’s a much nicer surface than what I had originally, and I
belief that a great resource to ask about the subject of saved a ton of money doing it myself.
our interview (working from home/remotely) are the folks
who have been doing it for years now—the PSO consul- Bob Motanagh
tants and architects. You’re a Senior Consultant with PSO, NICE. That’s the good stuff right there. Next up, have
correct Mr. Lugabihl? you adopted any new applications or methods for task
tracking or time tracking since things got bad with the
Branden Lugabihl
pandemic?
That is correct!
Branden Lugabihl
Bob Motanagh I’ve been doing a lot of the bullet journal stuff. I’m not
Let me ask you this first. Have you done anything new/dif- following the exact methodology for doing bullets per
ferent with your office space since things got bad with the task tracking, but I’ve got a bunch of notebooks/journals
pandemic? Anything that you're proud of? for keeping track of tasks I want to accomplish. I draw
everyday too, which is something I’m pretty proud of.

1 Randy is Branden's cat

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 31


Branden Lugabihl

Branden Lugabihl Bob Motanagh


In the pandemic it’s easy to put stuff off but it helps a ton Even though we talked about silly stuff, it was super nice
to keep track of things in my journals. And the drawings to see and talk to folks again. I think it fell apart because
help a ton as it helps me stay on task— it keeps my mind folks had busy Fridays.
busy so that I don’t just get stuck in this anxiety loop of not
Branden Lugabihl
getting things done.
I’ve gotten into some small vegetable farming too, that’s
drawings.jpg helped me get up in the morning. I got a lemon tree, and
I’m growing some spices. It’s a nice hobby that I’ve got-
ten into.
Bob Motanagh
That’s such a good idea, plus it gets you out of the house
which is super nice. OK! Something new I’m trying here.
I’ve noticed over the pandemic that I’ve gotten into a
horrible habit of leaving browser windows open with a
gratuitous number of tabs. On average, how many tabs
do you think you leave open yourself?
Branden Lugabihl
Average browser window? Probably 10 tabs per win-
dow, but I’ll have 5-6 windows open at a time.
Bob Motanagh Bob Motanagh
Oh yeah, I’m right there with you on the anxiety part. I’m not looking to tab-shame here, as currently have
a window with… [counts] 32 tabs open currently. You
Branden Lugabihl know what’s bad for me? When I think that “Hey, I can
Yeah, the journaling helps a lot as I get a list of accomplish- still see text in the tab” is still a small number of tabs. I
ments of things that I got done that day. And it’s been help- only identify it as being a bad situation when I can only
ing across several different aspects, from both physical see the site favicon in the tab. Anything else you want to
and mental-health to time tracking the stuff I do for work. share before we go?
Bob Motanagh Branden Lugabihl
It helps so much. I’ve been using Things for iOS/Mac Yeah dude, this pandemic stuff is lonely. I got Randy
myself to do some of the same tracking, and on those days here, but gosh I can’t wait till this is over.
where it feels like you’ve been busy without getting any- Randy.jpg
thing done, seeing that list of things you have gotten done
fights back against those feelings. Plus, time in general in
situations like this pandemic just flies by. Days go by, and
before you know it something that you thought happened
a day ago was 2 weeks ago. During my time in PSO all
travel made time just collapse into this weird state where I
would see old friends and what felt to me as if it’s been just
a few months was something like 2-3 years. I think a lot of
folks are going to be experiencing that now as the social
restrictions around the pandemic ease up.
So, anything else you’ve found that helps things at all?
Branden Lugabihl
I miss our Jackbox Games sessions.

Bob Motanagh
Me too!
Bob Motanagh
Branden Lugabihl Well thanks Branden and thank you Randy! Hopefully
Even if it’s one Friday a month, it really helped. we see the both of you in the #cats channel soon.

32 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Benoit Serratrice

Benoit Serratrice

benoit

Staff Cloud Solutions Architect

This is the very beginning of your direct message history with @benoit

Bob Motanagh Benoit Serratrice


Alright bud, here we go! Oh no, my wife works in the same room I do…

Benoit Serratrice Bob Motanagh


Ok so I’ll be answering in French so you can practice your Hello!
French?
Benoit Serratrice
Bob Motanagh
Yeah, she’s working while doing the treadmill. It’s a
Eh… I’m not all that great with French. Care to share
standing desk with a treadmill.
with me your official title, preferably in English?
Bob Motanagh
Benoit Serratrice
Oh, super cool. That is awesome.
Sure, yes, let’s get serious first. So, I think my official title
is Staff Cloud Solution… wait… no… not that… it changed… Benoit Serratrice
hold on I need to check out Workday . Staff Multi-
Yeah, she’ll log something like thirty thousand steps a
Cloud Solution...I mean Solutions—plural!—Architect. Staff
day on that.
Multi-Cloud Solutions Architect. It’s a global team and
we’re spread across the world, supporting customers and Bob Motanagh
partners with their journey into multi-cloud. Last year, I Oh wow! That’s impressive. So, what about you? Are you
was primarily helping our VCPP (VMware Cloud Provider also using a standing desk?
Program) customers but this year, it’s those same partners
Benoit Serratrice
adapting to this new multi-cloud world.
I am!
glasses.jpg
Bob Motanagh
Fantastic, so do I. Speaking of, according to my Apple
Watch it’s time to stand, so let’s switch to standing mode
here. Do you have a treadmill as well?
Benoit Serratrice
Oh no, I’ve got a bike trainer here I’ve been using. So, I’ll
go from sitting down working, to standing up working,
to going on the bike whenever I have to watch videos for
recorded meetings or training or the like. Because I’m
in Singapore, I must often watch recorded meetings, so
I figure I might as well take that time and get something
like a workout going on!
bike_trainer.jpg

Bob Motanagh
Hey, look at us, former PSO folks working in the Cloud
field! We’re pretty much aligned in the same vertical. What
a coincidence. Hey, are you working alone in your room?

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 33


Benoit Serratrice

Bob Motanagh Bob Motanagh


Besides getting some good exercise time, I see you’ve …Of course you would be doing that. Again, you’re a
got your standing desk… mind if I ask if you ended up wizard, Benoit. Ok, I lied, I’ve got two more questions for
using your Wellness Benefit for either of those? you, but they’re easy. What’s your favorite Slack chan-
nels, and on average, how many tabs do you tend to have
Benoit Serratrice
open in each browser window?
Actually, I ended up using that to get a (non-trainer) bike!
Benoit Serratrice
Bob Motanagh My two favorite ones are the #weekendarchitects and
Me too! I finally get to ride on the trails around South Den- the #cloud-gardening channels. The second one is a
ver! It’s been fantastic for my physical and mental health. bunch of folks talking about gardening.
So, I don’t know if you’re still doing vRealize Orchestrator-
Bob Motanagh
related work or not, but are you involved with anything
else? I’ve never even heard of those, that’s cool.

Benoit Serratrice Benoit Serratrice


Yeah, I do still work with vRO! I work a lot with Field En- As for tabs, on average, I probably have something like
ablement; getting labs together for SEs around the world 20-30 at a time.
so they can demo apps/environments in OneCloud. It’s not Bob Motanagh
a lot of fun for SEs to spend their valuable time building
My man! I think my record is something like 65. I’m really
these environments from scratch, so I’ve been writing a lot
bad at cleaning up after myself when it comes to open
of code to spin up OneCloud labs from scratch. In addition
tabs. Anything else you’d like to add about working dur-
to that, I’ve also been automating some other things in my
ing the pandemic?
life, like watching the availability of certain restaurants
here in Singapore to see if there’s openings for reserva- Benoit Serratrice
tions. I’ve been able to connect with people across the world a
Bob Motanagh bit more easily due to how everyone is home, and no one
is travelling. Besides that, I hope things get better soon,
Wait… you’ve automated the task of watching for open
as I just find myself feeling more exhausted more often
reservation spots in restaurants?
lately. I can’t explain why, but it’s something my wife and
Benoit Serratrice I both experience.
Yeah, haha. Bob Motanagh
Bob Motanagh Well, I hope you get some sleep, Mr. Serratrice, and
thank you so much for your time!
That’s incredible. You’re a wizard, Benoit.

Benoit Serratrice
Story continued on next page ->
I’ve also been doing a lot of code for some of my photo
management here as well, with some of my free time.
Bob Motanagh
Ok, I’ve got two other major questions here, I know you
code a lot of your own tools, but are there any apps in
specific that you don’t write yourself that you find yourself
using?
Benoit Serratrice
Slack and email, really, those are my two biggest com-
munication tools. For family, WhatsApp is a popular
chat messaging app that’s used here in Singapore and
elsewhere in Asia. I know you asked about other apps
that others develop, and… I’ve also been working on
something of a personal app that will allow me to enter
my tasks completed for work so that they’re automati-
cally posted in the various places that I must report my
work.

34 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Mark Meulemans speaks at the Mental Wellness at VMware Tech Talk (https://tech-talks.eng.vmware.com/talks/mental-wellness-at-vmware).

Talk about iterating and improving on the previous version of Lastly, I wanted to thank my boss and colleagues from Sofia—
this article. There are two interviewees this time! Before I let you Vlad, Georgi, Atanas, Borisov, and Ivo. They’ve been a very sup-
fine folks go, however, I'd like to share some more observations I porting team to me during these weird times, and I’m apprecia-
have had since I've written the last article. tive of it. Oh and… go Bills, they were so close this year.

I mentioned the Wellness Benefit in my interview with Benoit,


and I just want to emphasize how amazing that benefit is. I used
mine to get a nice hybrid road/trail bicycle that’s helped me get
out of the house and see the great outdoors here in Colorado.
That’s something that’s been great for my health, but I’ve seen
colleagues use their benefits for everything from home-labs
to gaming consoles. If it helps you, it’s something you should
investigate.

In my last article I mentioned my reMarkable 2 tablet. I’m


still using that, and it helped me plan this very article! I’ve got
another tool I’ve been using what I used my Wellness Benefit on
from 2020: the Freewrite Traveler. It’s a portable typewriter with
an ePaper screen. It has insanely long battery life and its very
portable. I bring it with me on bike rides when I find somewhere
comfortable and secluded to get some good writing done. It
can sync over wifi and doesn’t have a web browser or any other
functionality. For someone with ADHD (such as myself), the
intentionally limited functionality of devices like the Freewrite
Traveler and the reMarkable tablet are things that have been a
big help.

I also wanted to give a quick shout out to my colleague (and


former interview subject) Mark Meulemans for his participation
in the Mental Health Tech Talk that occurred earlier in the year.
He did a fantastic job with that. Mental health is something that
everyone should pay attention to, especially given the condi-
tions of the pandemic. I know it’s mentioned quite often, but Bob Motanagh is a Senior Solutions Architect in the Cloud Services
the symptoms of mental health issues can really sneak up on a Business Unit working on Cloud Director Availability. He’s a die
person, and they’re not exactly a lot of fun to deal with. hard Buffalo Bills fan and is (still) secretly wishing that this is the
year they win it all. You can reach him at bmotanagh@vmware.com.

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 35


A Data Analytics
Platform on
VMware private
cloud

Author: Rumen Barov | Editor: Ben Duong, Dexter Arver Photo by panumas nikhomkhai from Pexels.com
INtroduction customer, and support info. The other system was
Digital innovation goes hand in hand with data1.The an in-house data lake containing primarily product
chart below2 shows the ubiquitous data growth over telemetry from all VMware products, with a bi-di-
the last decade and projections for the next few years. rectional data pipe used by on-premises products for
The numbers are mind boggling. cloud-assisted features.

For VMC to start working with the two systems would


have required adapting multiple processes and dif-
ferent toolsets. Moreover, they needed data driven
decisions ASAP, so they decided to quickly setup a new
Amazon Redshift-based system to work with the new
SaaS data. This approach initially worked well. Howev-
er, when data shows its value, the hunger for more data
grows, and very soon the VMC team found themselves
in a position where instead of 2 systems, they needed
to work with 3, delaying the time-to-market of their
Take the example of Super Collider, VMware’s internal
analytics and worsening the company data segrega-
analytics service. The system has had a steady 3x an-
tion problem. At the same time, VMC’s need for a fresh
nual data volume growth for the last 7-8 years.
and clear customer view was pressing. The needs for
The more data you have, the more you can automate
unified access to customer, product, and SaaS (and all
business value, however, more data means more cost.
other kinds of data) collided, hence the name Super
To achieve a positive return of investment on data
Collider. Super Collider inherited the Amazon-based
you need a data system whose value grows faster than
system with SaaS data, and the in-house system with
its cost. Extracting value from data is possible when
product data. The increased adoption of Super Collider
organizations have a data system that would allow for
increased data variety and volume, and also increased
efficient services built on top of it.
the need for real-time analytics and AI. And with thou-
sands of active users from all VMware departments
What is Super Collider?
(full-time data engineers, data scientists, software
Super Collider is an elastic, opinionated, self-service
developers, managers, sales, executives, etc.) grew the
analytics platform with data lake and data warehous-
need for collaboration capabilities, data management,
ing capabilities. The platform is built and operated
governance, and self-service capabilities. In Super
within VMware and is intended for internal use—ev-
Collider, we combined the data and capabilities of the
ery VMware employee has access. To really understand
Redshift-based system into the in-house lake-based
what Super Collider is, let’s go through its history,
system and added more self-service tooling.
architecture, and we’ll then drill down into implemen-
tation details, past and present challenges and some
So, we’ve built Super Collider to serve as a platform
use cases.
that enables every team at VMware to innovate and
build data features. A few examples of features built on
Super collider History
top of the Super Collider platform are the following:
It was clear from the beginning that VMware Cloud
• VMware Cloud team relies on machine-learned
(VMC) would be unlike on-premises software, as the
algorithms to predict hardware capacity needed by
SaaS world is extremely dynamic, and navigating
VMware Cloud on Amazon, which allows VMware
through it requires frequent and swift data-driven
to purchase hardware capacity early, at a better
decisions. Back then, inside VMware, there were a
price.
few sufficiently mature data systems. One was a data
• Another example is vSphere Health—a power-
warehouse built in-house containing back office, sales,
ful feature of vSphere 6.7 that works to identify
1 https://www.gartner.com/smarterwithgartner/cio-agenda-2021- and resolve potential issues before they have an
prepare-for-increased-digital-innovation/ impact on a customer’s environment. Resolutions
2 From https://www.youtube.com/watch?v=eHTCR1BDhhA

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 37


and recommendations are provided to guide the resources from the IT, data science, and data engineer-
customer through remediation. ing teams or leverage team members with data science
skills for a faster approach using the self-service
What is unique about Super Collider platform to quickly meet your business need. The
In Buddhism, nirvana is the ultimate goal—it marks above operating model affects product, technical, and
the end of the vicious cycle of doing things the wrong process decisions we take in Super Collider. Providing
way over and over again. In Super Collider we define such flexibility of self-service was a challenging but a
the nirvana of the data world to be data democratiza- mandatory step.
tion. The goal of data democratization is to have any-
body use data at any time to make decisions with no Data flow
barriers to access or understanding; it is an easy way Let's look at the history of data solutions. First was the
for people to understand the data so that they can use data warehouse, a centrally managed monolithic store
it to expedite decision-making and uncover opportuni- for structured data, where data is first cleaned and
ties for an organization; it is an enabler for x-analytics then organized into marts. Then, Massively Parallel
and decision intelligence.2 In short, Super Collider is an Processing (MPP) architectures allowed for cost effi-
opinionated, multi-tenant, self-service data platform cient systems that were able to handle larger data sizes
that promotes data mesh principles. Multi-tenancy and organizations started building data lakes contain-
encourages decentralized data ownership by domains. ing all the raw data. And in recent years the data mesh
Data domains promote data quality and treating data shifted the focus towards unlocking value out of data
as a product; when data is a product, owners do their by organizing data and processes into bounded data
best to make their customers (data users) happy. And domains.
being self-service allows for unobstructed interactions
between users and owners. Last but not least, the plat- From an architectural perspective, Super Collider is a
form facilitates (makes it easy for users but does not structured data lake, but it offers warehousing capa-
enforce) processes and tools aligned with our opinion bilities too, and the tooling and recommendations we
on how to manage data efficiently, including all aspects provide facilitate data mesh-like practices—domain-
of data management and policy compliance. driven pipelines, and cross-team collaboration. A typi-
cal data flow with Super Collider is shown on the top of
When the data platform is self-service, it is up to the this page.
business to decide how many resources to invest in
implementing a given data feature. The platform gives Super Collider users are organized into self-organized
more options for securing the needed data person- teams (so far we have 100+ teams onboarded with the
nel—you can either wait for centralized VMware platform). Teams ingest and process data with the
end goal to provide usable data and accurate busi-
2 https://www.gartner.com/smarterwithgartner/gartner-top-10- ness insights. For regular ingestion and data process-
trends-in-data-and-analytics-for-2020/

38 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


ing, teams use an in-house built feature called Data Although this approach is more expensive to imple-
Pipelines. Data Pipelines provide an abstraction layer ment (e.g. traditionally, staging and prod are different
over Super Collider APIs called by the workflow engine deployments, accessible via different URLs) it enables
with the goal to make data engineers more efficient extremely fast prototyping and speeds up the overall
(e.g., instead of opening an ODBC connection to Super data management process.
Collider, dealing with credentials, retries, reconnects,
etc., Data Pipelines allows you to just invoke a method How Super Collider is built
called “execute_query”). Everything exposed by Data Looking at the big data technology data landscape (see
Pipelines includes built-in monitoring, troubleshoot- image below)3, it is clear that choosing a technology
ing, and smart notifications capabilities. Using Data
Pipelines also allows Data Engineers to experience
higher SLAs of Super Collider compared to the SLAs
they’d get if they were directly using Super Collider. Of-
ten the gain is ½ to 1 additional “9” of availability (e.g.
from 99.9% original, users get 99.97% actual availabil-
ity, because Data Pipelines is capable of sophisticated
error detection mechanisms and can retry on-system
related failures such as network errors, for example).

stack for data processing is far from trivial.


Few organizations have the skills and resources need-
ed to do that, as the data engineering work required
to build such applications is daunting. A recent paper
by Google highlighted the huge engineering “debt”
involved in building and maintaining the platform
behind their Machine Learning (ML) services. The vast
majority of the “debt” has nothing to do with ML and it
Screenshot from Super Collider self-service portal at https://supercollider.vmware.com is the choice of technologies that determines the debt.
In Super Collider, we wanted both low cost and the
Teams also manage data visibility; by default all data— flexibility to change our technical decisions in the
both the raw and the processed—is VMware confi- future. We continuously challenge our architecture
dential (visible to all VMware employees). Data is only and make changes to it. For example, we moved off
restricted when policy compliance requires it to be. Hadoop, moved off Amazon, considered using Snow-
Every piece of data and every process has exactly one flake, etc.. After a few iterations of that kind, Super
team who owns it. For higher developer productivity Collider is built entirely on open-source and VMware
and to enable iterative development, Super Collider of- technologies and is deployed in OneCloud—VMware’s
fers staging and production support where staging and internal IaaS Cloud built using VMware Cloud Director
production data is logically separated. and operated by the IT Organization.

All data—staging, production, lake, and warehouse— Using proven and readily accessible open-source
can be accessed with a single SQL query, which allows software (OSS) technology stack gives us flexibility and
for rapid ad-hoc analysis and hypothesis validation.
3 https://mattturck.com/data2020/

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 39


ensures the best speed of innovation; this, combined needed to be standardized, so we chose HDFS.
with our expertise in operating Open Source Software OK, let’s get technical. First, we’ll take a look at the
(OSS) keeps cost down. Another significant factor for high-level architecture of Super Collider and then we’ll
cost reduction is going to the cloud4 and using cloud- get into next level of details.
native technologies; we’ve tried public, private, and
hybrid, and finally settled with private cloud.

Architecture
Speaking of the technology stack, we can define three
layers in the Super Collider implementation:
• IaaS stack—the layer that provides all networks, as
well as the storage and compute infrastructure.
• PaaS stack—the layer of OSS installed on the IaaS.
• SaaS stack—the layer that stitches everything
together and exposes the data capabilities to users
(and reduces complexity by hiding implementa-
tion details from them). The SaaS layer exposes Super Collider high level architecture
functionalities like SQL interface to all data, data The data pipe is Kafka. Kafka consumers are
pipelines, data lake, data warehousing capabilities, Kubernetes(K8S)-based jobs, primarily Spark jobs. Us-
multitenancy, data discovery, governance, etc. For ers can plug whatever jobs they implement, and those
Super Collider users, this layer is the data platform jobs get packaged into containers which Super Collider
that allows them to do all sorts of analytics. can run and operate. The results of these jobs can go
to both the data lake (for batch analytics) and back to
Before going into details about each of the three layers, the sender (real-time analytics); data destinations are a
let’s highlight some architectural requirements. matter of user-provided deployment configuration.

A key requirement to the architecture was for the The access to the data lake is through Apache Impa-
system to be able to handle large amounts of data at a la—an MPP SQL query engine. The lake provides virtu-
low cost. This requires storage and compute to scale al data warehouse capabilities where users can create
independently. Relational database management sys- data pipelines, usually used for batch processing and
tems (RDBMS) at scale are expensive, so we focus on a ML data preparation, or to maintain high-quality data
combination of a scalable storage and a query engine. marts that feed downstream reporting and analytics.

Another key requirement is for the architecture to be Data Lake and Analytics Stack
open for extensions. Technologies in the data domain For the query engine, we choose Apache Impala.
change very rapidly and the migration of petabytes of Permission management is solved with Apache
data is costly, so we need a way to maintain the con- Sentry, where group membership (authZ) is provided
tracts with our users without moving the data. Also, by FreeIPA. This setup allows us to integrate a Super
acquisitions and system mergers are not uncommon. Collider-internal Kerberos instance (for Super Collider
Throughout the history of the system that is now service users) and VMware Corporate LDAP. Another
known as Super Collider, we have already merged with potential option was Apache Ranger; however, we dis-
a few other systems, including the aforementioned carded it in favor of simplicity and flexibility. We could
Amazon system with SaaS data. To open Super Collider have chosen a simpler architecture if our company's
for other processing engines, the data format needed organization was different (e.g., without FreeIPA, or
to be widely supported, so we choose Apache Parquet. without Kerberos, or without both).
Similarly, the interface through which data is exposed
Workflows are deployed to in-house K8S clusters and
4 https://www.gartner.com/smarterwithgartner/4-trends- the technology of choice for workflow management is
impacting-cloud-adoption-in-2020/

40 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Apache Airflow, which is exposed to users. Metadata Infrastructure: IaaS
management is delegated to Apache Atlas. A few words about OneCloud, the private VMware
Speaking about technologies—a noteworthy recent cloud that Super Collider relies on. If you have heard of
optimization is the migration from Hadoop YARN. Cloud Director, then you already know what OneCloud
Because we had lots of very simple data pipelines, is. VMware provides datacenter design recommenda-
where each pipeline needs a fraction of the resources tions to help its partners to build and operate their
that YARN needs to run the job, migrating our data own VMware-powered cloud. To validate and test these
pipelines from YARN to K8S saved ~90% of the used designs, VMware actually operates large private clouds
CPU cores, and ~80% of the used memory. For exam- to run production workloads within the company. In
ple, here’s the memory and CPU consumption of the other words, OneCloud is not something proprietary, it
vSphere Health data jobs before and after the migra- is just an IaaS technology stack built and operated by
tion: VMware for VMware. VMware customers can spawn
their own OneCloud-like systems.
Streaming Streaming
Super Collider benefited from the latest vSphere
workflows workflows reduction reduction %
ecosystem version. For example, the optimizations in
v1 (YARN) v2 (K8S)
vSAN and NSX helped squeeze the best performance
# of CPU out of Impala, while vSphere offered native container
538 40 498 92%
COres support for the K8S workloads.

RAM (GB) 187 30 157 84% Now, lets drill down into compute, storage, and net-
work.
Infrastructure: Administration model
Infrastructure: compute
For IaaS, we use OneCloud. The OneCloud team is
The Super Collider team optimized VM CPU utilization
responsible for provisioning hardware and for (opera-
by running several less-CPU-intensive OSS services
tions of) running the IaaS Cloud.
in a single VM (e.g., Hue and Zookeeper). The team
The Super Collider team is responsible for deploying
also combine CPU-intensive and non-CPU-intensive
the Super Collider analytics stack in VMs and to con-
machines on the same physical host using VM affinity
tainers on the IaaS Cloud. Terraform is used to manage
rules. To increase performance of the memory-hungry
VM lifecycle, and Ansible for in-VM configurations.
Impala, we disabled memory ballooning and memory
overcommitment on the VMs that host Impala.

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 41


These kinds of configurations allowed us to achieve Infrastructure: Network
the desired balance between performance, stability, For inbound Internet traffic (e.g., VMware SaaS and
and cost. products deployed at customer premises pushing
telemetry data to Super Collider), we tried a few CDN
As expected, Impala with its metadata cache was the providers and also experimented going without a CDN.
heaviest memory consumer, followed by Kafka—the Finally, we chose Akamai services—mainly for DDoS
backbone of the data pipe. Real-time data processing protection, invalid request filtering, and rate control.
workflows are #3 in the list, and this was expected, as
they work with data that is not yet on HDFS. Every big data platform is extremely I/O and network-
intensive. The generic environments offered by cloud
Infrastructure: Storage vendors are not optimized for such scenarios. Group-
The data lake storage file system is HDFS powered by ing of workloads for network proximity both on virtual
Dell EMC Isilon (provided by OneCloud). The main and physical level is critical for performance and cost
reason for this choice is lower cost without sacrificing optimization.
scalability. For example, Amazon S3's (widely used for
data lakes) TCO is a sum of 5 separate components, one Using the latest version NSX-T Data Center allows the
of them being storage per GB. This single cost compo- segmentation to be configured across SDDCs, which
nent (leaving the other 4 aside) is comparable to One- allows the network to scale beyond the limits of a
Cloud’s TCO per GB. And unlike Amazon, OneCloud single SDDC. This option to span across SDDCs also al-
does not charge us for #requests & data retrievals, for lows for network maintenance without downtime.
management and analytics features, and/or for data
transfer (except for the overall network cost). Requesting dedicated physical environment and hosts
that are physically close to each other may be possible
The above-right donut graph is how Super Collider with some cloud vendors, and it comes at a higher
services use block storage. The largest consumer of cost. Here again we benefit from using a private cloud.
block storage is the data pipe. Since we do not want to With OneCloud, we got this feature almost for free—
lose data in the pipe (Kafka), we rely on Kafka to handle we only had to explain our needs early enough to allow
redundancy, and we provide Kafka with a storage the OneCloud team to plan accordingly.
configured by OneCloud for low latency. The second
largest consumer is Impala, which does aggressive Infrastructure: Operations
caching (ephemeral writes). To eliminate the replica- Super Collider offers decentralization of data manage-
tion load on the vSAN from those ephemeral writes, we ment, which creates challenges in resource planning
use vSAN Direct Configuration™. and capacity utilization. At the same time, Super
Collider is a centrally funded system, and the efficient

42 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


resource and capacity management is a requirement. each panel is clickable and allows drill-down to ease
On an aggregate level, however, the growth curve looks troubleshooting. That main Grafana dashboard is pro-
smooth, and we have been able to predicte its growth jected on a large central display in the office.
accurately enough to be able to manage our budget.
With more than a hundred teams using the platform, Past and future considerations
it is not uncommon for a team to double their load on In the above sections we explained how our system
the system within a few days. And especially during is built. In this section we want to shed some light on
development, it is not uncommon for a query or a the reasoning behind some of our architectural deci-
data pipeline to be implemented in a way that is very sions. Each decision below is a candidate for a separate
resource-intensive. To help users utilize the resources article itself, as the number of variables evaluated in
efficiently, Super Collider offers show back capabilities order to get to the final decision was significant. To
that enable data engineers to optimize their workloads mitigate that complexity, for every significant decision
for capacity, thus lowering the overall operational cost. we try to do our best to convert everything to a dollar
cost in 1, 2, and 5 year segments. For simplicity we’ll
Still, Super Collider is a self-service platform that al- only outline key facts that steered our decision in one
lows any VMware employee to implement any kind direction or another.
of data application. This might lead to and opens the
door to congestion points and resource starvation. To Past: Cloud-native vs hybrid vs private cloud
solve the noisy neighborhood problem, we are heavily In 2017 it was time to decide the Super Collider deploy-
relying on Kubernetes namespaces. For example, the ment model for the next 5 years. With that we also had
data pipelines of one team can be physically separated to choose a vendor for lake and warehousing capa-
from the data pipelines of another team. On the SQL bilities. The three options for vendors were Amazon,
side, we rely on Impala resource pools, and for the Snowflake, and an in-house built system based on
network we use VMware NSX network segmentation open-sourced software.
when needed.
Amazon vs. Snowflake: From a product perspective,
Infrastructure: Support and availability Snowflake is an integrated solution with streamlined
Since Super Collider is used as a platform for in-prod- usage practices. Their vision is aligned with our needs
uct and business-critical features, high availability for integrated and democratized lake and warehouse
of the system and 24x7 support are a must. Level1 capabilities. On the other hand, Amazon, with its
support is provided by a central incident management multiple data services, is like LEGO®—the way you as-
team and Super Collider software and infrastructure semble blocks determines your solution and cost.
engineers across the globe take shifts to provide 24x7
Level2 support. Let’s start with Snowflake—the cost of Snowflake, at
the scale we were in 2017, was actually a bit lower than
In Super Collider, a “data loss” incident is considered to the on-prem variant. However, Snowflake did not offer
be P0, and we have designed our system architecture an on-prem deployment, so we had to add the cost of
and backup/restore strategy in a way that prevents migrating existing workloads (both VMware-internal
data loss even when there is a failure in a given part and workloads we had on Amazon) to the total cost.
of the system. For example, Kafka topics are persisted Also, given that Snowflake data format is proprietary,
on SSD, so that if a given broker fails, data is not lost. we had to calculate the cost of migrating off Snowflake.
In addition, a backup of Kafka is kept for a week. This This made the total cost of Snowflake higher than the
allows us to restore and replay data in case of a failure cost of the on-prem deployment of an OSS stack.
in downstream data processor, even when the prob- With Amazon, the calculation was not that simple
lem was found hours or days later. With regards to because we had to validate and estimate several com-
monitoring, we initially we started with Grafana and binations of technologies including Presto, Athena,
InfluxDB, but recently we find ourselves using the Pro- Kinesis, etc. For our use-cases, Redshift Spectrum +
metheus stack more, mostly because of its K8S integra- Glue was the winning combination, and still, TCO of
tion. Every component is being monitored in isolation Amazon-based stack was significantly higher than
and via end-to-end monitoring. A main single-page running OSS-based stack deployed and operated at
Grafana dashboard shows the overall system stability VMware.
by user-facing components (e.g., SQL endpoint, Data
Pipelines, ingestion endpoint, control plane, etc.), and

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 43


Top-left: Rumen Past: Hadoop Present: Data Federation
Barov, the author
We had 2 main challenges with Hadoop-based compo- Data attracts data, hence the term “data gravity.” With
of this article.
nents: HDFS and YARN current data growth trends—data landscape changes
Top-right: The and data sources proliferation—it is impossible for all
Super Collider With HDFS, we used to run a version that was inef- the data to reside in a single data lake backed by a sin-
Team.
ficient above a few million files. This challenge was gle file system. Trying to keep Super Collider as a single
addressed in newer HDFS versions, so we upgraded to filesystem-backed data lake is a doomed mission.
it (back then it was Hadoop 2.6). The other challenge
with HDFS was its cost. HDFS is cheap, but modern There are attempts in different directions that are
technologies can be even cheaper. For example, HDFS aimed at providing a federated view over datasets.
was originally designed to run on local drives, but run- While distributed query engines like Presto, solve the
ning local drives in the cloud is expensive—so, we ini- “data read” problem quite well, the challenge with
tially started with EMC VNX (shared iSCSI). Later, with writes and transactions is still a work in progress.
the release of Dell EMC Isilon, we got the same SLAs
at 4.5x lower cost, because A) Isilon does not need 3x Another approach is data orchestration (e.g., Alluxio).
replication factor, and B) Isilon is a more cost-effective Cloudera has invested in a toolset that makes it easy to
per gigabyte. copy data with its metadata and permissions, etc.This
trivializes use-cases that require subset of all the data
We had a challenge with the utilization of memory and in the lake. Although data copying is a workaround,
CPU resources by YARN-managed workloads. Migrat- and not a solution—this workaround is good enough
ing to K8S resulted in savings of up to 90%, depending when data volume is in the GB ranges.
on the service being migrated, with an average of 50%
overall, thus halving our IaaS resource utilization. What’s next
Bridging the gap between object storage and table stor-
Present: Read uncommitted = no isolation age is a problem that the data world has yet to solve.
As mentioned above we picked query engine instead Solutions like Delta Lake work in particular setups,
of RDBMS. The problem is that this setup offers no however, there is no available open data platform yet.
transaction isolation capabilities (transaction isolation Ensuring data and metadata consistency between file
level “read uncommitted” as defined by the SQL-92 system, query engine, and metadata store—without
standard). a vendor lock-in—is a billion-dollar challenge and
VMware is betting on Project Taurus to solve it. Super
In Super Collider we have implemented multiple work- Collider and Project Taurus are now “one team, two
arounds to this problem, however with time and scale projects” and we cannot be more excited about the
those workarounds become costly and less efficient. future. Stay tuned!
We are still looking to fill in the gap between what the
market offers—file storage, and what we actually need, Rumen Barov is a Staff II engineer in the Cloud Services
a table storage. And there are solutions out there, usu- Business Unit. You can reach him at rburov@vmware.
ally contributing with innovations in the contact area com.
between SQL engine, metadata, and storage. Take a
look at Apache Iceberg or Delta Lake, for example.

44 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Project Kepler:
The Origins of
the Anywhere
Workspace
VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 45
Also, end users were onboarding
personal devices, which usually
doesn’t have adequate security
software. So, attackers started using
those as a pivot point into the internal
network.

Brianna: So how did Project Kepler


address these issues?

Shawn: I’d like to preface this by


saying that the technologies coming
together in Project Kepler are
designed to become a future work
style for organizations going forward.
Even as the pandemic wanes, we still
expect organizations to be in a heavy
Before its external announcement, tools that companies used to manage remote-work operating model with
VMware’s Anywhere Workspace was devices, devices fell off visibility from somewhere between 40% to 60% of
code named “Project Kepler.” Brianna VPNs. Suddenly, they’d have 10,000, the workforce being distributed at all
Blacet sat down with Project Kepler’s 20,000, or 30,000 devices that they times. We don't think it's ever going
leaders—Shawn Bass, VP/CTO, End couldn’t patch or keep up to date. to go back to that 15%-20% mark.
User Computing, and Craig Connors, Patching, visibility, and security So it's a very big market opportunity
VP/CTO Service Provider and Edge became big problems. that won’t disappear with COVID.
BU—to find out why and how this Our former CEO Pat Gelsinger said
project came into existence. During that transition, most that, but I’ve heard the same from
organizations either started to my conversations with a number
Brianna: So, Shawn, will you give increase their VDI capabilities or of different CIOs. We’re seeing very
us the high-level “Project Kepler for starting rapidly expanding their similar patterns all throughout the
Dummies”? How did this project get VPN appliance footprints to manage industry.
started? that change. For most companies,
their VPN capacity was designed Project Kepler isn’t a product, per se.
Shawn: The pandemic was the for a remote workforce of no It’s made up of parts—pieces coming
genesis of how this project came to more than 30%. They struggled together from three different BUs
be. Pre-pandemic, most companies with how to meet a 90% remote and integrations between them.
had a remote workforce of roughly 15- workforce overnight. Since most But it appears more seamless to the
20%. But during the pandemic, out of VPN technology is based on physical end user. So when they purchase
necessity, that number rose to about hardware, they struggled to expand the Anywhere Workspace solution,
90%-100%. That immediate shift capacity. They had issues sourcing there will be a client component on
caused great disruption for a lot of more VPN appliances. We began to their devices, which is the Workspace
organizations. How could they make talk to customers about how to shift ONE tunnel technology. That’s what
this remote workforce as productive to a software-defined model. As long will get a device that is not already
as possible as quickly as possible? as they had enough compute capacity, connected to the VeloCloud fabric
that would give them the ability to onto the VeloCloud fabric.
Some organizations struggled with scale to meet demand.
getting devices in people’s hands. Craig: If we think about Secure
For example, if they weren’t laptop The other challenge was security—an Access Service Edge (SASE—
users, how could they quickly order increase in two attack vectors. Every VMware’s software-defined, or SD
everyone laptops or help employees major vendor of VPN appliances on WAN technology), it's really access
repurpose their BYOD devices? How the market has had a 9.0 or above technologies and then security
could they give them remote access CVSS vulnerability in their VPNs. technologies. The existing access
to the corporate environment? Over the last 13 months, they’ve seen technology we had for SASE was
their VPN hardware has all been SD-WAN. So customers had software-
Separately, disconnected from the targeted by nation-state attackers. defined technologies deployed in

46 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


their corporate offices. Although cloud and routes them intelligently to Brianna: Do we have any customers
there have been some work-from- the backend (where their applications using it now?
home deployments with SD WAN, reside) or provides a way to provide
to reach a much larger audience inline security as traffic goes through Shawn: There are some customers
of remote workers, we needed a the SASE PoP and out to the cloud. using an early-access version
software-based solution for accessing of secure access, which is just a
the same fabric. So, really, a lot of So, when you look at these component that takes the Workspace
different VMware technologies have technologies, a customer would ONE tunnel piece and brings the
come together to help us deliver as a have to go and acquire a technology customer to the VeloCloud SASE
cloud service a scalable way for these from Microsoft, technology from a PoP. But that's not the architecture
remote-access clients to access that cloud-security provider, an endpoint that will be generally available (GA).
same fabric, as Shawn said, that the security offering from a third vendor, It's an early form of it that's running
SD-WAN branch offices had accessed. and then piece together all these the AWS cloud. It's not meant to be
components that we're offering in the scalable or production-ready.
Brianna: So how is this different Anywhere Workspace. We're basically
from other products on the market? telling customers that you can That early access version was not a
replace an endpoint-security vendor, full featured product. We didn't have
Shawn: I think if you compare it a Secure Web Gateway vendor, an any of the cloud-security capabilities
to other market competitors, there SD WAN vendor, and an endpoint- built into the early secure-access
are many elements that align to management vendor with our all-in- version. It’s just an improved
other vendors’ technologies in bits one solution offering. mechanism for intelligent routing,
and pieces—if we were to compare not a set of security technologies
some of the device-management Brianna: Is it one SKU or multiple delivered from the cloud.
elements or zero-trust elements that SKUs?
Workspace ONE brings to the table. Brianna: So, can this be used easily
There are corollary components Shawn: It is offered as a single SKU. for BYOD?
available from Microsoft, for example, But there are different additions
that are very similar in nature. But or versions of that SKU that will be Shawn: Yes and no. As I explained,
the difference is that Microsoft available. Customers can add more we have the tunnel capabilities
doesn't really have a solution for and more functionality. today. On Windows devices, there
SASE that brings customers to the is a capability called “registered

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 47


mode.” It doesn’t mean that it’s MDM Brianna: Do you guys want to talk ONE tunnel technology. These are
controlled, but it does mean that more about the architecture? kind of the “onramps” into the SASE
there is a device registration in our PoP.
console that we can link policy to. Shawn: The one piece we haven't
So we have that support today for talked about is how all these different Craig: Yeah. So if you think about it
Windows 10’s registered mode. We elements come together. The slide in terms of the VMware vision of “any
do not yet have it for Mac, but that below is the architecture view that app, any cloud, any device,” the slide
is something that we're working on shows the endpoints, the SASE PoP, below is our pictorial representation.
in the Q2-Q3 timeframe. We will be
and the elements it’s composed of. We have three different types of
working on it for iOS and Android in
It shows what would happen as an users now—we have the work-from-
the Q3-Q4+ timeframe. By the end of
end user goes to use different types anywhere user who's mobile and on
this year, you'll have this capability
for almost all of the major operating of websites or SaaS apps, as well as the go in Starbucks, in the airport,
systems. In the second half of the what ultimately happens, in terms of or wherever they may be. We have
year, one of the most important applying policy in the SASE PoP. the work-from-home user, which
things that we will be working on is
rehabbing the compliance engine
within Workspace ONE to be able to
focus more on the device posture of
devices. This basically refers to our
ability to assess the risk of the device
and to use that risk in decisions that
we make in real time within the
Workspace ONE tunnel and within
the cloud web-security solution.

There's some machine-learning stuff


happening on the backend as part
of the Workspace ONE risk analytics
that's part of workspace. Today, we
sample user behavior patterns, both
in terms of how they authenticate to
Workspace ONE and we also leverage
patterns, based on the device posture.
So we look at things like—what is
the behavior the user is trying to Shawn: When I talked earlier about is now a much more prevalent part
accomplish? Are there too many there being two onramps to this SASE of the workplace, post-pandemic,
failed login attempts? We use that as PoP architecture, that's represented than ever before. And then we still
a mechanism to assess the risk. in the diagram above by those two have work-from-office users. We will
objects on the left of the SD WAN always have work-from-office users
We also look at certain things on the gateway and the secure access. going forward. We want to make
device, like the pattern of behavior sure that we have a solution for these
of application downloads. Is this So, functionally, if you've got a three different types of users that are
normal behavior or is it abnormal? branch-office user or a home-office accessing the solution.
We use those factors to compute a user that has one of the VeloCloud
risk factor for that user. That risk appliances, they're going to get to the As Shawn said, on the left side of
factor can then be used when the user SASE PoP through the existing SD the slide, we have our two different
goes to authenticate the next time. WAN fabric from the appliance. If it's types of access technologies coming
We can use that as a mechanism to a standalone home user that doesn't into the SASE PoP. We have secure
determine whether we should permit have an appliance, or if they're in a access, which is the Workspace ONE
or deny it. We can also do things in hotel or airport or whatever, then they tunnel service containerized in
between, such as doing a multi-factor would come through that secure- Kubernetes—it's horizontally scalable
authentication to make the user access offering that Craig talked and multi-tenant, providing those
prove they are who they say they are. about earlier. That’s the Workspace work-from-anywhere and work-from-

48 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


home users that are using Workspace Brianna: Have hybrid and multi- prevention, malicious URLs, DLP, and
ONE with access from the network cloud added a layer of complexity? CASB. Now you're just employing
side. And then we have our SD WAN it at the SASE edge and doing it in
gateways, which are bringing in the Craig: One of the things that's kind a way that doesn't impair the user
work-from-office users and any SD of important here is this single experience, because you're not
WAN user who's working from home orchestrator at the top of the diagram. routing them to one central location
and has an SD WAN device in their So we're bringing all of these under a to apply those security benefits.
home, like me, for instance. single-pane-of-glass configuration.
There will always be some things, Craig: To Shawn's point, when I had
These two access technologies like mobile-device management, that stack of boxes in the datacenter,
give you access to the applications where you have to go to a different if I want to move that to the cloud, it
on the right, either through the portal. But as much as possible, doesn't work if they're all different
technology that SD WAN already bringing everything under a single vendors, because now my traffic is
provides, in terms of connecting you pane of glass allows us to simplify the hopping cloud to cloud to cloud. That
to your enterprise datacenter, or by management of this for the users. was before. But what we've done is to
routing you through the security We already did this for the hybrid put them all in the same PoP. So, it's a
technologies on the right to reach network with SD WAN. But now, true single pass.
those applications. So, with the bringing the work-from-anywhere
launch of the secure access in the PoP users, the cloud web security, and the Brianna: So now can you tell me a
in June, we will have the cloud web- next-gen firewall under that same little bit about how this all evolved
security offering, as well, which is a simple single pane of glass is what within the organization and who
secure web gateway functionality for makes it easier for users to operate the players are? How did the project
web and SaaS applications. their networks in a secure way. start? Who initiated it and what BUs
are involved?
So, how do I securely access Office Shawn: Historically, what
365? What about security techniques organizations would do in order to Craig: Great question, although
provided by cloud web security, such secure these things was to route the I'm not sure I have a good answer.
as URL filtering? How do I block users’ traffic back to the datacenter I mean, I was involved relatively
gambling sites or adult websites utilizing VPN and then leveraging early in the project, but it was kind
for my work users? We use a cloud various capabilities—proxies and of after the idea was already born. I
access security broker (CASB). How firewalls and things of that nature— think where it really kind of came
do I protect things like OneDrive to try to provide security. But the from was Shankar Iyer, the GM of
and other business applications and problem with that is, as more and End-User Computing Business Unit
provide data-loss prevention (DLP)? more applications go to the cloud and Sanjay Uppal, the GM of Service
How do I make sure that sensitive and SaaS applications are routing Provider & Edge Business Unit. But
information—such as social-security the users’ traffic to the datacenter we're talking about the pandemic
numbers, credit card numbers, and then back out to a public-cloud response and what needs to change
and things like that—isn't being application, these additional routes with our technology structure within
leaked out of my enterprise. There's add latency, which impairs user VMware as a whole—to make our
also a play for securing enterprise experience. We want to make the user solutions more consumable as a SaaS
applications, either in the datacenter experience better. The only solution platform and that type of stuff. Both
or running in IaaS, such as AWS or available for most VPN technologies the VeloCloud business and the EUC
Azure. is to enable split tunneling and to business are pretty well-suited to
take that traffic out of the VPN and this idea of cloud consumption and
Next year, we will also add the NSX let it go directly to the cloud. But subscription consumption. Add that
firewall-as-a-service, similar to what the downside of taking that traffic to the pandemic response of people
we did for secure access, in terms out and having it go direct to the shifting to a distributed workforce
of making it a multi-tenant, cloud- cloud is, you lose all the visibility and model and also our technologies
scale solution. We have to do the security that you had previously in that can help that scenario, it kind
same thing for the NSX firewall. We the datacenter. of made sense to bring these two
will add that on the right side of that together. So, I think it just kind
architecture diagram, for securing So this notion of security moving out of evolved out of something that
the applications on the bottom. to the edge is that you can still apply naturally made sense for both BUs.
all those capabilities—spearfishing

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 49


I talked before about the challenging doing a huge number of customer- business units, as far as how all those
security dynamics that are happening related briefings trying to articulate build on each other. Because right
right now. We had spoken to the the vision and strategy behind Project now, there are missing pieces of that
Security Business Unit and said Kepler. It has been one of the most picture.
“Hey, we've already got existing exciting things from a customer lens.
integrations between Workspace ONE More and more and more customers Brianna: What are the plans for
and Carbon Black.” Separately, the are asking us about this and it's really testing and rollout?
security business unit was working coming from two different angles.
closely with NSBU on integration Shawn: That's a great question.
between Carbon Black and NSX and There's the business impact of There's a lot of work I think that has
the last-line technology. And so it what we're trying to solve with to happen on testing. The good news
kind of made sense that the security Kepler of making a productive is that, at least from an EUC point of
elements became a key part of this distributed workforce. But an adjunct view, all the technologies that make
overall offering. relationship to this project is that we up the Anywhere Workspace are
have a number of customers asking tried-and-true existing technologies
It was like August or September of us right now about being able to that tens of thousands of customers
2019 when we first started talking articulate a pan-VMware zero-trust are already using today. So this will be
between VeloCloud and NSX about strategy. an evolution of existing technologies
how we do SASE. I remember sitting that we already have. The use cases
in a meeting with Tom Gillis (the GM NSBU had been heavily engaged with around Anywhere Workspace are
for Network and Security Business a large number of customers to be some net new-use cases, but it doesn't
Unit or NSBU) and talking about able to articulate what the future of represent a material change or how
remote-access technologies. We were networking and endpoint security the technologies are tested today.
like, “Oh, you know Workspace ONE could look like, as far as security
has tens of millions of endpoints. and authentication. When you start Craig: Earlier, Shawn mentioned
We should just use Workspace changing the way that you access our ability to exchange things like
ONE.” My team’s initial focus was things from unmanaged devices and posture data in real time. That
on how we could bring security into new IoT devices—versus the prior underscores one of the things we
the PoP. I thought we’d get to the world, which was all driven based did with our SASE architecture. We
Workspace ONE piece later, which on corporate-managed devices—a use GENEVE tunnels as our service-
was being explored in parallel by number of things have to change chaining mechanism between
Pere Monclus’s team under the name about how you evaluate device components. One of the things that
Project RAMBOS (Remote and Mobile posture and risk, and how you this allowed us to do was to build
Branch Office Services). And then respond to changing conditions on simulators for Workspace ONE as
the pandemic hit and suddenly, it an endpoint. And that’s why I would an ingress point and Cloud Web
really accelerated everything we say that the zero-trust elements that Security (CWS) as an egress point.
were doing. And that's when those factor into Kepler is another area of This allows us to automate all of
conversations between Sanjay and great acceleration and what we'll do the SASE touchpoints, in terms of
Shankar happened. over the next couple of years. There all the different ways traffic can
are still some missing technology come in and out and make sure that
Shawn: That definitely accelerated elements in our stuff today that needs works—independent of Workspace
the whole joint offering. I think further development. One of those is ONE testing and CWS testing—to
there were there were elements of around continuous authentication make sure they are functioning
engineering collaboration before and continuous device posture. That properly. This use of GENEVE as an
that, but I don't think we had really is something that we're working on integration mechanism between
formulated a solution at that time. for the second half of this year. our components, as well as making
With the pandemic, it was like, “Okay, sure that we have a quality solution
now's the right time. This has got to But I think that's another area that, for testing the different components
be produced.” I think it made a lot of from an innovation point of view, we in isolation, reduces the amount of
people think about their technology have a lot more work to do between integration testing and issues. Beta
offerings in a new light. our BUs. So that when we tell a story goes live next week.
about zero trust, holistically, for
We have been extremely busy. I would VMware, we're talking about zero Brianna: How many customers will
say that Craig and I each had been trust from the point of view of all our be in that beta?

50 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Shawn Bass Craig Connors Johannes Kepler
Craig: Approximately 20. BUs, which is not something that wants to get immediate time to value
historically we've done a great job by deploying the SASE technology.
Shawn: It's been a long time coming. of. To give an example, somebody So we're reworking some of the
Craig, Pere Monclus, and I have from my team just committed code unmanaged or third-party-managed
been on a large series of calls with into the EUC repo, which seems like capabilities in Workspace ONE to be
J.P. Morgan Chase (JPMC) about a really small thing, but it's almost able to say, “I want to unlock SASE
their future of work, with regard to unheard of. capability, even if Workspace ONE
next-gen SASE architectures. JPMC is not the management agent.” The
represents an interesting opportunity, So, this close collaboration between long-term destination, of course, is
because they have come to us for BUs—this real “better together” for us to capture that customer as a
a zero-trust approach, inclusive of story of “if I add multiple VMware Workspace ONE managed device, but
SASE. They may potentially see us as products, I get a better experience”— that can be sold at a later date—after
a replacement for multiple vendors is something that we've been chasing we've given them value immediately
and their existing stack, from the EUC for a long time and is now, I think, through the SASE route to market.
point of view. really coming to fruition. We can then work on trying to
say, “Hey, now that you own this
They are a Microsoft endpoint- Shawn: I think with this SASE and component of VMware technology, let
managed solution today. They use zero-trust notion is starting to make us show you the full capability when
Citrix for VDI and app remoting. This people realize that there is a market all these things are working together.”
project has the capability to come for us to be focused on offering a set
together and give us an opportunity of things to devices that are managed, It's kind of like a seed strategy of “how
to displace two other competitors where we're simply just assessing do we quickly get them some value
in the space—a result of this effort posture and using that to inform risk, and then use that to layer on more of
around SASE and zero trust. So authentication, and access. It's giving the capabilities of the full-platform
it's exciting to us, because it's a us a new perspective on devices approach?” That's where Craig talked
customer we’ve been pursuing for that are either BYOD or devices earlier about how this integration of
probably 10 years. The intersection that are managed by a third-party our two technologies is sort of that
of these things is finally breaking tool. You know, one of the requests launching point to give customers
open the opportunity for us to go in that we have is to be able to have value. But we want to be able to build
and potentially displace two other a mechanism of this offering that on it with more capabilities as time
competitors. can be sold into customers that are goes on.
managing their devices with another
Brianna: So what else do you think competitive technology. Brianna: So, anything else about the
that VMware people should know future of this project that we should
about what's happening? Historically, that would have been know?
something that you would have
Craig: This is a natural evolution of approached by saying, “Okay, step one Shawn: Yeah, I would say one
what we're doing. I think one of the is to replace your existing incumbent of the key things—and I kind of
things that's really important is that device management with Workspace already touched on this earlier
it represents one of the first close ONE. Step two, add SASE.” That is not in the interview—is that this is
collaborations between different going to work for a customer who really changing the way we do

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 51


endpoint device posture. This envision this just as delivering the Hungry for more information about the
is something we're targeting to four core sets of SASE features that Anywhere Workspace? Read this blog1
deliver in the second half of this we showed on that other slide. We by Shawn Bass and watch the recording
year. As I mentioned, what we're envision this as delivering a lot of the online event held on May 5th and
developing is the ability to do device- more features in the future. So, you 6th: “Leading Change: Build Trust with
posture checking on devices that know, there are additional security the Anywhere Workspace.”2
we're not managing, and, more functions that will come in SASE,
importantly, able to do continuous like the remote-browser isolation
verification of device posture and that Shawn mentioned. Something
use that continuous verification that may be more interesting is
as a checkpoint. So while the user that this same platform gives us a
is connected, if some other device way of bringing in users securely
deviates from what we consider from anywhere to a datacenter and
“known good” or an acceptable delivering services on top of this
posture, we then have the ability framework. You can imagine that
within the tunnel used by the SASE there's an endless array of services
PoP to be able to enforce a disconnect that we could deliver: DNS security,
of that session midstream. IP address management (IPAM), edge
computing…lots of other things that
Craig: The slide above may be will deliver out of the SASE PoPs and
interesting, because it shows all the continue to expand the value of this This interview was conducted by
different components. All of this runs offering going forward. Brianna Blacet. She is an Innovation
on top of the VMware infrastructure Storyteller working in the Office of the
stack in our PoPs—ESX, NSX, and Brianna: Well, it seems like we have CTO. You can reach her on Slack or at
Tanzu. Then you have SD WAN secure arrived at the end. Thank you guys so bblacet@vmware.com.
access from Workspace ONE. So much for taking this time to explain
almost every group within VMware it all. 1 https://blogs.vmware.com/euc/2021/04/
is represented here in some way—a anywhere-workspace-announcement-
overview.html
huge portion of the company. 2 https://www.vmware.com/anywhere-
workspace-event.html?src=so_606248b47ee86
In terms of the future, we don't &cid=7012H000001l67p

52 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Objectives
& Key
2.8333 in
Results 3.1667 in

A Short How-to (or How I Learned to


Stop Worrying and Love OKRs)

Jen Handler
What are OKRs?
OKR stands for Objectives and Key Results. This is a management framework developed by Andy
1 https://www.whatmatters.com/ Grove at Intel1 in the 70s, and it’s been used by many other orgs since—like LinkedIn, Oracle,
articles/the-origin-story/
GE, and Google—to focus teams of people on delivery of high value results. There are two other
people often associated with OKRs:
2 https://www.amazon.com/dp/ • John Doerr was one of Grove’s mentees, had a lot of success with OKRs, and wrote a book2
B078FZ9SYB/ about them
3 https://www.amazon.com/ • Christina Wodtke3 is another person who’s gotten a lot of practice with OKRs, and writes a lot
dp/0996006087/ about them

Anatomy of an OKR
An Objective is 'what we want to achieve' in support of a mission, within a certain timeframe (3
months is common). Objectives are a bit vague, should be inspiring, and don’t contain numbers.
They require further definition.

A Key Result is a measurable way we’ll know that we’ve achieved our objective. It further defines
our objective, and for every objective, it is common to have 3 key results. Good key results are
leading indicators that we’ve delivered on our objective (what we can measure within the
period), versus lagging (measurements after the period).

A Mission is an important element of the OKR framework, too. Think of this as the culmination
of your OKRs. OKRs represent milestones along the way of achieving your mission.

Example
4 https://www.sweetgreen.com/ Imagine Sweetgreen4, a store that sells salads in the U.S., is launching a new service that enables
customers to buy additional produce, to use at home, from the store. This service will be called
“Farmer Direct,” and the first product sold will be lettuce. Assume Sweetgreen has validated that
customers want to buy more lettuce directly from the store in launch locations, and that they will
launch the service in the first half of this period. Here is one OKR for the quarter:

Mission: To inspire healthier communities by connecting people to real food.


• Objective: Launch an awesome “farmer direct” buying experience
• Key Result 1: Point-of-sale lettuce purchases >20% total revenue for launch stores
• Key Result 2: 75% conversion in every launch store (conversion = oz. lettuce sold/oz. lettuce
supplied)

How to get the most out of OKRs


OKRs can be incredibly focusing and motivating, but they can also be paralyzing and de-
motivating.

OKRs work best when they:


• Provide focus for a team on the results that really matter (because they are the results that
really matter)
• Drive alignment within teams
• Help other teams know what a team’s focused on, at a high level (vs the detail in a backlog)
• Inspire people on the team to reach farther than they might otherwise
• Help teams deliver some amazing results (versus a collection of outputs with unclear results)
• Are something leadership AND the team care about
• Are frequently checked in on, and new actions are generated to achieve the OKRs

54 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


OKRs won’t work when they:
• Are totally unachievable given known constraints/conditions. Aspirational is good; totally
unachievable because of [solid reasons] is not good.
• Don’t represent the things that matter most for the team to deliver
• Aren’t focused. Too many OKRs is a bad thing. A good target is 3 OKRs.
• Aren’t measurable (the key results aren’t measurable)
• Are not something leadership AND the team care about
• Are set at the top of the quarter, and forgotten

Key roles
Shepherd: The OKR shepherd looks after the OKR process and manages the team through it. The
OKR shepherd isn’t accountable for all of the OKRs, but the OKR shepherd might be a captain of
an OKR or two.

Captains: Every OKR needs an owner: someone to look after the OKR. This means:
• Measuring KRs
• Ensuring there’s work being done to deliver the OKR
• Influencing others to participate in the work to deliver on the OKR
• Holding planning meetings for the OKR, as needed
• Participating in OKR planning meetings
• Sharing feedback (what’s working, what isn’t working) about the OKR process

Captains are not responsible for doing all the work themselves. The objective is to influence and
enlist others in the work.

The OKR cycle: Generate


Toward the end of an OKR period, we’ll consider the next period’s OKRs. Some may carry over,
others may be new. There are three ways this can work:
• Leadership puts forth suggested OKRs; the team provides feedback; OKRs are refined and set.
• Works well when the team is not very experienced with OKRs, and/or does not have the
insight to be able to generate OKRs
• Leadership puts forth suggested objectives only. Team provides feedback, then generates Key
Results collaboratively with leadership. OKRs are set.
• Works well when team is starting to become more comfortable with OKRs, and has a bit of
insight to be able to generate OKRs (but not without leadership support)
• The full team collaborates on the generation of OKRs for the period.
• Works well when the team is experienced with OKRs and has the insight needed to be able
to generate OKRs

Act The OKR cycle: Deliver


Once OKRs are generated and set, we move into the Deliver phase. Within this phase, there’s a
repeating cycle through the period that looks like this:

The OKR cycle: Deliver: Act


Delivering a key result tends to involve doing a bunch of different things. These things are todos,
aka backlog items. Some may have a more immediate impact than others, and we’d prioritize
those.

The OKR cycle: Deliver: measure & Learn


Throughout the period, we’ll measure KRs to know how well we’re doing, and to inform our
Measure & Learn confidence level in our ability to deliver these results. For quarterly OKRs, it works well to

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 55


measure every 1-2 weeks. This is how we know whether new actions are needed. In check-ins, we
can enlist the team to help us figure out what actions to take, and which to take first.

Check-ins
Throughout the period, we need to check in on our OKRs and iterate—either on the OKRs
themselves, if priorities have changed, or on the things we’re doing to deliver OKRs.

How this has worked for us:


• Every two weeks, captains (mandatory) and anyone else interested in joining (optional) will
gather on Monday afternoon ET to talk about how it’s going with our OKRs
• The goal of this meeting is to help one another come up with ideas for how we can get some of
those low confidence scores up
• Captains should come to the meeting prepared with a “confidence score” for each of
their KRs, and be able to talk about why their confidence score (why it’s low, or why it’s high)

How to run a confidence check


Captains may want to take on confidence scoring with the people working on the OKR prior to the
check in. Here’s how to run a confidence scoring session to generate actions.

Option 1: Good for small teams who are just starting with OKRs
Time: One hour (could shrink to 45 minutes as team gets into rhythm)

1. Set the meeting up with a quick refresher on why we’re here and what we’ll do today.
2. Facilitator prompts: “Any questions/concerns before we start?” Give people a minute to do a
silent read, and offer thoughts. Assuming they’re ready to go…
3. Facilitator prompts: Take a minute to read the first objective and key result, and think about
how confident you are in our ability to deliver on it.
4. Score: When everyone appears ready to score, the facilitator counts to 3 and, on 3, each
person shows their score using their fingers. Use a scale of 1–10, with 1 being “hardly any
confidence” and 10 meaning “we’re super confident we’ll crush this.” You can also use a 1-5
scale, which may help the team avoid questions like “what is the difference between a 7 and
an 8?”
5. Discuss: Facilitator prompts: “Let’s discuss!” People take turns talking about why they gave
their scores. The facilitator takes loads of notes. Why the high score? Why the low score? And
this might blend into the next part...
6. Generate actions: At some point during discussion, the facilitator prompts: “What could we
do today to boost our confidence?” The best ideas go into the backlog.

Option 2: Good for larger teams, and teams that are used to OKRs.
Time: 30 minutes
Pre-work: Add OKRs to a shared document that you and your team can collaborate in (my team
uses a Google sheet). On day the of the confidence scoring, invite captains to add their scores in
advance to the shared document (assume this won’t happen though—take a few minutes at the
top of the session to score).

1. Set the meeting up with a quick refresher on why we’re here and what we’ll do today.
2. Score: Facilitator prompts: “Captains, please take 5 minutes and add scores.” Captain will
enter their confidence scores.
3. Discuss & generate actions: Facilitator prompts discussion on OKRs. Captains take turns
talking about the “why” behind their scores. Facilitator prompts all attendees to generate
ideas for how to increase confidence in low scores. Facilitator captures notes/actions in the
document, and captains take actions into their backlog.

56 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Score
OKR scoring is about scoring how well we did (versus confidence). Traditionally it’s done after the
period ends. I’ve found it to be more useful to do a preliminary score prior to the end of the period,
though, because it can motivate a team for a last minute burst of action, and could inform OKRs
for the next period.

OKR scoring looks a lot like confidence scoring, except now we’re scoring how well we did (or,
5 https://www.whatmatters.com/ really, how well we are on track to deliver). Above image is John Doerr’s scale5.
faqs/how-to-grade-okrs/

Instead of discussing additional things we can do to deliver our OKRs—like we do in confidence


checks—we’ll now discuss what we might want to carry into the next quarter (and why), and what
we learned in this quarter that may have an impact on next quarter’s OKRs.

(By the time we score OKRs, we may be well into planning for the next quarter and have a draft
that can be adjusted based on discussion in our scoring session.)

Other Stuff
6 https://www.perdoo.com/ Here are some other suggestions/thoughts I have related to OKRs. Some of these are departures
resources/okr-crash-course/ from the advice some books would offer on OKRs. These come from my own practical experience:
7 https://tanzu.vmware.com/content/ • If we need to change them, let’s change them. OKRs are proven to be great tools for focus,
slides/the-fallacy-of-okrs-and-how-
to-find-the-right-measurements-of- but let’s agree to not use them as contracts. If priorities shift dramatically and an Objective or
success KR doesn’t seem relevant anymore, say something! Let’s talk about it.
8 https://medium.com/@jseiden/ • Celebrate wins along the way: Delivering a KR is a big win, but delivering an outcome along
to-write-better-okrs-use-outcomes- the way of delivering an aspirational OKR is also a big win. If you have a win, radiate it to the
e82be6e7b460
9 https://www.amazon.com/dp/ team so we know and can celebrate.
B015VACHOK/ • Let’s figure out how to account for/support “non OKR work that’s also important.”
This could be “business as usual” work that simply needs to continue to keep our business
humming. It could be occasional customer work. We want to balance this with OKR work, and
make sure we have the bandwidth to make a strong impact on our OKRs. We may also find
that some work that seems important and doesn’t seem to map to an OKR actually is.
• Make sure leadership and the team are aligned on how aspirational / realistic the OKRs
should be: Teams will really struggle if some of the crew feel that any of the KRs are more "set
in stone" than they are intended to be.

Further reading
Quick reads: Longer reads:
Perdoo (OKR tool) crash course6 Measure What Matters2
The Fallacy of OKRs7 High Output Management9
To Write Better OKRs, Use Outcomes8 Radical Focus3

Jen Handler is a Director of Product Management of Customer Experience and Success in the Modern
Applications Platform Business Unit. You can reach her on Slack or at jhandler@vmware.com.

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 57


THE BACKLOG STANDUP
Author: Erica Dohring

TLDR—try the Backlog Standup for a faster, more focused start to


the day. "Yesterday, I worked on [feature]. I ran into
issues with [library] so I debugged that for a
Background
Does your standup feel slow? The Backlog Standup followed by while. I also worked on [other feature] and that
role-specific standups could be a solution. Different roles of the was tough because the test was so finicky.
product development team have different needs around the Today I'll move onto [yet another new feature]
daily standup meeting: and follow-up about that blocker.
• Product managers need to know what’s going on with work-
flow items, especially if there are blockers.
• Engineers use it as a time to decide who is working on what Tomorrow, I'm going to a doctor's appointment
for the day or ask for help. at noon, which means that I might be out for an
• Designers use the time to update the team on major user hour or so."
learnings.

To address these various needs (and maybe somewhat by habit/ This format has the following issues:
tradition), most teams use a “circle” format. Each person goes • Folks tend to feel the need to talk more as a representation
around typically talking about what they did yesterday and of their productivity.
today. • The details raised are often not relevant to everyone.

58 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Backlog Standup
(Today)

Engineering Standup Product/Design Standup


(Planning Today) (Planning Today and the Future)

Format • Vacation Announcements—we typically communicate those


The Backlog format is more work item focused. Here is how you during “stucks, blocks or helps” in addition to keeping a wiki
run it: with team time off.
• Pull up the backlog for everyone to see. • Availability —we share our calendars.
• Ask for a comment on each work item in progress. En- • Product Roadmap Updates—we do this before our weekly
gineers can report “going”, “blocked”, or “done.” Blocked planning meeting to not make standup take too long. Plus
sometimes needs a discussion for a minute or two but can once weekly is generally sufficient.
be pushed to the end. • Detailed Design <-> Engineering Communication—Design
• PMs and design give any updates relevant to engineers. Key and engineering typically run on 2 different time horizons1.
product direction changes or user learnings for example. Most of the time engineering is focused on the day-to-day
• Finish asking, “does anyone have any stucks, blocks or and design is focused on a week or more out. This Backlog
helps?” Standup hones in on the former. We encourage other tools
• Clap it out. to keep that link strong:
• Head to product-design or engineering standup based on • Rapport—attend and rotate hosting the Backlog Stand-
your role. up. You’ll hear a much shorter version of what you heard
before from Engineers (no more implementation or
This standup version feels relevant and snappy. No more ir- calendar details) and you can still see how everyone is
relevant details. In addition, this version also has the benefit of feeling.
making sure your tracking software is up-to-date. Especially • User Updates that Impact the Backlog for the Next 1–2
in larger teams, an even slightly out-of-date backlog can lead to Days—definitely share that in Backlog Standup
duplicated work. Taking care of this at standup decreases that • Fun, Interesting User Updates—share those in Backlog
likelihood and makes it easier for the PMs to keep a pulse on Standup! That is a great daily nugget and keeps the
workstreams. team focused on their mission to solve user problems.

But what about … Different roles have different needs. Our team has found this to
• Detailed Design <-> PM Communication—PM and design host be a huge improvement in making our morning. We feel more
a “product standup” immediately after this one to discuss focused and have a clean, reliable backlog.
those details. Engineers head to an engineering standup.
This helps keep the shared Backlog Standup free of technical Erica Dohring is a Senior Developer in the Modern Applications
details irrelevant to product and design. Platform Business Unit. You can reach her on Slack or at edohring@
• Fun Time—we typically have a few minutes for a laugh or vmware.com.
“interesting” at the start while we are waiting for folks to
arrive, giving us that sense of connection. We also schedule 1 https://www.researchgate.net/publication/343280168_Dual-Track_
team bonding time at a different point during the week. Development

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 59


Aligning on a Shared

OKRs, OGSMs, a
Photo by Dave Colman from Pexels.com

Future:

and North Star Metrics


Andrew Zusman

Editor: Dexter Arver


Mind Maps and Brain Dumps ing back information to help guide
My default way to prioritize informa- longer term product strategy.
tion has been to just brain dump. I
Aligning on a shared vision is critical look at problems from a bunch of What’s the point of great ideas if they
to success. Whether you are working different vectors and angles, toss can’t be easily related to other teams
in a small team, startup, or even a out a ton of ideas, and just see what and teammates?
large organization, the path to suc- sticks. To put it another way, I write
cess will depend on a shared under- lots of notes, blow up a lot of balloons, I find that the larger the scale, the
standing of what the future should pop them, and see what’s left. To put harder it is to communicate great
look like. it a third way, I talk things out, try to ideas, and the more important it is to
record things as I go, kinda bounce communicate them and align.
Here at VMware we have heard the around, and then eventually over
term "Fast to the Future", but how time I find some clarity. Sometimes, If you’re just working through a
might we know we are all headed I’m able to clearly articulate the brainstorm for yourself or in a small
toward the same future, together? outcomes I find, but it’s much more group and you can work out the com-
difficult to articulate the process of munication later, then maybe a Mind
Using frameworks like OKRs, OGSMs, information-gathering in a represen- Map, whiteboard (Miro) or some kind
or North Star Metrics can help us tational way. In other words, the way of personal organization could be
align on a common understanding I think is great for me personally, but great, but for a team trying to scale
of where we're headed. Then, we can it’s not easy to share with others. and communicate with other teams,
just focus on the hard work of getting disorganized thought means lack of
there! Mind Maps are an example of what alignment.
this process might look like visually.
In this article, we will discuss the They are relational, have hierarchy, The lack of high level communica-
differences between outputs and connect nodes of information to tion on strategy is something I have
outcomes, give an overview of key other nodes of information, allow dif- experienced across a wide swatch of
frameworks for aligning on a shared ferent groupings of information, and clients I’ve worked with, and probably
future, and relate some of my person- are a way to mirror my default way something we are all familiar with.
al experiences working with different of prioritizing information. It’s the It leads to duplicate effort, slower
frameworks in pursuit of enterprise- output of a brain dump. scaling, misalignment, and opaque
scale digital transformation for our knots that are difficult to unravel. For
clients. If I’m by myself thinking through a my clients, who are often working
problem, then this is a great system. very hard to scale, inability to align
I think many people who would It works for me, and that’s about all and communicate is uncomfortable
choose to write an article on this topic that’s required of a personal system. at best and a path toward failure at
would truly love it, but the truth is There are, however, many problems worst.
that I haven’t historically loved frame- when I start collaborating with my
works. teammates. Mind maps are messy. In order to avoid these issues, we
They are not ideal for finding align- must gain alignment, but what do we
Instead, this article is generally fo- ment with others. Brainstorming is a mean when we say alignment? And
cused on how even someone who has very important skill. Cataloging infor- why does it matter so much?
historically not loved frameworks, mation is an important skill. Neither
and have been widely skeptical of brainstorming nor cataloging is, in Alignment
them, has found value in using them. my experience, helpful for collabora- Alignment means a straight line—
I’m not going to lie. This is a long tion. Mind mapping isn’t digestible everyone on a team, multiple teams,
article. If reading is your jam, you’re any more than whiteboards filled or even an entire organization—is
in luck. There’s a lot here to read. If with ideas are digestible. I have cam- headed in the same direction on the
you don’t love reading, you can catch era rolls filled with whiteboard pic- same road with the same idea of what
a recording where I talk about some tures that lose their value because they’re meant to accomplish. We can
of these frameworks here1. Enjoy! the artifact on the whiteboard was think of it like a school of fish. They
never as important as the process may all swim separately but they
1 https://tech-talks.eng.vmware.com/talks/
of brainstorming and cataloging understand what the school needs—
aligning-on-a-shared-future-outcomes-okrs- information. And this is something protection, food—and can execute as
ogsms-and-north-star-metrics I am working on improving—shar- a group. If one fish leaves the school,

62 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


they’re weaker, and we can think of lutants will drop by 10%.” Whatever The first team is the business team.
that as similar to the effect of being framework we use to push for align- They are working to align the whole
out of alignment. ment, pushing for outcomes is ideal company around objectives that will
for alignment because it’s not pre- help with a funding round and ex-
Culture plays a big role in alignment. scriptive. pand the team to 150 employees.
Culture can be defined as col-
lective instinct. The instincts of a We’ll see how this plays out when we The second team is building the
team, group of teams, or an organiza- actually apply different frameworks company’s vendor experience—the
tion are formed by having the same to the same concept of alignment, fruit farm or fruit vendor that wants
thoughts and desires for outcomes. but outputs, in my experience, are to sell fruit.
The fish have a culture that compels less conducive to alignment success.
them to swim in a similar way for Outputs often push people to make The third team is the fruit hunter
specific reasons. decisions they might not otherwise team that helps the fruit aficionados
make. Similarly, outputs often push rate, track, connect with others, and
Alignment is what we do to facilitate teams toward metrics that may not most importantly to the business
having those same thoughts/desires be indicative of real success. We also team, buy more fruit from the second
and ultimately the same collective give teams autonomy to solve prob- team’s vendors.
instinct. While I won’t go into too lems in meaningful ways.
much depth on this, something to So is there alignment between these
keep in mind as you read through the Now let’s take a look at three popular three teams? Probably? Maybe?
different frameworks below is that frameworks in enterprise software:
alignment is often a mirror of cul- OKRs, OGSMs, and North Star Met- Here are some questions to consider
ture. Companies can align in many rics. Frameworks are communication to try to figure out the answer:
different ways and don’t all need tools that help us to align with other
frameworks, but misalignment and teams and across an organization. • What kind of communication do
toxic cultures are, in my experience, the teams have? Are they aligned
correlated. To explain these popular frame- on a common vision?
works and how organizations use • What's the connection between
I also want to note that alignment them to push toward outcomes and making the purchasing process
and purpose often overlap and, not gain alignment, let’s talk through a easier and having a successful
coincidentally, purpose overlaps with scenario! funding round?
happiness. Knowing what’s expected • Does upgrading features help in
of you, what you can do to contribute Fruit Finders attracting top talent?
to the broader good, and what your Fruit hunters are people that travel • Does increasing the average pur-
role is in comparison to others is the world sampling exotic fruits. They chase amount per fruit hunter go
often helpful for productive work. So spend outrageous amounts of money hand-in-hand with adding new
while I’m primarily talking about how every year on fruit and often order (or fruit hunters?
organizations align across multiple even import) fruits from outside their • Do we need more users or do
teams from a top-down perspective, local area. we need existing users to buy
I also want to call out that having more?
alignment and using frameworks can Fruit hunters want to eat rare or • If we add more users, will our
contribute toward overall happiness exotic fruits, rate what they’ve eaten, average purchase price drop?
of employees. and then connect with others who are • If our purchase price per user
passionate about fruit. drops but our overall revenue
When we talk about alignment, it is increases, that’s good for the
important that we are aligning on Fruit Finders is a company of sixty first team, but will it help at-
outcomes and not outputs. Outcomes employees aimed at building a tract new vendors?
are the way something turns out and marketplace for the fruit hunting
outputs are the amount of something community. It connects fruit vendors We can keep cross-referencing the
produced. For example, an outcome with fruit lovers and it aims to build a different things the different teams
might be, “increase indoor air quality community for fruit hunters. are trying to accomplish, but this is a
in my house” while an output might common struggle in many organiza-
be, “spend three hours vacuuming The company is split into three tions.
each week and the amount of pol- teams.

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 63


Completing OKRs is seen as counter
Objective Key Results productive—OKRs aren’t a task list.
(5/10 confidence) They are inspirational stretch goals
that could result in better outcomes.

500 fruit purchases are made daily. OGSMs


Another option for Fruit Finders is to
Crush the 95% of fruit hunters checkout with multiple items use a different framework—OGSMs
purchasing in their cart. OGSM stands for Objectives, Goals,
experience! Strategies, and Measures.
90% of first-time purchasers make an additional
purchase within 30 days of their first purchase. Unlike OKRs there are multiple
OGSMs for a company with each team
having their own. A common trait
with OGSMs is that they can cascade
The teams are all working together. will feed into the objective for a team
to help form a chain of goals, which
They are all working towards com- below it in the hierarchy.
helps with alignment.
pleting their goals.hey all care about
the business value and the end-user Let’s take a look at how this might
Objectives have both goals (metrics
value that they are building.But in work for Fruit Finders with the OKR
that define the success criteria of the
spite of it all, it is hard to ensure align- above.
objectives) and strategies (pathways
ment in everything
to achieve the objectives). Measures
No matter what team you’re on, these
are ways to define the success criteria
So, let’s take a look at how Fruit Find- are your goals. The entire business is of the strategy.
ers might apply different frameworks optimized around just one objective
to gain that alignment. and needs to be inspiring. Contrary to OKRs, OGSMs are less in
the realm of moonshots. We still want
OKRs Everyday, employees wake up, go into to push toward something ambitious,
One way the teams might align is by work, and think “What can I do today but the point of an OGSM is not to
using the popular OKR framework. to crush the purchasing experience?” rally the entire business around one
OKR stands for “Objective and Key Each team can focus on one of the key moonshot idea and, if we fall short,
Results”. results or, maybe, all three at once. to feel okay having fallen short. The
point of an OGSM is to meet the goals
The OKR framework tasks teams with Each week when the whole business we set as a business, and align on a
creating a singular objective that the meets to discuss their confidence pathway for completion.
whole team will push toward. The score, they can say, “Hey, this week we
objective is a “moonshot” that a team had a ton of first time purchasers so Because OGSMs cascade through a
is meant to be only 50/50 in terms we got to 500 fruit purchases every business, this also means that the
of their confidence in executing day this week, but we’re not sure if entire organization has transpar-
on. Objectives are chosen at longer we’ll see the results next time because ency into how their work will directly
intervals such as quarters, halves, or we gave them a discount code, so impact the work of those above/be-
even yearly. maybe we’re 6/10 confidence on that low them in the hierarchy. Strategies
one.” should cascade between levels and
Key Results are metrics that explain each level should own the outcomes
what success would look like with 10/10 confidence isn’t considered and appropriately prioritize. This
just 50% confidence that each key a “good” thing in OKRs. It means means that work can be traced back
result can be completed. Teams are that they set their sights too low and to higher level goals and hopefully
encouraged to pick just a few key weren’t ambitious enough. Con- that will show employees how their
results and to meet weekly to chart versely 1/10 confidence isn’t consid- work impacts the overall business.
their confidence as they push toward ered a “good” thing in OKRs because That transparency can be beneficial,
those results. it means that a team set their sights especially as organizations scale. It
way too high and generally that’s poor also means that team efforts can track
OKRs can cascade as one level in the for morale. metrics that will be impactful overall
hierarchy may have a Key Result that and strive for them.

64 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Objective: Fruit finders is the most successful online marketplace for fruit buyers and sellers.

Business team Vendor partnership team fruit hunter team

Fruit buyers pay less for exotic fruits on Fruit


Become the most profitable fruit marketplace Vendors have more repeat customers online
Objective online than in their brick-and-mortar stores
Finders than they would at their local store by
signing up for a subscription

Vendors see 10% average monthly increase An average of 100 net new subscriptions are
Monthly Recurring Revenue increases each
goals month over the next 10 months by at least 5%
of returning customers over brick-and-motar signed up for each month for the next 10
sales using Fruit Finders subscription plans months (1000 new subscriptions)

Referrals to use Fruit Finders are streamlined


Returning customers get a bonus when they
strategies Fruit Hunters are buying more fruit
sign up for a subscription
using referral links from each Fruit vendor and
Fruit Hunter account
• Fruit Hunters that make their first purchase on • Each subscription customer gets a referral link
• 75% of new users join the platform using a
Fruit Finders make a second purchase within that can earn them a 30% discount on their next
referral code
measures 30 days order
• 90% of website visitors come from a specific
• 20% of Fruit Hunters sign up for recurring sub- • Each vendor sends their referral code to five
users' referral link
scriptions with their favorite vendors prospective buyers each month

The above is a quick example of an Hopefully, in the above example, you view into which metric is the most
OGSM I drew up for Fruit Finders. can see the cascade effect that helps critical for the overall company and
to build alignment across the orga- how the metric they’re working on
There’s an overall company objective nization and the transparency that feeds into that overall metric.
that filters to the Business Team. The helps teams to stay on the same page.
Business Team is defining success by I don’t mean to present NSMs as a
profitability To increase profitability, In the final Framework we’ll look combination of OKRs and OGSMs,
they will need to increase Monthly at today, you will also see a cascade but there is some overlap between the
Recurring Revenue (MRR). If fruit effect. considerations of both in how NSMs
hunters are buying more fruit, then are generated and how they work to
the MRR will go up, and one way to North Star Metrics improve alignment.
make that happen is for there to be an North Star Metrics (NSMs) are a form
increase in subscriptions. of a single metric that matters. Amplitude, a product intelligence
company, has championed the North
The Vendor team can see the need The company has a singular metric Star Metric2. Here is their checklist of
for more subscriptions and sets their that is the most critical and then the what makes a great NSM:
objective based on the metric of the metrics cascade from there. Each 1. It expresses value. We can see
business team—20% of Fruit Hunt- team then works on what will be most why it matters to customers.
ers sign up for subscriptions. To beneficial to helping to achieve that 2. It represents vision and strategy.
make that happen, the Vendor team metric. Sometimes those are called Our company’s product and busi-
sets having more repeat customers Sign Posts. ness strategy are reflected in it.
than the brick-and-mortar stores of 3. It’s a leading indicator of success.
their vendors as their objective. To One of the hardest parts about North It predicts future results, rather
get there, they will need to see more Star Metrics is actually choosing the than reflecting past results. (Not
repeat customers via subscriptions right North Star Metric for an organi- Lagging)
and to get more subscriptions, they zation. In my experience though, the 4. It’s actionable. We can take action
try using discount codes. hardest part of having a North Star to influence it.
Metric is actually putting in the work 5. It’s understandable. It’s framed in
The discount codes mean that the once the Metric has been determined. plain language that non-techni-
fruit hunters will pay less compared cal partners can understand.
to buying at their local stores, so the One of the key aspects of NSMs is that 6. It’s measurable. We can instru-
Fruit Hunter team focuses their atten- they galvanize support for a singular ment our products to track it.
tion on new subscribers for the next goal, much like OKRs, without requir- 7. It’s not a vanity metric. When
10 months. ing a moonshot or confidence score.
Everyone at the organization has a 2 https://amplitude.com/north-star

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 65


it changes we can be confident or potentially even hourly, but the ing sold and theoretically larger
that the change is meaningful outcomes there would be the same— orders that make vendors happy,
and valuable, rather than being clear business value, clear user value, and fruit hunters can sample
something that doesn’t actually leading indicator, and an understand- more types of fruit.
predict long-term success—even able metric. 6. It’s measurable. We can track this
if it makes the team feel good metric.
about itself. So how might Fruit Finders describe 7. It’s not a vanity metric. When
its NSM? it changes we can be confident
For example, Slack’s NSM was mes- that the change is meaningful
sages sent within an organization. FRUIT FINDERS - Each order in- and valuable, rather than being
cludes more than two fruit types. something that doesn’t actually
This is a solid NSM. It expresses the predict long-term success—even
value of Slack (connecting organiza- Why might this be a great NSM for if it makes the team feel good
tional communication), it expresses Fruit Finders? about itself. This will be a mean-
the value of the product being sold, 1. It expresses value. We can see ingful representation—the more
it’s actionable and understandable why it matters to customers. people try different fruits, the
and it could be a leading indicator Fruit hunters want to sample new more likely they are to continue
as well. Overall, this is a well written and exotic fruits. User motivation to order them. Fruit hunters love
articulation of a NSM. and product value are demon- fruit, so the more fruit we put it
strated here. Trying more types of in their hands, the more value the
We can speculate about why this was fruit also means more reviews so system will have.
the most critical to Slack. They chose it bolsters support overall.
messages sent within an organization 2. It represents vision and strat- Strictly My Opinion: OKRs
rather than new organizations using egy. Our company’s product and I think there is a tendency to use
Slack. It’s interesting to optimize business strategy are reflected in OKRs incorrectly or to use what I call
usage over growth, but I think the it. Fruit Finders wants to encour- “lower case” okrs rather than “capital”
following: age fruit hunters to try new and OKRs. It’s rather common to see too
• Usage leads to loyalty, which will exotic fruits, so the vision is many OKRs, poorly written OKRs, or
lead to long term growth. there. This also gives us a way to things that are checklist-y in nature
• Usage justifies the price and measure or consider whether a rather than provocative moonshots
potential price raises. vendor is well suited to the plat- that help drive and motivate teams to
• Usage requires a focus around form or not. If they are a vendor reach for more difficult outcomes.
user needs and solving user that just deals in one type of fruit,
problems. then in the long term, they may We think about OKRs as being an
not be a good candidate for our “easy” framework because defining
Another example is Zoom’s North push toward diversity of fruit on an OKR isn’t so difficult and people
Star Metric—Weekly hosted meet- the platform. can understand the basic meaning
ings. 3. It’s a leading indicator of success. of the words quickly. However, the
It predicts future results, rather concept of a “moonshot” is relatively
Pretty straight forward NSM. This is than reflecting past results. This foreign to major enterprise compa-
actionable and understandable. Every is aspirational in that we want nies (especially if it is publicly traded).
team at Zoom can rally around trying orders to have more types of fruit The desire for consistency is often
to get more weekly hosted meetings. sold. fundamentally at odds with the low-
It’s a call to action. 4. It’s actionable. We can take action grade confidence scores, which is a
It also expresses the value proposi- to influence it. requirement for moonshots (and thus
tion of the company in a succinct way 5. It’s understandable. It’s framed in OKRs). Due to this, I don’t generally
and it’s even trackable. plain language that non-techni- recommend OKRs for enterprise. And
cal partners can understand. It’s I have had to push back on their use
Zoom could have rallied around paid actionable and understandable in many instances.
subscribers, time spent in each meet- for each team. The business can
ing, participants in a meeting, but rally around that metric as the I also find that instead of one great
they chose to go with weekly hosted one that drives growth (more OKR for an enterprise, there are often
meetings. Maybe with COVID this things per order), the vendors can half a dozen; and this, too, is not a
has changed to daily hosted meetings see more diversity in what is be- galvanizing, inspirational method.

66 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Instead it is just a collection of goals intra-team alignment. My biggest is- NSMs as their framework.
written in an OKR format. That’s sue with OGSMs is that, unlike OKRs,
not beneficial. Great OKRs work. they’re not inspiring. No one wakes Strictly My Opinion: OTHER
However, it is very difficult to write up in the morning and tries to crush Certainly there are other frameworks
great OKRs and to persevere through their OGSM. It is more of a way to out there and many organizations
(potentially) years of learning the write out what you’re working on and may choose simply not to use specific
framework, evangelizing it, and ex- how it fits into the larger enterprise. It frameworks. At Pivotal, “four im-
ecuting at scale. is not meant to rally a team. peratives” were passed down to each
team—they were just stated goals
There has also been a lot of writ- Strictly My Opinion: NSMS that each team would autonomously
ing over the past few years on how Finally, I really love NSMs. If I had work toward. I think that was one way
organizations have tried to tie the to pick one for VMware to try, this to consider how a framework may
success of OKRs to a bonus structure is likely what I’d pick as a first ex- not be needed while still aligning a
and why that’s a recipe for disaster. I periment. It would be fascinating to company. It could be that all that is
won’t spend time in this article dis- think about what we might use as needed is a directive, but consider
cussing it, but it’s worth diving into that NSM. The idea of galvanizing that the more crisp we are at formu-
either Christina Wodke’s book or John and inspiring teams while retaining lating understandable directives,
Doerr’s book for more info on that transparency across an organization the closer we’ll be to alignment and,
conversation. and cascading goals top-to-bottom hopefully, if we do everything right, a
is outstanding (especially in terms of better, scalable culture.
Strictly My Opinion: OGSMS alignment).
One great thing about OGSMs is that ResourceS: OKRS
they cascade and they offer a degree I think the biggest hurdle is in select- There are two seminal works on
of transparency. I think OGSMs are ing a North Star Metric. This can be OKRs. The first is Christina Wodke’s
helpful in an organization as each excluding rather than inclusive and book Radical Focus3 and the other
team has insight into how their work aligning an entire organization be- is John Doerr’s book Measure What
slots into other work being done in an hind the wrong metric can be disas- Matters4.
organization. It is helpful to have a trous in a way that OGSMs or OKRs
3 https://www.goodreads.com/book/
roadmap or strategy at that level and are unlikely to be as impactful if done
show/28951428-radical-focus
certainly the act of putting together poorly. Even with that risk, I’d still 4 https://www.goodreads.com/book/
an OGSM can be helpful in gaining love to see more organizations select show/39286958-measure-what-matters

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 67


I prefer Wodke’s book, but they’re I don’t think OGSMs are quite as using any framework for my work at
both pretty great. Here is how Wodke difficult to master as OKRs, so there’s VMware, but I do feel like the work
sums up OKRs: more wiggle room. Who knows, I’m doing is contributing to the over-
maybe someone here at VMware will all success of the organization. It’s a
“OKRs are about continuous improve- be inspired and decide to write a book thought I go back and forth on in my
ment and learning cycles. They are on OGSMs and if you do, again, I’d head. Do I need a framework to have
not about making check marks in a love to read it. the feeling I want? I’m not sure.
list. So you didn’t hit any of your KRs.
Ask yourself why, and fix it. So you hit ResourceS: NSMS I’d love to hear from you. I’d love to
them all? Set harder goals, and move Amplitude - North Star Metrics2 hear about whether you find your
on.” - Christina Wodke work to be meaningful, how you
Amplitude runs some specific work- know it’s helping the broader organi-
ResourceS: OGSMS shops and webinars that are free. I’ve zation, and what your experience has
For OGSM my recommendation is attended a few of them and got a lot been like with these frameworks or
just to Google it. There isn’t a great of value out of them. Lots of places others. In short, we have an oppor-
repository or book that I’ve found for out there use North Star Metrics as tunity to drive thought leadership
OGSMs. If anyone finds one, please their framework, but Amplitude is together, to align on our own shared
let me know as I’d love to read it. It the primary thought leader. future, and build the culture we want
seems like a technique that many for VMware.
companies have adapted and made Conclusion
their own, but not one that has a cen- If you’ve read this far, you’re probably If you loved this article, tell your
tral body of knowledge or OGSM-spe- thinking that this is a lot of writing friends. If you didn’t (or you have any
cific experts. In some ways that could for someone that proclaims to not questions or compliments), tell me!
be considered a good thing. One of like frameworks. You’d be right. I Thank you.
the drawbacks of using frameworks don’t like frameworks. What I do like,
is that they often aren’t as malleable however, is clarity and the separation Andrew Zusman is a Senior Product
as needed for different organizations of signal from noise. I love to be in- Designer in the Modern Applications
at different stages and with different spired and I love to feel like the work Platform Business Unit. You can
personnel. In another way it’s prob- I’m doing is meaningful. Frameworks reach him at @andrew on Slack or
ably not the best thing that there isn’t are a great way to build the kind of azusman@vmware.com.
a specific body of knowledge to focus culture that makes me excited to go
on. into work everyday. I’m not currently

68 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


unblocking
FedEX + vra 8.3

THE

ACE Team
Authors: Tom Scanlan and Luis Valerio Castillo | Editor: Dexter Arver

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 69


A
ccelerated Co-Innovation Engineering (ACE) is a new but it was not in parity with the vRO plug-in. Because of their
team in the Office of the CTO within the Advanced extensive use of Puppet, FedEx would not update to VCF 4.x until
Technology Group. ACE is chartered to collaborate with vRA 8.x supported feature parity with the vRA 7.6 release for a
customers and across VMware Business Units (BU) to infuse new few specific features of the Puppet plugin.
technology and enhancements into VMware products that ac-
celerate adoption. ACE is Called into Action
The upgrade issue was raised by the field in early to mid 2020.
ACE engages in strategic situations where it can make the most They shared information with the Cloud Management Business
impact for the success of VMware and our customers. To do so, Unit (CMBU) on what specific issues needed to be overcome for
ACE listens to sales and other field teams, and looks into BU our customers. CMBU has the same problem of resource alloca-
issue trackers. Generally, what we find is that there are improve- tion as most engineering teams—there are finite resources, and
ments that can help our customers but BU prioritization results due to other priorities, some simply would not fit into the prod-
in delaying the related benefits to our customers. Sometimes uct roadmap until mid-2021. CMBU and the field raised the issue
improvements require fundamentally new technology, which to ACE requesting that we take it on as an effort to accelerate the
may be outside the core competency or responsibility of a single needed capabilities in late November (as seen in our tracking
BU. system3).

A recent example of how ACE can help accelerate customer After review, ACE agreed to develop the required capabilities.
outcomes involves our customer, FedEx. FedEx’s desired out- The team worked out an arrangement with CMBU to embed
comes required improvements in one product that had been ACE’s software engineers into CMBU engineering. The goal was
delayed for months and was not going to be prioritized soon. to get FedEx’s Puppet automation working with an early version
FedEx would be stuck on an older (but working for them) version of vRA 8. By doing so, FedEx could immediately advance their
of this product which would, in turn, prevent upgrades to several VCF upgrade and rollout. FedEx agreed that the outcome of over-
other VMware products. Injecting expert effort into the right coming the obstacle to their upgrades was worth the time. Based
place could help them adopt the latest version and unblock a on that and the CMBU's agreement to embed our engineers, we
single deal worth $70 million. And the BU and FedEx were will- began in early January 2021.
ing to make that happen with the help of ACE.1
Luis Valerio Castillo and Tom Scanlan were the members of ACE
To support their future desired outcomes, FedEx wanted to move embedded with members of the CMBU configuration manage-
from VMware Cloud Foundation 3.x to 4.x. As a prerequisite of ment provider development team, Deepak Mettem and Mrinali-
that upgrade, they needed to move from vRealize Automation ni Anand. Initial on-boarding—coming up to speed on the code
(vRA) 7.6 to vRA 8.x and NSX-V to NSX-T. Versions vRA 7.6 and base and team practices—took a couple of weeks. If you want
prior supported Puppet2 via a vRealize Orchestrator plug-in de- to know how providers in vRA work—and more specifically the
veloped and maintained by Puppet Labs. vRA 8, was a complete Puppet provider—you can head to the “On-boarding working
rewrite, and the puppet plug-in would not work with the change session recordings" Confluence page4 for all the deep diving you
in design. Some support for Puppet was built into vRA directly, might desire.

1 The ACE team would like to make clear that the ACE team supports outcomes, 3 https://gitlab.eng.vmware.com/octo-ace/ace-submissions/-/issues/26#note_
not product release promises. 139587d6b1f846ca477d25a9faf240d2bab87f3b
2 https://puppet.com/ 4 https://confluence.eng.vmware.com/display/VCMR/On-boarding+working
+session+recordings

70 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Requested Outcomes
The outcomes requested by FedEx are covered by short descriptions and demos of the works as they came into being.

Requested Outcome 1:
Customer must be able to use vRA 8 blueprints to specify a Puppet primary server, and a geographically local Puppet compile server
for scaling.

FedEx has a global computing footprint, with tens of thousands


of VMs to manage. To use Puppet at that scale, FedEx is using
a Puppet primary server with many compile servers. In this
architecture, there is a top-level server in a high-availability
configuration that manages certificates and has a global per-
spective of all production VMs. At each datacenter there are one
or more Puppet compile servers that do most of the work local
to the VMs in that datacenter and synchronize with the primary
Puppet server. Compile servers max out around 2500 nodes, but
that limit can be raised by adding more compile servers and us-
ing a load balancing mechanism to spread the load across many
compile servers.

Blueprint containing "installMaster” property See https://via.vmw.com/EQEL for a demo of this outcome.

Requested Outcome 2:
Customer must be able to expect the same kinds of Puppet Facts available from vRA 8 as vRA 7.6.

Puppet Facts are attributes about VMs that can be used to affect
how the VM is configured. While some Facts are discovered from
the VM characteristics such as the size of disks, or the amount or
RAM or CPUs, others can be provided from an external system.
For example, servers with a fact named “type” and a value of”
webserver” could be configured differently than one with a
“type” fact of “database.”

To support external Facts, properties of the vRA blueprint are


captured and sent as a file in the VM that is available for Puppet
to query at run time. That functionality existed in vRA 7 but was
lost while re-architecting vRA 8. The limitation was due to using
an “allow list” of Facts that could be passed to the VM that was
Facts file as delivered in v8.1, vs 8.4 (video: https://via.vmw.com/EQEJ) not configurable and allowed only a small set of blueprint facts.
The fix was to flip this logic around to a “disallow list” and allow
everything not on that list. Now any customer added properties
could come across by default, while some private information
could be prevented from being sent to the VM.

An interesting highlight about the ability—ACE had to be surgi-


cally precise in that Windows and Linux shells support different
character lengths. This revealed a bug in the Windows imple-
mentation for handling large numbers of Facts—fixing Fact han-
dling for Windows would have required a larger re-design. Since
FedEx only uses Puppet with Linux VMs, this was not a problem
for them, and we pushed the Windows fix into the backlog for a
future release. We also documented the issue, added testing that
would reveal the issue and make it easier to fix for the engineer-
ing team in the future. We put in exactly what the customer
Properties attached to blueprint (video: https://via.vmw.com/EQEI) needed while advancing the engineering team.

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 71


Requested Outcome 3:
Customer must be able to purge a Puppet node, and already purged nodes should not be considered a failure.

Purging a Puppet node will cause the primary server to revoke a


certificate for the node being purged. In an early vRA 8 release
FedEx had tested in their lab, there was a scenario where a VM
would failed to deploy and a purge step would be triggered to
clean things up, but this purge step would fail because the VM
did not have a certificate registered with the primary server. This
caused problems where deployments could not be cleaned up in
vRA 8. As development on vRA 8 progressed, this bug was elimi-
nated. ACE didn’t make any changes to the code, but instead
they demonstrated that the current behavior matched FedEx’s
expectations.
Successful purge during forced failure

The ACE Team loves it when a plan comes together.

ACE's Exit
The work is complete and is now generally available as part of sistance is needed. ACE has already moved on to the next set of
the 8.4 vRA release. FedEx and other customers can now utilize blockers, and we look forward to sharing more, great outcome
these improvements when appropriate. Specifically, FedEx had experiences in the future.
access to test an early release of vRA 8.4. Through customer-
facing demos and discussion, we learned that FedEx had a very Tom Scanlan is a Application Platforms Architect working in the
positive experience with ACE. In fact, CMBU and the field also ACE team in the Office of the CTO. You can reach him at
had a very positive experience with ACE. tscanlan@vmware.com.

ACE has officially stepped away from this project, having writ- Luis Valerio Castillo is a Member of Technical Staff working in
ten the code in a BU supported fashion and leaves continued the ACE team in the Office of the CTO. You can reach him at
support to GSS. However, ACE will remain close by in case as- lvaleriocasti@vmware.com.

72 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


From poster
session to propel:

benefits, adaptability,
and innovation of
University Talent at VMware
KATE WILKINSON
At VMware, our employees are what set us apart from the rest of the technology industry.
The innovative spirit and passion of our employees has made VMware the leading
company it is today—including placing in the top 11% on Forbes’ 2021 America’s Best
Large Employers1. 1 https://www.forbes.com/best-
large-employers/
It’s no surprise then, that recruiting, hiring, and retaining our employees is crucial in
maintaining our innovative culture across the company. A great deal of time, effort, and
resources are put into developing strategies and programs to find the best talent at all
levels and experiences. This article examines talent management for those just starting
out in their careers at the university level.

VMware is committed to finding top global talent at the university-level for both
internships and New Graduate (NG) positions. Hiring at this level poses a different
set of challenges and processes than hiring for industry positions. Let’s explore those
differences and learn how University Talent innovates while recruiting incredible interns
and NGs.

I had the opportunity to talk with three University Talent (UT) team members—Maria
Raimundo, Cherielynn Tsay, and Katherine Nguyen—about their roles, the benefits
UT can have on a team, and their perspectives on how the UT team is innovating bringing
university-level talent into the company. We also talked with some folks from the Office
of the CTO’s Research & Emerging Talent team who are also passionate about recruiting
university talent.

Before we go any further, let’s introduce the UT team members who will be featured in this
article:

Maria Raimundo is a University Talent Engagement Manager, responsible for the global
NG and Intern hiring strategy for many of VMware’s Product Teams. Maria is part of the
UT team whose focus is to recruit, hire, and provide an excellent experience to all students
who work at VMware. Maria supports her BUs by partnering closely with their Chief-of-
Staff, Finance Director, Industry Recruiter, and HR Management Partner to devise a yearly
hiring plan that will meet the BU's early-in-career talent needs.

Cherielynn Tsay is a program manager on the University Talent Experience Team,


overseeing the AMER Intern Program. Cherielynn’s scope includes curating program
events and trainings; leading onboarding; supporting interns’ managers and mentors;
partnering with global counterparts to ensure consistent intern programming across all
regions; and being a liaison with UT recruiting teams for a smooth intern experience.

Katherine Nguyen is a University Talent Program Manager overseeing a variety of


different programs that reach the R&D community. Her work focuses on employee
experience by providing new graduate hires with the tools, resources, and opportunities 2 https://blogs.vmware.com/
to become a successful contributor at VMware. Katherine also supports the Talent careers/2020/10/vmware-code-
Acquisition’s overall strategy by planning and executing events that focus on VMware’s house-2020-the-virtual-experi-
DEI goals. Some of these programs and events include: ence.html
3 https://source.vmware.com/por-
• VMware CodeHouse2, an exclusive, invite-only, 3-day coding event for women in tal/pages/HR/university-flight
technology. 4 https://source.vmware.com/por-
• University Flight Program3, VMware’s global NG resource center. tal/pages/HR/university-propel
• University Propel Program4, a U.S.-based rotation R&D program. 5 https://blogs.vmware.com/ca-
reers/2021/01/meet-the-2020-vm-
• Achieve Scholarship5, provides women a one-time award of $10,000 if they are ware-achieve-and-vmware-rise-
pursuing a Computer Science or related major. scholarship-recipients.html

74 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


• High School Immersion6, a growing program that engages students at the high school 6 https://octo.vmware.com/high-
level interested in pursuing a degree in Computer Science. school-virtual-hackathon/

University Talent timelines & hiring processes


Unlike industry-level talent who can be hired anytime throughout the year, UT relies on
a seasonal cycle to recruit and hire based on academic calendars around the world. We
asked Maria for her perspective on the reasoning behind this structure, and some of the
benefits and challenges to this timeline.

As Maria explained to us, the best university plan is to interview once and hire twice,
meaning there is a great focus to hire the best interns, which allows VMware to get the
best NG full-time hires when they graduate. To get the top student talent means having an
overall hiring plan up to 1-2 years in advance, as many students find their job or internship
far ahead of graduation.

The UT hiring season follows the major global school timelines. The 2022 UT season
is from July 1, 2021—June 30, 2022. Globally, this means UT works on the overall 2022
University hiring plan from April 2021 to June 2021 and opens all planned NG and intern
requisitions by July 2021, even though the soonest VMware has student hires start is
February 2022 (the beginning of the new FY23 fiscal year).

While this timeline might seem very proactive—and a bit counterintuitive as upcoming
UT plans are made before annual budgets are set—Maria emphasized that timing is a
huge factor in capturing optimal university talent, especially from a DEI perspective
(we’ll discuss more later in the article). While hiring for industry positions happens
on a shorter cycle—usually month-to-month—university students are dependent on
when universities and/or local labor laws allow them to work outside of studying over
the course of a school year, so it’s ideal to have UT focus solely on those optimal hiring
periods. We work carefully with hiring teams to ensure they get the full benefit of having
student talent as interns and NG hires, so a lot of thought and planning is put into the
entire UT experience.

Importance of university level talent—and what they bring to VMware


When we were talking with Maria, Katherine, and Cherielynn about their roles and
experiences, their enthusiasm for UT was immediately obvious. While many teams at
VMware are familiar with hiring interns and NGs, they still wanted to emphasize the
benefit they bring to teams, as well as the general importance of university-level talent to
a company’s overall talent acquisition strategy.

As a UT Engagement Manager, Maria has a unique perspective on university-talent, as she


sees it mostly from the BU-perspective. When asked about the importance of university-
level talent, Maria believes companies should take advantage of the fact that students are
likely in tune with the newest technologies, which allows the hiring companies to stay
competitive and relevant. As Maria puts it, “students have an edge in being unvarnished
problem-solvers who are natural early adapters of the latest tech.”

Cherielynn and Katherine see the benefits of university talent through their direct work
with interns and NGs. When asked about the best part of her job, Cherielynn said:
Without a doubt, the best part of my job is getting to know the interns and listening to their
experiences. There’s never a dull moment with the interns. It’s a plus when they come back
as new grads, and I get to see them further their careers at VMware! Cherielynn Tsay

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 75


From her work overseeing the AMER Intern Program, Cherielynn has taken note of the
innovations and new viewpoints interns can bring to the table. As Cherielynn says:
We always talk about innovation as a pillar of our growth and I believe our interns and
new grads plays a role in that. This emerging talent brings fresh perspectives to the table
and are always willing to push the envelope with a passion for technology. The pipeline we
have quite literally builds the future workforce at VMware! Cherielynn Tsay

Like Cherielynn, Katherine sees the benefit NGs have to VMware culture, technology, and
customers. According to Katherine,
New graduates bring a fresh set of ideas and energy to the company that can be quite
infectious! They’re creative, social, and willing to go above and beyond at the opportunity to
learn. For many new grads, VMware may be their first corporate job where they are working
on real-world projects, and they bring in fresh perspectives and a can-do attitude that is
beneficial to VMware’s culture and growth.

When asked about the best part of her job, Katherine answered,
The best part of my job is being able to work with students who are hungry to learn new
skills and ready to challenge themselves at VMware! I particularly enjoy working with
students because they bring a fresh mindset to VMware, and I believe we can learn from
them just as much as they learn from us. Katherine Nguyen

Like Katherine, Maria believes in NG’s high-impact energy and desire to create value for
their teams and customers. As she points out,
Teams benefit from hiring New College Grads because they are eager, bright employees who
are ready to learn the basics of the job. This also allows experienced workers the ability to
take on higher level projects—so everyone wins.

COVID-19 & University Talent


For over a year, COVID-19 has impacted every part of our lives, including the way we
recruit, hire, and build experiences for our new hires. Across the technology industry,
the COVID-19 pandemic and subsequent all-virtual environments changed the way the
internships had been run in the past, with many companies cancelling their 2020 and
2021 internship programs altogether. Despite these changes, the VMware University
Talent team was able pivot quickly while still bringing in top-tier talent. Maria Raimundo

We asked Cherielynn about how COVID-19 impacted her work and the internship
program. Overall, her team was surprised how the framework of their program was not
dramatically different when they decided late in the planning season to make the 2020
internship program completely virtual. In some ways, managing the logistics of all-virtual
internship events was actually easier than previous years’ in-person events, and costs
were cut down significantly. Also, more people were able to attend virtual events than ever
before; the annual Intern Poster Session used to be tied to office locations, but the 2020
virtual Poster Session allowed over 2,000 VMware employees from all over the world to
attend!

Understandably, the biggest difference from previous years was the lack of face-to-face
interaction the interns had with other interns, their managers, and their mentors, but,
regardless of that change, interns still had an overwhelmingly positive experience. The
VMware Global Intern Net Promoter Score for the 2020 UT Season was 86.5%—on par
with previous years.

76 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


Overall, the switch to a virtual internship encouraged the team to think outside of the box
and re-evaluate their processes. For 2021, the decision to go virtual was made far earlier in
the planning season, so Cherielynn and her team has plenty of time to adjust the program
based off the feedback from last season.

From Katherine’s perspective, hiring of NGs has not been greatly impacted by COVID-19.
In the beginning of the pandemic, they had to establish some new processes, but now
they continue to see many applicants and referrals. As Katherine works on many of the
UT programs, she notes that each of VMware’s partnered universities have seamlessly
transitioned to virtual events and conferences.

Also, VMware’s flexibility to work remote has allowed Katherine’s team to have a greater
reach among the new graduate community, as it increases opportunities for new
graduates who may not have considered VMware before because of office location or
relocation limitations.

Unfortunately, COVID-19 delayed 2020 UT hiring by almost two months—a very long time
in the University Talent realm! However, in terms of global recruitment methods, Maria

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 77


is confident that the virtual engagement with students will be the norm from now on.
Before COVID-19, UT was already transitioning away from solely in-person career fairs to
virtual formats, as some countries saw attendance skyrocket when the event moved to a
virtual platform well before the pandemic. While it’s still too early to see how lasting some
of the changes will be, Maria notes that “it’s reasonable to assume virtual event hiring and
brand engagement are worthy investments for UT moving forward.”

Beyond UT, other teams within VMware saw a shift in how they partnered with interns
in 2020 due to COVID-19. VMware’s Research & Emerging Technologies team within the
Office of the CTO is enthusiastic about hiring interns to work on their ground-breaking
research projects. Interns bring a different perspective and new technologies to the
project. Their curiosity and fresh eyes inject a new energy into the team. In addition to
having unique knowledge and skills, interns serve as a conduit to collaborating with their
respective academic research groups.

COVID-19 brought both challenges and opportunities to the research intern program.
Many interns and researchers missed the in-person aspects of working together that we
all miss—mostly lunch. In lieu of crystalizing ideas via whiteboard, teams utilized Slack
and Github. Researcher Lalith Suresh had a very productive summer with his remote
intern, as he explained, “we’ve been using Github heavily to brainstorm. He thinks of an
idea or next step, writes it down in detail in a Github issue. Then the rest of us pile on and
iterate from there.”

Researcher Radhika Niranjan Mysore found the lack of physical location as an advantage.
She said,
Being diligent about writing down ideas, thoughts and decisions made throughout the
project helps with remote collaboration, but beyond that it helps if you’re working on a
publication you move more quickly in terms of the paper content and getting it in shape
well in advance of deadlines.

The networking aspect was helped by having a summer-long game, where interns were
paired with 3 different researchers every two weeks to solve a “fun” problem. While not
quite as effective as impromptu office chats, it did provide interns a chance to work with
folks that weren’t part of their project and do some informal networking while remote.

Innovations in University Talent


Innovation can be seen across VMware, from our Product organizations building new
technologies to our Operations teams supporting our employees through forward-
thinking programs, and the UT team is no exception. We asked Maria, Cherielynn, and
Katherine how they thought VMware’s UT team was innovative, especially compared to
other companies in the industry.

Cherielynn believes UT team's innovations comes down to the overall experiences and
opportunities VMware interns get to have. As Cherielynn points out, the interns get to:
Work on cutting edge technology and projects that have a business impact but also
experience our company culture firsthand. The way our managers, mentors, and teams
model our EPIC2 values day in and out truly makes a difference in their intern’s experience.

Katherine also sees innovations through the UT experience, specifically in the UT hiring
programs offered by VMware. Programs such as Academy, Propel, Launch, and Flight
ensure every new graduate, regardless of position or team, will be well-equipped with the
tools and resources needed to excel at VMware. As Katherine also points out, unlike other Cherielynn Tsay

78 | INNOVATE | JUNE 2021 VMWARE CONFIDENTIAL


companies, every NG is onboarded into at least one—if not two—of the programs, which
guarantees that no new graduate is left behind.

Maria also attributes the experience offered to interns and UT’s hiring programs to the
innovation of the UT team. Interns work on projects that are both meaningful and tied to
VMware’s bottom line because UT views interns as future permanent hires.

Maria also notes the innovative strategy in how UT reaches students from diverse
backgrounds. Aligning the UT annual planning with academic seasonality is a big
benefit in this area, as 70% of the overall DEI UT strategy and budget is focused on DEI
conferences and branding events, most of which happen in the September to December
timeframe. Therefore, BU’s having their intern and NG plans ready to go by the start of
the UT season on July 1 allows them to get embedded into the DEI recruiting community.
Outside of the US, VMware partners with many universities on their top-tier computer
science curriculum and they also support student populations with higher concentrations
of women and/or underrepresented minorities.

Get Involved with University Talent


Now that you’ve read all about the benefits and innovations happening within the
University Talent space—not to mention the adaptability of the UT teams when
something unprecedented like COVID-19 happens—you might be wondering how to get
involved!

One of the best ways is to check out the Get Involved with University Talent Source
Page7. From there, you can learn more about UT programs to volunteer with, and find UT 7 https://source.vmware.com/
portal/pages/HR/get-involved-
contacts to reach out for more information. You can also reach out to Walter Christmas8 with-university-talent
or Monisha Kothari9, two members of VMware’s University Talent team who can answer 8 wchristmas@vmware.com
questions and help get you involved. 9 kmonisha@vmware.com

Another way to get involved is to encourage your leadership to hire interns and NGs.
Reach out to your University Talent Engagement Manager to find out more about what
your BU can do. As we’ve learned, interns and NGs can contribute in many more ways
than helping with a project. They offer new perspectives and help in finding creative
solutions, as well as positively impact VMware’s culture and direction. And while we
might be past the planning cut-offs for UT’s 2022 season, it’s never too early to start
thinking about 2023!

Thank you to Maria Raimundo, Cherielynn Tsay, Katherine Nguyen and Lori Blonn for
their help with this article and their positive impact on VMware through their hard work.

Kate Wilkinson is a Sr. Program Manager working in the Office of the CTO. She’s an expert in
diversity matters and leads that effort in OCTO. She also happens to be an amazing human
being. You can reach her at katherinew@vmware.com.

VMWARE CONFIDENTIAL INNOVATE | JUNE 2021 | 79


STAFF
DEXTER ARVER
LORI BLONN
BEN DUONG
SLOAN GRIFFIN
BOB MOTANAGH
ZACH SHEPHERD
KATE WILKINSON
ANDREW ZUSMAN

EDITORS
DEXTER ARVER
MOHSIN BEG
LORI BLONN
BEN DUONG
AUSTIN ROTH EAGLE
BEN PFAFF
LEONID RYZHYK
JOE SAMAGOND

CONTRIBUTORS
DEXTER ARVER
RUMEN BAROV
SHAWN BASS
EMAD BENJAMIN
BRIANNA BLACET
MIHAI BUDIU
LUIS VALERIO CASTILLO
CRAIG CONNORS
JOHN DIRICO
ERICA DOHRING
JEN HANDLER
MICHAEL HEIN
GREG LAVENDER
PERE MONCLUS
BOB MOTANAGH
DALE OLDS
CHIRAG PATEL
RENU RAMAN
TOM SCANLAN
NATASHA TUCK
KATE WILKINSON
ANDREW ZUSMAN

VMWARE INNOVATE
JUNE 2021
VMWARE CONFIDENTIAL

The journal cover is a digital art piece by


Hanna Friend called “Perpetual Innovation".
She would like y'all to know that this is The
Super Duper Fun Issue.

Also, we’re always looking for interesting


stories. If you have a story to tell, reach out
to Dexter Arver on Slack.

You might also like