Professional Documents
Culture Documents
Datadog Ebook DeveloperEnablementPractices
Datadog Ebook DeveloperEnablementPractices
14 Conclusion
The problem: A massively scaling
dev team—but not enough ops talent
In 2015, I found myself in a tough but familiar good ops engineers at the same rate that we
situation (and one that many readers will could find and hire developers.
recognize). Our team had a huge amount of
How were we going to support our rapidly
development to do on an ambitious new web
expanding development team? We needed to find
platform, and we were hiring developers by the
creative ways to empower our developers to write
dozen. But we couldn’t possibly scale the ops
high quality code, and then deploy, monitor, and
team to support the growth of the development
remediate problems themselves.
team. To put it simply, we couldn’t find and hire
03
NINE DE VELOPER ENAB LEMENT PR AC TICE S TO ACHIE VE DE VOP S AT ENTERPRI S E SCALE
A well-planned metrics
and monitoring strategy
yields higher quality code,
lower support costs,
and more self-sufficient
development teams.
04
NINE DE VELOPER ENAB LEMENT PR AC TICE S TO ACHIE VE DE VOP S AT ENTERPRI S E SCALE
These practices help ensure that developers are ops tasks. Below, I describe these principles in
focused on writing high-quality code that meets more detail and explain how we implemented
business requirements and not worrying about them in our environment.
05
NINE DE VELOPER ENAB LEMENT PR AC TICE S TO ACHIE VE DE VOP S AT ENTERPRI S E SCALE
As part of the questionnaire, developers must notification. Therefore, this questionnaire also
also think about alerting. I firmly believe that forces developers and product teams to go
if you’re going to monitor a metric, you should beyond metrics and think about key performance
also alert on it as well. A dashboard is fine, but indicators (KPIs) and service level indicators (SLIs).
if something isn’t right, somebody should get a
106
NINE DE VELOPER ENAB LEMENT PR AC TICE S TO ACHIE VE DE VOP S AT ENTERPRI S E SCALE
As a sidenote, KPIs and SLIs illustrate contextual Ideally, developers fill out this questionnaire along
critical business or system functions, whereas with their counterparts on the Product Team
metrics represent point-in-time data without before they begin writing code. The understanding
the necessary context. The key difference is in is that these metrics and alerts are going to be
defining what is “normal.” For example, the count visible to everybody (including executives). This
of records processed over the last 60 minutes is exercise should inform how developers write the
a metric: it doesn’t provide any business context. code. Then, when developers are getting ready
The successful percentage of records processed to deliver their code, the architecture team uses
over the last 60 minutes is a more useful indicator. this questionnaire during our governance check.
We verify (1) that the developers are going to
The count of records processed be accountable for the things they said were
important in the questionnaire; and (2) that
over the last 60 minutes is a dashboards and alerts have been checked into
metric: it doesn’t provide any the repository.
07
NINE DE VELOPER ENAB LEMENT PR AC TICE S TO ACHIE VE DE VOP S AT ENTERPRI S E SCALE
So when developers want to bootstrap a project, are automated. We really want developers to be
they simply click through the Spring Initializr, as focused on writing high quality code, rather than
most of the manual microservice setup steps worrying about ops tasks.
08
NINE DE VELOPER ENAB LEMENT PR AC TICE S TO ACHIE VE DE VOP S AT ENTERPRI S E SCALE
Example documentation providing context on how to use our custom @SLA annotation
09
NINE DE VELOPER ENAB LEMENT PR AC TICE S TO ACHIE VE DE VOP S AT ENTERPRI S E SCALE
But first, developers need training and enablement. files, the compilation and correctness of the
examples and solutions in the docs will have to
We require all developers to complete a
successfully compile and build to deploy new
microservice bootcamp. This bootcamp contains
documentation. This ensures that example code
all the coding patterns, including how to create
will not be stale or broken.
KPIs, SLIs, and custom metrics, that developers
need to get started when building services. This During the bootcamp, we also introduce our
bootcamp is “intra-sourced” and we encourage custom libraries and our AOP (aspect-oriented
active contributions, such as lessons learned and programming) annotations, which experience
new microservice patterns. has shown us create the most useful and simple
high-level KPIs and SLIs. However, we do also
The bootcamp materials are currently written
strongly encourage individual team creativity
in AsciiDocs, an excellent vehicle for linking and
when devising the metrics and visualizations
embedding source files to provide additional
that will work best for their services.
context and clarity. By directly embedding Java
10
NINE DE VELOPER ENAB LEMENT PR AC TICE S TO ACHIE VE DE VOP S AT ENTERPRI S E SCALE
It's important to monitor what tools your devs are using before they deploy
With so many components and developers, it’s very The compliance checker generates a compliance
easy to have configuration and versioning drift. score and a green or red result. If green, the
To resolve this problem, we created an automated developer is clear to push to production. If red, the
compliance checker that checks the status of developer understands that the microservice is no
roughly 20 framework dependencies, verifies the longer compliant with our framework standards.
build quality, and ensures that devs are using Updating everything to the latest versions is a click
up-to-date versions of the web framework, library, of a button, as shown below.
containers, and other components. This final, pre-
deployment step helps developers avoid running
stale or insecure legacy items in production.
11
NINE DE VELOPER ENAB LEMENT PR AC TICE S TO ACHIE VE DE VOP S AT ENTERPRI S E SCALE
The compliance checker interface makes it easy to see what’s out of compliance and make the
required updates.
12
NINE DE VELOPER ENAB LEMENT PR AC TICE S TO ACHIE VE DE VOP S AT ENTERPRI S E SCALE
13
NINE DE VELOPER ENAB LEMENT PR AC TICE S TO ACHIE VE DE VOP S AT ENTERPRI S E SCALE
Conclusion
The metrics-driven mindset has enabled the team key business metrics, rather than worrying about
to deliver higher quality software faster. Our very ops tasks. In addition, the metrics-driven mindset
lean ops and architecture team has driven the gives devs ownership over their microservices
practices described above, freeing developers and encourages devs to identify and troubleshoot
to focus on writing good code and instrumenting issues themselves.
14