Download as pdf or txt
Download as pdf or txt
You are on page 1of 52

Observability with Grafana: Monitor,

control, and visualize your Kubernetes


and cloud platforms using the LGTM
stack Rob Chapman
Visit to download the full and correct content document:
https://ebookmass.com/product/observability-with-grafana-monitor-control-and-visuali
ze-your-kubernetes-and-cloud-platforms-using-the-lgtm-stack-rob-chapman/
本书版权归Packt Publishing所有
Observability with Grafana

Monitor, control, and visualize your Kubernetes


and cloud platforms using the LGTM stack

Rob Chapman

Peter Holmes

BIRMINGHAM—MUMBAI
Observability with Grafana
Copyright © 2023 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted
in any form or by any means, without the prior written permission of the publisher, except in the case
of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information
presented. However, the information contained in this book is sold without warranty, either express
or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable
for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and
products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot
guarantee the accuracy of this information.

Group Product Manager: Preet Ahuja


Publishing Product Manager: Surbhi Suman
Book Project Manager: Ashwini Gowda
Senior Editor: Shruti Menon
Technical Editor: Nithik Cheruvakodan
Copy Editor: Safis Editing
Proofreader: Safis Editing
Indexer: Tejal Daruwale Soni
Production Designer: Ponraj Dhandapani
DevRel Marketing Coordinator: Rohan Dobhal
Senior DevRel Marketing Coordinator: Linda Pearlson

First published: December 2023

Production reference: 1141223

Published by Packt Publishing Ltd.


Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK

ISBN 978-1-80324-800-4

www.packtpub.com
To all my children, for making me want to be more. To Heather, for making that possible.

– Rob Chapman

For every little moment that brought me to this point.

– Peter Holmes
Contributors

About the authors


Rob Chapman is a creative IT engineer and founder at The Melt Cafe, with two decades of experience
in the full application life cycle. Working over the years for companies such as the Environment Agency,
BT Global Services, Microsoft, and Grafana, Rob has built a wealth of experience on large complex
systems. More than anything, Rob loves saving energy, time, and money and has a track record for
bringing production-related concerns forward so that they are addressed earlier in the development
cycle, when they are cheaper and easier to solve. In his spare time, Rob is a Scout leader, and he enjoys
hiking, climbing, and, most of all, spending time with his family and six children.

A special thanks to Peter, my co-author/business partner, and to our two reviewers, Flick and Brad –
we did this!

Heather, my best friend – thank you for believing in me and giving me space to grow.

Thank you to my friend and coach, Sam, for guiding me to be all I can be and not being afraid to show
that to the world.

Last but not least, thanks to Phil for the sanctuary over the years – you kept me sane.

Peter Holmes is a senior engineer with a deep interest in digital systems and how to use them to solve
problems. With over 16 years of experience, he has worked in various roles in operations. Working at
organizations such as Boots UK, Fujitsu Services, Anaplan, Thomson Reuters, and the NHS, he has
experience in complex transformational projects, site reliability engineering, platform engineering,
and leadership. Peter has a history of taking time to understand the customer and ensuring Day-2+
operations are as smooth and cost-effective as possible.

A special thanks to Rob, my co-author, for bringing me along on this writing journey, and to our
reviewers.

Rania, my wife – thank you for helping me stay sane while writing this book.
About the reviewers
Felicity (Flick) Ratcliffe has almost 20 years of experience in IT operations. Starting her career in
technical support for an internet service provider, Flick has used her analytical and problem-solving
superpowers to grow her career. She has worked for various companies over the past two decades in
frontline operational positions, such as systems administrator, site reliability engineer, and now, for
Post Office Ltd in the UK, as a cloud platform engineer. Continually developing her strong interest
in emerging technologies, Flick has become a specialist in the area of observability over the past few
years and champions the cause for OpenTelemetry within the organizations she works at or comes
into contact with.

I have done so much with my career in IT, thanks to my colleagues, past and present. Thank you for
your friendship and mentorship. You encouraged me to develop my aptitude for technology, and I
cannot imagine myself in any other type of work now.

Bradley Pettit has over 10 years of experience in the technology industry. His expertise spans a range
of roles, including hands-on engineering and solution architecture. Bradley excels in addressing
complex technical challenges, thanks to his strong foundation in platform and systems engineering,
automation, and DevOps practices. Recently, Bradley has specialized in observability, working as a
senior solutions architect at Grafana Labs. He is a highly analytical, dedicated, and results-oriented
professional. Bradley’s customer-centric delivery approach empowers organizations and the O11y
community to achieve transformative outcomes.
Table of Contents
Preface xv

Part 1: Get Started with Grafana and


Observability
1
Introducing Observability and the Grafana Stack 3
Observability in a nutshell 4 Pelé Product 16
Case study – A ship passing through the Masha Manager 18
Panama Canal 5
Introducing the Grafana stack 19
Telemetry types and technologies 6 The core Grafana stack 19
Metrics 7 Grafana Enterprise plugins 21
Logs 7 Grafana incident response and management 21
Distributed traces 8 Other Grafana tools 21
Other telemetry types 10
Alternatives to the Grafana stack 22
Introducing the user personas of Data collection 23
observers 11 Data storage, processing, and visualization 23
Diego Developer 13
Deploying the Grafana stack 24
Ophelia Operator 14
Summary 25
Steven Service 15

2
Instrumenting Applications and Infrastructure 27
Common log formats 28 Structured, semi-structured, and
unstructured logging 28
viii Table of Contents

Sample log formats 30 Best practices for setting up distributed tracing 42

Exploring metric types and best Using libraries to instrument efficiently43


practices 34 Popular libraries for different programming
Metric types 35 languages 44
Comparing metric types 36
Infrastructure data technologies 45
Metric protocols 38
Common infrastructure components 45
Best practices for implementing metrics 39
Common standards for infrastructure
Tracing protocols and best practices 40 components 47
Spans and traces 40 Summary 48
Tracing protocols 41

3
Setting Up a Learning Environment with Demo Applications 49
Technical requirements 49 Installing the OpenTelemetry demo application 67
Introducing Grafana Cloud 50 Exploring telemetry from the demo
Setting up an account 50 application 68
Exploring the Grafana Cloud Portal 53 Logs in Loki 69
Exploring the Grafana instance 56 Metrics in Prometheus/Mimir 72
Installing the prerequisite tools 60 Traces in Tempo 73
Adding your own applications 76
Installing WSL2 61
Installing Homebrew 61 Troubleshooting your
Installing container orchestration tools 62 OpenTelemetry Demo installation 76
Installing a single-node Kubernetes cluster 63 Checking Grafana credentials 76
Installing Helm 63 Reading logs from the OpenTelemetry Collector 77
Installing the OpenTelemetry Demo Debugging logs from the OpenTelemetry
Collector 77
application 64
Setting up access credentials 64 Summary 78
Downloading the repository and adding
credentials and endpoints 64
Installing the OpenTelemetry Collector 66
Table of Contents ix

Part 2: Implement Telemetry in Grafana


4
Looking at Logs with Grafana Loki 81
Technical requirements 81 Log stream selector 88
Updating the OpenTelemetry demo Log pipeline 88
application 82 Exploring LogQL metric queries 95

Introducing Loki 82 Exploring Loki’s architecture 99


Understanding LogQL 84 Tips, tricks, and best practices 101
LogQL query builder 84 Summary 103
An overview of LogQL features 86

5
Monitoring with Metrics Using Grafana Mimir and Prometheus 105
Technical requirements 105 Prometheus 121
Updating the OpenTelemetry demo SNMP 121
application 106 Understanding data storage
Introducing PromQL 106 architectures 121
An overview of PromQL features 108 Graphite architecture 122
Writing PromQL 114 Prometheus architecture 122
Mimir architecture 123
Exploring data collection and metric
protocols 119 Using exemplars in Grafana 125
StatsD and DogStatsD 120 Summary 127
OTLP 120

6
Tracing Technicalities with Grafana Tempo 129
Technical requirements 129 Introducing Tempo and the TraceQL
Updating the OpenTelemetry Demo query language 130
application 129 Exploring the Tempo features 131
Exploring the Tempo Query language 135
x Table of Contents

Pivoting between data types 138 Understanding the Tempo


architecture 143
Exploring tracing protocols 139
What are the main tracing protocols? 139
Summary 145
Context propagation 140

7
Interrogating Infrastructure with Kubernetes, AWS, GCP, and Azure147
Technical requirements 147 Exploring AWS integration 157
Monitoring Kubernetes using Grafana148 Monitoring GCP using Grafana 162
Kubernetes Attributes Processor 148 Configuring the data source 162
Kubeletstats Receiver 150 Google Cloud Monitoring query editor 163
Filelog Receiver 151 Google Cloud Monitoring dashboards 165
Kubernetes Cluster Receiver 152
Kubernetes Object Receiver 152
Monitoring Azure using Grafana 166
Prometheus Receiver 153 Configuring the data source 166
Host Metrics Receiver 154 Using the Azure Monitor query editor 167
Using Azure Monitor dashboards 170
Visualizing AWS telemetry with
Grafana Cloud 155 Best practices and approaches 171
Amazon CloudWatch data source 155 Summary 172

Part 3: Grafana in Practice


8
Displaying Data with Dashboards 175
Technical requirements 175 Advanced dashboard techniques 186
Creating your first dashboard 176 Managing and organizing dashboards189
Developing your dashboard further 180 Case study – an overall system view 191
Using visualizations in Grafana 183 Summary 195
Developing a dashboard purpose 185
Table of Contents xi

9
Managing Incidents Using Alerts 197
Technical requirements 198 Groups and admin 213
Being alerted versus being alarmed 198 Grafana OnCall 214
Before an incident 199 Alert groups 214
During an incident 202 Inbound integrations 215
After an incident 207 Templating 217
Writing great alerts using SLIs and Escalation chains 218
SLOs 208 Outbound integrations 219
Schedules 221
Grafana Alerting 209
Alert rules 210 Grafana Incident 222
Contact points, notification policies, and Summary 224
silences 213

10
Automation with Infrastructure as Code 225
Technical requirements 225 Getting to grips with the Grafana API236
Benefits of automating Grafana 226 Exploring the Grafana Cloud API 236
Introducing the components of Using Terraform and Ansible for Grafana
observability systems 226 Cloud 237
Exploring the Grafana API 239
Automating collection infrastructure
with Helm or Ansible 228 Managing dashboards and alerts with
Automating the installation of the Terraform or Ansible 240
OpenTelemetry Collector 228 Summary 242
Automating the installation of Grafana Agent 235

11
Architecting an Observability Platform 243
Architecting your observability Management and automation 253
platform 243 Developing a proof of concept 254
Defining a data architecture 244
Containerization and virtualization 254
Establishing system architecture 246
xii Table of Contents

Data production tools 255 Sending telemetry to other consumers 258


Setting the right access levels 256 Summary 259

Part 4: Advanced Applications and Best Practices


of Grafana
12
Real User Monitoring with Grafana 263
Introducing RUM 263 Pivoting from frontend to backend
Setting up Grafana Frontend data 271
Observability 265 Enhancements and custom
Exploring Web Vitals 269 configurations 273
Summary 274

13
Application Performance with Grafana Pyroscope and k6 275
Using Pyroscope for continuous A brief overview of k6 285
profiling 276 Writing a test using checks 286
A brief overview of Pyroscope 276 Writing a test using thresholds 286
Searching Pyroscope data 276 Adding scenarios to a test to run at scale 287
Continuous profiling client configuration 279 Test life cycle 288
Understanding the Pyroscope architecture 282 Installing and running k6 288

Using k6 for load testing 283 Summary 290

14
Supporting DevOps Processes with Observability 291
Introducing the DevOps life cycle 292 Release 295
Using Grafana for fast feedback Deploy 296
during the development life cycle 294 Operate 298
Code 294 Monitor 298
Test 295 Plan 300
Table of Contents xiii

Using Grafana to monitor CD platforms 302


infrastructure and platforms 300 Resource platforms 303
Observability platforms 300 Security platforms 303
CI platforms 301
Summary 304

15
Troubleshooting, Implementing Best Practices, and More with Grafana
 305
Best practices and troubleshooting Preparing the Grafana stack 308
for data collection 305 Grafana stack decisions 309
Preparing for data collection 306 Debugging Grafana 309
Data collection decisions 306 Avoiding pitfalls of observability 311
Debugging collector 307
Future trends in application
Best practices and troubleshooting monitoring 312
for the Grafana stack 308 Summary 313

Index 315

Other Books You May Enjoy 330


Preface
Hello and welcome! Observability with Grafana is a book about the tools offered by Grafana Labs for
observability and monitoring. Grafana Labs is an industry-leading provider of open source tools to
collect, store, and visualize data collected from IT systems. This book is primarily aimed toward IT
engineers who will interact with these systems, whatever discipline they work in.
We have written this book as we have seen some common problems across organizations:

• Systems that were designed without a strategy for scaling are being pushed to handle additional
data load or teams using the system
• Operational costs are not being attributable correctly in the organization, leading to poor cost
analysis and management
• Incident management processes that treat the humans involved as robots without sleep schedules
or parasympathetic nervous systems

In this book, we will use the OpenTelemetry Demo application to simulate a real-world environment
and send the collected data to a free Grafana Cloud account that we will create. This will guide you
through the Grafana tools for collecting telemetry and also give you hands-on experience using the
administration and support tools offered by Grafana. This approach will teach you how to run the
Grafana tools in a way so that anyone can experiment and learn independently.
This is an exciting time for Grafana, identified as a visionary in the 2023 Gartner Magic Quadrant
for Observability (https://www.gartner.com/en/documents/4500499). They recently
delivered change in two trending areas:

• Cost reduction: This has seen Grafana as the first vendor in the observability space to release
tools that not only help you understand your costs but also reduce them.
• Artificial intelligence (AI): Grafana has introduced generative AI tools that assist daily operations
in simple yet effective ways – for example, writing an incident summary automatically. Grafana
Labs also recently purchased Asserts.ai to simplify root cause analysis and accelerate problem
detection.

We hope you enjoy learning some new things with us and have fun doing it!
xvi Preface

Who this book is for


IT engineers, support teams, and leaders can gain practical insights into bringing the huge power of
an observability platform to their organization. The book will focus on engineers in disciplines such
as the following:

• Software development: Learn how to quickly instrument applications and produce great
visualizations enabling applications to be easily supported
• Operational teams (DevOps, Operations, Site Reliability, Platform, or Infrastructure): Learn
to manage an observability platform or other key infrastructure platform and how to manage
such platforms in the same way as any other application
• Support teams: Learn how to work closely with development and operational teams to have
great visualizations and alerting in place to quickly respond to customers’ needs and IT incidents

This book will also clearly establish the role of leadership in incident management, cost management,
and establishing an accurate data model for this powerful dataset.

What this book covers


Chapter 1, Introducing Observability and the Grafana Stack, provides an introduction to the Grafana
product stack in relation to observability as a whole. You will learn about the target audiences and
how that impacts your design. We will take a look at the roadmap for observability tooling and how
Grafana compares to alternative solutions. We will explore architectural deployment models, from
self-hosted to cloud offerings. Inevitably, you will have answers to the question “Why choose Grafana?”.
Chapter 2, Instrumenting Applications and Infrastructure, takes you through the common protocols and
best practices for each telemetry type at a high level. You will be introduced to widely used libraries for
multiple programming languages that make instrumenting an application simple. Common protocols
and strategies for collecting data from infrastructural components will also be discussed. This chapter
provides a high-level overview of the technology space and aims to be valuable for quick reference.
Chapter 3, Setting Up a Learning Environment with Demo Applications, explains how to install and set
up a learning environment that will support you through later sections of the book. You will also learn
how to explore the telemetry produced by the demo app and add monitoring for your own service.
Chapter 4, Looking at Logs with, Loki, takes you through working examples to understand LogQL. You
will then be introduced to common log formats, and their benefits and drawbacks. Finally, you will be
taken through the important architectural designs of Loki, and best practices when working with it.
Chapter 5, Monitoring with Metrics Using Grafana Mimir and Prometheus, discusses working examples
to understand PromQL with real data. Detailed information about the different metric protocols
will be discussed. Finally, you will be taken through important architectural designs backing Mimir,
Prometheus, and Graphite that guide best practices when working with the tools.
Preface xvii

Chapter 6, Tracing Technicalities with Grafana Tempo, shows you working examples to understand
TraceQL with real data. Detailed information about the different tracing protocols will be discussed.
Finally, you will be taken through the important architectural designs of Tempo, and best practices
when working with it.
Chapter 7, Interrogating Infrastructure with Kubernetes, AWS, GCP, and Azure, describes the setup and
configuration used to capture telemetry from infrastructure. You will learn about the different options
available for Kubernetes. Additionally, you will investigate the main plugins that allow Grafana to
query data from cloud vendors such as AWS, GCP, and Azure. You will look at solutions to handle large
volumes of telemetry where direct connections are not scalable. The chapter will also cover options
for filtering and selecting telemetry data before it gets to Grafana for security and cost optimization.
Chapter 8, Displaying Data with Dashboards, explains how you can set up your first dashboard in the
Grafana UI. You will also learn how to present your telemetry data in an effective and meaningful way.
The chapter will also teach you how to manage your Grafana dashboards to be organized and secure.
Chapter 9, Managing Incidents Using Alerts, describes how to set up your first Grafana alert with Alert
Manager. You will learn how to design an alert strategy that prioritizes business-critical alerts over
ordinary notifications. Additionally, you will learn about alert notification policies, different delivery
methods, and what to look for.
Chapter 10, Automation with Infrastructure as Code, gives you the tools and approaches to automate
parts of your Grafana stack deployments while introducing standards and quality checks. You will gain
a deep dive into the Grafana API, working with Terraform, and how to protect changes with validation.
Chapter 11, Architecting an Observability Platform, will show those of you who are responsible for
offering an efficient and easy-to-use observability platform how you can structure your platform so
you can delight your internal customers. In the environment we operate in, it is vital to offer these
platform services as quickly and efficiently as possible, so more time can be dedicated to the production
of customer-facing products. This chapter aims to build on the ideas already covered to get you up
and running quickly.
Chapter 12, Real User Monitoring with Grafana, introduces you to frontend application observability,
using Grafana Faro and Grafana Cloud Frontend Observability for real user monitoring (RUM). This
chapter will discuss instrumenting your frontend browser applications. You will learn how to capture
frontend telemetry and link this with backend telemetry for full stack observability.
Chapter 13, Application Performance with Grafana Pyroscope and k6, introduces you to application
performance and profiling using Grafana Pyroscope and k6. You will obtain a high-level overview that
discusses the various aspects of k6 for smoke, spike, stress, and soak tests, as well as using Pyroscope
to continuously profile an application both in production and test environments.
xviii Preface

Chapter 14, Supporting DevOps Processes with Observability, takes you through DevOps processes
and how they can be supercharged with observability with Grafana. You will learn how the Grafana
stack can be used in the development stages to speed up the feedback loop for engineers. You will
understand how to prepare engineers to operate the product in production. Finally, You will learn
when and how to implement CLI and automation tools to enhance the development workflow.
Chapter 15, Troubleshooting, Implementing Best Practices, and More with Grafana, closes the book by
taking you through best practices when working with Grafana in production. You will also learn some
valuable troubleshooting tips to support you with high-traffic systems in day-to-day operations. You will
also learn about additional considerations for your telemetry data with security and business intelligence.

To get the most out of this book


The following table presents the operating system requirements for the software that will be used in
this book:

Software/hardware covered in the book Operating system requirements


Either Windows, macOS, or Linux with dual CPU
Kubernetes v1.26
and 4 GB RAM
Either Windows, macOS, or Linux with dual CPU
Docker v23
and 4 GB RAM
If you are using the digital version of this book, we advise you to type the code yourself or access
the code from the book’s GitHub repository (a link is available in the next section). Doing so will
help you avoid any potential errors related to the copying and pasting of code.

Download the example code files


You can download the example code files for this book from GitHub at https://github.com/
PacktPublishing/Observability-with-Grafana. If there’s an update to the code, it
will be updated in the GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://
github.com/PacktPublishing/. Check them out!

Code in Action
The Code in Action videos for this book can be viewed at https://packt.link/v59Jp.

Conventions used
There are a number of text conventions used throughout this book.
Preface xix

Code in text: Indicates code words in text, database table names, folder names, filenames, file
extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “The
function used to get this information is the rate() function.”
A block of code is set as follows:

histogram_quantile(
    0.95, sum(
        rate(
            http_server_duration_milliseconds_bucket{}[$__rate_
interval])
        ) by (le)
    )

Any command-line input or output is written as follows:

$ helm upgrade owg open-telemetry/opentelemetry-collector -f OTEL-


Collector.yaml

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words
in menus or dialog boxes appear in bold. Here is an example: “From the dashboard’s Settings screen,
you can add or remove tags from individual dashboards.”

Tips or important notes


Appear like this.

Get in touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at customercare@
packtpub.com and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen.
If you have found a mistake in this book, we would be grateful if you would report this to us. Please
visit www.packtpub.com/support/errata and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would
be grateful if you would provide us with the location address or website name. Please contact us at
copyright@packt.com with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you
are interested in either writing or contributing to a book, please visit authors.packtpub.com.
xx

Share Your Thoughts


Once you’ve read Observability with Grafana, we’d love to hear your thoughts! Please click here to go
straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering
excellent quality content.
xxi

Download a free PDF copy of this book


Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere?
Is your eBook purchase not compatible with the device of your choice?
Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.
Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical
books directly into your application.
The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content
in your inbox daily
Follow these simple steps to get the benefits:

1. Scan the QR code or visit the link below

https://packt.link/free-ebook/9781803248004

2. Submit your proof of purchase


3. That’s it! We’ll send your free PDF and other benefits to your email directly
Part 1: Get Started with
Grafana and Observability

In this part of the book, you will get an introduction to Grafana and observability. You will learn about
the producers and consumers of telemetry data. You will explore how to instrument applications and
infrastructure. Then, you will set up a learning environment that will be enhanced throughout the
chapters to provide comprehensive examples of all parts of observability with Grafana.
This part has the following chapters:

• Chapter 1, Introducing Observability and the Grafana Stack


• Chapter 2, Instrumenting Applications and Infrastructure
• Chapter 3, Setting Up a Learning Environment with Demo Applications
1
Introducing Observability and
the Grafana Stack
The modern computer systems we work with have moved from the realm of complicated into the
realm of complex, where the number of interacting variables make them ultimately unknowable and
uncontrollable. We are using the terms complicated and complex as per system theory. A complicated
system, like an engine, has clear causal relationships between components. A complex system, such
as the flowing of traffic in a city, shows emergent behavior from the interactions of its components.
With the average cost of downtime estimated to be $9,000 per minute by Ponemon Institute in 2016,
this complexity can cause significant financial loss if organizations do not take steps to manage this
risk. Observability offers a way to mitigate these risks, but making systems observable comes with
its own financial risks if implemented poorly or without a clear business goal.
In this book, we will give you a good understanding of what observability is and who the customers
who might use it are. We will explore how to use the tools available from Grafana Labs to gain visibility
of your organization. These tools include Loki, Prometheus, Mimir, Tempo, Frontend Observability,
Pyroscope, and k6. You will learn how to use Service Level Indicators (SLIs) and Service Level
Objectives (SLOs) to obtain clear transparent signals of when a service is operating correctly, and
how to use the Grafana incident response tools to handle incidents. Finally, you will learn about
managing their observability platform using automation tools such as Ansible, Terraform, and Helm.
This chapter aims to introduce observability to all audiences, using examples outside of the computing
world. We’ll introduce the types of telemetry used by observability tools, which will give you an
overview of how to use them to quickly understand the state of your services. The various personas
who might use observability systems will be outlined so that you can explore complex ideas later with
a clear grounding on who will benefit from their correct implementation. Finally, we’ll investigate
Grafana’s Loki, Grafana, Tempo, Mimir (LGTM) stack, how to deploy it, and what alternatives exist.
4 Introducing Observability and the Grafana Stack

In this chapter, we’re going to cover the following main topics:

• Observability in a nutshell
• Telemetry types and technologies
• Understanding the customers of observability
• Introducing the Grafana stack
• Alternatives to the Grafana stack
• Deploying the Grafana stack

Observability in a nutshell
The term observability is borrowed from control theory. It’s common to use the term interchangeably
with the term monitoring in IT systems, as the concepts are closely related. Monitoring is the ability
to raise an alarm when something is wrong, while observability is the ability to understand a system
and determine whether something is wrong, and why.
Control theory was formalized in the late 1800s on the topic of centrifugal governors in steam engines.
This diagram shows a simplified view of such a system:

Figure 1.1 – James Watt’s steam engine flyweight governor (source: https://www.mpoweruk.com)
Observability in a nutshell 5

Steam engines use a boiler to boil water in a pressure vessel. The steam pushes a piston backward
and forward, which converts heat energy to reciprocal energy. In steam engines that use centrifugal
governors, this reciprocal energy is converted to rotational energy via a wheel connected to a piston.
The centrifugal governor provides a physical link backward through the system to the throttle. This
means that the speed of rotation controls the throttle, which, in turn, controls the speed of rotation.
Physically, this is observed by the balls on the governor flying outward and dropping inward until
the system reaches equilibrium.
Monitoring defines the metrics or events that are of interest in advance. For instance, the governor
measures the pre-defined metric of drive shaft revolutions. The controllability of the throttle is then
provided by the pivot and actuator rod assembly. Assuming the actuator rod is adjusted correctly, the
governor should control the throttle from fully open to fully closed.
In contrast, observability is achieved by allowing the internal state of the system to be inferred from
its external outputs. If the operating point adjustment is incorrectly set, the governor may spin too
fast or too slowly, rendering the throttle control ineffective. A governor spinning too fast or too slowly
could also indicate that the sliding ring is stuck in place and needs oiling. Importantly, this insight can
be gained without defining in advance what too fast or too slow means. The insight that the governor
is spinning too fast or too slowly also needs very little knowledge of the full steam engine.
Fundamentally, both monitoring and observability are used to improve the reliability and performance
of the system in question.
Now that we have introduced the high-level concepts, let’s explore a practical example outside of the
world of software services.

Case study – A ship passing through the Panama Canal


Let’s imagine a ship traversing the Agua Clara locks on the Panama Canal. This can be illustrated
using the following figure:

Figure 1.2 – The Agua Clara locks on the Panama Canal


6 Introducing Observability and the Grafana Stack

There are a few aspects of these locks that we might want to monitor:

• The successful opening and closing of each gate


• The water level inside each lock
• How long it takes for a ship to traverse the locks

Monitoring these aspects may highlight situations that we need to be alerted about:

• A gate is stuck open because of a mechanical failure


• The water level is rapidly descending because of a leak
• A ship is taking too long to exit the locks because it is stuck

There may be situations where the data we are monitoring are within acceptable limits, but we can still
observe a deviation from what is considered normal, which should prompt further action:

• A small leak has formed near the top of the lock wall:

‚ We would see the water level drop but only when it is above the leak
‚ This could prompt maintenance work on the lock wall

• A gate in one lock is opening more slowly because it needs maintenance:

‚ We would see the time between opening and closing the gate increase
‚ This could prompt maintenance on the lock gate

• Ships take longer to traverse the locks when the wind is coming from a particular direction:

‚ We could compare hourly average traversal rates


‚ This could prompt work to reduce the impact of wind from one direction

Now that we’ve seen an example of measuring a real-world system, we can group these types of
measurements into different data types to best suit the application. Let’s introduce those now.

Telemetry types and technologies


The boring but important part of observability tools is telemetry – capturing data that is useful,
shipping it from place to place, and producing visualizations, alerts, and reports that offer value to
the organization.
Three main types of telemetry are used to build monitoring and observability systems – metrics,
logs, and distributed traces. Other telemetry types may be used by some vendors and in particular
circumstances. We will touch on these here, but they will be explored in more detail in Chapters 12
and 13 of this book.
Another random document with
no related content on Scribd:
The Project Gutenberg eBook of Steel
This ebook is for the use of anyone anywhere in the United States
and most other parts of the world at no cost and with almost no
restrictions whatsoever. You may copy it, give it away or re-use it
under the terms of the Project Gutenberg License included with this
ebook or online at www.gutenberg.org. If you are not located in the
United States, you will have to check the laws of the country where
you are located before using this eBook.

Title: Steel
a manual for steel users

Author: William Metcalf

Release date: October 2, 2023 [eBook #71782]

Language: English

Original publication: New York: John Wiley & Sons, 1896

Credits: deaurider and the Online Distributed Proofreading Team at


https://www.pgdp.net (This file was produced from images
generously made available by The Internet Archive)

*** START OF THE PROJECT GUTENBERG EBOOK STEEL ***


STEEL:
A MANUAL FOR STEEL-USERS.

BY
WILLIAM METCALF.

FIRST EDITION.
FIRST THOUSAND.

NEW YORK:
JOHN WILEY & SONS.
London: CHAPMAN & HALL, Limited.
1896.

Copyright, 1896,
BY WILLIAM METCALF.

ROBERT DRUMMOND,
ELECTROTYPER AND PRINTER,
NEW YORK.
INTRODUCTION.
Twenty-seven years of active practice in the manufacture of steel
brought the author in daily contact with questions involving the
manipulation of steel, its properties, and the results of any operations
to which it was subjected.
Blacksmiths, edge-tool makers, die-makers, machine-builders,
and engineers were continually asking questions whose answers
involved study and experiment.
During these years the Bessemer and the open-hearth processes
were developed from infancy to their present enormous stature; and
the shadows of these young giants, ever menacing to the expensive
and fragile crucible, kept one in a constant state of watching, anxiety,
and more study.
The literature of steel has grown with the art; its books are no
longer to be counted on the fingers, they are to be weighed in tons.
Then why write another?
Because there seems to be one little gap. Metallurgists and
scientists have worked and are still working; they have given to the
world much information for which the world should be thankful.
Engineers have experimented and tested, as they never did
before, and thousands of tables and results are recorded, providing
coming engineers with a mine of invaluable wealth. Steel-workers
and temperers have written much that is of great practical value.
Still the questions come, and they are almost always those
involving an intimate acquaintance with the properties of steel, which
is only to be gained by contact with both manufacturers and users. In
this little manual the effort is made to fill this gap and to give to all
steel-users a systematic, condensed statement of facts that could
not be obtained otherwise, except by travelling through miles of
literature, and possibly not then. There are no tables, and no exact
data; such would be merely a re-compilation of work already done by
abler minds.
It is a record of experiences, and so it may seem to be dogmatic;
the author believes its statements to be true—they are true as far as
his knowledge goes; others can verify them by trial.
If the statements made prove to be of value to others, then the
author will feel that he has done well to record them; if not, there is
probably nothing said that is likely to result in any harm.
CONTENTS.
Page.
CHAPTER I.
General Description of
Steel, and Methods
of Manufacture.—
Cemented or Converted
Steel. Blister, German,
Shear, Double-shear.
Crucible-steel,
Bessemer, Open-hearth 1

CHAPTER II.
Applications and Uses
of the Different
Kinds of Steel.—
Crucible, Open-hearth,
Bessemer 14

CHAPTER III..
Alloy Steels and Their
Uses.—Self-hardening,
Manganese, Nickel,
Silicon, Aluminum 27

CHAPTER IV..
37
Carbon.—General
Properties and Uses.
Modes of Introducing It
in Steel. Carbon
Tempers, How
Determined. The
Carbon-line. Effects of
Carbon, in Low Steel, in
High Steel. Importance
of Attention to
Composition

CHAPTER V..
General Properties of
Steel.—Four
Conditions: Solid,
Plastic, Granular,
Liquid. Effects of Heat.
Size of Grain.
Recalescence,
Magnetism. Effects of
Cooling, Hardening,
Softening, Checking.
Effects of Forging or
Rolling, Hot or Cold.
Condensing, Hammer-
refining, Bursting.
Ranges of Tenacity, etc.
Natural Bar, Annealed
Bar, Hardened Bar, etc. 52

CHAPTER VI..
Heating.—For Forging;
Hardening;
Overheating; Burning;
Restoring; Welding 77

CHAPTER VII..
Annealing 84

CHAPTER VIII..
Hardening and 96
Tempering.—Size of
Grain; Refining at
Recalescence; Specific-
gravity Tests; Temper
Colors; How to Break
Work; a Word for the
Workman

CHAPTER IX..
Effects of Grinding.—
Glaze, Skin,
Decarbonized Skin,
Cracked Surfaces,
Pickling 128

CHAPTER X..
Impurities and Their
Effects.—Cold-short.
Red-short, Hot-short,
Irregularities,
Segregation, Oxides,
etc., Wild Heats,
Porosity. Removing
Last Fractions of Hurtful
Elements. Andrews
Broken Rail and
Propeller-shaft 129

CHAPTER XI..
146
Theories of Hardening.
—Combined, Graphitic,
Dissolved, Cement,
Hardening and Non-
hardening Carbon.
Carbides. Allotropic
Forms of Iron α, β, etc.
Iron as an Igneous
Rock or as a Liquid

CHAPTER XII..
Inspection.—Ingots, Bars,
Finished Work.
Tempers and
Soundness of Ingots.
Seams, Pipes, Laps,
Burns, Stars 151

CHAPTER XIII..
Specifications.—
Physical, Chemical, and
of Soundness and
Freedom from
Scratches, Sharp Re-
entrant Angles, etc. 154

CHAPTER XIV..
Humbugs 161

CHAPTER XV..
Conclusions 164

GLOSSARY.
Definitions of Shop
Terms Used 167
STEEL:
A MANUAL FOR STEEL USERS.
I.
GENERAL DESCRIPTION OF STEEL AND
OF MODES OF ITS MANUFACTURE.

Steel may be grouped under four general heads, each receiving


its name from the mode of its manufacture; the general properties of
the different kinds are the same, modified to some extent by the
differences in the operations of making them; these differences are
so slight, however, that after having mentioned them the discussion
of various qualities and properties in the following pages will be
general, and the facts given will apply to all kinds of steel, exceptions
being pointed out when they occur.
The first general division of steel is cemented or converted steel,
known to the trade as blister-steel, German, shear, and double-shear
steel.
This is probably the oldest of all known kinds of steel, as there is
no record of the beginning of its manufacture. This steel is based
upon the fact that when iron not saturated with carbon is packed in
carbon, with all air excluded, and subjected to a high temperature,—
any temperature above a low red heat,—carbon will be absorbed by
the iron converting it into steel, the steel being harder or milder,
containing more or less carbon, determined by the temperature and
the time of contact.
Experience and careful experiment have shown that at a bright
orange heat carbon will penetrate iron at the rate of about one eighth
of an inch in twenty-four hours. This applies to complete saturation,
above 100 carbon; liquid steel will absorb carbon with great rapidity,
becoming saturated in a few minutes, if enough carbon be added to
cause saturation.
MANUFACTURE OF BLISTER-STEEL.
Bars of wrought iron are packed in layers, each bar surrounded
by charcoal, and the whole hermetically sealed in a fire-brick vessel
luted on top with clay; heat is then applied until the whole is brought
up to a bright orange color, and this heat is maintained as evenly as
possible until the whole mass of iron is penetrated by carbon; usually
bars about three quarters of an inch thick are used, and the heat is
required to be maintained for three days, the carbon, entering from
both sides, requiring three days to travel three eighths of an inch to
the centre of the bar. If the furnace be running hot, the conversion
may be complete in two days, or less. The furnace is then cooled
and the bars are removed; they are found to be covered with
numerous blisters, giving the steel its name.
The bars of tough wrought iron are found to be converted into
highly crystalline, brittle steel. When blister-steel is heated and rolled
directly into finished bars, it is known commercially as german
steel.
When blister-steel is heated to a high heat, welded under a
hammer, and then finished under a hammer either at the same heat
or after a slight re-heating, it is known as shear-steel, or single-
shear.
When single-shear steel is broken into shorter lengths, piled,
heated to a welding-heat and hammered, and then hammered to a
finish either at that heat or after a slight re-heating, it is known as
double-shear steel.
Seebohm gives another definition of single-shear, and double-
shear; probably both are correct, being different shop designations.
Until within the last century the above steels were the only kinds
known in commerce. There was a little steel made in India by a
melting process, known as Wootz. It amounted to nothing in the
commerce of the world, and is mentioned because it is the oldest of
known melting processes.
Although converted steel is so old, and so few years ago was the
only available kind of steel in the world, nothing more need be said
of it here, as it has been almost superseded by cast steel, superior in
quality and cheaper in cost, except in crucible-steel.
Inquiring readers will find in Percy, and many other works, such
full and detailed accounts of the manufacture of these steels that it
would be a waste of space and time to reprint them here, as they are
of no more commercial importance.
In the last century Daniel Huntsman, of England, a maker of
clocks, found great difficulty in getting reliable, durable, and uniform
springs to run his clocks. It occurred to him that he might produce a
better and more uniform article by fusing blister-steel in a crucible.
He tried the experiment, and after the usual troubles of a pioneer he
succeeded, and produced the article he required. This founded and
established the great Crucible-cast-steel industry, whose benefits to
the arts are almost incalculable; and none of the great inventions of
the latter half of this nineteenth century have produced anything
equal in quality to the finer grades of crucible-steel.
crucible-cast steel is the second of the four general kinds of
steel mentioned in the beginning of this chapter.
Although Huntsman succeeded so well that he is clearly entitled
to the credit of having invented the crucible process, he met with
many difficulties, from porosity of his ingots mainly; this trouble was
corrected largely by Heath by the use of black oxide of manganese.
Heath attempted to keep his process secret, but it was stolen from
him, and he spent the rest of a troubled life in trying to get some
compensation from the pilferers of his process. An interesting and
pathetic account of his troubles will be found in Percy.
Heath’s invention was not complete, and it was finished by the
elder Mushet, who introduced in addition to the oxide of manganese
a small quantity of ferro-manganese, an alloy of iron and
manganese; and it was now possible, with care and skill, to make a
quality of steel which for uniformity, strength, and general utility has
never been equalled.
Crucible-steel was produced then by charging into a crucible
broken blister-steel, a small quantity of oxide of manganese, and of
ferro-manganese, or Spiegel-eisen, covering the crucible with a cap,
and melting the contents in a coke-furnace, a simple furnace where
the crucible was placed on a stand of refractory material, surrounded
by coke, and fired until melted thoroughly.
The first crucibles used, and those still used largely in Sheffield,
were made of fire-clay; a better, larger, and more durable crucible,
used in the United States exclusively, and in Europe to some extent,
is made of plumbago, cemented by enough of fire-clay to make it
strong and tough. As the demands for steel increased and varied it
was found that the carbon could be varied by mixing wrought iron
and blister-bar, and so a great variety of tempers was produced,
from steel containing not more than 0.10% of carbon up to steel
containing 1.50% to 2% of carbon, and even higher in special cases.
It was soon found that the amount of carbon in steel could be
determined by examining the fractures of cold ingots; the fracture
due to a certain quantity of carbon is so distinct and so unchanging
for that quantity that, once known, it cannot be mistaken for any
other. The ingot is so sensitive to the quantity of carbon present that
differences of .05% may be observed, and in everyday practice the
skilled inspector will select fifteen different tempers of ingots in steels
ranging from about 50 carbon to 150 carbon, the mean difference in
carbon from one temper to another being only .07%. And this is no
guess-work;—no chemical color determination will approach it in
accuracy, and such work can only be checked by careful analysis by
combustion.
This is the steel-maker’s greatest stronghold, as it is possible by
this means for a careful, skilful man to furnish to a consumer, year
after year, hundreds or thousands of tons of steel, not one piece of
which shall vary in carbon more than .05% above or below the mean
for that temper.
The word “temper” used here refers to the quantity of carbon
contained in the steel, it is the steel-maker’s word; the question,
“What temper is it?” answered, No. 3, No. 6, or any other
designation, means a fixed, definite quantity of carbon.
When a steel-user hardens a piece of steel, and then lets down
the temper by gentle heating, and he is asked, “What temper is it?”
he will answer straw, light brown, brown, pigeon-wing, light blue, or
blue, as the case may be, and he means a fixed, definite degree of
softening of the hardened steel.
It is an unfortunate multiple meaning of a very common word, yet
the uses have become so fixed that it seems to be impossible to
change them, although they sometimes cause serious confusion.
The quantity of carbon contained in steel, and indeed of all
ingredients, as a rule, is designated in one hundredths of one per
cent; thus ten (.10) carbon means ten one hundredths of one per
cent; nineteen (.19) carbon means nineteen one hundredths of one
per cent; one hundred and thirty-five (1.35) carbon means one
hundred and thirty-five one hundredths of one per cent, and so on.
So also for contents of silicon, sulphur, phosphorus, manganese and
other usual ingredients.
This enumeration will be used in this work, and care will be taken
to use the word “temper” in such a way as not to cause confusion.
It has been stated that crucible-cast steel is made from ten
carbon up to two hundred carbon, and that its content of carbon can
be determined by the eye, from fifty carbon upwards, by examining
the fracture of the ingots. The limitation from fifty carbon upwards is
not intended to mean that ingots containing less than fifty carbon
have no distinctive structures due to the quantity of carbon; they
have such distinctive structures, and the difficulty in observing them
is merely physical.
Ingots containing fifty carbon are so tough that they can only be
fractured by being nicked with a set deeply, and then broken off;
below about fifty carbon the ingots are so tough that it is almost
impossible to break open a large enough fracture to enable the
inspector to determine accurately the quantity of carbon present;
therefore it is usual in these milder steels, when accuracy is
required, to resort to quick color analyses to determine the quantity
of carbon present. Color analyses below fifty carbon may be fairly
accurate, above fifty carbon they are worthless.
As the properties and reliability of crucible-steel became better
known the demand increased, and the requirements varied and were
met by skilful manufacturers, until, by the year 1860, ingots were
produced weighing many tons by pouring the contents of many
crucibles into one mould; in this way the more urgent demands were
met, but the material was very expensive and the risks in
manufacturing were great. About this time, stimulated by the desire
of enlightened governments to increase their powers of destruction
in war by the use of heavy guns of greater power than could be
obtained by the use of cast iron, and for heavier ship-armor to be
used in defence, Mr. Bessemer, of England, now Sir Henry
Bessemer, reasoned that if melted cast iron was reduced to wrought
iron by puddling, or boiling, by the mere oxidation, or burning out, of
the excess of carbon and silicon from the cast iron, that the same
cast iron might be reduced to steel in large masses by blowing air
through a molten mass in a close vessel, retaining enough heat to
keep the mass molten so that the resulting steel could be poured into
ingots as large as might be desired. At about the same time, or a
little earlier, Mr. Kelly, of the United States, devised and patented the
same method. Both of these gentlemen demonstrated the potencies
of their invention, and neither brought it to a successful issue.
To persistent and intelligent iron-masters of Sweden must be
given the credit of bringing the process of Bessemer to a commercial
success, and so they gave to the world pneumatic or Bessemer
steel, the latter name holding, properly, as a just tribute to the
inventor, and this inaugurated the third general division:

BESSEMER STEEL.
Bessemer steel is made by pouring into a bottle-shaped vessel
lined with refractory material a mass of molten cast iron, and then
blowing air through the iron until the carbon and silicon are burned
out. The gases and flame resulting escape from the mouth of the
vessel.
The combustion of carbon and silicon produce a temperature
sufficient to keep the mass thoroughly melted, so that the steel may
be poured into moulds making ingots of any desired size.
In the beginning, and for many years, the lining of the vessel was
of silicious or acid material, and it was found that all of the
phosphorus and sulphur contained in the cast iron remained in the
resulting steel, so that it was necessary to have no more of these
elements in the cast iron than was allowable in the steel. The higher
limit for phosphorus was fixed at ten points (.10%), and that is the
recognized limit the world over. When Bessemer pig is quoted, or
sold and bought, it means always a cast iron containing not more
than ten phosphorus.
In regard to sulphur, it was found that if too much were present
the material would be red-short, so that it could not be worked
conveniently in the rolls or under the hammer, and that when the
amount of sulphur present was not enough to produce red-shortness
it was not sufficient to hurt the steel.
As red-short material is costly and troublesome to the
manufacturer, it was not found necessary to fix any limit for sulphur,
because the makers could be depended upon to keep it within
working limits.
Later investigations prove this to be a fallacy, as much as ten or
even more sulphur has been found in broken rails and shafts, the
steel having made workable by a percentage of manganese. (See
the results of Andrews’s investigation given in Chap. X.)
During the operation of blowing Bessemer steel the flame issuing
from the vessel is indicative of the elimination of the elements, and it
is found that while the combustion is partially simultaneous the
silicon is all removed before the carbon, and the characteristic white
flame towards the end of the blow is known as the carbon flame;
when the carbon is burned out, this flame drops suddenly and the
operator knows that the blow is completed. Any subsequent blowing
would result in burning iron only. During the blow the steel is charged
heavily with oxygen, and if this were left in the steel it would be
rotten, red-short, and worthless. This oxygen is removed largely by
the addition of a predetermined quantity of ferro-manganese, usually
melted previously and then poured into the steel.
The manganese takes up the greater part of the oxygen, leaving
the steel free from red-shortness and easily worked.
The fact that the phosphorus of the iron remained in the steel
notwithstanding the active combustion and high temperature led to
the dictum that at high temperatures phosphorus could not be
eliminated from iron. This conclusion was credited because in some
of the so-called direct processes of making iron where the
temperature was never high enough to melt steel all, or nearly all, of
the phosphorus was removed from the iron.
For many years steel-makers the world over worked upon this
basis, and devoted themselves to procuring for their work iron
containing not more than ten (.10) phosphorus, now universally
known and quoted as Bessemer iron.
Two young English chemists, Sidney Gilchrist Thomas and Percy
C. Gilchrist, being careful thinkers, concluded that the question was
one of chemistry and not one of temperature; accordingly they set to
work to obtain a basic lining for the vessel and to produce a basic
slag from the blow which should retain in it the phosphorus of the
iron. After the usual routine of experiment, and against the doubtings
of the experienced, they succeeded, and produced a steel practically
free from phosphorus. For the practical working of their process it
was found better, or necessary, to use iron low in silicon and high in
phosphorus, using the phosphorus as a fuel to produce the high
temperature that is necessary instead of the silicon of the acid
process. In the acid process it is found necessary to have high
silicon—two percent or more—to produce the temperature
necessary to keep the steel liquid; in the Thomas-Gilchrist process
phosphorus takes the place of silicon for this purpose.
In this way the basic Bessemer process was worked out and
became prominent.

You might also like