Download as pdf or txt
Download as pdf or txt
You are on page 1of 68

Architecting Distributed Transactional

Applications: Data-intensive Distributed


Transactional Applications 1st Edition
Guy Harrison
Visit to download the full and correct content document:
https://ebookmeta.com/product/architecting-distributed-transactional-applications-data
-intensive-distributed-transactional-applications-1st-edition-guy-harrison/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

CockroachDB The Definitive Guide: Distributed Data at


Scale Guy Harrison

https://ebookmeta.com/product/cockroachdb-the-definitive-guide-
distributed-data-at-scale-guy-harrison/

Architecting Data Intensive Applications 1st Edition


Anuj Kumar

https://ebookmeta.com/product/architecting-data-intensive-
applications-1st-edition-anuj-kumar/

Introducing Blockchain Applications: Understand and


Develop Blockchain Applications Through Distributed
Systems George

https://ebookmeta.com/product/introducing-blockchain-
applications-understand-and-develop-blockchain-applications-
through-distributed-systems-george/

Distributed Systems Theory and Applications 1st Edition


Ratan K. Ghosh

https://ebookmeta.com/product/distributed-systems-theory-and-
applications-1st-edition-ratan-k-ghosh/
Distributed Sensor Networks Second Edition Sensor
Networking and Applications S. Sitharama Iyengar

https://ebookmeta.com/product/distributed-sensor-networks-second-
edition-sensor-networking-and-applications-s-sitharama-iyengar/

Architecting a Modern Data Warehouse for Large


Enterprises: Build Multi-cloud Modern Distributed Data
Warehouses with Azure and AWS 1st Edition Anjani Kumar

https://ebookmeta.com/product/architecting-a-modern-data-
warehouse-for-large-enterprises-build-multi-cloud-modern-
distributed-data-warehouses-with-azure-and-aws-1st-edition-
anjani-kumar/

Architecting a Modern Data Warehouse for Large


Enterprises: Build Multi-cloud Modern Distributed Data
Warehouses with Azure and AWS 1st Edition Anjani Kumar

https://ebookmeta.com/product/architecting-a-modern-data-
warehouse-for-large-enterprises-build-multi-cloud-modern-
distributed-data-warehouses-with-azure-and-aws-1st-edition-
anjani-kumar-2/

Real Time Systems Design Principles for Distributed


Embedded Applications 3rd Edition Hermann Kopetz

https://ebookmeta.com/product/real-time-systems-design-
principles-for-distributed-embedded-applications-3rd-edition-
hermann-kopetz/

Real Time Systems Design Principles for Distributed


Embedded Applications 3rd Edition Hermann Kopetz

https://ebookmeta.com/product/real-time-systems-design-
principles-for-distributed-embedded-applications-3rd-edition-
hermann-kopetz-2/
Co
m
pl
im
en
ts
of
Architecting
Distributed
Transactional
Applications
Data-Intensive Distributed
Transactional Applications

Guy Harrison,
Andrew Marshall
& Charles Custer

REPORT
Architecting Distributed
Transactional Applications
Data-Intensive Distributed
Transactional Applications

Guy Harrison, Andrew Marshall,


and Charles Custer

Beijing Boston Farnham Sebastopol Tokyo


Architecting Distributed Transactional Applications
by Guy Harrison, Andrew Marshall, and Charles Custer
Copyright © 2023 O’Reilly Media, Inc. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA
95472.
O’Reilly books may be purchased for educational, business, or sales promotional
use. Online editions are also available for most titles (http://oreilly.com). For more
information, contact our corporate/institutional sales department: 800-998-9938 or
corporate@oreilly.com.

Acquisitions Editor: Aaron Black Interior Designer: David Futato


Development Editor: Michele Cronin Cover Designer: Randy Comer
Production Editor: Gregory Hyman Illustrator: Kate Dullea
Copyeditor: nSight, Inc.

January 2023: First Edition

Revision History for the First Edition


2023-01-24: First Release

The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Architecting


Distributed Transactional Applications, the cover image, and related trade dress are
trademarks of O’Reilly Media, Inc.
The views expressed in this work are those of the authors and do not represent the
publisher’s views. While the publisher and the authors have used good faith efforts
to ensure that the information and instructions contained in this work are accurate,
the publisher and the authors disclaim all responsibility for errors or omissions,
including without limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and instructions contained in this
work is at your own risk. If any code samples or other technology this work contains
or describes is subject to open source licenses or the intellectual property rights of
others, it is your responsibility to ensure that your use thereof complies with such
licenses and/or rights.
This work is part of a collaboration between O’Reilly and Cockroach Labs. See our
statement of editorial independence.

978-1-098-14261-2
[LSI]
Table of Contents

1. Planning for a Distributed Transactional Application. . . . . . . . . . . . . . 1


Why Distributed Transactional Applications? 2
The Business Drivers for Distributed Systems 3
The Return of Transactional Consistency 4
The Increasingly Attractive Economics
of Distributed Computing 4
Understanding Your Requirements 5
A Modern Distributed and Transactional
Architectural Pattern 6
Summary 8

2. Distributing the Application Layer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9


Regions and Zones 9
Old-School Load Balancing 10
Microservices 11
Containers 12
Kubernetes, Pods, and Services 13
Multiregion Kubernetes 14
Event Management 15
Serverless Deployments 17
Multiregion Serverless 19
Summary 19

3. Distributing and Scaling the Storage Layer. . . . . . . . . . . . . . . . . . . . . 21


Transactional Versus Nontransactional
Distributed Databases 22
Hosting Strategies for Distributed Databases 23

iii
Serverless or Dedicated Deployment? 26
Kubernetes 28
Placement Policies 29
Multiregion Database Deployments 30
Distributed Database Consensus 30
Survival Goals 32
Locality Rules 34
Summary 35

iv | Table of Contents
CHAPTER 1
Planning for a Distributed
Transactional Application

This report is an architectural guide for software professionals


who are considering implementing a distributed transactional
application.
A distributed application is an application implemented on multiple
networked computers. A transactional application is an application
that can correctly process simultaneous updates from multiple users.
As we’ll see in subsequent sections, distributed applications—
and increasingly transactional distributed applications—are rapidly
becoming the new normal for modern software products. This is
because the combination of transactional and distributed technolo‐
gies allows for greater application resilience, scalability, and correct‐
ness—mandatory attributes for modern applications.
However, distributed applications pose unique challenges, and
implementing transactional behavior in a distributed context is
particularly tricky. The advantages of the distributed transactional
architecture are undeniable—but so are the problems and the risks!
This report is for software developers, architects, and operational
staff who want to understand the benefits and challenges of dis‐
tributed transactional software architecture. We try not to assume
any specific technology background, but some familiarity with data‐
bases, software development frameworks, and cloud services would
be advantageous.

1
After reading this report, we hope you’ll have a good handle on
the business and technology motivations for modern distributed
architectures and will be familiar with the architectural patterns
and software frameworks most widely deployed across the indus‐
try. In particular, you should be well equipped to understand the
role that technologies and patterns such as Docker, Kubernetes,
and distributed transactional databases play in modern distributed
architectures.

Why Distributed Transactional Applications?


As Marc Andreessen famously wrote over a decade ago, “software
is eating the world”. In today’s world, it’s distributed software
that has the greatest appetite. Increasingly, it’s distributed soft‐
ware that is powering the applications that run our society and
mediate our digital lives.
The events of the last few years have certainly emphasized the
importance of distributed software.
For instance, during the pandemic, “touchless” payments skyrock‐
eted, and cash-only businesses became increasingly rare. Payment
systems that can accept global credit card transactions are inherently
distributed, with points of presence across the globe. As a result,
almost all businesses depend on distributed payment solutions. If
these solutions fail, then business cannot proceed.
The pandemic also accelerated the adoption of online shopping—
customers are more likely than ever to purchase from an online
retailer. When you are selling online, your customers can be any‐
where, so your storefront must be available globally. You can only
offer good service in all locations by distributing your application
across the globe.
Finally, the pandemic emphasized the need to be able to react to
sudden changes in demand. Many businesses faced sudden increases
in demand as their online customer base surged during lockdowns.
Distributed applications can scale up or down quickly by adding or
removing services or instances. Those without a distributed solution
often could not react promptly.
Enterprises that have attempted to maintain monolithic solutions
have often not fared well. In one case we are aware of, a large com‐
pany had its entire system running on a single mainframe. A large

2 | Chapter 1: Planning for a Distributed Transactional Application


snowstorm hit the state where the mainframe was located, causing
a power outage. Diesel generators backed up the site, but the storm
also led to road closures, making it impossible to get fuel deliveries.
Faced with the possibility of going entirely offline if they couldn’t
get additional fuel, they decided to instead fail over to a backup
data center, resulting in a major, customer-impacting interruption of
service. Subsequently, they decided that a more robust distributed
transactional solution was required.

The Business Drivers for Distributed Systems


Distributed systems are increasingly prevalent in our modern soft‐
ware ecosystem due to the significant advantages that they provide:
Reliability
Distributed systems are inherently more reliable than mono‐
lithic systems since they have no single point of failure.
Scalability
Monolithic systems have difficulty coping with increases in
resource utilization. In contrast, well-designed distributed sys‐
tems can expand by adding new nodes or service instances.
Elasticity
Elasticity implies that the application can also scale down
by releasing resources when workload demands reduce. A
well-designed distributed application can release computing
resources by shutting down unneeded nodes or services.
Performance
A distributed application can also provide unique performance
benefits when compared to a monolithic application:
• A distributed application may be able to provide increased
throughput or concurrency by parallelizing tasks across
multiple nodes or services.
• A distributed application may be able to offer reduced
latency by processing requests from specific regions with
services located in the same region. A monolithic appli‐
cation, in contrast, will, by necessity, offer low-latency
requests only in the region in which it is physically located.

The Business Drivers for Distributed Systems | 3


The Return of Transactional Consistency
During the early years of the “modern” web 2.0 internet, applica‐
tion architects were forced to choose between availability, global
scalability, and consistency—and it was widely believed that you
had to choose two out of three of these obviously desirable traits.
Consequently, strong consistency was abandoned in favor of “even‐
tual consistency,” and the definition of “eventual” was stretched to
include “maybe never.”
Eventual consistency caused headaches for application developers,
many of which had no satisfactory solution. Luckily, distributed
software technology has largely moved past the need for eventual
consistency. As we’ll see, modern application frameworks allow
for strong consistency together with very high availability and
scalability.

The Increasingly Attractive Economics


of Distributed Computing
The complexity of distributed systems was an almost overwhelming
obstacle in the past. Until recently, the human resources needed to
maintain multiple software components in multiple locations and
to manage the performance and reliability of these multiple compo‐
nents were beyond all but the largest organizations.
The economics of distributed systems has been completely revolu‐
tionized by technological advances over the past 10–15 years:

• The advent of public cloud computing allows enterprises of any


size to deploy on infrastructures made available by the world’s
largest software providers.
• Containerization technologies, most notably Docker, together
with microservice design patterns, can be used to create easily
deployable and replicable units of application functionality.
• Container orchestration solutions, most notably Kubernetes,
allow the containers that combine to form an application to be
deployed and scaled in a relatively simple and reliable fashion.

The upshot of these advances has reduced the total cost of owner‐
ship for a distributed application as well as reduced complexity in
application design, implementation, and maintenance.

4 | Chapter 1: Planning for a Distributed Transactional Application


Understanding Your Requirements
It’s not possible to select an ideal architecture without firmly defined
business requirements. Here are some of the considerations you
should clarify before finalizing your architecture:
Total cost of ownership
The total cost of ownership of a deployment includes the cap‐
ital costs of hardware (for on-premises deployments) or hard‐
ware rental (for cloud deployments) together with the software
licensing costs and staffing costs for administrators. A fully
managed cloud deployment minimizes staffing costs and encap‐
sulates all other costs into a single subscription. An on-premises
deployment might have higher staffing costs and higher initial
hardware costs but lower software subscription costs, especially
if open source software is used. It’s particularly important to
factor in the higher staffing costs involved in an on-premises
solution as well as the cost savings that can be achieved in the
cloud from elastic scaling.
High availability requirements
Almost all modern systems aspire to continual availability with
minimal downtime. Modern software frameworks can provide
very high availability with very attractive economics. Never‐
theless, there are always cost–availability trade-offs to be con‐
sidered. For instance, a three-node software topology might
survive a single-node failure without issue but be unable to
survive multiple concurrent node failures. As the degree of
redundancy increases, the ability to survive failures increases,
but so do the total operational costs.
Throughput and latency requirements
Most applications have performance requirements that combine
both throughput (transactions per second) and latency (average
or percentile time to complete a transaction). The two require‐
ments are correlated, but one can often be increased at the
expense of the other. It’s important, therefore, to be clear on
what is expected of the application in both contexts.
Geographical considerations
A distributed application may have to be configured to opti‐
mize performance for multiple regions. In addition, a global

Understanding Your Requirements | 5


application might be subject to privacy and data domiciling
regulations that differ across regions.

A Modern Distributed and Transactional


Architectural Pattern
It’s our belief that there is an architectural sweet spot that provides
a particularly compelling combination of economics, elasticity, and
high availability. The essential components of this architecture are as
follows:

• The use of public cloud platforms as the primary substrate for


all application compute elements
• Using a microservices pattern for the top-level application
architecture
• Using Docker containers as packaging for the microservices
• Where possible, using Kubernetes to orchestrate the deploy‐
ment and maintenance of these containers
• Optionally deploying a message brokering layer like Kafka to
communicate between services
• Using a fully managed, cloud-based, distributed, transactionally
consistent database platform for the persistence layer

This architecture is not a one-size-fits-all solution for the entire


universe of modern applications. However, we think it encapsulates
the needs of a larger subset of all applications than any other single
pattern. It delivers scalability, availability, consistency, and elasticity
in an economical and maintainable fashion.
Let’s look briefly at each of the aforementioned elements of modern
distributed architecture:
Public cloud platforms
Highly available distributed applications require servers run‐
ning in multiple locales, each ideally with redundant network
communications and no single source of failure in any location
or between any two locations. Very large organizations may
be in possession of such an infrastructure, but for most of us,
only the large public clouds—such as those offered by Google,
Microsoft, and Amazon—can offer such a global infrastructure.

6 | Chapter 1: Planning for a Distributed Transactional Application


These public cloud providers also have fully managed versions
of most of the components needed for a modern application.
Microservices
In a microservices architecture, application functionality is
delivered through the interactions between multiple independ‐
ent software units. This contrasts with the monolithic applica‐
tion, in which all software functionality is delivered from a
single large software unit.
Microservices provide software-engineering efficiencies as well
as improving application scalability and resilience.
Docker containers
Containers are lightweight, virtualized environments that are
perfect for the deployment of microservices. The container iso‐
lates the microservice from any variation in operating system
type or configuration and is portable across any platform that
supports the container platform.
Container orchestration and Kubernetes
Container orchestration frameworks—such as Kubernetes—
arrange for the deployment and management of the containers
that comprise an application. Kubernetes creates and maintains
the containers that compose the application and manages the
redundancy, scalability, and resilience of these containers.
Distributed database services
The use of Dockerized microservices coordinated by Kuber‐
netes works well for the applications’ logic. However, the persis‐
tence layer—the database—has a different set of requirements.
In any nontrivial application, there is data that is scoped across
multiple microservices, and that must persist beyond the life‐
time of those services. It’s rarely possible to run a separate
database for each service—there must be a common data store
shared across the entire application.
However, a single monolithic database is undesirable both
from a performance and availability point of view. Even if a
single database could scale to meet all the needs of the entire
application, in a global deployment, some of the application’s
users would experience poor performance because of living on
a different continent from the database. From an availability

A Modern Distributed and Transactional Architectural Pattern | 7


perspective, the monolithic database represents a single point of
failure.
Therefore, we need a database that can scale as required by the
application and that has redundancy in its architecture to sur‐
vive failures in individual nodes. This database needs to meet
the requirements of the application for low-latency regional
performance as well as global consistency.
Such databases did not exist until relatively recently, but they
are available today. Some of the candidate databases—such
as CockroachDB and YugabyteDB—are available both as on-
premises deployments and as fully managed cloud systems.
Others—such as Google Spanner or Microsoft Azure Cosmos
DB—are available only on a specific cloud platform.

Summary
Modern enterprises require highly available, globally scoped, and
scalable software solutions. These requirements are best met by
distributed transactional application architectures.
Today, there exists a well-proven cloud-based architectural pattern
for distributed transactional applications. This pattern involves the
use of public cloud platforms, microservices, Docker containers,
Kubernetes, and a distributed transactional database.
In Chapter 2, we’ll take a deep dive into the architecture of the
application layer, and in Chapter 3 we will examine the distributed
database layer.

8 | Chapter 1: Planning for a Distributed Transactional Application


CHAPTER 2
Distributing the Application Layer

Almost all modern distributed applications can be divided into two


major layers:
Application (or compute) layer
Primarily responsible for application logic and end-user
experience
Persistence (or database) layer
Responsible for maintaining long-term application data and
providing a consistent view of the application state to the com‐
pute layer
These two layers have divergent properties that usually result in
different deployment architectures. In this chapter, we’ll discuss the
architecture for the application layer, and in Chapter 3, we’ll look at
the database layer.

Regions and Zones


The major public clouds all provide computing resources organized
around regions and zones. A region is a broad geographical region
that defines the physical location where you can deploy applications.
Regions are often given vague names (for instance, europe-west1).
However, in practice, a region will generally be located within a
specific country and almost always a specific city. For instance, the
Google Cloud region asia-northeast1 is in Tokyo, while the region
asia-northeast2 is in Osaka.

9
Each region will contain multiple zones—typically at least three
zones per region. These zones usually represent a specific data cen‐
ter within the region that has no common point of failure with other
zones. So, the zones represent regional redundancy that allows a
region to continue to function even if an individual data center fails.
Figure 2-1 illustrates three regions in a public cloud, each of which
has three zones. In a typical public cloud, there are many regions—
for instance, Google currently supports 35 regions.

Figure 2-1. Regions and zones in a typical public cloud

In the remainder of this report, we are going to assume that you are
deploying into a public cloud that implements such a region/zone
scheme. However, even if you are deploying on your own hardware,
you are likely to emulate this sort of regional separation and inter‐
regional redundancy.

Old-School Load Balancing


No one architectural pattern suits all applications. Container tech‐
nologies such as Docker and container orchestration technologies
such as Kubernetes have a significant learning curve, and if no
trained practitioners are available, then using a more traditional
architecture might involve less risk and delay.
In these scenarios, we might implement a fully distributed archi‐
tecture using a more traditional approach involving cloud-based
virtual machines (VMs) combined with load balancers and other
cloud facilities. As an example, we create a subnet for each region
and configure identically configured VMs in each region. Backend

10 | Chapter 2: Distributing the Application Layer


services are defined based on protocol and ports made available by
software running on the VMs. A global load balancer is defined that
routes traffic to appropriate services within regions, typically based
on geographical proximity.
Figure 2-2 illustrates the scheme.

Figure 2-2. Deploying an application using load balancers and virtual


machines

Although Kubernetes automates most of these mechanisms, we


will still sometimes use a global load balancer in multiregional
configurations.

Microservices
In a microservices architecture, units of application functionality are
developed and deployed independently. These microservices inter‐
act to deliver the overall application functionality. Microservices
development is based on some fundamental principles:

Microservices | 11
A microservice should have a single concern.
A microservice should aspire to satisfy just one item of func‐
tionality. The service should be an “expert” at that one function
and should not be concerned with any other functions.
The microservice should be independent of all other services.
The microservice should not be directly dependent on some
other service and should be testable and deployable independ‐
ently of all other services.
The microservice should be small enough to be developed by
a single team.
The “two-pizza rule” developed at Amazon stipulated that each
team should be small enough to be fed by two pizzas. The core
principle is that a microservice should be small enough that
interteam dependencies do not arise.
The microservice should be ephemeral.
A microservice should not hold state in such a way that pre‐
vents another microservice from taking over should it fail. In
practice, this simply means that each microservice interaction is
completely independent of past or future interactions.
The microservice implementation should be opaque.
The internal implementation of the microservice should be of
no concern to other services.

Containers
Microservices are often deployed in containers. Containers package
all the software dependencies necessary to execute a microservice
independently. This includes a minimal operating system image,
together with any libraries and other dependencies necessary to
support the microservice’s code.
By far, the most common mechanism for creating and deploying
containers is Docker. Docker containers provide significant advan‐
tages for the deployment of microservices. By encapsulating all
dependencies, developers can be confident that their service will
run on any Docker platform. Docker containers are less resource-
intensive than full-blown virtualization, and Docker containers iso‐
late the internals of the microservice and thereby improve security.

12 | Chapter 2: Distributing the Application Layer


Kubernetes, Pods, and Services
Kubernetes controls, or orchestrates, the containers that comprise a
distributed application.
A Kubernetes cluster consists of a number of nodes, each of which
hosts one or more containers.
Kubernetes provides a high-level abstraction called a Pod. A Pod
consists of one or more containers with shared storage, networking,
and namespaces. Although it might be possible to implement a
microservice entirely in a single container, modularity is improved
by allowing for containers that implement shared services across
multiple Pods.
Pods also often include sidecar containers. The sidecar typically pro‐
vides common services across a range of Pods, such as monitoring,
alerting, logging, and network abstraction.
Kubernetes services are composed of a logical set of Pods. Kuber‐
netes can load balance between the Pods, can restart failed Pods, and
can provide the service with a dedicated DNS access point. In this
way, Pods may fail, but the service survives. Figure 2-3 illustrates the
relationship between containers, Pods, and services.

Figure 2-3. Kubernetes services, Pods, and containers

Kubernetes, Pods, and Services | 13


Kubernetes also provides Pods and services with storage resources.
Persistent Volumes provide storage resources that will survive fail‐
ures or shutdowns. Ephemeral Volumes provide storage that will not
persist beyond the lifetime of a Pod.
Kubernetes provides all the other necessary configurations to assem‐
ble services into an application stack. This includes the distribution
of Pods across the multiple nodes in the Kubernetes cluster, man‐
agement of “secrets” such as API keys and passwords, resource
utilization limits, and scaling practices.
We don’t have space to dig into Kubernetes in detail, but there are
many resources available. O’Reilly has books (such as Production
Kubernetes by Josh Rosso et al.) and learning paths (such as a Certi‐
fied Kubernetes Application Developer prep course) that can help.

Multiregion Kubernetes
Kubernetes clusters are generally confined to a single region, though
a cluster can span multiple availability zones. While it is technically
possible to deploy a “stretch” Kubernetes cluster with nodes across
multiple regions, the latency penalty that results is likely to be
intolerable.
Consequently, each region will typically have its own Kubernetes
cluster and will use a global load balancer to route requests to
the appropriate cluster. The solution looks somewhat similar to the
“old-school” pattern shown in Figure 2-2, except that the global load
balancer routes requests not to services exposed within a VM but
to services exposed in a Kubernetes cluster. Figure 2-4 illustrates the
configuration.
The major public clouds all offer a global load balancing service
that is usually adequate for a single-cloud solution. However, if you
want to load balance between cloud platforms, or between noncloud
premises, third-party global load balancing services are offered by
the major content delivery network vendors, such as Cloudflare and
Akamai.

14 | Chapter 2: Distributing the Application Layer


Figure 2-4. Multiregion Kubernetes

Event Management
In a distributed application, there is often a need for a reliable
messaging or event management layer that allows microservices to
coordinate their work. Often, these take the form of work queues
that allow work requiring the interaction of multiple microservices
to progress reliably and asynchronously.
It’s possible to use the database as the one and only common com‐
munication layer between microservices, but this can increase the
load on what is often already a critical component in transactional
latency and is wasteful since these messages do not need to have
long-term persistence.
Consequently, many distributed applications employ a distributed
messaging service to support message requests between services.
Such a messaging service guarantees that messages will survive Pod
or node failures, without necessarily requiring long-term storage
once a work request has been fully processed.

Event Management | 15
The distributed messaging service most widely used in modern
applications is Apache Kafka, though native cloud messaging serv‐
ices such as Google Cloud Pub/Sub are also commonly used.
In many distributed designs, transient messages between services do
not pass between regions. Regions operate independently of each
other; if a region fails, messages within that region may be lost or
at least unavailable until the region is recovered. In this scenario,
the messaging system runs encapsulated within each region, which
keeps latency low and performance high.
However, in a regional failure scenario, if another region is expected
to pick up incomplete work from the failed region, the messaging
solution must span regions. In these scenarios, latency is increased
since messages must be replicated across regions before processing.
In some cases, a compromise solution can be implemented in which
messages are replicated asynchronously across regions. In this case,
some messages might be lost if a region fails, but hopefully the bulk
of work in progress is transferred to the new region. Figure 2-5
illustrates the scenario in which each region has its own Kafka
messaging service and the two are kept in sync via replication.

Figure 2-5. Replicating a Kafka messaging service across regions

16 | Chapter 2: Distributing the Application Layer


Serverless Deployments
Kubernetes has rapidly become the “OS of the cloud” because of
its ability to abstract the underlying cloud platform and allow con‐
tainers to be orchestrated in such a way as to deliver a distributed,
scalable application. However, Kubernetes configuration and man‐
agement are far from simple and, in some cases, may become more
complex than the application itself.
Serverless platforms can provide a simpler solution for deploying
individual services. A serverless platform attempts to abstract the
entirety of the underlying infrastructure and simply provides a plat‐
form that runs application code. Although serverless platforms lack
some of the capabilities of Kubernetes, they radically reduce the
management and deployment overheads.
Serverless platforms have reduced administrative overheads, which
can reduce staffing expenditure. They can also reduce hosting costs:
a Kubernetes deployment will typically exact some hosting costs
even when idle, but the billing for a serverless deployment will
typically be based on actual utilization, not peak utilization.
On the other hand, serverless platforms offer increased simplicity
at the expense of reduced flexibility. Serverless platforms work well
when the application is composed of independent containers that
are completely stateless and don’t interact with each other.
Furthermore, serverless platforms can place restrictions on con‐
tainer sizes or programming languages that are supported. And
when things go wrong in a serverless platform, you will be more
reliant on the platform vendor than is the case with a Kubernetes
environment.
In some circumstances, security policies might require that applica‐
tion code not run on shared cloud instances. Such a policy might
rule out a serverless deployment, where you have little control over
where code runs.
Nevertheless, we expect serverless platforms to evolve to address
some of these restrictions over time and become an increasingly
attractive option for a wider range of use cases.

Serverless Deployments | 17
The developer experience for a serverless platform is very
straightforward:

1. The developer implements a service using a supported language


for the platform.
2. The developer invokes a serverless platform command to
deploy the service. This command will typically package the
service as a Docker container and deploy that container to the
cloud infrastructure.
3. Scaling and management of the service are controlled from the
platform’s management interface.

Although serverless is often attractive from a billing point of view,


the automatic scaling can result in billing “surprises.” Most server‐
less platforms provide a mechanism for providing billing alerts and
billing limits to prevent runaway scaling and cost overruns.
A variety of serverless platforms are available:
Google Cloud Run
A Docker container–oriented serverless platform available in
the Google Cloud Platform.
Amazon AWS Lambda and AWS Fargate
Lambda is a function-as-a-service offering that is suitable for
very short-running function calls, while Fargate is a container‐
ized solution like Cloud Run.
Microsoft Azure Container Apps
A containerized serverless platform available in the Azure
cloud.
Knative
A Kubernetes-based platform offering a serverless experience
within a Kubernetes cluster. This allows enterprises to provide
a serverless experience within a private Kubernetes-based cloud,
and allows developers with access to Kubernetes clusters a sim‐
pler deployment experience.

18 | Chapter 2: Distributing the Application Layer


Multiregion Serverless
Most cloud platforms support serverless deployments into specific
regions only. To achieve a multiregion serverless solution, we will
need to deploy the serverless solution into each region, then employ
a global load balancer to route traffic to the appropriate region. The
global load balancer acts exactly like those illustrated in Figures 2-2
and 2-4, routing requests to the service within a region that can
satisfy the request.

Summary
Modern cloud platforms and containerization solutions provide
cost-effective and low-risk mechanisms for deploying a globally
distributed application. Legacy application architectures can be
deployed to cloud-based virtual machines with global load balancers
providing fault tolerance and workload distribution. Docker con‐
tainers orchestrated by Kubernetes provide a more robust and effec‐
tive means of deploying microservice-based applications. Serverless
platforms provide an even higher level of abstraction and simplifica‐
tion, albeit with some limitations on flexibility.
These platforms work well with the stateless and transient compute-
oriented components of an application. However, almost all dis‐
tributed applications must maintain a consistent and persistent data
store that coordinates and records the activities of the microservices
that deliver the application’s functionality. In Chapter 3, we’ll dig
into the configuration of such a distributed transactional database.

Multiregion Serverless | 19
CHAPTER 3
Distributing and Scaling
the Storage Layer

To some extent, the problem of distributing application logic—or


more generally, computational services—was solved decades ago.
Even in the client-server era prevalent in the 1990s, application logic
would typically be running on the client tier in multiple locations—
indeed, there would typically be one copy of application logic run‐
ning on each user’s client PC. Each had, in effect, their own compute
and memory, which do not depend on, or interfere with, another
user’s compute or memory.
As we transitioned from client-server to the early internet, web
application servers took on the role of hosting computational serv‐
ices, and it was generally possible to deploy as many of these as was
necessary to support demand.
However, the database layer has historically been a completely dif‐
ferent matter. Every application will have certain data elements that
must be consistent across all users or at least some subset of users.
Therefore, users must share database services. Maintaining a consis‐
tent and persistent database state across distributed nodes has been
challenging.
Database technology has come a long way since the monolithic
databases that powered web 1.0. Today, reliable distributed transac‐
tional databases exist, and these databases are best suited to meet the
demands of modern distributed applications.

21
Distributed Database Examples
In the following sections, we describe the capabilities
common to modern distributed transactional database
systems. Such systems include CockroachDB, Google
Spanner, YugabyteDB, Microsoft Cosmos DB, and
others.
However, not all distributed transactional databases
implement all the features outlined in this chapter, and
this report does not attempt to compare the feature
sets of various distributed transactional database sys‐
tems. In most cases, the terminology and capabilities
described are those of CockroachDB, with which the
authors are most familiar. Other databases may use
different terminology or have different capabilities.

Transactional Versus Nontransactional


Distributed Databases
This report is focused on transactional technologies. However, it’s
worth briefly reviewing the justification and use of nontransactional
alternatives.
CAP theorem, or Brewer’s theorem, states that you can only have at
most two of three desirable characteristics in a distributed system:
Consistency
Every user sees the same view of the database state.
Availability
The database remains available unless all elements of the dis‐
tributed system fail.
Partition tolerance
The system can continue to operate in the event that a network
partition divides the distributed system in two or if two nodes
in the network cannot communicate.
To use a simple example, if we have an application with users in the
US and Europe, and the network between the two continents fails,
we need to decide between the application shutting down in one
continent or continuing in both continents but with inconsistencies
arising between the two continents as local transactions change the
local view of the database.

22 | Chapter 3: Distributing and Scaling the Storage Layer


During the initial rollout of the first global internet applications,
network partitions were a frequent enough occurrence that many
organizations felt motivated to sacrifice consistency for availability.
However, the situation is different today. Network partitions are still
possible, but the major cloud vendors have built such redundancy
into their network infrastructure that the chance of such a network
outage is massively reduced. Figure 3-1 shows a typical set of net‐
work links between major regions in a public cloud.

Figure 3-1. Major public clouds have redundant network links between
each location, reducing the likelihood of network partitions

With the possibility of network partitions reduced, the trade-offs


between consistency and availability have changed. Realizing this,
Google developed Spanner, a distributed and transactionally con‐
sistent database service. Third-party equivalents to Spanner, such
as CockroachDB and YugabyteDB, have since become available,
and some have matured to the point at which they represent a
production-ready, cross-platform solution for distributed transac‐
tional applications.

Hosting Strategies for Distributed Databases


Application architects have a variety of choices for the deployment
of the database layer in a modern distributed transactional applica‐
tion. In brief, these are as follows:

Hosting Strategies for Distributed Databases | 23


Do-it-yourself on your own hardware
In this scenario, you provision the hardware on which the nodes
of the distributed database run and deploy the database to those
nodes.
Do-it-yourself on a cloud platform
In this scenario, you use cloud-based VMs to host the nodes
of the distributed database and deploy the database into these
VMs. You could even take this approach to distribute your
databases across multiple cloud platforms.
Fully managed cloud database service
Most database vendors offer a “fully managed” cloud-hosted
version of the database, made available as a service. You are
not responsible for the deployment or administration of these
nodes, though you still must determine the relative number and
capacity of the nodes.
Serverless cloud database
Serverless offerings are relatively new and provide elastic
capacity based on demand. The amount of computing resources
made available to your application adjusts automatically, and
you pay only for the resources you use.
Not all databases support all deployment patterns. For instance,
some systems support only fully managed deployments (Spanner,
for example). And while most vendors offer fully managed cloud
servers, at the time of writing very few (such as CockroachDB) offer
a serverless option.
Table 3-1 compares some of the advantages and disadvantages of
each deployment pattern.

Table 3-1. Comparison of deployment options


Advantages Disadvantages
Do-it-yourself • Maximum control over hardware • The highest cost in terms of skilled
on your own and OS configuration human resources
hardware • Lower ongoing hardware “rental” • High initial cost in terms of hardware
costs compared to a cloud acquisition
deployment • Paying for computing resources that
• Reduced latency for applications are unused during idle periods
running on premises

24 | Chapter 3: Distributing and Scaling the Storage Layer


Advantages Disadvantages
Do-it-yourself • Ability to reconfigure computing • Increased operational expenditure
on a cloud resources dynamically (hardware “rental”)
platform • Reduced capital hardware
expenditure
• Availability of additional services
such as Amazon S3 for backup
storage
• Ability to fine-tune placement of
nodes to minimize latency
Fully • Reduced operational costs • Some loss of fine-grained control over
managed • Rapid deployment software and hardware configuration
cloud • Rapid scaling and reconfiguration
database
service
Serverless • Minimal operational costs • Your deployment is cotenanted with
cloud • Pay only for the resources you use other serverless users. This may result
database • No need for predeployment capacity in less predictable performance when
planning compared to a dedicated deployment.
• Automatic and seamless scaling • This approach may conflict with some
with demand security policies regarding colocation
of data with other tenants.
• Billing may be unpredictable if
monthly budgets are set too high,
or performance may be throttled if
monthly budgets are set too low.

Of course, if your application layer is running in the cloud, then


you will almost certainly choose a cloud deployment option for
your database. Only if your application layer is running on premises
would you consider an on-premises database deployment.
There are two major reasons why we would normally be “all cloud”
or “all on premises”:
Security
The application layer typically communicates with the database
layer within a private network. If one component is on premises
while the other is in the cloud, then database traffic must transi‐
tion from one network to another, increasing vulnerability to a
cyberattack.
Latency
The latency between the application and the database is one
of the key components of end-user response time and overall
throughput. Making sure that the application code is located

Hosting Strategies for Distributed Databases | 25


“close” to a database node minimizes that latency. Such close‐
ness is not possible when the application layer and database
layer are not both on premises or both in the same cloud
platform.
We argued in Chapter 2 and elsewhere that for most modern dis‐
tributed applications, the advantages of public cloud deployment are
decisive. If you’ve accepted that argument, then most likely you will
be drawn toward a cloud deployment of some form.
Of the three cloud options for database deployment—do-it-yourself,
fully managed, and serverless—we believe that the self-managed
option is usually the least attractive, especially for a small team.
It involves greater administrative costs as well as a higher risk of
failure.

Serverless or Dedicated Deployment?


If you have decided upon a fully managed cloud database deploy‐
ment, you might also need to decide between a dedicated or server‐
less deployment.
In a dedicated deployment, you choose the number and size of the
database nodes. These nodes are dedicated to your cluster and are
under your control, and your billing will be largely determined by
the number and types of nodes you configure.
In a serverless deployment, you don’t have to worry about any of
that. You simply sign up for a serverless account—possibly within
a specific region—and provide a limit on the amount of monthly
spending you’re willing to commit to. Resources will be applied
to your service as your workload requires, and you’ll only ever be
charged for the resources you use.
There’s a lot to like in a serverless deployment:

• You are paying only for the resources you use, so if your appli‐
cation has peaks and troughs of activity, you will probably save
money.
• Resources applied to your workload will scale dynamically—as
the workload demands increase or decrease, CPU and memory
will be adjusted to suit. As a result, you may not need to
perform benchmarks or otherwise determine ahead of time

26 | Chapter 3: Distributing and Scaling the Storage Layer


the hardware resources needed to support the application
workload.
• You generally have a monthly “free” allowance of utilization, so
you can develop for free and then seamlessly transition to a paid
service when your application moves into production.

These advantages are compelling across a wide range of use cases.


However, there are some limitations:

• In a serverless deployment, your application shares some physi‐


cal resources with other serverless users. Of course, you can’t
see data from other users, but some organizations with hyper‐
sensitive security requirements might find this cotenanting
unacceptable.
• This cotenanting also allows for the possibility that a “noisy
neighbor” might disrupt your performance. Because some hard‐
ware resources are shared, it’s possible that a very high load on
another tenant causes a noticeable drop in throughput on your
service. Furthermore, during periods of low activity, your data
in cache memory may be replaced by data from other tenants.
When your application starts to ramp back up, it will experience
a “cold cache” scenario in which physical I/O rates are higher
than normal and, consequently, query latencies are increased.
• In dedicated mode, both your cost and resource utilization are
relatively fixed, so there’ll be no unpredictability in billing or
in performance. In serverless mode, you “cap” your bill at a
certain amount; if your resource utilization exceeds that cap,
then you’ll be throttled back to the performance limitations
provided by the free tier. There are, of course, ways to monitor
and manage your resource utilization. Nevertheless, if you’re
not paying attention to your application’s workload, you might
find that you are consuming resources at a greater than predic‐
ted rate, and consequently that your spend is exceeding expect‐
ation. When the spend limit is reached and throttling occurs,
the resources available to the serverless cluster will be reduced,
leading to increases in latency and reduced throughput. If the
resource limitations of the free tier are not sufficient to support
your application’s workload, you’ll probably need to increase
your spending limit. Serverless platforms offer thresholds and
alerts to help you keep track of your spend, but it is still the case
that a sudden increase in workload—maybe even from a single

Serverless or Dedicated Deployment? | 27


errant SQL request—might cause an unexpected increase in cost
or an undesirable reduction in database throughput.
• Currently available serverless database offerings are sometimes
tied to a specific region, which may cause performance issues if
the application is distributed globally.
• Not all database vendors offer a serverless deployment option.

To plan a serverless deployment, you need to establish an upper


limit on your monthly spending, a cloud provider, and the region or
regions that your serverless deployment will work within.

Kubernetes
We spoke at length about Kubernetes in Chapter 2. Kubernetes is
almost a no-brainer for a modern application, providing advantages
in deployment, portability, manageability, and scaling. It’s also an
attractive framework for running databases—most of the database
vendors use Kubernetes in their own cloud deployments.
However, providing you can arrange for your Kubernetes nodes to
be running in the same data center as your database nodes, we don’t
think it is mandatory to use Kubernetes as the database platform
for a Kubernetes-based application layer. The types of workloads
encountered by the database are very different from those in the
application layer, and it may be that the two workloads won’t coexist
all that well if colocated in the same Kubernetes cluster.
Default Kubernetes settings are rarely appropriate for database
deployments. Historically, Kubernetes has been used for applications
rather than databases. For applications, CPU and memory allocation
have been more influential than I/O management. Consequently,
many Kubernetes clusters—particularly those on cloud platforms—
are configured with economical “storage by the GB” disks. For
instance, when creating a Kubernetes cluster on Google Cloud Plat‐
form, the default disk type for the node pool is standard persistent
disk, whereas an SSD persistent disk is a much better option for a
database deployment.

28 | Chapter 3: Distributing and Scaling the Storage Layer


Finally, as we saw in Chapter 2, Kubernetes clusters cannot easily
span regions. If you are looking for a multiregion database deploy‐
ment, Kubernetes might not be the best fit.
Most database vendors offer a Kubernetes operator that encapsulates
the logic for deploying and managing their database engine in a
Kubernetes cluster. If you are deploying your database on Kuber‐
netes, these operators are recommended.

Placement Policies
We argued earlier that the advantages of fully managed or serverless
cloud-based deployments for your database layer are compelling.
A fully managed deployment reduces the human costs involved
in managing the distributed database and usually reduces the
operational risks involved in a complex, multiregion deployment.
However, there are a few considerations that might lead you to a
self-managed cloud deployment.
In a fully managed deployment, you don’t have access to all the
fine-grained configuration options that will be available in a self-
managed deployment. For instance, you won’t be able to modify the
Linux kernel configuration or all the database tuning parameters,
and access to logs and other diagnostic information will be reduced.
Furthermore, you won’t have completely fine-grained control over
the placement of your database nodes. Most fully managed options
allow you to determine the region in which each node will exist, but
not normally the default zone. It might be difficult or impossible to
ensure that an application node and database node are in the same
zone (in effect, within the same data center).
In some cases, you might want to control the placement of nodes
even within the data center. For instance, most clouds allow for
placement policies that can encourage two nodes to be located phys‐
ically close to one another—in the same rack, for instance. This
optimization is not available when using a fully managed service.

Placement Policies | 29
Multiregion Database Deployments
We discussed the concepts of zones and regions in Chapter 2. The
concept of regions and zones is common to the major cloud plat‐
forms, allowing the provisioning of services with no single point
of failure, either globally or locally. In almost all cases, we use the
same region and zone definitions for the database as for the applica‐
tion layer. Any mismatch between application regions and database
regions will increase latency and jeopardize availability should a
region fail.
As with the application layer, we define regions and zones such that
low-latency requests can be satisfied in each region and application
service can continue even if computing resources in one of the zones
fail.

Distributed Database Consensus


Modern distributed databases implement a distributed consensus
protocol that allows all the nodes that comprise the database to agree
on the current state of any data item.
Most commonly, the data in the database will be partitioned by
primary key range or some other mechanism and distributed across
the nodes of the cluster. Typically, every data partition will be
replicated at least three times. The majority of nodes must always
participate in any data change so that in the event of a single-node
failure, a copy of the most recent data is available. Higher numbers
of replicas are required if the database is to survive multiple failures.
The Raft protocol is often used to coordinate the consensus process.
In Raft, one of the nodes is elected as a “leader” for a specific data
partition.
Figure 3-2 illustrates some of these principles in a typical distributed
database (in this case, CockroachDB).
It’s not necessary to fully understand the internals of distributed
databases. But it is important to understand the need for multiple
replicas and the consensus mechanism since it will influence our
regional and zone design.

30 | Chapter 3: Distributing and Scaling the Storage Layer


Figure 3-2. A distributed database architecture

While the details of consensus mechanisms are complex and vary


across database platforms, the following considerations apply to
most distributed databases:

• A replication factor defines how many copies of each data item


are distributed across the cluster. The replication factor will be
less than or equal to the number of nodes.
• A majority of replicas—and hence a majority of nodes—must
be available for a data item to be available. For instance, if the
replication factor is three, then only a single-node failure can be
tolerated. If the replication factor is five, then a two-node failure
might be tolerated.
• In a distributed application, performance is improved by having
most of a region’s data located in that region since it allows
updates to complete with only local writes. However, high avail‐
ability is improved by distributing replicas such that a single
region failure or network disconnection does not cause a full
system outage.

Distributed Database Consensus | 31


Survival Goals
A distributed database may implement survival goals, which deter‐
mine trade-offs between latency and survivability. For instance, we
might choose between the following survival goals:
Zone failure survival goal
The database will remain available for reads and writes even if
a node in a zone fails. The number of zones that can fail will
depend on the replication factor.
Region failure survival goal
The database will be fully available even in the event of the fail‐
ure of an entire region. To achieve this, data must be replicated
to another region, which in most circumstances will reduce
write performance.
Figure 3-3 illustrates a zone failure within a database with a zone
survival goal. The data ranges in question are maintained in three
replicas within the US region (the default region), and a failure of
any single zone (in this case, the New York zone) in that region
allows the database to continue to function with two replicas. How‐
ever, should an entire region fail, then the entire database would
probably become unavailable because some data would lose more
than one replica.

Figure 3-3. Example of a zone failure in zone survival goal

32 | Chapter 3: Distributing and Scaling the Storage Layer


Figure 3-4 shows a region failure for a database with a regional sur‐
vival goal. Because of setting regional survival goals for a database,
the replication factor is automatically increased to five instead of the
default three, and the replicas are distributed across regions. When
the US region fails, there are still three copies of the data in other
regions, and database operations can continue.

Figure 3-4. Example of a region failure in regional survival goal

Zone survival results in the best performance, while regional sur‐


vival promotes greater survivability in case of large-scale outages or
network partitions. When deciding between the two, the following
considerations are relevant:

• In the big public cloud platforms, failures of entire regions are


rare but not unheard of. The zone survival configuration shown
in Figure 3-3 could not have remained completely available
during a region failure. You shouldn’t assume on any platform
that a regional outage is impossible.
• Regional survival implies zone survival. By default, a regional
survival goal protects against the loss of any single region or the
failure of any two zones.
• Just as we need at least three nodes to allow for survivability in a
single-region cluster, we need three regions to allow for regional
survival. With only two regions, the failure of any one region
would make most replicas for some data elements unavailable.

Survival Goals | 33
Locality Rules
Regardless of the survival goal, we may be able to fine-tune the
distribution of data within a table to optimize access from specific
regions. This is primarily done to optimize low-latency requests
from various regions and can also be used to comply with legal
requirements for data domiciling.
Tables in a distributed database may have locality rules that deter‐
mine how their data will be distributed across zones:
Global table
This table will be optimized for low-latency reads from any
region.
Regional table
This table will be optimized for low-latency reads and writes
from a single region.
Regional-by-row table
This table will have specific rows optimized for low-latency
reads and writes for a region. Different rows in the table can be
assigned to specific regions.
With a global table, replicas for all rows within the table will be
duplicated in each region. This ensures that read time is optimized
but creates the highest overhead for writes because all regions must
coordinate on a write request. Global tables are suitable for relatively
static lookup tables that are relevant across all regions. A product
table might be a relevant example—product information is often
shared across regions and not subject to frequent updates; therefore,
performance is optimized if each region has a complete copy of
the product table. The downside is that writes to the product table
will require participation from all regions and therefore be relatively
slow.
With a regional table, as much replica information as possible
(subject to failure goals) for all ranges in the table is in a single
region. This makes sense either if that region is more important
to the business than other regions or if the data is particularly
relevant to that region. For instance, if in an internationalized appli‐
cation error codes for each language were in separate tables, then it
might make sense to locate these in particular regions (though this,

34 | Chapter 3: Distributing and Scaling the Storage Layer


of course, would rekindle the age-old debate on where “English”
should reside).
A regional-by-row table locates the replicas for specific rows in spe‐
cific regions. For example, in a users table, we could assign rows to
the US region if their country code was USA, Canada, or any country
in North or South America. Regional by row is a very powerful way
of moving data close to the regions in which it is required.
If your primary goal is high performance in all geographies, then
you are probably going to be motivated toward zone survival. In
this scenario, carefully determining your table locality settings will
make a big difference. Tables that have global relevance and that
are read-intensive should probably be global. Tables that contain
regional-specific data should probably become regional by row.
Only if a table is specific to a region—such as a language-specific
table—should it be left as regional.
If your primary goal is high availability, then you’re probably going
to need regional survival. In this scenario, table localities are less
important for write performance because writes are distributed
across multiple regions by default. However, you will still see advan‐
tages in configuring global, regional, and regional-by-row settings
for selected tables to optimize read performance and distribute the
workload more evenly across your cluster.

Summary
A distributed transactional application will need a database platform
that can maintain consistency across widely separated geographi‐
cal regions while simultaneously supporting low-latency operations
from each of those regions.
Modern distributed transactional databases can span multiple geog‐
raphies and potentially survive failures of entire regions. Multi‐
region configurations also allow you to fine-tune the distribution
of data such that data resides where it is most likely to be used, thus
reducing latency for both reads and writes.
There are some trade-offs between latency and availability. Most
critically, where regional survival is required, some increase in write
latency will occur because multiple regions will have to participate
in transaction consensus.

Summary | 35
For most organizations, only public clouds will offer the global
scope and redundancy necessary for a successful distributed data‐
base deployment. Database vendors offer fully managed cloud plat‐
forms that reduce administrative overhead and operational risk.
Some vendors also offer serverless options that can further reduce
complexity and optimize billing. However, in some cases, a do-it-
yourself deployment on a public cloud can deliver the ultimate per‐
formance optimizations by fine-tuning the placement of application
and database nodes.
The requirements of modern applications often demand a dis‐
tributed, transactional solution. In the past, such solutions were
only available to the largest and most sophisticated organizations.
Today, the existence of public cloud platforms and container tech‐
nologies such as Docker and Kubernetes, together with distributed
transactional database platforms, allows virtually any team to imple‐
ment a distributed transactional application.
We hope this report is a useful starting point for those embarking on
the journey to a distributed transactional architecture.

36 | Chapter 3: Distributing and Scaling the Storage Layer


About the Authors
Guy Harrison has worked with databases for more than two
decades and authored books on Oracle, MySQL, MongoDB, and
other technologies, as well as CockroachDB: The Definitive Guide
(O’Reilly). Harrison is currently a CTO at ProvenDB and resides in
Australia.
Andrew Marshall is a Portland, OR–based product strategy leader
who loves to think, talk, and write about how to help teams build
better software.
Charles Custer is a former teacher, tech journalist, and filmmaker
who has combined those three professions into writing and making
videos about databases and application development (and occasion‐
ally messing with NLP and Python to create weird things in his
spare time).
Another random document with
no related content on Scribd:
Length to end of tail 18 1/2 inches, to end of wings 11 3/8; extent of
wings 22 1/2; wing from flexure 8; tail 10 1/2; bill along the ridge
1 4 1/2/12; tarsus 1 10 1/2/12; first toe 8/12, its claw 10/12; middle toe
1 2/12, its claw 6/12.

Adult Female. Plate CCCLVII.


The Female is similar to the male, and little inferior in size.
Five American specimens compared with several European, present
no appearances indicative of a specific difference. Some individuals
of both countries are larger than others, and the tail differs much in
length, according to age or the growth of the feathers. The largest
specimen in my possession, presented to me by Dr Richardson,
and marked as shot by Mr Drummond, measures as follows:—
Length to end of tail 20 1/2 inches; bill along the ridge 1 7/12; tail
11 3/4; wing from flexure 8 9/12; tarsus 2; middle toe 1 1/12, its claw
7 1/2/12.
In this individual the feathers on the fore neck are white for
more than half their length from the base. In the other specimens this
white part is fainter or light grey, and of less extent.
PINE GROSBEAK.

Pyrrhula Enucleator, Temm.


PLATE CCCLVIII. Male and Female.

In Wilson’s time, this beautiful bird was rare in Pennsylvania; but


since then it has occasionally been seen in considerable numbers,
and in the winter of 1836 my young friend J. Trudeau, M. D.
procured several in the vicinity of Philadelphia. That season also
they were abundant in the States of New York and Massachusetts.
Some have been procured near the mouth of the Big Guyandotte on
the Ohio; and Mr Nuttall has observed it on the lower parts of the
Missouri. I have ascertained it to be a constant resident in the State
of Maine, and have met with it on several islands in the Bay of
Fundy, as well as in Newfoundland and Labrador. Dr Richardson
mentions it as having been observed by the Expedition in the 50th
parallel, and as a constant resident at Hudson’s Bay. It is indeed the
hardiest bird of its tribe yet discovered in North America, where even
the Rose-breasted Grosbeak, though found during summer in
Newfoundland and Labrador, removes in autumn to countries farther
south than the Texas, where as late as the middle of May I saw
many in their richest plumage.
The Pine Grosbeak is a charming songster. Well do I remember how
delighted I felt, while lying on the moss-clad rocks of Newfoundland,
near St George’s Bay, I listened to its continuous lay, so late as the
middle of August, particularly about sunset. I was reminded of the
pleasure I had formerly enjoyed on the banks of the clear Mohawk,
under nearly similar circumstances, when lending an attentive ear to
the mellow notes of another Grosbeak. But, Reader, at
Newfoundland I was still farther removed from my beloved family;
the scenery around was thrice wilder and more magnificent. The
stupendous dark granite rocks, fronting the north, as if bidding
defiance to the wintry tempests, brought a chillness to my heart, as I
thought of the hardships endured by those intrepid travellers who, for
the advancement of science, had braved the horrors of the polar
winter. The glowing tints of the western sky, and the brightening stars
twinkling over the waters of the great Gulf, rivetted me to the spot,
and the longer I gazed, the more I wished to remain; but darkness
was suddenly produced by the advance of a mass of damp fog, the
bird ceased its song, and all around seemed transformed into chaos.
Silently I groped my way to the beach, and soon reached the Ripley.
The young gentlemen of my party, accompanied by my son John
Woodhouse, and a Newfoundland Indian, had gone into the interior
in search of Rein Deer, but returned the following afternoon, having
found the flies and musquitoes intolerable. My son brought a number
of Pine Grosbeaks, of different sexes, young and adult, but all the
latter in moult, and patched with dark red, ash, black and white. It
was curious to see how covered with sores the legs of the old birds
of both sexes were. These sores or excrescences are, I believe,
produced by the resinous matter of the fir-trees on which they obtain
their food. Some specimens had the hinder part of the tarsi more
than double the usual size, the excrescences could not be removed
by the hand, and I was surprised that the birds had not found means
of ridding themselves of such an inconvenience. One of the figures
in my plate represents the form of these sores.
I was assured that during mild winters, the Pine Grosbeak is found in
the forests of Newfoundland in considerable numbers, and that some
remain during the most severe cold. A lady who had resided there
many years, and who was fond of birds, assured me that she had
kept several males in cages; that they soon became familiar, would
sing during the night, and fed on all sorts of fruits and berries during
the summer, and on seeds of various kinds in winter; that they were
fond of bathing, but liable to cramps; and that they died of sores
produced around their eyes and the base of the upper mandible. I
have observed the same to happen to the Cardinal and Rose-
breasted Grosbeaks.
The flight of this bird is undulating and smooth, performed in a direct
line when it is migrating, at a considerable height above the forests,
and in groups of from five to ten individuals. They alight frequently
during the day, on such trees as are opening their buds or blossoms.
At such times they are extremely gentle, and easily approached, are
extremely fond of bathing, and whether on the ground or on
branches, move by short leaps. I have been much surprised to see,
on my having fired, those that were untouched, fly directly towards
me, until within a few feet, and then slide off and alight on the lower
branches of the nearest tree, where, standing as erect as little
Hawks, they gazed upon me as if I were an object quite new, and of
whose nature they were ignorant. They are easily caught under
snow-shoes put up with a figure of four, around the wood-cutters
camps, in the State of Maine, and are said to afford good eating.
Their food consists of the buds and seeds of almost all sorts of trees.
Occasionally also they seize a passing insect. I once knew one of
these sweet songsters, which, in the evening, as soon as the lamp
was lighted in the room where its cage was hung, would instantly
tune its voice anew.
My kind friend Thomas M’Culloch of Pictou in Nova Scotia, has
sent me the following notice, which I trust will prove as interesting to
you as it has been to me. “Last winter the snow was exceedingly
deep, and the storms so frequent and violent that many birds must
have perished in consequence of the scarcity of food. The Pine
Grosbeaks being driven from the woods, collected about the barns in
great numbers, and even in the streets of Pictou they frequently
alighted in search of food. A pair of these birds which had been
recently taken were brought me by a friend, but they were in such a
poor emaciated condition, that I almost despaired of being able to
preserve them alive. Being anxious, however, to note for you the
changes of their plumage, I determined to make the attempt; but
notwithstanding all my care, they died a few days after they came
into my possession. Shortly after, I received a male in splendid
plumage, but so emaciated that he seemed little else than a mass of
feathers. By more cautious feeding, however, he soon regained his
flesh, and became so tame as to eat from my hand without the least
appearance of fear. To reconcile him gradually to confinement, he
was permitted to fly about my bedroom, and upon rising in the
morning, the first thing I did was to give him a small quantity of seed.
But three mornings in succession I happened to lie rather later than
usual, and each morning I was aroused by the bird fluttering upon
my shoulder, and calling for his usual allowance. The third morning, I
allowed him to flutter about me some time before shewing any
symptom of being awake, but he no sooner observed that his object
was effected than he retired to the window and waited patiently until I
arose. As the spring approached, he used to whistle occasionally in
the morning, and his notes, like those of his relative the Rose-
breasted Grosbeak, were exceedingly rich and full. About the time,
however, when the species began to remove to the north, his former
familiarity entirely disappeared. During the day he never rested a
moment, but continued to run from one side of the window to the
other, seeking a way of escape, and frequently during the night,
when the moonlight would fall upon the window, I was awakened by
him dashing against the glass. The desire of liberty seemed at last to
absorb every other feeling, and during four days I could not detect
the least diminution in the quantity of his food, while at the same time
he filled the house with a piteous wailing cry, which no person could
hear without feeling for the poor captive. Unable to resist his
appeals, I gave him his release; but when this was attained he
seemed very careless of availing himself of it. Having perched upon
the top of a tree in front of the house, he arranged his feathers, and
looked about him for a short time. He then alighted by the door, and I
was at last obliged to drive him away, lest some accident should
befall him.
“These birds are subject to a curious disease, which I have never
seen in any other. Irregularly shaped whitish masses are formed
upon the legs and feet. To the eye these lumps appear not unlike
pieces of lime; but when broken, the interior presents a congeries of
minute cells, as regularly and beautifully formed as those of a honey-
comb. Sometimes, though rarely, I have seen the whole of the legs
and feet covered with this substance, and when the crust was
broken, the bone was bare, and the sinews seemed almost
altogether to have lost the power of moving the feet. An
acquaintance of mine kept one of these birds during the summer
months. It became quite tame, but at last it lost the power of its legs
and died. By this person I was informed that his Grosbeak usually
sang during a thunder-storm, or when rain was falling on the house.”
While in the State of Maine, I observed that these birds, when
travelling, fly in silence, and at a considerable height above the
trees. They alight on the topmost branches, so that it is difficult to
obtain them, unless one has a remarkably good gun. But, on waiting
a few minutes, you see the flock, usually composed of seven or eight
individuals, descend from branch to branch, and betake themselves
to the ground, where they pick up gravel, hop towards the nearest
pool or streamlet, and bathe by dipping their heads and scattering
the water over them, until they are quite wet; after which they fly to
the branches of low bushes, shake themselves with so much vigour
as to produce a smart rustling sound, and arrange their plumage.
They then search for food among the boughs of the taller trees.
Loxia Enucleator, Linn. Syst. Nat. vol. i. p. 299.—Lath. Ind. Ornith. vol. i. p.
372.
Pine Grosbeak, Loxia Enucleator, Wils. Amer. Ornith. vol. i. p. 80, pl. 5, fig.
2.
Pyrrhula Enucleator, Ch. Bonaparte, Synopsis of Birds of United States,
p. 119.
Pyrrhula (Corythus) Enucleator, Richards. and Swains. Fauna Bor.-
Amer. vol. ii. p. 262.
Pine Grosbeak or Bullfinch, Nuttall, Manual, vol. i. p. 535.

Adult Male. Plate CCCLVIII. Fig. 1.


Bill short, robust, bulging at the base, conical, acute; upper mandible
with its dorsal outline convex, the sides convex, the edges sharp and
overlapping; lower mandible with the angle short and very broad, the
dorsal line ascending and slightly convex, the sides rounded, the
edges inflected; the acute decurved tip of the upper mandible
extending considerably beyond that of the lower; the gape-line
deflected at the base.
Head rather large, ovate, flattened above; neck short; body full. Legs
short, of moderate strength; tarsus short, compressed, with six
anterior scutella, and two plates behind, forming a thin edge; toes
short, the first proportionally stout, the third much longer than the two
lateral, which are about equal; their scutella large, their lower surface
with large pads covered with prominent papillæ. Claws rather long,
arched, much compressed, laterally grooved, and acute.
Plumage soft, full, rather blended, the feathers oblong. At the base of
the upper mandible are strong bristly feathers directed forwards. The
wings of moderate length; the primaries rounded, the second and
third longest, and with the fourth and fifth having their outer webs
slightly cut out. Tail rather long, emarginate, of twelve strong, broad,
obliquely rounded feathers.
Bill reddish-brown. Iris hazel. Feet blackish-brown, claws black. The
general colour of the plumage is bright carmine tinged with vermilion;
the feathers of the fore part of the back and the scapulars greyish-
brown in the centre; the bristly feathers at the base of the bill
blackish-brown; the middle of the breast, abdomen, and lower tail-
coverts, light grey, the latter with a central dusky streak. Wings
blackish-brown; the primaries and their coverts narrowly edged with
reddish-white, the secondaries more broadly with white; the
secondary coverts and first row of small coverts tipped with reddish-
white, the smaller coverts edged with red.
Length to end of tail 8 1/2 inches, to end of wings 6 1/4, to end of
claws 6 3/4; extent of wings 14; wing from flexure 4 3/4; tail 4; bill
along the ridge 7 1/2/12, along the edge of lower mandible 7/12; tarsus
9 1/2/ ; first toe 4 1/2/12, its claw 5/12; middle toe 8/12, its claw 5/12.
12

Female. Plate CCCLVIII. Fig. 2.


The female is scarcely inferior to the male in size. The bill is dusky,
the feet as in the male. The upper part of the head and hind neck are
yellowish-brown, each feather with a central dusky streak; the rump
brownish-yellow; the rest of the upper parts light brownish-grey.
Wings and tail as in the male, the white edgings and the tips tinged
with grey; the cheeks and throat greyish-white or yellowish; the fore
part and sides of the neck, the breast, sides, and abdomen ash-grey,
as are the lower tail-coverts.

Length to end of tail 8 1/4 inches, to end of wings 6 1/4, to end of


claws 6 3/4; extent of wings 13 1/2; wing from flexure 4 1/2; tail 3 10/12;
tarsus 9 1/2/12; middle toe and claw 1 1/12.
Young fully fledged. Plate CCCLVIII. Fig. 3.
The young, when in full plumage, resemble the female, but are more
tinged with brown.

Fig. 1.

Fig. 2.

An adult male from Boston examined. The roof of the mouth is


moderately concave, its anterior horny part with five prominent
ridges; the lower mandible deeply concave. Tongue 4 1/2 twelfths
long, firm, deflected at the middle, deeper than broad, papillate at the
base, with a median groove; for the distal half of its length, it is cased
with a firm horny substance, and is then of an oblong shape, when
viewed from above, deeply concave, with two flattened prominences
at the base, the point rounded and thin, the back or lower surface
convex. This remarkable structure of the tongue appears to be
intended for the purpose of enabling the bird, when it has insinuated
its bill between the scales of a strobilus, to lay hold of the seed by
pressing it against the roof of the mandible. In the Crossbills, the
tongue is nearly of the same form, but more slender, and these birds
feed in the same manner, in so far as regards the prehension of the
food. In the present species, the tongue is much strengthened by the
peculiar form of the basi-hyoid bone, to which there is appended as it
were above a thin longitudinal crest, giving it great firmness in the
perpendicular movements of the organ. The œsophagus a b c d, Fig.
1. is two inches 11 twelfths long, dilated on the middle of the neck so
as to form a kind of elongated dimidiate crop, 4 twelfths of an inch in
diameter, projecting to the right side, and with the trachea passing
along that side of the vertebræ. The proventriculus c, is 8 twelfths
long, somewhat bulbiform, with numerous oblong glandules, its
greatest diameter 4 1/2 twelfths. A very curious peculiarity of the
stomach e, is, that in place of having its axis continuous with that of
the œsophagus or proventriculus, it bends to the right nearly at a
right angle. It is a very powerful gizzard, 8 1/2 twelfths long, 8 twelfths
broad, with its lateral muscles 1/4 inch thick, the lower very distinct,
the epithelium longitudinally rugous, of a light reddish colour. The
duodenum, f, g, first curves backward to the length of 1 1/4 inch, then
folds in the usual manner, passing behind the right lobe of the liver;
the intestine then passes upwards and to the left, curves along the
left side, crosses to the right, forms about ten circumvolutions, and
above the stomach terminates in the rectum, which is 11 twelfths
long. The cœca are 1 1/4 twelfth in length and 1/4 twelfth in diameter.
The entire length of the intestine from the pylorus to the anus is
31 1/2 inches (in another male 31); its greatest breadth in the
duodenum 2 1/2 twelfths, gradually contracting to 1 1/4 twelfth. Fig. 2.
represents the convoluted appearance of the intestine. The
œsophagus a b c; the gizzard d, turned forwards; the duodenum, e f;
the rest of the intestine, g h the cœca, i; the rectum, i j, which is
much dilated at the end.

The trachea is 2 inches 2 twelfths long, of uniform diameter. 1 1/2


twelfth broad, with about 60 rings; its muscles like those of all the
other species of the Passerinæ or Fringillidæ.
In a female, the œsophagus is 2 inches 10 twelfths long; the
intestine 31 inches long.
In all these individuals and several others, the stomach contained a
great quantity of particles of white quartz, with remains of seeds; and
in the œsophagus of one was an oat seed entire.
Although this bird is in its habits very similar to the Crossbills, and
feeds on the same sort of food, it differs from them in the form and
extent of its crop, in having the gizzard much larger, and the
intestines more than double the length, in proportion to the size of
the bird.
ARKANSAW FLYCATCHER.

Muscicapa verticalis, Bonap.


PLATE CCCLIX. Male and Female.

This species extends its range from the mouth of the Columbia
River, across our continent, to the shores of the Gulf of Mexico; but
how far north it may proceed is as yet unknown. On the 10th of April
1837, whilst on Cayo Island, in the Bay of Mexico, I found a
specimen of this bird dead at the door of a deserted house, which
had recently been occupied by some salt-makers. From its freshness
I supposed that it had sought refuge in the house on the preceding
evening, which had been very cold for the season. Birds of several
other species we also found dead on the beaches. The individual
thus met with was emaciated, probably in consequence of a long
journey and scanty fare; but I was not the less pleased with it, as it
afforded me the means of taking measurements of a species not
previously described in full. In my possession are some remarkably
fine skins, from Dr Townsend’s collection, which differ considerably
from the figure given by Bonaparte, who first described the species.
So nearly allied is it to the Green-crested Flycatcher, M. crinita, that
after finding the dead bird, my son and I, seeing many individuals of
that species on the trees about the house mentioned, shot several of
them, supposing them, to be the same. We are indebted to the
lamented Thomas Say for the introduction of the Arkansaw
Flycatcher into our Fauna. Mr Nuttall has supplied me with an
account of its manners.
“We first met with this bold and querulous species, early in July, in
the scanty woods which border the north-west branch of the Platte,
within the range of the Rocky Mountains; and from thence we saw
them to the forests of the Columbia and the Wahlamet, as well as in
all parts of Upper California, to latitude 32°. They are remarkably
noisy and quarrelsome with each other, and in the time of incubation,
like the King Bird, suffer nothing of the bird kind to approach them
without exhibiting their predilection for battle and dispute. About the
middle of June, in the dark swamped forests of the Wahlamet, we
every day heard the discordant clicking warble of this bird, somewhat
like tsh’k, tsh’k, tshivait, sounding almost like the creaking of a rusty
door-hinge, somewhat in the manner of the King Bird, with a
blending of the notes of the Blackbird or Common Grakle. Although I
saw these birds residing in the woods of the Columbia, and near the
St Diego in Upper California, I have not been able to find the nest,
which is probably made in low thickets, where it would be
consequently easily overlooked. In the Rocky Mountains they do not
probably breed before midsummer, as they are still together in noisy
quarrelsome bands until the middle of June.”
Dr Townsend’s notice respecting it is as follows: “This is the Chlow-
ish-pil of the Chinooks. It is numerous along the banks of the Platte,
particularly in the vicinity of trees and bushes. It is found also, though
not so abundantly, across the whole range of the Rocky Mountains;
and among the banks of the Columbia to the ocean, it is a very
common species. Its voice is much more musical than is usual with
birds of its genus, and its motions are remarkably quick and graceful.
Its flight is often long sustained, and like the Common King Bird, with
which it associates, it is frequently seen to rest in the air, maintaining
its position for a considerable time. The males are wonderfully
belligerent, fighting almost constantly, and with great fury, and their
loud notes of anger and defiance remind one strongly of the
discordant grating and creaking of a rusty door hinge. The Indians of
the Columbia accuse him of a propensity to destroy the young, and
eat the eggs of other birds.”

Tyrannus verticalis, Say, Long’s Exped. vol. ii. p. 60.


Musicapa verticalis, Ch. Bonaparte, Synopsis of Birds of United States, p.
67.
Arkansaw Flycatcher, Musicapa verticalis, Ch. Bonaparte, Amer. Ornith.
vol. i. p. 18, pl. 2, fig. 2.
Arkansaw Flycatcher, Nuttall, Manual, vol. ii. p. 273.

Adult Male. Plate CCCLIX. Fig. 1.


Bill rather long, stout, tapering, broader than high, unless toward the
end. Upper mandible with its dorsal outline straight and declinate,
until at the tip, where it is deflected, the ridge narrow, the sides
convex, the edges sharp, with a slight notch close to the very narrow
tip. Lower mandible with the angle short and broad, the dorsal line
ascending and very slightly convex, the ridge broad and flat at the
base, the sides convex, the edges sharp, the tip acute. The gape-
line almost straight. Nostrils basal, elliptical, partly covered by the
bristly feathers.
Head rather large; neck short; body slender. Feet very short; tarsus
slender, compressed, with six anterior scutella, which are so large
below as almost to meet behind; toes free, slender, of moderate
length. Claws moderately arched, much compressed, acute.
Plumage soft and blended. Strong bristles along the basal margin of
the upper mandible, and over the nostrils. Wings rather long, broad;
the first five primaries much attenuated toward the end, the first more
so, the fifth least; this attenuation being chiefly produced by an
incision on the first web; the first four are nearly equal, the third
longest, the fourth half a twelfth shorter, the third one-twelfth shorter
than the hind, and exceeding the first by nearly two-twelfths; the
other primaries gradually broader and more rounded; outer
secondaries abrupt and slightly emarginate. Tail rather long, almost
even, of twelve broad, abruptly rounded and acuminate feathers.
Bill black. Iris brown. Feet and claws black. The general colour of the
upper parts is ash-grey, the back tinged with yellow; the wing-coverts
and quills chocolate-brown, with brownish-white edges, those of the
inner secondaries broader. Upper tail-coverts and tail black,
excepting the outer web of the lateral feather on each side, and the
basal margin of the next. There is a patch of bright vermilion on the
top of the head, tinged with orange-yellow behind. Throat greyish-
white, the sides and fore part of the neck pale ash-grey, shaded on
the fore part of the breast into pure yellow, which is the prevalent
colour of the lower parts; lower wing-coverts yellow, the middle ones
tinged with grey.
Length to end of tail 9 inches, to end of wings 7, to end of claws 7,
extent of wings 15 1/4; tail 3 7/8; wing from flexure 5 1/2; bill along the
1/2
ridge 9/12, along the edge of lower mandible 1 1/12; tarsus 8 /12; first
toe 3 1/2/12, its claw 4 1/2/12; third toe 7/12, its claw 4/12.

Adult Female. Plate CCCLIX. Fig. 2.


The Female is rather smaller, but is similar to the male in colouring.
The young also is similar to the adult, but wants the red patch on the
head.
In the female mentioned above as having been found in Texas, the
mouth is half an inch wide, its roof anteriorly slightly concave, with
three median prominent lines, the palate flat, with its membrane or
skin diaphanous, as in Goatsuckers. The tongue is 7 twelfths long,
deeply emarginate and papillate at the base; triangular, extremely
depressed, tapering to a thin slit and bristly point. The posterior
aperture of the nares is 4 twelfths long, linear, papillate on the edges,
ending abruptly at its fore part, without a prolonged fissure.
Œsophagus, a, a, b, 2 inches 9 twelfths long, funnel-shaped for half
an inch, then cylindrical and nearly 4 twelfths in diameter, until it
enters the thorax. Proventriculus, c, 3 1/2 twelfths in diameter, and
with a belt of oblong glandules. Stomach c, d, elliptical, 7 1/2 twelfths
long, 6 twelfths broad, its lateral muscles of moderate strength, the
lower not distinct; the epithelium with broad longitudinal rugæ, and of
a dark reddish-brown colour. Intestine, e, f, g, 7 inches long, its
diameter at the anterior part 3 1/2 twelfths, gradually diminishing to
1 1/2 twelfth. Cœca extremely small, 1 twelfth long, 1/2 twelfth broad,
and 1 1/4 inch distant from the anus; cloaca i, globular.

Trachea 1 inch 10 twelfths long, tapering from a diameter of 2


twelfths to 1 twelfth; the rings ossified and firm, about 70 in number;
the lateral and sterno-tracheal muscles slender; the inferior laryngeal
muscles are strong but very short, forming a prominent knob, and
attached to the first bronchial ring. Bronchi wide, of about 20 half-
rings.
The digestive organs of this bird, and of the Flycatchers in general,
do not differ materially from those of the Thrushes and Warblers. The
pharynx and œsophagus, however, are much wider.
SWALLOW-TAILED FLYCATCHER.

Muscicapa forficata, Gmel.


PLATE CCCLIX. Male.

Not having seen this handsome bird alive, I am unable to give you
any account of its habits from my own observation; but I have
pleasure in supplying the deficiency by extracting the following notice
from the “Manual of the Ornithology of the United States and of
Canada,” by my excellent friend Thomas Nuttall.
“This very beautiful and singular species of Flycatcher is confined
wholly to the open plains and scanty forests of the remote south-
western regions beyond the Mississippi, where they, in all probability,
extend their residence to the high plains of Mexico. I found these
birds rather common near the banks of Red River, about the
confluence of the Kiamesha. I again saw them more abundant, near
the Great Salt River of the Arkansa in the month of August, when the
young and old appeared, like our King Birds, assembling together
previously to their departure for the south. They alighted repeatedly
on the tall plants of the prairie, and were probably preying upon the
grasshoppers, which were now abundant. At this time also, they
were wholly silent, and flitted before our path with suspicion and
timidity. A week or two after, we saw them no more, having retired
probably to tropical winter-quarters.
“In the month of May, a pair, which I daily saw for three or four
weeks, had made a nest on the horizontal branch of an elm,
probably twelve or more feet from the ground. I did not examine it
very near, but it appeared externally composed of coarse dry grass.
The female, when first seen, was engaged in sitting, and her mate
wildly attacked every bird which approached their residence. The
harsh chirping note of the male, kept up at intervals, as remarked by
Mr Say, almost resembled the barking of the Prairie Marmot, ’tsh,
’tsh, ’tsh. His flowing kite-like tail, spread or contracted at will while
flying, is a singular trait in his plumage, and rendered him
conspicuously beautiful to the most careless observer.”

Muscicapa forficata Gmel. Linn. Syst. Nat. vol. i. p. 931.—Lath. Ind. Ornith.
vol. ii. p. 485.—Ch. Bonaparte, Synopsis of Birds of United States, p. 275.
Swallow-tailed Flycatcher, Muscicapa forficata, Bonap. Amer. Ornith.
vol. i. p. 15, pl. 2, fig. 1.
Swallow-tailed Flycatcher, Nuttall, Manual, vol. i. p. 275.

Adult Male. Plate CCCLIX. Fig. 3.


Bill of moderate length, rather stout, subtrigonal, depressed at the
base, straight; upper mandible with its dorsal outline nearly straight,
and declinate, to near the tip, which is deflected, slender,
compressed, and acute, the edges sharp and overlapping, with a
slight notch close to the tip; lower mandible with the angle rather
long and wide, the back broad at the base, the dorsal line ascending
and very slightly convex, the edges sharp, the tip acute. Nostrils
basal, roundish, partly covered by the bristly feathers.
Head rather large; neck short; body ovate. Feet short; tarsus with six
anterior very broad scutella. Toes free, slender; the first stout, the
lateral equal; claws rather long, arched, slender, much compressed,
very acute.
Plumage soft and blended. Bristles at the base of the upper
mandible strong. Wings rather long, the first four quills longest, with
their inner webs emarginate and attenuate at the end. Tail very long,
deeply forked, of twelve broad, rounded feathers.
Bill and feet black. Iris hazel. Upper part of the head, the cheeks,
and the hind part and sides of the neck, ash-grey; scapulars and
back darker and tinged with reddish-brown; the rump darker, the
upper tail-coverts black. Wings brownish black, all the feathers
margined with greyish-white, the anterior wing-coverts scarlet; tail-
feathers deep black, with their terminal margins white, the three
outer on each side pale rose-coloured to near the end. The throat,
fore part of neck and breast, pure white; the sides, abdomen, and
lower tail-coverts, and lower wing-coverts, pale rose-colour; the
axillary feathers bright scarlet.
Length to end of tail 11 1/2 inches, to end of wings 7 1/2; tail to the
fork 2 2/12, to the end 5 1/2; wing from flexure 5 1/8; bill along the
ridge 5/8, along the edge of lower mandible 7/8; tarsus 3/4; hind toe
3/ , 1/ 1/
8 its claw 4/12; middle toe 5 /12, its claw 3
2 /12.
2

You might also like