Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 18

Architecture Design and Principles

When you start deploying Observability Pipelines Worker into your infrastructure, you
may run into questions such as:
•Where should the Observability Pipelines Worker be deployed within the network?
•How should the data be collected?
•Where should the data be processed?
Lets go through what to consider when designing your Observability Pipelines Worker
architecture, specifically these topics:
•Networking
•Collecting data
•Processing data
•Buffering data
•Routing data
Networking
The first step to architecting your Observability Pipelines Worker deployment is
understanding where Observability Pipelines Worker fits within your network and where
to deploy it.

• Working with network boundaries


• When deploying the Observability Pipelines Worker as an aggregator, it should be
deployed within your network boundaries to minimize egress costs. Ingress into the
Observability Pipelines Worker should never travel over the public internet. Therefore,
Datadog recommends starting with one aggregator per region to keep things simple.
• Using firewalls and proxies
• When using firewalls, restrict agent communication to your aggregators and restrict
aggregator communication to your configured sources and sinks.
• If you prefer to use a HTTP proxy, Observability Pipelines Worker offers a global
proxy option to route all Observability Pipelines Worker HTTP traffic through a proxy.
Using DNS and service discovery
Discovery of your Observability Pipelines Worker aggregators and services should resolve through DNS or
service discovery. This strategy facilitates routing and load balancing of your traffic, and is how your agents and
load balancers discover your aggregators. For proper separation of concerns, the Observability Pipelines Worker
does not resolve DNS queries and, instead, delegates this to a system-level resolver (for example, Linux
resolving).
• Choosing protocols
• When sending data to the Observability Pipelines Worker, Datadog
recommends choosing a protocol that allows easy load-balancing and
application-level delivery acknowledgment. HTTP and gRPC are
preferred due to their ubiquitous nature and the amount of available
tools and documentation to help operate HTTP/gRPC-based services
effectively and efficiently.
• Choose the source that aligns with your protocol. Each Observability
Pipelines Worker source implements different protocols. For example,
Observability Pipelines Worker sources and sinks use gRPC for inter-
Observability Pipelines Worker communication, and the HTTP source
allows you to receive data over HTTP. See Sources for their respective
protocols.
Collecting Data
• Your pipeline begins with data collection. Your services and systems
generate logs, metrics, and traces that can be collected and sent
downstream to your destinations. Data collection is achieved with
agents, and understanding which agents to use ensures you are
collecting the data you want.
• Choosing agents
• Following the guideline to start with one aggregator, choose the agent
that optimizes your engineering team’s ability to monitor their systems.
Therefore, integrate Observability Pipelines Worker with the best agent
for the job and replace the other agents with the Observability Pipelines
Worker.
When Observability Pipelines Worker can replace agents
• The Observability Pipelines Worker can replace agents performing
generic data forwarding functions, such as:
• Tailing and forwarding log files
• Collecting and forwarding service metrics without enrichment
• Collecting and forwarding service logs without enrichment
• Collecting and forwarding service traces without enrichment
These functions collect and forward existing data without modifying
data. Since these functions are not unique, these agents can be replaced
with the Observability Pipelines Worker to provide more configuration
options that may be needed as your environment evolves.
When Observability Pipelines Worker should integrate with agents

• The Observability Pipelines Worker should integrate with agents that produce
vendor-specific data that the Observability Pipelines Worker cannot replicate.
• For example, Datadog Network Performance Monitoring integrates the
Datadog Agent with vendor-specific systems and produces vendor-specific
data.
• Therefore, the Datadog Agent should collect the data and send it directly to
Datadog, since the data is not a supported data type in the Observability
Pipelines Worker.
• Use source components such as the datadog_agent or open_telemetry to
receive data from your agents.
Reducing agent risk

• When integrating with an agent, configure the agent to be a simple


data forwarder and route supported data types through the
Observability Pipelines Worker. This reduces the risk of data loss and
service disruption by minimizing the agent’s responsibilities.
Processing data

If you want to design an efficient pipeline between your Observability Pipelines Worker’s
sources and sinks, it helps to understand which types of data to process and where to process it.
1. Choosing which data to process
• You can use Observability Pipelines Worker to process logs, metrics, and traces. However, real-
time, vendor-specific data, such as continuous profiling data, is not interoperable and typically
does not benefit from processing.
2. Choosing where to process data
• Observability Pipelines Worker can be deployed anywhere in your infrastructure. Deploy it
directly on your node as an agent for local processing, or on separate nodes as an aggregator
for remote processing. Where the processing happens depends largely on your use case and
environment.
Local processing
• With local processing, Observability Pipelines Worker is deployed on
each node as an agent.
• Data is processed on the same node from which the data originated. This
provides operational simplicity since the Observability Pipelines Worker has
direct access to your data and scales along with your infrastructure.

• Local processing is recommended for:


• Simple environments that do not require high durability or high
availability.
• Use cases, such as fast, stateless processing, and streaming delivery,
that do not require holding onto data for long periods of time.
• Operators that can make node-level changes without a lot of friction.
Buffering data
• Where and how you buffer your data can also affect the efficiency of your
pipeline.
• Choosing where to buffer data
• Buffering should happen close to your destinations, and each destination
should have its own isolated buffer, which offers the following benefits:
1.Each destination can configure its buffer to meet the sink’s requirements.
2.Isolating buffers for each destination prevents one misbehaving
destination from halting the entire pipeline until the buffer reaches the
configured capacity.
For these reasons, the Observability Pipelines Worker couples buffers with its sinks.
Routing data
• Routing data, so that your aggregators send data to the proper destination, is the
final piece in your pipeline design. Use aggregators to route data flexibly to the
best system for your team(s).

Separating systems of record and analysis


• Separate your system of record from your system of analysis to optimize cost
without making trade-offs that affect their purpose. For example, your system of
record can batch large amounts of data over time and compress it to minimize cost
while ensuring high durability for all data. And your system of analysis can sample
and clean data to reduce cost while keeping latency low for real-time analysis.
Routing to your system of analysis

Optimize your system of analysis for analysis while reducing costs by doing the
following:
•Front the sink with a memory buffer.
•Set batch.timeout_sec to ≤ 5 seconds (the default for analytical
sinks, such as datadog_logs).
•Use the remap transform to remove attributes not used for
analysis.
•Filter events not used for analysis
•Consider sampling logs with level info or lower to reduce their
volume

You might also like