Professional Documents
Culture Documents
Architecture Design and Principles
Architecture Design and Principles
When you start deploying Observability Pipelines Worker into your infrastructure, you
may run into questions such as:
•Where should the Observability Pipelines Worker be deployed within the network?
•How should the data be collected?
•Where should the data be processed?
Lets go through what to consider when designing your Observability Pipelines Worker
architecture, specifically these topics:
•Networking
•Collecting data
•Processing data
•Buffering data
•Routing data
Networking
The first step to architecting your Observability Pipelines Worker deployment is
understanding where Observability Pipelines Worker fits within your network and where
to deploy it.
• The Observability Pipelines Worker should integrate with agents that produce
vendor-specific data that the Observability Pipelines Worker cannot replicate.
• For example, Datadog Network Performance Monitoring integrates the
Datadog Agent with vendor-specific systems and produces vendor-specific
data.
• Therefore, the Datadog Agent should collect the data and send it directly to
Datadog, since the data is not a supported data type in the Observability
Pipelines Worker.
• Use source components such as the datadog_agent or open_telemetry to
receive data from your agents.
Reducing agent risk
If you want to design an efficient pipeline between your Observability Pipelines Worker’s
sources and sinks, it helps to understand which types of data to process and where to process it.
1. Choosing which data to process
• You can use Observability Pipelines Worker to process logs, metrics, and traces. However, real-
time, vendor-specific data, such as continuous profiling data, is not interoperable and typically
does not benefit from processing.
2. Choosing where to process data
• Observability Pipelines Worker can be deployed anywhere in your infrastructure. Deploy it
directly on your node as an agent for local processing, or on separate nodes as an aggregator
for remote processing. Where the processing happens depends largely on your use case and
environment.
Local processing
• With local processing, Observability Pipelines Worker is deployed on
each node as an agent.
• Data is processed on the same node from which the data originated. This
provides operational simplicity since the Observability Pipelines Worker has
direct access to your data and scales along with your infrastructure.
Optimize your system of analysis for analysis while reducing costs by doing the
following:
•Front the sink with a memory buffer.
•Set batch.timeout_sec to ≤ 5 seconds (the default for analytical
sinks, such as datadog_logs).
•Use the remap transform to remove attributes not used for
analysis.
•Filter events not used for analysis
•Consider sampling logs with level info or lower to reduce their
volume