Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Apache Beam SDK-based pipelines

Dataflow pipelines are based on Apache Beam.

Transform Transform Transform

Input PCollection PCollection PCollection Output

Basic concepts of Apache Beam programming


The basic concepts of Apache Beam programming include:

● PCollections: The PCollection abstraction represents a potentially


distributed, multi-element data set that acts as the pipeline's data. Beam
transforms use PCollection objects as inputs and outputs.

● Transforms: These are the operations in your pipeline. A transform takes


a PCollection (or multiple PCollections) as input, performs an operation
that you specify on each element in that collection, and produces a new
output PCollection.

● Pipeline I/O: Beam provides read and write transforms for several
common data storage types and allows you to create your own.

Go to Apache Beam SDK-based programming to learn more.

You might also like