Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Talend Tutorial Task Aid >

Filtering Data using the tMap Component

This tutorial uses Talend Open Studio Data Integration version 6

1. Create a new Job, add the movies metadata as an input


source, and add a tMap component
a. Create a new Standard Job named tMapFilter.
b. Add the movies metadata file as input delimited component.
c. Add a tMap component that can modify the schema and filter columns.
d. Create a flow of data from the movies component to the tMap_1 component by
linking the two components.

2. Configure the tMap_1 component to filter columns


a. Double-click the tMap_1 component.
The tMap_1 wizard window has four main sections.

Left Section displays the incoming data flows. Note that there can be multiple
inputs into the tMap component.
Middle Section displays the mapping links between the input and output data
flows. Here you can also create variables that use input values, and are then used
to produce output.
Right Section displays the output data flows.
Bottom Section is the Schema editor that can be used to modify the schema of an
input or output flow. To edit a Schema, select the input/output flow whose schema
you want to change (the selected flow is highlighted in yellow) and edit the schema
in the Schema editor.
b. To create a new output component, in the output section of the tMap_1 wizard, click
the [+] button, type the name of the output as filteredOutput, and click OK.
An empty output is created.
c. To add columns to the output, in the Schema editor of the output, click the [+] icon.
d. Define a column for movie ID (Column: movieID, Type: Integer, and Length: 4).
Note: The output column name need not be the same as the input column name. To
change the column name, edit the entry in the Schema editor.

Talend takes the complexity out of integration


Based on open source Scalable Future-proof Predictable cost
Visit www.talend.com Follow us on Twitter @Talend
Talend Tutorial Task Aid >

e. To send the data from the movieID column of the input file to the output column, click
movieID, hold, and drag to the Expression column of filteredOutput.
A yellow arrow appears indicating the flow of data.
f. To add the title and releaseYear columns to the output component and link them,
select and drag the columns from the input component to the output component.
g. To change the order of the columns in the output component, click the [] or []
icons.
The column order and the corresponding links will be updated.

3. Use the configured tMap_1 component


a. To display the output processed by the tMap_1 component, add a tLogRow
component in the Job Designer and link the filteredOutput output of the tMap_1
component to the tLogRow_1 component.
b. To run the Job, in the Run view, click Run.
Only the filtered movie data (movieID, releaseYear, and title) is displayed.

Talend takes the complexity out of integration


Based on open source Scalable Future-proof Predictable cost
Visit www.talend.com Follow us on Twitter @Talend

You might also like