Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

1.What is Talend software? What is a project in Talend?

Why is Talend called a Code


Generator?
Talend is an ETL tool for Data Integration. It provides software solutions for data preparation, data
quality, data integration, application integration, data management and big data. Talend has a separate
product for all these solutions. Data integration and big data products are widely used

In Talend Studio, the highest physical structure for storing all different types of data integration Jobs,
metadata, routines, etc. is the "project".

Once the task gets over, Talend Studio mechanically will interpret it into the Java category at the
backend. ... Every element of very Job is split into three elements of Java code (begin, main and end).
This can be the reason why the Talend studio is named a code generator.

2. Describe a Job Design in Talend.

This is the technical implementation/graphical representation of the business model. In this


design, one or more components are connected with each other to run a data integration
process. Thus, when you drag and drop components in the design pane and connect then with
connectors, a job design converts everything to code and creates a complete runnable program
which forms the data flow.

3. What is a ‘Component’ in Talend? List down different component families


available in Talend
What makes it a Talend component. A component usually consists of the following files: an XML
descriptor file, a messages properties file, some Java template files, an icon and some JAR files that are
imported and used by the component.
Explain the various types of connections (row, iterate, trigger) available in Talend

There are various types of connections which define either the data to be processed, the data
output, or the Job logical sequence.

Right-click a component on the design workspace to display a contextual menu that lists all
available connections for the selected component.

The sections below describe all available connection types.

Row connection

A Row connection handles the actual data. The Row connections can


be Main, Lookup, Reject, Output, Uniques/Duplicates, or Combine according to the nature of
the flow processed.

Iterate connection

The Iterate connection can be used to loop on files contained in a directory, on rows contained


in a file or on DB entries.

A component can be the target of only one Iterate connection. The Iterate connection is mainly


to be connected to the start component of a flow (in a subJob).

Some components such as the tFileList component are meant to be connected through an iterate


connection with the next component. For how to set an Iterate connection, see Iterate
connection settings.

Trigger connections

Trigger connections define the processing sequence, so no data is handled through these
connections.

The connection in use will create a dependency between Jobs or subJobs which therefore will be
triggered one after the other acco

rding to the trigger nature.


Differentiate between ‘Built-in’ and ‘Repository’.

 Built-in: all information is stored locally in the Job. You can enter and edit all
information manually.
 Repository: all information is stored in the repository.

You can import read-only information into the Job from the repository. If you want to modify the
information, you must take one of the following actions:

 Convert the information from Repository to Built-in and then edit the built-in


information.
 Modify the information in the Repository. Once you have made the changes, you are
prompted to update the changes into the Job.

In which case to use Built-In and Repository:

 Use Built-In for information that you only use once or very rarely.


 Use Repository for information that you want to use repeatedly in multiple components
or Jobs, such as a database connection.

7. What are Context Variables and why they are used in Talend?

Context variables are the variables which can have different values in different
environments. You can create a context group which can hold multiple context
variables. You need not add each context variable one by one to a job, you can simply
add the context group to the job.
These variables are used to make the code production ready. Its means by using
context variables, you can move the code in development, test or production
environments, it will run in all the environments.
In any job, you can go to Contexts tab as shown below and add context variables.

You might also like