Pentaho Kettle Pdi Eng

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Pentaho Data Integration

(Kettle)

PDI Overview (Kettle)

An entry-level tool for data manipulation (ETL)


PDI (Kettle) reads procedures stored in XML
format
Spoon is a graphical tool used to develop that
procedures
Procedures are designed linking components
Many data sources can be used, JDBC, files,
web services
JavaScript and Java support for complex
routines

www.robertomarchetto.com

Development enviroment

www.robertomarchetto.com

Example, Source database

www.robertomarchetto.com

Example, destination database

www.robertomarchetto.com

Schema comparison

www.robertomarchetto.com

Procedure users_dimension

Query users:
SELECT u.id, CONCAT(u.first_name, ' ', u.last_name) as fullname, u.title
FROM users u
WHERE u.first_name is not null and u.last_name is not null
www.robertomarchetto.com

Testing

www.robertomarchetto.com

Procedure accounts_dimension

Query accounts:
select a.id, a.name, a.industry, a.billing_address_postalcode,
a.billing_address_city, a.billing_address_country
from accounts a
www.robertomarchetto.com

Procedure opportunities_fact

Query opportunities:
SELECT o.id, o.date_entered, o.date_closed, o.assigned_user_id,
o.sales_stage, o.name, o.amount
FROM opportunities o
WHERE o.sales_stage in ('Closed Won', 'Closed Lost') ORDER BY o.id
www.robertomarchetto.com

Procedure dates_dimension

www.robertomarchetto.com

Collect procedures in a job

www.robertomarchetto.com

Using JNDI

Edit JNDI /simple-jndi/jdbc.properties or


C:/Documents and Settings/<user>/.pentaho/simplejndi/default.properties

www.robertomarchetto.com

Running procedures

Directly from Spoon

From Pentaho BI Suite

Using command line (Kitchen, Pan)


kitchen.bat /file:D:\Jobs\jobname.kjb /level:Basic

In a clustered enviroment

Using a web services (Carte)

www.robertomarchetto.com

Publishing on Pentaho

www.robertomarchetto.com

Running from Pentaho

www.robertomarchetto.com

Scheduling

Using Pentaho's scheduler

Using an external scheduler (cron)

www.robertomarchetto.com

You might also like