Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Expt2: The concept of a repository in PDI (Kettle) and learn how to create, connect or

disconnect from a repository.


Explanation:Pentaho Data Integration is referred to as, "Kettle." Pentaho Data Integration
began as an open source project called. "Kettle." The term, K.E.T.T.L.E is a recursive that
stands for Kettle Extraction Transformation Transport Load Environment.
When Pentaho acquired Kettle, the name was changed to Pentaho Data Integration.
Other PDI components such as Spoon, Pan, and Kitchen, have names that were originally
meant to support the "culinary" metaphor of ETL offerings.
Concept of Repository: The Kettle repository is a workspace that the data integrator works
on.
This workspace is a physical region of the hard drive,designated exclusively for Kettle.
In the repository, all information about transformations, jobs, schedules, etc. is stored.
The repository concept promotes reusability,which in turn saves time and effort.
A repository may be created in two ways:
1) Kettle database repository
2) Kettle file repository
When kettle is started, the ‘Repository Connection’ dialog box appears, asking you to select a
repository from the list of existing repositories, or create a new one.
To create a file repository:
Step 1: In ‘Repository Connection’ dialog box click on + [ ] button. The ‘Select the
repository type’ dialog box will appear.
Step 2: Select ‘kettle file repository’ and click ok.
Step 3: In ‘File repository settings’ dialog box, click on Browse button, select a folder that
shall exclusively be your file repository space; fill ID and Name and click on ‘OK’ button.
Click on the ‘Repository connection’- ‘OK’ button to select the newly-created repository.
You are now ready to create transformations and jobs on this workspace.
To disconnect from the current working repository, go to Tools menu:
Tools -> Repository -> Disconnect repository
…or alternatively, press Ctrl+D.

NOTE: If wants to change repository or create a new one, then you can do so by first
disconnecting from the current working repository. Then, open the ‘Repository Connection’
dialog box from:
Tools -> Repository -> Connect …or alternatively, press Ctrl+R. The ‘Repository
Connection’ dialog box appears.
Create a Connection in the PDI Client
If you want to access the repository items through the PDI client, perform the following steps to create a connection to a
Pentaho Repository: 

1. Verify the Pentaho Server is running, and start the PDI client.

2. Click the Connect link in the upper right corner of the PDI client toolbar. The Pentaho Repository welcome
dialog box appears. 

3. Click Get Started.

4. Enter or update the Display Name property.

5. Modify the URL associated with your repository, if necessary.

6. Click Finish to test the connection of your repository. If the test fails, make sure that the port number in the
URL is correct.  If you installed PDI using the Pentaho Installation Wizard, the correct port should appear in
the installation-summary.txt file.  The file is in the root directory where you installed PDI.

7. If the test is successful, you can either Connect Now, Manage Connections, or Finish to close the dialog
box. If you choose to finish, you can connect to the repository later through the menu next to the Connect link
in the upper right corner of the PDI client toolbar.

Connect to a Pentaho Repository


Once a repository is created, a menu appears next to the Connect link. You can use this menu to connect to the
repository.

If you are in the process of creating your first repository, selecting Connect Now will automatically take you to Step 2.

1. Select a repository in the Connect menu.

2. Log on to the repository by entering your User Name and Password credentials. For example, User Name =
admin, Password = password.

3. Click OK to exit the Repository Configuration dialog box. Your user name and repository display name will
appear in the upper right corner of the PDI client toolbar.

If you want the Repository Connection window to automatically appear when the PDI client starts, go to Tools >
Options and click Show repository dialog at startup.

Manage Repositories in the PDI Client


After a repository is created, a menu appears next to the Connect link. You can use the menu to connect to any
repository you created. If you connect to a repository, the Connect link in the PDI client toolbar is replaced by your user
name and the display name of the repository.
This menu can also be used to access the Repository Manager or disconnect from your current repository.

Repository Manager
You can Add, Edit, or Delete your repositories through the Repository Manager dialog box.

Connection Details
Use the Connection Details dialog box to specify the settings of your repository.

Setting Description

Display Name Identifies the repository within the PDI client.

URL Defines the web address of the repository. The default value
is http://localhost:8080/pentaho. You can change this setting to any web address
pertaining to your specific collaboration project.

Description Describes the repository, such as its type and any other useful information.

Launch connection on Indicates the repository should open by default when starting the PDI client.
startup

You might also like