DD10.1.1 Adv 202006 LG e 637291407447284514

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 113

Unauthorized

Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Data Discovery & Advanced


Profiling

Lab Guide

Version: DD10.1.1-ADV-202006

Unauthorized reproduction
Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Data Discovery & Advanced Profiling

Version: DD10.1.1-ADV-202006
June 2020
Copyright (c) 1998–2020 Informatica LLC. All rights reserved.
This educational service, materials, documentation and related software contain proprietary
information of Informatica LLC and are provided under a license agreement containing
restrictions on use and disclosure and are also protected by copyright law. Reverse engineering
of the software is prohibited. No part of the materials and documentation may be reproduced or
transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without
prior consent of Informatica LLC. The related software is protected by U.S. and/or international
Patents and other Patents Pending.
Use, duplication, or disclosure of the related software by the U.S. Government is subject to the
restrictions set forth in the applicable software license agreement and as provided in DFARS
227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR
12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable.
The information in this educational service, materials, and documentation are subject to change
without notice. If you find any problems in this educational service, materials or documentation,
please report them to us in writing.
Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT,
PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange, PowerMart, Metadata
Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data
Transformation, Informatica B2B Data Exchange Informatica On Demand, Informatica Identity
Resolution, Informatica Application Information Lifecycle Management, Informatica Complex
Event Processing, Ultra Messaging and Informatica Master Data Management are trademarks or
registered trademarks of Informatica LLC in the United States and in jurisdictions throughout the
world. All other company and product names may be trade names or trademarks of their
respective owners.
Portions of this educational service, materials and/or documentation are subject to copyright held
by third parties, including without limitation: Copyright © Adobe Systems Incorporated. All rights
reserved. Copyright © Microsoft. All rights reserved. Copyright © Oracle. All rights reserved.
Copyright @ the CentOS Project.
This Software is protected by U.S. Patent Numbers 5,794,246; 6,014,670; 6,016,501; 6,029,178;
6,032,158; 6,035,307; 6,044,374; 6,092,086; 6,208,990; 6,339,775; 6,640,226; 6,789,096;
6,820,077; 6,823,373; 6,850,947; 6,895,471; 7,117,215; 7,162,643; 7,243,110, 7,254,590;
7,281,001; 7,421,458; 7,496,588; 7,523,121; 7,584,422, 7,720,842; 7,721,270; and 7,774,791,
international Patents and other Patents Pending.
DISCLAIMER: Informatica LLC provides this educational services, materials, and documentation
“as is” without warranty of any kind, either express or implied, including, but not limited to, the
implied warranties of non-infringement, merchantability, or use for a particular purpose.
Informatica LLC does not warrant that this educational service, materials, documentation or
related software is error free. The information provided in this educational service, materials,
documentation and related software may include technical inaccuracies or typographical errors.
The information in this educational service, materials, documentation and related software is
subject to change at any time without notice.

Unauthorized reproduction
Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its ii
its affiliates.
affiliates.
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Document Conventions
This guide uses the following formatting conventions:

If you see… It means… Example


> Indicates a sub menu to navigate Click Repository > Connect.
to. In this example, you should click the
Repository menu or button and
choose Connect.
boldfaced text Indicates text you need to type or Click the Rename button and name
enter. the new source definition
S_EMPLOYEE.
UPPERCASE Database tables and column T_ITEM_SUMMARY
names are shown in all
UPPERCASE.
italicized text Indicates a variable you must Connect to the Repository using the
replace with specific information. assigned login_id.
Note: The following paragraph provides Note: You can select multiple objects
additional facts. to import by using the Ctrl key.
Tip: The following paragraph provides Tip: The m_ prefix for a mapping
suggested uses or a Velocity best name is…
practice.

Unauthorized reproduction
Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its iii
its affiliates.
affiliates.
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Other Informatica Resources


In addition to the student and lab guides, Informatica provides these other resources:
 Documentation and Knowledge Base
 Global Customer Support
 Professional Certification

Accessing Documentation and Knowledge Base


To get the latest documentation and Knowledge Base for your product, go to
https://network.informatica.com

Contacting Global Customer Support


You can contact a Customer Support Center by telephone or through the Online Support. Online
Support requires a username and password. You can request a username and password at
https://www.informatica.com/services-and-training/support-services/contact-us.html

Obtaining Informatica Professional Certification


You can take, and pass, exams provided by Informatica to obtain Informatica Professional
Certification. For more information, go to
https://www.informatica.com/services-and-training/certification.html

Unauthorized reproduction
Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its iv
its affiliates.
affiliates.
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Table of Contents
Module 0: Getting Started
Lab 0-1: Starting the Services and Logging into the Analyst Tool ............................................. 1
Module 1: Analyst Advanced Profiling
Lab 1-1: Data Domain Profiling ................................................................................................. 7
Lab 1-2: Use an Enterprise Discovery Profile to Profile Multiple Objects ................................ 19
Module 2: Developer Profiling Overview
Lab 2-1: Create an Enterprise Discovery Profile in Developer ................................................ 29
Lab 2-2: Create a Join Analysis Profile ................................................................................... 45
Module 3: Functional Dependency & Primary Key Profiling
Lab 3-1: Reviewing the Titles Objects ..................................................................................... 53
Lab 3-2: Functional Dependency and Primary Key Inference ................................................. 75
Module 4: Enterprise Discovery - Overlap Discovery and Foreign Key Profiling
Lab 4-1: Overlap Discovery Profiling ....................................................................................... 85
Lab 4-2: Foreign Key Profiling ................................................................................................. 91

Unauthorized reproduction
Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Unauthorized reproduction
Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Module 0 : Getting Started


Lab 0-1: Starting the Services and Logging into the Analyst Tool
Overview:
Before we can start working, we need to ensure the Informatica Services are started.
Once started we will log into Informatica Administrator to verify the services are running.
Note: that this is an Administration Task and will not need to be performed by the user.

Note: If you are continuing on from the Analyst Course you may not need to perform this step
as the services will already be up and running. Please ask your instructor if you are unsure.

Objectives:
 Connect to the training environment.
 Verify the Oracle Services are running.
 Start the Informatica Services.
 Log into Informatica Administrator and verify the Services are running.
 Log into Informatica Analyst.
Duration:
20 minutes

Tasks
1. Log into the image.
a. If you haven’t already done so, connect to the training environment using the link
provided.
b. If you have not connected to the windows machine, in the Student Portal choose
CONNECT VM and connect again. You will be logged into the image as
Administrator with the password admin
2. Verify the Oracle Services are running.
a. In the Quick Launch tray, click on the Services icon.

b. In the Services window, verify that the Oracle service and the Oracle TNS Listener
are running by checking the Status column.
Note: You can start each service by right-clicking on the service name and selecting
Start from the sub-menu.

Unauthorized reproduction or
Unauthorized
Module
or distribution
distribution prohibited.
reproduction
0 : Getting Started
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 1
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

3. Start the Informatica Services


a. Scroll to the Informatica service. If the Informatica service is not running, it will need
to be started.
Note: This will only need to be done once during the course. Once started the
Services will stay running for the duration unless the image is restarted.
b. You can start the Informatica service by selecting the service name and selecting
Start from the sub-menu or right clicking on the service and choosing Start.
4. Note: Please insure you select the Informatica 10.1.1 services.

2
Unauthorized reproduction
Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or0its
Module its affiliates.
affiliates.
: Getting Started
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

a. Please wait about 10 minutes for all of the sub-services to start.


b. Close the Services window.
5. Log into Informatica Administrator
a. Launch the Windows Chrome utility.

b. Click the Informatica Administrator shortcut button to launch the Administrator Tool.

Unauthorized reproduction or
Unauthorized
Module
or distribution
distribution prohibited.
reproduction
0 : Getting Started
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 3
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Note that if the login isn’t presented, more time is needed for the services to start.
Please wait and retry in a few minutes.
c. Log in using the username Administrator with the password admin.

d. Verify that all of the Services are running. If they are not, more time is needed.

6. Log into Informatica Analyst.

4
Unauthorized reproduction
Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or0its
Module its affiliates.
affiliates.
: Getting Started
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

a. In Chrome, either press the Informatica Analyst shortcut button on the toolbar or
enter the address:
 Infa-server.com:8085/analyst/ and press Enter.
b. Log in using the username ANALYST_01 with the password ANALYST_01.

Note: you will land on the Start Page or Task Inbox.

You are now ready to start.

Unauthorized reproduction or
Unauthorized
Module
or distribution
distribution prohibited.
reproduction
0 : Getting Started
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 5
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

6
Unauthorized reproduction
Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or0its
Module its affiliates.
affiliates.
: Getting Started
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Module 1: Analyst Advanced Profiling


Lab 1-1: Data Domain Profiling
Overview:
A data domain is a predefined or user-defined Model repository object based on the semantics
of column data or a column name. For example, Social Security number, credit card number,
email ID, and phone number can be individual data domains.
A data domain helps you find important data that remains undiscovered in a data source. For
example, you may have legacy data systems that contain Social Security numbers in a
Comments field. You need to find this information and protect it before you move it to new data
systems.
A data domain glossary lists all the data domains and data domain groups. Use the Preferences
menu in the Developer tool to import and export data domains to and from the data domain
glossary.
Objectives:
 Log into Informatica Analyst
 Apply Data Domain Profiling
 Manage the Data Domain Glossary
Duration:
15 minutes

Tasks

Note: If you have just completed the Analyst course and are using the same image, you can
work with the files in the CUSTOMER_DATA Project that was created. If this is a fresh
copy of the image, please follow the instructions below.
1. Once you have logged into the Analyst, click Open to access the Library Tab.
a. From the Projects tab, navigate to CONTENT > Training_Materials > ANALYST
> CUSTOMER_DATA and select the project.
Note: Alternatively use the objects in the CUSTOMER_DATA project created in the Analyst
course.

Unauthorized
Unauthorized
Module reproduction
reproduction
1: Analyst Advanced or distribution
Profilingor distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 7
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

2. Apply Data Domain Profiling:


a. In the CUSTOMER_DATA Project, open PROFILE_CUSTOMER_SHIPPING.
b. Click Edit.

c. On the Specify Settings tab, add a checkmark to select Run data domain
discovery.
d. In the Data Domain sections, in the Group By drop-down list to the right, select
Data Domain Group.
e. Scroll down the list of Data Domains presented, collapsing each group as you go
and select the groups PCI, PHI, and PII.

8 Unauthorized reproduction
Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orAdvanced
Module 1: Analyst affiliates.
Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

f. In Run data domain discovery section, add a checkmark to Data and a


checkmark on Column name.
g. Choose to Exclude null values from data domain discovery.

i. Click Save > Save and Run.

Unauthorized
Unauthorized
Module reproduction
reproduction
1: Analyst Advanced or distribution
Profilingor distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 9
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

3. Review Data Domain Discovery Results:


a. Once the profile has run, in the Profile results, scroll to the right to review the
Data Domain Column.

Note: The columns identified that have potential data domains are displayed, along
with the percentage of data that conforms to that domain.
b. Scroll down and review the COUNTRY, PHONE and EMAIL rows bringing your
cursor over the Data Domain to review more information.

c. Click on the EMAIL hyperlink to view the column in Detail view.


d. Scroll down and review Data Domain toward the bottom left of the window.
e. Right click on the Email and choose to Drill down on Non-Conforming Rows.

Unauthorized reproduction
10 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orAdvanced
Module 1: Analyst affiliates.
Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Note You will need to scroll across to review the EMAIL data. Many appear to be invalid
emails containing spaces.
f. Return to the Overview View by clicking the Return to Overview hyperlink at the
top left-hand side of the screen.

g. Choose to Filter By: Inferred data domains using the Filter By tab to the left.

Note: the columns that have been identified as containing either data domain data or have a
domain column name are displayed.

Unauthorized
Unauthorized
Module reproduction
reproduction
1: Analyst Advanced or distribution
Profilingor distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 11
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

h. Review the results from here.


4. Manage the Data Domain Glossary.
a. Click Manage > Data Domain Glossary.

b. In the Navigator choose to view by Group.

c. Expand the PCI, PHI and PII groups and note the data domains that have been
defined within each one.

Unauthorized reproduction
12 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orAdvanced
Module 1: Analyst affiliates.
Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

d. In the Navigator to the left, under PHI, select Email.

Note the Properties and Rules are displayed. It lists the Data and Column Name
Rules that can be used to identify Emails.

Unauthorized
Unauthorized
Module reproduction
reproduction
1: Analyst Advanced or distribution
Profilingor distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 13
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

e. Review the contents of the other Data Domain Groups that exist.

5. Apply all of the Data Domain Rules to your data:


a. Once complete, close the Glossary and return to your
Profile_CUSTOMER_SHIPPING by clicking on the Discovery tab.

b. Edit the profile once again and click on the 3 Specify Settings tab.
c. This time select to apply all of the Data Domains by selecting the checkbox
beside Name.

Unauthorized reproduction
14 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orAdvanced
Module 1: Analyst affiliates.
Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

d. Save and run the profile.


e. Once again review the Data Domain Discovery results that have been identified
for the columns, noting the increase.

6. Approve or Reject some domains;


a. In the profile scroll down and once again review the EMAIL field noting that
additional data domains have been identified.

Unauthorized
Unauthorized
Module reproduction
reproduction
1: Analyst Advanced or distribution
Profilingor distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 15
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

b. Click the EMAIL hyperlink to review in Detail view.


c. Scroll down to the Data Domains and review the contents.
Note: it is possible to collapse the dialogs to bring the Dad Domain viewer into
view.
d. We want to reject the other domains that have been identified so right click on
the Date_AllFormats and choose Reject.

Note it is removed from the list.


e. Right-click on Email. This is a valid data domain so select Email and choose
Approve.

Unauthorized reproduction
16 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orAdvanced
Module 1: Analyst affiliates.
Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Note that the Email is flagged as Accepted.

f. Reject the AlphaNumeric_Special… data domain as well.

Unauthorized
Unauthorized
Module reproduction
reproduction
1: Analyst Advanced or distribution
Profilingor distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 17
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Unauthorized reproduction
18 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orAdvanced
Module 1: Analyst affiliates.
Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Lab 1-2: Use an Enterprise Discovery Profile to Profile Multiple


Objects
Overview:
Enterprise Discovery profiles can be used to profile multiple objects at the same time in the
Analyst tool.
This lab will demonstrate how to create an Enterprise Discovery Profile, how to add objects,
define profile settings, run the profile and review the results.
Further Advanced Profiling can be performed in the Enterprise Profile in the Developer Tool.
Objectives:
 Create an Enterprise Discovery Profile.
 Add objects and define settings.
 Review Profile results.
Duration:
15 mins

Tasks
1. Return to the Library > Projects view.
2. Select New > Profile.

3. Select the Enterprise Discovery option and click Next.

Unauthorized
Unauthorized
Module reproduction
reproduction
1: Analyst Advanced or distribution
Profilingor distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 19
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

4. Set:
 Enterprise Discovery Profile name as Profile_Orders
 Description = Order Files
 Location = CONTENT/Training_Materials/ANALYST/CUSTOMER_DATA
Note: If you have just completed the Analyst course and have a project called
CUSTOMER_DATA then you can use this project.

a. Press Next.
5. Press Choose to select the required data objects.

Unauthorized reproduction
20 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orAdvanced
Module 1: Analyst affiliates.
Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

a. From the CUSTOMER_DATA Project, select the Orders and Order_Details


objects.

b. Press Save.

Unauthorized
Unauthorized
Module reproduction
reproduction
1: Analyst Advanced or distribution
Profilingor distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 21
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

c. Press Next and Next again.


6. In the Specify Settings Tab.
a. Select the checkbox to Enable data domain discovery.
b. Press the Choose button to the right to select the Data Domains.

c. In the Choose Data Domains dialog show data domain groups in hierarchy.
d. Select Account_Bank, Contact and Address Data Domains.
e. Click OK.

Unauthorized reproduction
22 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orAdvanced
Module 1: Analyst affiliates.
Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

f. Scroll down and set:


 Run data domain discovery on both Data and Column name.
 Exclude null values from data domain discovery.
 Exclude columns with approved data domains.

g. Scroll down and note the Column Profile Settings.


h. Ensure Enable column profile and All rows to profile are selected.

Unauthorized
Unauthorized
Module reproduction
reproduction
1: Analyst Advanced or distribution
Profilingor distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 23
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

i. Choose Save > Save and Run from the menu.


Note: it may take a few minutes to run.

j. The profile opens on the Summary tab.

Unauthorized reproduction
24 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orAdvanced
Module 1: Analyst affiliates.
Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

7. To view the profiles select the Profiles tab. Both Profiles are listed.
a. Select Profile_Orders and click the Open Profile.

Unauthorized
Unauthorized
Module reproduction
reproduction
1: Analyst Advanced or distribution
Profilingor distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 25
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

b. Review the profile of the data checking to see the data domains that were
identified.
c. Once complete return to the Enterprise Discovery Profile by pressing the Back to
Enterprise Discovery button at the top right of the screen.

d. Finally, review the profile for the Order_Details file.


e. Note that it is possible to run the profile, detect outliers and add to Scorecards
from here.

Unauthorized reproduction
26 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orAdvanced
Module 1: Analyst affiliates.
Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Note: We can see that the Order Details file contains the information for each Order held in the
Order file. The mapping specification that we created shows this information.
8. Close any open objects by clicking the X beside the name and return to your Library >
Project tab, refreshing the view to ensure the Enterprise Discovery Profile has been
saved.
Note: This profile should be saved to the CUSTOMER_DATA folder in the CONTENT >
Training_Materials > ANALYST project or your main project if you just completed the
Analyst course.

This concludes the lab.

Unauthorized
Unauthorized
Module reproduction
reproduction
1: Analyst Advanced or distribution
Profilingor distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 27
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Unauthorized reproduction
28 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orAdvanced
Module 1: Analyst affiliates.
Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Module 2: Developer Profiling Overview


Lab 2-1: Create an Enterprise Discovery Profile in Developer
Overview:
This lab familiarizes you with working with Informatica Developer. You use the physical data
objects created in the earlier labs. You will create an Enterprise Discovery Profile and then
perform a Join Analysis profile.
Objectives:
 Create an Enterprise Discovery Profile in the Developer.
 Configure Data Domain Discovery.
 Review the Column and Domain Discovery Profiles.
 Navigate the Enterprise Discovery Profile.
 Run profiles.
Duration:
60 minutes

Tasks
1. Log into Informatica Developer.
From the Start menu choose All Programs > Informatica 10.1.1 > Client > Developer
Client > Launch Informatica Developer.
Right click on MRS_SVC_DEV_DQ and choose connect.

Log in as ANALYST_01/ANALYST_01.

Unauthorized
Unauthorized
Module reproduction
2: Developerreproduction or distribution
or
Profiling Overview distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 29
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Expand the CONTENT > Training_Materials > ANALYST > CUSTOMER_DATA project
(or the CUSTOMER_DATA project you created in the Analyst course if it exists) and
review the contents.
2. Expand Physical Data Objects > TRAINING and open the CUSTOMER_SHIPPING table
to view.
On the Overview tab ensure that the following properties are defined for the CUST_ID
column, paying particular attention to the precision of the port:
i. Set the CUST_ID as the Primary Key and change the Precision to 20.
Note: Please ensure you complete this step and update the CUST_ID column.

Save and close the data object by clicking the X beside the name.

Unauthorized reproduction
30 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orProfiling
Module 2: Developer affiliates.
Overview
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

3. Create an Enterprise Discovery Profile.


Right-click on the CUSTOMER_SHIPPING object that was just updated and select
Profile.
Select Enterprise Discovery Profile and click Next.
Set:
 Name = edp_CUSTOMERORDERS
 Location: Save to the CUSTOMER_DATA project or the CONTENT >
Training_Materials > ANALYST > CUSTOMER_DATA whichever you
have been working with.
Click Next.
Click Choose.
In the Select Data Objects window, select Order_Details, and Orders from the same
project and click OK.

Review the Data Objects.

Unauthorized
Unauthorized
Module reproduction
2: Developerreproduction or distribution
or
Profiling Overview distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 31
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Click Next and Next again.


4. Define Enterprise Discovery Profiling Options:
Click on Data Domain Selection and set:
 Show data domain group in hierarchy.
 Select Account_Bank and PII domain groups.
 Ensure that Enabled as part of “Run Enterprise Discovery Profile” action is
checked.

Unauthorized reproduction
32 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orProfiling
Module 2: Developer affiliates.
Overview
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

In the Data Domain Discovery Inference Options Tab set:


 Override the default inference options.
 Exclude null values from data domain discovery.

Unauthorized
Unauthorized
Module reproduction
2: Developerreproduction or distribution
or
Profiling Overview distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 33
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Under Sampling Options ensure that the profile will run on All the rows.

Unauthorized reproduction
34 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orProfiling
Module 2: Developer affiliates.
Overview
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Note the Primary Key and Foreign Key Profiling options can be set here also.
Press Finish and wait until it has created the profile by checking the progress viewer in
the bottom left window.

5. Working with edp_CUSTOMERORDERS results. Review Column Profiles.


Once complete, in your project, expand Profiles, right-click edp_CUSTOMERORDERS
and click Open.

Unauthorized
Unauthorized
Module reproduction
2: Developerreproduction or distribution
or
Profiling Overview distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 35
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

The message box asks you to refresh the MRS before you view the Enterprise
Discovery.

Click OK.
Right-click MRS_SVC_DEV_DQ.
Click Refresh.

Unauthorized reproduction
36 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orProfiling
Module 2: Developer affiliates.
Overview
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

6. Once refreshed, expand your CUSTOMER_DATA > Profiles folder and double click
edp_CUSTOMERORDERS to open.
Note: that if the below screen doesn’t appear it is because the profiles are still running.
Once the profile completes, it will be displayed. If necessary, close and reopen the
profile.
Note: This can take time to complete. Please let the profile complete before opening. If
any errors occur, please refresh the repository and then continue.

Note that it opens on the Results Tab.


Click Column Profile.

Unauthorized
Unauthorized
Module reproduction
2: Developerreproduction or distribution
or
Profiling Overview distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 37
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Review the profile results for each data object by selecting them in the Data Objects
Profiled table. The profile results are displayed in the table to the right.

7. Review the Domain Discovery Results:


We also chose to perform Data Domain Discovery on our objects in the Enterprise
Profile. Click Data Domains to review any data domains that were identified.

Unauthorized reproduction
38 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orProfiling
Module 2: Developer affiliates.
Overview
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Note that it is possible to Approve/Reject the domain identified by right clicking on the
column and choosing Approve/Reject.

8. Working in Default View:


Click on the Default View Tab. It may take a minute or two to open initially.
Note: The Default View displays the modeling editor, which allows you to view the
objects in the Discovery profile. You can add or remove objects and create profiles.
You will need to add the objects to the canvas. Do this by selecting them in the Outline
View dialog in the bottom left of the screen.

Unauthorized
Unauthorized
Module reproduction
2: Developerreproduction or distribution
or
Profiling Overview distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 39
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Note: Once selected they will appear in the workspace canvas.

Click on an open space in the workspace and note the Properties view is updated.

Unauthorized reproduction
40 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orProfiling
Module 2: Developer affiliates.
Overview
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Select the Profiles Tab.


Note: All of the profiles in the Enterprise Discovery Profile can be accessed and run
from here.

9. Running profiles from the Properties view.


In the Profiles view, select Run Multiple.
Making sure all the profiles are selected press OK.

Unauthorized
Unauthorized
Module reproduction
2: Developerreproduction or distribution
or
Profiling Overview distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 41
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Note: Notice that the status changes to QUEUED then changes to RUNNING. When the
run is completed, the status changes to SUCCESS.

Unauthorized reproduction
42 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orProfiling
Module 2: Developer affiliates.
Overview
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Once complete, select Profile_CUSTOMER_SHIPPING and click Open.

Note: The profile opens in a new tab in the Enterprise Discovery Profile.

Note: that the profile definition and Profile results can be accessed.
Click Results > Column Profiling and review the profile.

Unauthorized
Unauthorized
Module reproduction
2: Developerreproduction or distribution
or
Profiling Overview distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 43
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Review the Data Domain Discovery Results for the profile also.

Once you have reviewed the profile, close it and return to the Default View.

Unauthorized reproduction
44 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orProfiling
Module 2: Developer affiliates.
Overview
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Lab 2-2: Create a Join Analysis Profile


Overview:
You have created an Enterprise Discovery Profile that contains profiles and data domain
discovery.
You will now look at creating a Join Analysis profile.
Objectives:
 Create a Join Analysis profile.
 Export the profile results.
Duration:
20 minutes

Tasks
1. Create a Join Analysis Profile.
In the Default View for edp_CUSTOMERORDERS in an open space, right-click and click
Select All.
Right click on any one of the objects and choose Join Profile.

Set:
 Name = ja_SHIPPING_ORDERS_DETAILS
Observe that all three data objects have been selected.

Unauthorized
Unauthorized
Module reproduction
2: Developerreproduction or distribution
or
Profiling Overview distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 45
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Click Next.
Ensure that all of the columns are selected.
Click Next.
Note: it is possible to create multiple joins in the profile. We will first create a join
between the CUSTOMER_SHIPPING and Orders objects to ensure that there are no
Orders with CustomerIDs that do not exist in the CUSTOMER_SHIPPING table. We will
then create a join to verify that there are no Order details without a corresponding order
in the Orders file.
2. Create the join to verify that there are no Orders with Customer IDs that do not exist in
the CUSTOMER_SHIPPING Table.
Click Add.
From the first Data Object Dropdown select the Orders object.

Unauthorized reproduction
46 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orProfiling
Module 2: Developer affiliates.
Overview
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

From the second dropdown select the CUSTOMER_SHIPPING data object.

Unauthorized
Unauthorized
Module reproduction
2: Developerreproduction or distribution
or
Profiling Overview distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 47
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Click the Add row icon.

Select CUST_ID for the Left and the Right columns.

Note: We want to verify that each order number in Orders contains a valid customer ID
in the shipping table.
Click OK.
3. Create a join to verify that the Orders in the OrderDetails objects exists in the Orders
object.
Click Add.
Select Order_Details and Orders as the data objects.
Note: We want to check that each order id in the Order Details object exists in the
Orders object.
Once again choose to Add a row.
Select the ORDER_NO from both the Order_Details and Orders objects.

Unauthorized reproduction
48 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orProfiling
Module 2: Developer affiliates.
Overview
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Click OK.

Click Finish and click Yes in the message box.


Note: The progress bar in the lower right corner. Do not open the profile until it has
completed the run. If you are unsure, please check the profiles for the Discovery Profile.

Unauthorized
Unauthorized
Module reproduction
2: Developerreproduction or distribution
or
Profiling Overview distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 49
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

4. The join profile will open once complete.

In the Navigator, select Results > Join Result.

Select the Orders:CUST_ID and CUSTOMER_SHIPPING:CUST_ID row.


Scroll to the right to see the number of rows on each side and the number of Join Rows.
Note: that the Left Only Rows = 3. This means there are 3 records in the Orders file
that do not have CUST_IDs in the CUSTOMER_SHIPPING table. This is an issue.

Unauthorized reproduction
50 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orProfiling
Module 2: Developer affiliates.
Overview
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

There are 8 records in the CUSTOMER_SHIPPING table that do not have


CUST_IDs in the Orders file. This is not an issue as some customers may not have
any orders.
Double click on the Orange Square in the diagram to view the orphan records in the
Orders file. They are displayed in the viewer below.

Note: These could be exported and sent to the appropriate department to rectify. These
are orders that were placed for customers that do not exist in our customer shipping
table.
Select the Order_Details:ORDER_NO and Orders:ORDER_NO join.
Note: Observe that no orphan records exist in this join. This means there are no order
numbers in each file without a matching order no in the other. This is good.

Unauthorized
Unauthorized
Module reproduction
2: Developerreproduction or distribution
or
Profiling Overview distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 51
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Close any open Profiles.

This concludes the lab.

Unauthorized reproduction
52 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or its
its affiliates.
and/orProfiling
Module 2: Developer affiliates.
Overview
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Module 3: Functional Dependency & Primary Key Profiling


Lab 3 - 1: Reviewing the Titles Objects
Overview:
Your company has acquired a small publishing firm. They maintained their data in several
different files and tables. We do not know how valid this data is, what is contained in each file or
table, or if any business relationships exist between the objects.
We perform an Enterprise Discovery profile on the data objects to determine:
 Content
 Relationships
Objectives:
 Import the object definitions.
 Create an Enterprise Discovery Profile.
 Review the results.
 Review column profiles.
Duration:
30 minutes

Tasks
1. Create a new Project called Titles.
a. Right click on the MRS_SVC_DEV_DQ and choose New > Project.

a. Set :
 Project Name = Titles
b. Press Finish.
2. Right-click on the project and choose Import.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 53
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

c. Choose Informatica > Import Object Metadata File (Advanced) and press Next.

d. Navigate to C:\INFA_Shared_DQ\DQ_Mappings\AD_PROF_10.1.1 and select


Titles.xml and press Open.

Unauthorized reproduction
54 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

i. In the Sources to the left, click on Titles.


ii. In the Target to the right, click on Titles.
iii. Choose Add Contents to Target.

Note that all the objects are added to the new project.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 55
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

e. Choose Next, Next, and Next.


f. In the final dialog review the settings and press Finish.

Unauthorized reproduction
56 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Note 6 new objects have been added to the project. 3 Tables and 3 Flat Files. We
will work with these for the remaining modules.

2. Create an Enterprise Discovery Profile.


a. Expand the Titles project.
b. Select all of the data objects by expanding DISCOVERY and selecting the first table
EMPLOYEES, hold down the shift key and select the last data object titles.
c. Right-click and select New > Profile.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 57
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

d. Select Enterprise Discovery Profile and click Next.


e. Set:
 Name = edp_TITLES
f. Verify that the location is the Titles project.

g. Click Next.

Unauthorized reproduction
58 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Note that all of the data objects have been added to the profile.

h. Click Next and Next again.


3. Select Data Domain Discovery > Data Domain Selection.
a. Select the checkbox beside All Data Domains.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 59
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

b. In the Navigator, select Inference Options.


c. Select Override the default inference options.
i. Select Data and column name.
ii. Select Exclude null values from data domain discovery.
Note: By default, the Domain Discovery is performed on the data only. Selecting this
option lets us run discovery on data and column name.

Unauthorized reproduction
60 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

4. Select Sampling Options.


a. Verify that all rows are profiled.
Note: The All Rows option means that profiles are created for all objects in the
Enterprise Discovery profile. This is the default setting.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 61
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Unauthorized reproduction
62 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

5. Select Primary Key Profiling > Inference Options.


a. Verify that the option is enabled.
b. Select the Override the default inference options.
i. Set:
Label Value
Max Key Columns 2
Maximum Violation Rows 10

Note that Dependency Inference Profiling must be set in each profile later. It is not available
here.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 63
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

6. Select Foreign Key Profiling > Inference Options.


a. Select the Override the default inference options.
b. Set
 Comparison case-sensitivity to Case-insensitive.
 Trim values before comparison = Both.

c. Click Finish.

7. Once complete refresh the MRS.


Note: The profile is still running. Please wait until it finishes to open. This may take a few
minutes.

8. Review the results.


a. After the refresh is complete, expand Titles > Profiles.
b. Select edp_TITLES.

Unauthorized reproduction
64 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

c. Right-click and select Open. It will open on the Results Tab.


d. Select the Default View Tab, click anywhere on the canvas and then select the
Properties > Profiles tab.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 65
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Note: Be sure that all of the profiles have completed before viewing the results.
e. You can verify that all of the profiles completed with SUCCEEDED.
f. Click on the Results tab.

Unauthorized reproduction
66 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

g. Verify that you are viewing the Relationships results and click onto the Entity
diagram. This is the blue circle in the main picture.
Note: The related data objects are displayed to the right. We will look at this in more
detail later.
9. Review the Data Domains identified.
a. In the Navigator, select Data Domains and then check the Show data domain group
in hierarchy checkbox.

b. Scroll through and review some of the domains discovered by selecting a domain in
the viewer, right clicking and choosing to drill down to see the underlying data.
c. Scroll down the list and choose Contact:Firstname.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 67
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

The domain is contained in the EMPLOYEES data object.


d. Right click and choose to Drill down.
Note: You can only drill down on columns when the data has been identified using a
data rule. You cannot drill down on domains that have been identified by a column
name rule.
e. Review the results, checking both Conforming and Non – Conforming Rows by
selecting Non-Conforming Rows and pressing Run.

i. Verify that you can see the 7 rows that do not match.
Note Data Domains can be approved and rejected from here also.
10. Review Column Profiles.
a. In the Results view, select Column Profile.
b. Ensure all of the data objects are listed.
c. Select each of the data objects and review the results.

Unauthorized reproduction
68 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

d. From the analysis, we can see the following:


Data Object Description
Stores Store name and address information
EMPLOYEES Employee information
SALES Information on the sales of titles
Jobs Job information
TITLES Information about titles, such as price,
type, sales, and pubdate.
PUBLISHERS Publisher address information
e. Select the Default view tab and note that the main canvas is currently empty. To
populate the view, select all of the objects in the Outline View to the left.
Note: there may be a slight delay selecting the tables in the list. This is normal on the
image.

f. Note that the Workspace area now displays all the objects. Rearrange them by
clicking and dragging them into position.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 69
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

g. Observe that there no links between any of the objects.


11. Review data object profiles.
a. Select the store's data object.
b. Expand the Columns and Profile section.

c. Under Profile click on the Profile_stores link


d. The Profile_stores opens in a new tab.
e. To the left, select Results > Column Profiling.

Unauthorized reproduction
70 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

f. Select the city column.

g. Notice that the Details section lists the values.


h. In the Details section, select Patterns.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 71
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

i. The X(8) pattern has a frequency of 2.


i. Select Statistics.
i. The Maximum Length is 9.

j. Select Datatypes.
i. Verify the city column has a datatype of String (9).

Unauthorized reproduction
72 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

12. Return to the Default view.


13. Take a few minutes to review the column profiles for the EMPLOYEES, PUBLISHERS
and SALES data objects.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 73
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Unauthorized reproduction
74 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Lab 3-2: Functional Dependency and Primary Key Inference


Overview:
You have taken a look at the new objects and created an Enterprise Discovery Profile.
Functional Dependency Profiling was not run as part of the profile, and we would like to run it for
our flat file data objects to help with reviewing the Primary Keys for these objects.
Note that the database tables already have a Primary Key defined.
Objectives:
 Configure Functional Dependency Profiling.
 Rerun the profiles.
 Review the results.
 Define the Primary Keys for the flat file data objects.
Duration:
30 minutes

Tasks
1. Within the Enterprise Discovery Profile open the following:
 Profile_stores
 Profile_jobs
 Profile_titles.

2. Configure Functional Dependency Profiling:


a. In Profile_stores expand Definition > Functional Dependency Profiling and select
Column Selection.
b. Select the checkboxes beside Titles\stores for both Determinant and Dependent
columns.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 75
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

c. Verify that all of the columns are selected as well as the Enabled as part of the ‘Run
Profile’ action checkbox. We can leave the Inference Options as they are.
d. Continue onto the Profile_jobs and repeat the step:

e. Enable Functional Dependency Profiling on the Profile_titles also and Save the
changes to the Enterprise Profile.
3. Return to the Default view and click on the Profiles tab.
a. Choose to Run Multiple profiles.

Unauthorized reproduction
76 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

b. In the dialog deselect the profiles for the 3 tables as these don’t need to be re-run.
c. Press OK and let the profiles run. These may take some time to complete.

4. Once the profiles have finished running, return to Profile_stores.


a. Choose Results > Primary Key Inference.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 77
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

b. Select stor_id. Right-click and select Verify.


c. Scroll across and note the green tick to show that it has been verified.

Note: No violations are found.


d. Before we approve this click on Results > Functional Dependency Inference.
e. Note that the stor_id can be used to determine all the columns in the object.
Note: the fields are highlighted below for clarification only.

Unauthorized reproduction
78 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

f. Return to the Primary Key Inference tab, right-click on stor_id and select Approve.

g. Close Profile_stores and select the Default View.


i. Select the store's data object.
ii. Expand Keys, and verify that the stor_id is the key.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 79
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

5. While in Default view review the jobs object:


a. Note that currently no Primary Keys are defined.

b. Return to the Profile_jobs and select Results > Column Profiling.

c. Review the contents of the Column Profiling section.

Unauthorized reproduction
80 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

d. Select Results > Functional Dependency Inference.


i. Note the columns can be determined by both the job_id and job_desc columns.

e. Select Results > Primary Key Inference.


f. Select job_id, right click and select Verify.

Note: It is verified, and there are no key violations.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 81
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

g. Right-click on job_id and select Approve.


h. Close the Profile_jobs tab and return to the Default View.
i. Verify that the Key has been added.

j. Select the Key.

i. Select Properties > Key.


k. Review the information.
6. Review the titles data object.
a. In the Default view check to see if a key has been defined for the object.

Unauthorized reproduction
82 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

b. Open the Profile_titles and review the data.

c. Review the Results > Functional Dependency Inference tab to help select a key.

Unauthorized
Unauthorized
Module reproduction
3: Functionalreproduction or
Dependency & or distribution
distribution
Primary prohibited.
prohibited.
Key Profiling Copyright©
Copyright© 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 83
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Note: Both the TITLE_ID and TITLE have been identified as determinant columns.
d. Open the Results > Primary Key Inference tab to review the inferred keys.
e. Verify the title_id and check there are no key violations.

f. Approve the key and close the open profiles within the Enterprise Discovery Profile
to return to the Default Tab.
g. On the Default View tab verify all the Primary keys have been defined.

This concludes the lab.

Unauthorized reproduction
84 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Module 3:2020,
Copyright© 2020, Informatica
Informatica
Functional and/or
and/or
Dependency its
its affiliates.
& Primary affiliates.
Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Module 4: Enterprise Discovery – Overlap Discovery and


Foreign Key Profiling
Lab 4-1: Overlap Discovery Profiling
Overview:
Create an Overlap Discovery Profile including all of the data objects within the enterprise
discovery profile. This will enable us to search for objects containing overlapping data.
Objectives:
 Create and run an Overlap Discovery Profile.
 Review the results, verifying overlap in some columns.
 This information can be used to help define PK-FK relationships in the next section.
Duration:
20 minutes

Tasks
1. Perform Overlap Analysis on the Titles data objects.
a. If needed, open the edp_TITLES profile.
b. In the Default View, select all of the data objects:
i. In the Default View, in an empty space, right-click and Select All.

c. Right-click on any selected object and select Overlap Discovery.

Unauthorized
Unauthorized
Module reproduction
4: Enterprisereproduction or distribution
or distribution
Discovery – Overlap prohibited.
Foreign KeyCopyright©
prohibited.
Discovery and Copyright©
Profiling 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 85
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

d. In the Overlap Discovery window set:


 Name = OverlapDiscovery
i. Verify that all of the data objects are available. If not, add the missing objects
using Add.

ii. Click Next.

Unauthorized reproduction
86 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright©
Module 2020,
2020,
4: Enterprise Informatica
Informatica
Discovery and/or
and/or
– Overlap its
its affiliates.
Discovery affiliates.
and Foreign Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

iii. Accept the default selection and click Next.


iv. Select Override the default inference options.
v. Select Case-insensitive for the Comparison.
vi. Click Finish.

vii. In the message box, click Yes to save the profile and let the profile run.

Unauthorized
Unauthorized
Module reproduction
4: Enterprisereproduction or distribution
or distribution
Discovery – Overlap prohibited.
Foreign KeyCopyright©
prohibited.
Discovery and Copyright©
Profiling 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 87
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

2. When the Profile completes it will open. It is then possible to review the Overlap
Discovery Results:
a. In the OverlapDiscovery tab, select Results > Overlap Discovery.

b. Scroll down and find the SALES data object.


c. Select the STOR_ID and verify that there is a 100% Overlap with the stores.stor_id
column by right clicking on the column and selecting Verify.

Unauthorized reproduction
88 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright©
Module 2020,
2020,
4: Enterprise Informatica
Informatica
Discovery and/or
and/or
– Overlap its
its affiliates.
Discovery affiliates.
and Foreign Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

d. After the verification, select the STOR_ID column. Note that a checkmark has been
added to the Verified column.
e. Observe that you can view a Venn diagram and note the overlap between the data in
both columns.

f. In the Overlap Discovery section, select titles > pub_id with PUBLISHERS.PUB_ID.
g. Right-click and select Verify.

Unauthorized
Unauthorized
Module reproduction
4: Enterprisereproduction or distribution
or distribution
Discovery – Overlap prohibited.
Foreign KeyCopyright©
prohibited.
Discovery and Copyright©
Profiling 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 89
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

h. Observe the Venn diagram and the percentage of rows for Overlap.
i. Double-click on the outer PUB_ID area to review the non-overlapping records.

j. Continue reviewing and verifying the overlap between columns in the other data
objects.
k. When you have finished reviewing the results, close the Overlap Discovery tab.

Unauthorized reproduction
90 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright©
Module 2020,
2020,
4: Enterprise Informatica
Informatica
Discovery and/or
and/or
– Overlap its
its affiliates.
Discovery affiliates.
and Foreign Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

Lab 4-2: Foreign Key Profiling


Overview:
Review and interpret the Foreign Key Profile that was generated as part of the Enterprise
Discovery Profile. We will approve relationships and verify that they are reflected on the data
objects on the model canvas.
Finally, we will export the updated model to DDL.
Objectives:
 Review the FK profile that was generated.
 Approve PK-FK Relationships
 Verify the relationship is updated on the model canvas.
 Export the updated model as DDL
Duration:
20 minutes

Tasks
1. Review the Foreign Key Profile.
a. In Default view, click on the workspace and from the Properties Tab select Profiles.

b. In Properties > Profiles, open ForeignKeyProfile.


c. Select Results > Foreign Key Profiling.

Unauthorized
Unauthorized
Module reproduction
4: Enterprisereproduction or distribution
or distribution
Discovery – Overlap prohibited.
Foreign KeyCopyright©
prohibited.
Discovery and Copyright©
Profiling 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 91
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

d. Click on the circle icon.


Note: Notice that Entity_1 section is populated.

e. Double-click on the circle to open the drill down. The EMPLOYEES data object is the
selected object.
Note: Green represents the currently selected objects. Blue objects represent the first
level of relationships. A single arrowhead indicates a primary key to foreign key
relationship. The arrow head points to the data object with the primary key.

Unauthorized reproduction
92 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright©
Module 2020,
2020,
4: Enterprise Informatica
Informatica
Discovery and/or
and/or
– Overlap its
its affiliates.
Discovery affiliates.
and Foreign Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

f. Hover the mouse over the connector, and the relationship is displayed.

Unauthorized
Unauthorized
Module reproduction
4: Enterprisereproduction or distribution
or distribution
Discovery – Overlap prohibited.
Foreign KeyCopyright©
prohibited.
Discovery and Copyright©
Profiling 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 93
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

g. Select titles and note that its direct relationships are displayed.

h. In the chart, select EMPLOYEES.


i. Right-click and select Pin Selected Data Object as Focus.

Unauthorized reproduction
94 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright©
Module 2020,
2020,
4: Enterprise Informatica
Informatica
Discovery and/or
and/or
– Overlap its
its affiliates.
Discovery affiliates.
and Foreign Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

2. Discover the business relationships:


a. Select titles.
b. Right-click and select View Column Relationships.

c. Select the title_id row with SALES.TITLE_ID as the Related Data Object.

Unauthorized
Unauthorized
Module reproduction
4: Enterprisereproduction or distribution
or distribution
Discovery – Overlap prohibited.
Foreign KeyCopyright©
prohibited.
Discovery and Copyright©
Profiling 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 95
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

d. Right-click and select Verify.


e. After the verification, a green checkmark appears in the Verified column.
f. Select the row to display the Venn chart.

g. Notice that you have 2 orphan rows. Double click on the orange titles to drill down
and review the orphan records.

Note: The orphan rows mean that there are titles that have not sold.

Unauthorized reproduction
96 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright©
Module 2020,
2020,
4: Enterprise Informatica
Informatica
Discovery and/or
and/or
– Overlap its
its affiliates.
Discovery affiliates.
and Foreign Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

This is a valid Primary Key – Foreign Key Relationship.


h. Right-click on the row and select Approve noting that the status has changed to
Approved.

i. Return to the Default View by clicking on the Default View tab.

Note: Observe that there is now a link between the data objects.

Unauthorized
Unauthorized
Module reproduction
4: Enterprisereproduction or distribution
or distribution
Discovery – Overlap prohibited.
Foreign KeyCopyright©
prohibited.
Discovery and Copyright©
Profiling 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 97
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

3. Select Sales in the Default view.


a. Select Properties > Relationships and review the relationship defined between the
two objects.

b. Return to the Foreign Key Profile, select titles and PUBLISHERS > pub_id.
c. Verify the relationship.

d. There are orphan rows in the Publishers data object. Double-click the orange icon to
populate the Data Viewer view and view the orphan records.

Unauthorized reproduction
98 Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright©
Module 2020,
2020,
4: Enterprise Informatica
Informatica
Discovery and/or
and/or
– Overlap its
its affiliates.
Discovery affiliates.
and Foreign Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

e. Right-click in the Data Viewer view, and select Export Data.

f. Export the data to a file on the Desktop.


g. Set:
 Name = PUBLISHERSwithoutTITLES
h. Approve the relationship and verify that it was updated on the model workspace.

Unauthorized
Unauthorized
Module reproduction
4: Enterprisereproduction or distribution
or distribution
Discovery – Overlap prohibited.
Foreign KeyCopyright©
prohibited.
Discovery and Copyright©
Profiling 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates. 99
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

4. Create relationships with the EMPLOYEES data object.


a. In the Foreign Key Profile, select All Data Objects in the group.

b. Expand the EMPLOYEES data object.


c. Select the PUB_ID.
d. Verify the relationship.

Unauthorized reproduction
100Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright©
Module 2020,
2020,
4: Enterprise Informatica
Informatica
Discovery and/or
and/or
– Overlap its
its affiliates.
Discovery affiliates.
and Foreign Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

e. The Conformance Percentage is 100%. There are no orphan rows.


f. Approve this relationship.
5. In EMPLOYEES, select the JOB_ID > jobs_job_id relationship.
a. Verify the relationship.

b. There is 1 orphan row. Double click on the orange square to view the orphan record
in the drill down below.

Unauthorized
Unauthorized
Module reproduction
4: Enterprisereproduction or distribution
or distribution
Discovery – Overlap prohibited.
Foreign KeyCopyright©
prohibited.
Discovery and Copyright©
Profiling 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.101
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

c. Approve the relationship.


6. Configuring SALES relationships.
a. Select SALES > STOR_ID > stores.stor_id.
b. Verify the relationship.

c. There are no orphan rows. Approve the relationship.

Unauthorized reproduction
102Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright©
Module 2020,
2020,
4: Enterprise Informatica
Informatica
Discovery and/or
and/or
– Overlap its
its affiliates.
Discovery affiliates.
and Foreign Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

7. Verify that the data objects have relationships.


a. Open the Default view.
b. In each of the data objects, collapse all of the Keys, Columns, or Profiles.
c. In the menu, select Layout > Arrange All.

Unauthorized
Unauthorized
Module reproduction
4: Enterprisereproduction or distribution
or distribution
Discovery – Overlap prohibited.
Foreign KeyCopyright©
prohibited.
Discovery and Copyright©
Profiling 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.103
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

8. Generate DDL.
a. In the Object Explorer, right-click on epd_TITLES and choose to Generate DDL.

b. Save the file to the desktop and call it TITLES.


c. Set the Target Database Type as ORACLE and press ok.

d. Once complete, check the Desktop and note that 2 files are generated.
TITLES_CREATE and TITLES_DROP.
e. Open and review the contents.

Unauthorized reproduction
104Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright©
Module 2020,
2020,
4: Enterprise Informatica
Informatica
Discovery and/or
and/or
– Overlap its
its affiliates.
Discovery affiliates.
and Foreign Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

9. Export Entity Relationships.


a. Return to the Foreign Key Profile and choose the Export icon to the right.

Unauthorized
Unauthorized
Module reproduction
4: Enterprisereproduction or distribution
or distribution
Discovery – Overlap prohibited.
Foreign KeyCopyright©
prohibited.
Discovery and Copyright©
Profiling 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.105
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

b. Export the Excel file to the Desktop.

c. Accept the default file name and Save.


d. Open the new spreadsheet on the Desktop and review the information.

Unauthorized reproduction
106Unauthorized reproduction or
or distribution
distribution prohibited.
prohibited. Copyright©
Copyright©
Module 2020,
2020,
4: Enterprise Informatica
Informatica
Discovery and/or
and/or
– Overlap its
its affiliates.
Discovery affiliates.
and Foreign Key Profiling
Unauthorized
Unauthorized reproduction
reproduction or
or distribution
distribution prohibited. Copyright© 2020,
prohibited. Copyright© 2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.

e. Review each of the tabs.


f. Close the spreadsheet and return to Developer.

This concludes the lab.

Unauthorized
Unauthorized
Module reproduction
4: Enterprisereproduction or distribution
or distribution
Discovery – Overlap prohibited.
Foreign KeyCopyright©
prohibited.
Discovery and Copyright©
Profiling 2020,
2020, Informatica
Informatica and/or
and/or its
its affiliates.
affiliates.107

You might also like