Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Movies Sample for DQS

The Movies sample for Data Quality Services (DQS) demonstrates how easily you can eliminate duplicate
records in your data by using DQS. Consider a scenario where you have a huge movies database with
lots of entries from different sources and theaters. You might have the following issues:
 Are their double entries and problems in your data?
 Does your data include duplicates?
 Is your data clean?

Sample Scenarios
The sample covers the following scenarios in DQS:

 Scenario 1: Create knowledge base from an existing knowledge base (.dqs file)
 Scenario 2: Perform matching activity

Scenario 1: Create knowledge base from an existing knowledge base (.dqs file)
In this scenario, you will use the Movies.dqs file to create a knowledge base with a matching policy.
Later, the new knowledge base created by you will be used to match records in the
MoviesSampleData.xls file.

1. Start Data Quality Client, and create a knowledge base from the Movies.dqs file. For step-by-step
instructions to do so, see Import a Knowledge Base from a .dqs File.
2. Select Domain Management, and then click Next. All the domains from the .dqs file will be
imported to the new knowledge base.
3. Click Finish to publish the knowledge base. The knowledge base is created with the following
matching rule:

Scenario 2: Perform matching activity


1. Create a new data quality project, select the knowledge base created in scenario 1, select the
Matching activity, then click Next.
2. In the Mapping stage, select Excel File as the data source, and then select the
MoviesSampleData.xls file. Map the columns in the Excel file with the domains as shown below:
3. Click Next to go to the Match stage. Click Start to run the matching. After the matching is done,
many clusters of Matched records are shown.

4. Eliminate the duplicate records, and then proceed to export the processed data.

You might also like