Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 1

Perform Data Quality checking using the following criteria below: (Your basis is your mapping document)

1. Data Type - Check if the ingested data matches your defined data type.
SHOW CREATE TABLE <TABLE NAME>

2. Number of Columns – Check if the ingested number of columns is the same with your defined
mapping.
SHOW CREATE TABLE <TABLE NAME>

3. Order of Columns – Check if the ingested order of columns is the same with your defined
mapping.
SHOW CREATE TABLE <TABLE NAME>

4. Date Format – Check columns with date if it follows the defined format. (If applicable)
SELECT <column_name> FROM <table_name>

5. Precision and Scale - Check if the ingested data matches your defined field size.
(Including decimal if applicable)
SELECT MAX(LENGTH (<COLUMN NAME>))

6. Validity - Define the validity of your columns in HQL


SELECT <column_name> FROM <table_name>

7. Completeness – Define the completeness of your columns in HQL


(Mandatory or Null)

8. Uniqueness – Check if your defined primary key is unique.

SELECT COUNT (*) as total_count, COUNT (DISTINCT <primary key column>) as distinct_count
FROM <table_name>
**Null Count (Make sure that your primary key has no null value)

9. Full Row Duplicates


SELECT column1,
columnN, - -- list all the columns
count(*)
FROM table
GROUP BY
column1,
columnN - -- list all the columns
HAVING count(*) > 1

You might also like