Audit Your Etl With Checksum

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 1

Audit Your ETL With CHECKSUM

Loading your target incrementally offers a huge performance enefit o!er running full truncate"reloads# ut there is a danger of missing inserts or updates in your source system$ %t can ta&e hours or days to trac& do'n the source of these pro lems (if it can e done at all)* and pro lems are generally not found until the customer pic&s up on it$

So anything you can do to !erify the target data 'ith the source is good for your usiness$ You can do simple ro'counts# ut that+s an unrelia le test$ ,or e-ample# if you+re missing one inserted ro' and one deleted ro'# your ro'count test 'ill still pass$

A good chec&sum techni.ue is a light'eight alternati!e that can e run asyncronously or run after the load is complete$ %n this e-ample# % am .uerying a S/L Ser!er target and using 012C to connect to an 0racle source to get a list of &eys that are not syncroni3ed$

SELECT T$&ey# S$&ey# T$chec&sum# T$num4ro's# S$chec&sum# S$num4ro's ,50M (SELECT &ey# count(6* num4ro's# sum(len(char4col*7len(num4col*7num4col7$$$* chec&sum ,50M target4ta legroup y &ey* T ,ULL 0UTE5 80%9 0:E9/UE5Y(source4ser!er#; SELECT &ey# count(6* num4ro's# sum(length(char4col*7length(num4col*7num4col7$$$* chec&sum ,50M source4ta le <50U: 2Y &ey;* S 09 T$&ey = S$&ey WHE5E S$num4ro's >? T$num4ro's 05 S$chec&sum >? T$chec&sum

You might also like