Professional Documents
Culture Documents
Complex ETL Testing:: A Strategic Approach
Complex ETL Testing:: A Strategic Approach
A Strategic Approach
2011 Real-Time Technology Solutions (212) 240-9050 www.rtts.com info@rttsweb.com 360 Lexington Ave. Fl 9, New York, NY 10017
2011 Real-Time Technology Solutions (212) 240-9050 www.rtts.com info@rttsweb.com 360 Lexington Ave. Fl 9, New York, NY 10017
Data Verification
The recommended pre-deployment strategy is to build test automation (both functional and performance) for every test entry point in the system (feeds, databases, internal messaging, front-end transactions). The goal of the strategy is to provide automated tools for rapid localization of issues between test entry points (see Figure 1). HL7 CDISC XML File
ETL ETL ETL ETL ETL Warehouse Mart(s) BI Tools Data ETL Data
In Figure 1, data from a variety of sources is transformed by the ETL into the Data Warehouse. A second ETL leg aggregates data from the Data Warehouse into Datamart tables for efficient reporting. Front-end applications and Business Intelligence applications access the Datamart in order to provide historical and statistical analyses of company data.
Data Comparison
Data Comparison
As suggested by Figure 1, there are four major points of data comparison in the scheme. These are indicated in Figure 2 and are found between the: Source(s) and Data Warehouse Data Warehouse and Datamart Datamart and BI Tool Source(s) and Datamart
2011 Real-Time Technology Solutions (212) 240-9050 www.rtts.com info@rttsweb.com 360 Lexington Ave. Fl 9, New York, NY 10017
In automated Data Warehouse testing, the emphasis is the validation of data integrity between all points of comparison to ensure the proper implementation of the ETL mappings and transformations across the architecture. Using test automation, data can be tracked from the source layer, through the ETL processing and through the Data Warehouse and Datamart components, to the front-end applications. If corrupt data is found in a front-end application, the execution of automated tests can quickly determine whether the problem is located in the data source, an ETL process, in a data warehouse database, a datamart or in the frontend/Business Intelligence tool. Rapid determination of the application tier in which an issue is located can dramatically lower turnaround times for remediation, as well as enhance quality.
On the performance side, the test entry points analogous to the data comparison points are used, to focus on characterizing subsystem response under load. These are indicated in Figure 3. Using industry standard performance tools, the performance of each of the components in the ETL processes can be evaluated. Often, a major focus of performance and scalability testing is the Business Intelligence tool client interfaces because these are the user-facing components of the Data Warehouse architecture. Perceived poor performance by the user community is frequently a significant concern, and performance measurements across a series of releases can often reduce project tensions around this issue.
Overall Strategy
The emphasis on rapid localization of either data or performance problems in complex data warehouse architectures provides a key tool for: Promoting efficiencies in the development cycle Shortening build cycles and meeting release targets Delivering high quality Data Warehouse architectures.
2011 Real-Time Technology Solutions (212) 240-9050 www.rtts.com info@rttsweb.com 360 Lexington Ave. Fl 9, New York, NY 10017
Bibliography
1. Inmon, W. H., Building the Data Warehouse, John Wiley & Sons; 3rd edition, 2002 2. Kimball, R.; Reeves, L.; Ross, M.; Thornthwaite, W., The Data Warehouse Lifecycle Toolkit : Expert Methods for Designing, Developing, and Deploying Data Warehouses, John Wiley & Sons, 1998
About RTTS
RTTS offers the most comprehensive suite of quality assurance services to help organizations drive positive results from their critical software projects. Headquartered in New York, NY, our expert team has worked closely with over 400 clients to improve their testing processes, tool knowledge, and application deployment outcomes. RTTS was founded in 1996, and has forged partnerships with the worlds leading test tool vendors. Our satellite locations are in Philadelphia, Atlanta, and Phoenix, and many of our consulting and education services are offered through the cloud. No matter where you are, RTTS will ensure application functionality, performance, scalability, and security for your organization.
About QuerySurge
RTTS team of test experts developed QuerySurge to address the unique testing needs in the data warehousing space. It has been implemented on projects ranging from data warehousing and ETL processes to data migrations and database upgrades. QuerySurge can verify as much as 100% of all data from source systems, through the ETL process, to the target data warehouse. The tool has increased test coverage and reduced test cycle time for several organizations, helping them to mitigate risk and meet business requirements. QuerySurge is offered exclusively by RTTS, as are any accompanying Data Warehouse Testing services. If you are interested in learning more, or to schedule a private demonstration, please contact us by e-mail here: info@rttsweb.com. If you would like to speak with our Sales team, please call (212) 240-9050.
2011 Real-Time Technology Solutions (212) 240-9050 www.rtts.com info@rttsweb.com 360 Lexington Ave. Fl 9, New York, NY 10017