Flat File Testing

You might also like

Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 2

FLAT FILE TESTING

Testing of delimited files such as CSV files or fixed width flat files.
What are Flat Files?
Flat files are extensively used for exchanging data between enterprises or between
organizations within an enterprise. Flat files come in two forms - delimited files
such as CSV (comma separated) files or fixed width files.

What is Flat File Testing?


Flat File testing is the process of validating the quality of data in the flat file
as well as ensuring that the data in the flat file has been consumed appropriately
by the application or ETL process.

Challenges in Flat File Testing?


Testing of inbound flat files presents unique challenges because the producer of
the flat file is usually different organizations within an enterprise or an
external vendor. Consequently, there might be differences in the format and content
of the files since there is no easy way to enforce the data type and data quality
constraints on the data in the flat files. Issues in flat file data can cause
failures in the consuming process. While the file processing requirements are
different from project to project, the focus of this use case is to list out some
of the common checks that need to be performed for validating flat files.

Flat File Testing Categories


File Ingestion Testing
Data Type Testing
Data Quality Testing
Data Completeness Testing
Data Transformation Testing
Performance Testing
FLAT FILE INGESTION TESTING
When data is moved using flat files between enterprises or organizations within
enterprise, it is important to perform a set of file ingestion validations on the
inbound flat files before consuming the data in those files.
File name validation
Files are ftp'ed or copied over to a specific folder for processing. These files
usually have a specific naming convention so that the process consuming the file is
able to understand the contents and date. From a testing standpoint, the file name
pattern needs to be validated to verify that it meets the requirement.

Example: A government agency that gets files from multiple vendors on a periodic
basis. The arriving files should follow a naming convension of
'CompanyCode_ContentType_DateTimestamp.csv'. However, the files coming in from a
specific vendor do not have have the correct company name.
Size and Format of the flat files
Although, flat files are generally delimited or fixed width, it is common to have a
header and footer in these files. Sometimes, these headers have a rowcount that can
be used to verify that the file contains the entire data as expected.

Some of the relevant checks are:


Verify that the size of the file is within the expected range where applicable.
Verify that the header, footer and column heading rows have the expected format
and have the expected location within the flat file.
Perform any row count checks to cross check the data in the header with the
values in the delimited data.

Example: A financial reporting company generates files with a header that contains
the summary amount with the line items having the detailed split. The sum of the
amounts in the line items should match the summary amount in the header.
File arrival, processing and deletion times
Files arrive periodically into a specific network folder or an ftp location before
getting consumed by a process. Usually, there are specific requirements that need
to be met regarding the file arrival time, order of arrival and retaining them.

Example: A pharma company gets a set of files from a vendor on a daily basis. The
process consuming this files expects the complete set of files to be available
before processing
1. A file that were supposed to come yesterday was delayed. It came in sometime
after today's file arrived causing issues due to difference in the order of
processing the files.
2. After the files gets processed, it is supposed to be moved to a specific
directory where it is to be retained for a specified period of time and deleted.
However, the file did not get copied over.
3. Compare the transformed data in the target table with the expected values for
the test data.
Automate file ingestion testing using ETL Validator
ETL Validator comes with Component Test Case and File Watcher which can be used to
test Flat Files.

Flat File Component: Flat file component is part of the Component Test Case. It
can be used to define data type and data quality rules on the incoming flat file.
The data in the flat file can also be compared with data from the database.
File Watcher: Using File Watcher test plans can be triggered automatically when
a new file comes into a directory so that the test cases on the file can be
executed automatically before the files are used further by the consuming process.
SFTP Connection: Makes it easy compare and validate flat files located in a
remote SFTP location.
ETL VALIDATOR RESOURCES

You might also like