ETL Testing - NULL Validations
ETL Testing : Null Validations
August 9, 2017
Slowly Changing Dimensions
Test Cases for SCD Type 2
August 16, 2017
Show all

How to compare data between Source and Target?

DataSources

Based on the volumes of the data and the coverage of data for your test cases you need to decide, on which tool the comparison should be done.

In any case, it is better to automate the process. But, if it is a onetime comparison and if the data is not more than 10-30 records, you can do the tests and verify them with the naked eye.

So, there are usually two options which are considered one is Excel and the other is the Database. Please note there are some automated tools and packages with the tools which do this, but assuming we don’t have any right now then we have to go with either Excel or Database.

When it comes to Excel, it is pretty easy, straight forward if you know the basic usage and the formulas and connecting to the data sources. But, we have a limitation of the volumes or the number of records to compare and if the volumes are high (which is common, now a days) then the excel starts to crash and you would have to redo the work over and over again.

So, the option left for us here is database. Now, it is not mandatory that your source or target or reference data is always a relational type. So, you need to load the various other data source formats into the database first.

Once you have all the related data to compare, you should start with the queries and applying the related transformation and filter conditions based on the Business Requirements.

SELECT
*
FROM
source_tablename
Where condition or joins (based on business rule and transformation)

MINUS

SELECT
*
FROM
target_tablename

And the other way (Vice Versa)

SELECT
*
FROM
target_tablename

MINUS

SELECT
*
FROM
source_tablename
Where condition or joins

For both of these queries, the output should be zero records. You also have to consider the where conditions and joins here which will exclude and include additional records for your testing and review of the Business Requirements.

If it is done with the database specific queries, then you can save the queries, upload them as your test execution effort and re-purpose them as and when required and as many number of times as possible during the regressions and re-tests.

There will be situations where you might not be able to do all the comparison at the database level. If you take a case, where the source and target are different databases we can copy the data into an excel sheet and compare using some formulas or we can copy data into notepad++ and compare using compare plugins like Beyond Compare and multiple other comparison tools.

Another option is to import source table into a temp table in the target database and perform minus query as shown above.

Leave a Reply

Your email address will not be published. Required fields are marked *

Share
+1
Tweet
Pin
Share