SAP Data Services - Data Integrity Check
SAP Data Services is the primary tool I use for ETL (Extract, Transform, Load) to move data in batches from one enterprise software system to another. Back in 2014 I noticed a gap in functionality for verifying that your data actually made it from the source system to the target, especially if there was heavy transformation involved and/or multiple data flows. For example: how do you know that the sum of your sales transactions made it to the weekly sales bucket in your data warehouse?
Data Services does have some auditing functionality built in to its data flows, but it is pretty simplistic and clunky to use. So, I created a re-usable custom solution, which consists of:
- A database table in an ETL support database to hold data about the use of the functions
- Two custom functions in Data Services
- An insert function to store the job execution and source value in the table
- An update function to store the target value and check the source/target against the specified error and warning tolerance levels
- Two scripts in the Data Services job
- An initialize script that captures the source value and calls the insert function
- A finalize script that captures the target value and calls the update function
- The finalize script can also take additional action, like emailing someone if there is a problem
To see a more detailed explanation and code examples, head over to the project on GitHub. Let me know if you run into any problems or have suggestions for improvement!