The solution covers different phases, including data ingestion, data validation, and Slowly Changing Dimensions (SCD) data processing. It combines multiple data frameworks, such as Generic Data Ingestion, Data Validation, and SCD Type 1 and Type 2, that are easily configurable, customizable, and deployable for any Microsoft Azure platform.
The vDataAid solution is developed using Azure Data Factory for data ingestion and Spark Notebook for data validation. Azure data integration pipeline is a generic pipeline used for data ingestion and validation that is completely driven by metadata.
For instance, the first step with any data source configuration is to capture the ingestion details like source and target paths, objects to be ingested, etc., in pre-configured metadata tables. Then, we use the single generic pipeline for ingestion, validation, and transformations (SCD) of all the objects without creating and maintaining multiple pipelines.