data engineering
13 August 2023
This article explores the importance of data lineage, which tracks the flow and transformations of data from source to destination, playing a vital role in ensuring data integrity and transparency in data processes.
20 June 2023
In this blog, we explore how to ensure data quality in a Spark Scala ETL (Extract, Transform, Load) job. To achieve this, we leverage Deequ, an open-source library, to define and enforce various data quality checks..
12 May 2023
This blog delves into the importance of data quality, and provides insight into how Data and MLOps Engineers can ensure that quality is maintained throughout the system lifecycle.