Session: Open Source for Data Quality
You’ve probably heard the expression, “Garbage in, garbage out.” The truth is that engineers and data scientists often work with imperfect data to create business value. During this talk we’ll explore open source projects and methodologies you can use to build valuable data pipelines with real world data. We’ll cover data anonymization, pipeline testing and automated validation. We will focus on action items you can bring to your future or current projects.