workingwithdata_ebook_april21_awc2op 4
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
TREATING DATA AS A PRODUCT
How Snowplow approaches enforced workflows
Validating data up front enforces workflows around the ruleset of definitions.
At Snowplow, we have done some thinking around these workflows.
Snowplow is a first-party data delivery platform that validates events in the
pipeline prior to loading to targets. Good events load to the warehouse (and
other targets) while bad events are stored for debugging and reprocessing.
Snowplow tracking can also be versioned – definitions can be updated
according to semantic versioning with all changes automatically manifesting
in the warehouse table structure.
Typical tracking workflow:
1 Collaborate in a tracking design workbook
2 Upload the rules (event and entity definitions) to the pipeline
3 Test tracking against these rules in a sandbox environment
4 Set up integrated tests to ensure each code push takes analytics
into account
5 Set up alerting for any spike in events failing validation
Summary
The case for enforced rulesets:
• Front-end devs don’t need to interpret an unenforced event
dictionary packed full of naming conventions
• Consumers of the raw data don’t need to guess what keys and values mean
• High quality analytics in every code push given the wealth of QA
tooling that exists when working with machine readable rulesets
• Far less data cleaning required since data is validated up-front
39