workingwithdata_ebook_april21_awc2op 4

28.04.2021 Views

TREATING DATA AS A PRODUCTHow Snowplow approaches enforced workflowsValidating data up front enforces workflows around the ruleset of definitions.At Snowplow, we have done some thinking around these workflows.Snowplow is a first-party data delivery platform that validates events in thepipeline prior to loading to targets. Good events load to the warehouse (andother targets) while bad events are stored for debugging and reprocessing.Snowplow tracking can also be versioned – definitions can be updatedaccording to semantic versioning with all changes automatically manifestingin the warehouse table structure.Typical tracking workflow:1 Collaborate in a tracking design workbook2 Upload the rules (event and entity definitions) to the pipeline3 Test tracking against these rules in a sandbox environment4 Set up integrated tests to ensure each code push takes analyticsinto account5 Set up alerting for any spike in events failing validationSummaryThe case for enforced rulesets:• Front-end devs don’t need to interpret an unenforced eventdictionary packed full of naming conventions• Consumers of the raw data don’t need to guess what keys and values mean• High quality analytics in every code push given the wealth of QAtooling that exists when working with machine readable rulesets• Far less data cleaning required since data is validated up-front39

CHAPTER 4REDUCING DATADOWNTIME WITHDATA OBSERVABILITY

CHAPTER 4

REDUCING DATA

DOWNTIME WITH

DATA OBSERVABILITY

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!