28.04.2021 Views

workingwithdata_ebook_april21_awc2op 4

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

TREATING DATA AS A PRODUCT

How Snowplow approaches enforced workflows

Validating data up front enforces workflows around the ruleset of definitions.

At Snowplow, we have done some thinking around these workflows.

Snowplow is a first-party data delivery platform that validates events in the

pipeline prior to loading to targets. Good events load to the warehouse (and

other targets) while bad events are stored for debugging and reprocessing.

Snowplow tracking can also be versioned – definitions can be updated

according to semantic versioning with all changes automatically manifesting

in the warehouse table structure.

Typical tracking workflow:

1 Collaborate in a tracking design workbook

2 Upload the rules (event and entity definitions) to the pipeline

3 Test tracking against these rules in a sandbox environment

4 Set up integrated tests to ensure each code push takes analytics

into account

5 Set up alerting for any spike in events failing validation

Summary

The case for enforced rulesets:

• Front-end devs don’t need to interpret an unenforced event

dictionary packed full of naming conventions

• Consumers of the raw data don’t need to guess what keys and values mean

• High quality analytics in every code push given the wealth of QA

tooling that exists when working with machine readable rulesets

• Far less data cleaning required since data is validated up-front

39

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!