Beam Enrich is the latest real-time enrichment platform developed at Snowplow Analytics. It takes as input a stream of raw data collected by the Scala Stream Collector, enrich it (using scala-common-enrich), and outputs both a stream of successfully enriched events as well as a stream of events that have failed enrichment.
It enriches the raw data, using scala-common-enrich outputted by the Scala Stream Collector in a GCP PubSub topic and outputs both the successfully enriched and those who failed enrichments to their respective PubSub topics.
It runs on GCP’s Dataflow.
Beam Enrich turns the raw events into TSV enriched events following our Canonical event model that are ready to be fed into the BigQuery Loader which takes care of actually loading the data into BigQuery.