Beam Enrich is an application that consumes the raw data from the raw Pub/Sub topic (outputted by the collector). It validates the data (against schemas scored in Iglu Central or the user’s own schema registry(ies), enriches the data using one or more enrichments and then writes the processed data out to the enriched Pub/Sub topic, from where it can be e.g. loaded into BigQuery.

Beam Enrich needs to be setup as a Dataflow job. It is accessible as a ZIP archive (from the Snowplow Bintray) or as a docker image (from dockerhub). You can also build the container yourself from source with sbt docker:publishLocal and build the archive from source using sbt universal:packageBin.