Pipeline Components and Applications

Enrich PubSub is a standalone JVM application that reads and writes from PubSub topics. It can be run from anywhere, as long as it has permissions to access the topics. For example, run it as a Kubernetes job, or on a GCP compute instance, or even just from your laptop.

Run Enrich PubSub

Enrich PubSub is published on Docker Hub

docker pull snowplow/snowplow-enrich-pubsub:2.0.1

The docker container can be run with the following command

docker run \ -it --rm \ -v $PWD:/snowplow \ -e GOOGLE_APPLICATION_CREDENTIALS=/snowplow/snowplow-gcp-account-11aa55ff6b1b.json \ snowplow/snowplow-enrich-pubsub:latest \ --enrichments /snowplow/enrichments \ --iglu-config /snowplow/resolver.json \ --config /snowplow/config.hocon
Above assumes that you have following directory structure:

  1. GCP credentials JSON file
  2. enrichments directory, (possibly empty) with all enrichment configuration JSONs
  3. Iglu Resolver configuration JSON
  4. Enrich PubSub configuration HOCON

Alternatively, you can download and run a jar file from the github release.

java -jar snowplow-enrich-pubsub-2.0.1.jar \ --enrichments /snowplow/enrichments \ --iglu-config /snowplow/resolver.json \ --config /snowplow/config.hocon