Pipeline Components and Applications

  1. Home
  2. Docs
  3. Pipeline Components and Applications
  4. Enrichment
  5. Stream Enrich
  6. Run stream enrich

Run stream enrich

Stream enrich can be run with different message queues:

  • kinesis
  • kafka
  • nsq
  • stdin

1. Run

1.1. Docker image (recommended)

With configuration files in path_to_config_dir directory :

docker run \ -d \ --name stream-enrich \ --restart always \ --log-driver awslogs \ --log-opt awslogs-group=${log_group_name} \ --log-opt awslogs-stream=`ec2metadata --instance-id` \ --network host \ -v ${path_to_config_dir}:/snowplow/config \ -e 'JAVA_OPTS=-Xms${heap_size} -Xmx${heap_size} -Dorg.slf4j.simpleLogger.defaultLogLevel=${log_level}' \ snowplow/stream-enrich-${message_queue}:${version} \ --config /snowplow/config/config.hocon \ --resolver file:/snowplow/config/iglu_resolver.json \ --enrichments file:/snowplow/config/enrichments/ \ --force-cached-files-download
Code language: JavaScript (javascript)

1.2. Fat jar

$ java -Dorg.slf4j.simpleLogger.defaultLogLevel=${log_level} \ -jar snowplow-stream-enrich-${message_queue}-${version}.jar \ --config config.hocon \ --resolver file:iglu_resolver.json \ --enrichments file:path/to/enrichments

2. Config in DynamoDB / Datastore

2.1. DynamoDB

When using with Kinesis, it’s possible to store the configuration of the resolver and/or enrichments in DynamoDB. In this case dynamodb: prefix needs to be used in place of file: prefix:

--resolver dynamodb:eu-west-1/ConfigurationTable/resolver \ --enrichments dynamodb:eu-west-1/ConfigurationTable/enrichment

In this case it’s assumed that the enrichments and resolver are stored in a table named ConfigurationTable in eu-west-1, that the key for that table is id, that the resolver JSON is stored in an item whose key has value resolver, and the enrichments are stored in items whose keys have values beginning with enrichment.

2.2. Datastore

When using with Google PubSub, it’s possible to store the configuration of the resolver and/or enrichments in Datastore. In this case datastore: prefix needs to be used in place of file: prefix:

--resolver datastore:resolver/iglu \ --enrichments datastore:enrichment/enrich-

In this case it’s assumed that the resolver has kind resolver and has iglu as key, the value being stored in column “json”. It also assumes that the enrichments have kind enrichment and their names start with enrich- with their values being stored in column “json”.