1. Dependencies
Running
Stream Enrich is a jarfile. Simply provide the configuration file as a parameter:
$ java -jar snowplow-stream-enrich-[targeted platform]-[version].jar --config my.conf --resolver file:resolver.json
Where targeted platform can be one of:
- kinesis
- kafka
- nsq
- stdin
This will start the Stream Enrich app to read raw events from Kinesis, Google PubSub, Kafka, NSQ or stdin depending on the targeted platform chosen and write enriched events back to Kinesis, Google Pubsub, Kafka, NSQ, or stdout.
If you are using configurable enrichments, provide the path to your enrichments directory as a parameter:
$ java -jar snowplow-stream-enrich-[targeted platform]-[version].jar --config my.conf --resolver file:resolver.js --enrichments file:path/to/enrichments
If you are using Kinesis and storing the resolver and/or enrichments in DynamoDB, use the “dynamodb:” prefix in place of the “file:” prefix:
$ java -jar snowplow-stream-enrich-kinesis-[version].jar --config my.conf --resolver dynamodb:eu-west-1/ConfigurationTable/resolver --enrichments dynamodb:eu-west-1/ConfigurationTable/enrichment
The above command assumes that the enrichments and resolver are stored in a table named
ConfigurationTable
in eu-west-1, that the key for that table is id
, that the resolver JSON is
stored in an item whose key has value resolver
, and the enrichments are stored in items whose
keys have values beginning with “enrichment”.
If you are using Google PubSub and storing the resolver and/or enrichments in Datastore, use the “datastore:” prefix in place of the “file:” prefix:
$ java -jar snowplow-stream-enrich-google-pubsub-[version].jar --config my.conf --resolver datastore:resolver/iglu --enrichments datastore:enrichment/enrich-
The above command assumes that the resolver has kind resolver
and has iglu
as key, the value
being stored in column “json”. It also assumes that the enrichments have kind enrichment
and their
names start with enrich-
with their values being stored in column “json”.
Configuring the log level
Stream Enrich uses slf4j logging:
$ java -Dorg.slf4j.simpleLogger.defaultLogLevel=debug \ -jar snowplow-stream-enrich-[version].jar --config my.conf --resolver file:resolver.json
This will also affect messages logged by the Kinesis Client Library (which Stream Enrich uses if you’re using Kinesis)
All done?
Now you are ready to setup the S3 Loader to sink the enriched data from Kinesis to S3.