1. Home
  2. Docs
  3. Getting started on Snowplow Open Source
  4. Setup Snowplow Open Source on GCP
  5. Setup Google Cloud Storage (GCS) Destination
  6. Running Google Cloud Storage Loader

Running Google Cloud Storage Loader

Docker images can be found on Docker Hub.

Loader can be run with:

docker run \
  -v $PWD/config:/snowplow/config \
  -e GOOGLE_APPLICATION_CREDENTIALS=/snowplow/config/credentials.json \ # if running outside GCP
  snowplow/snowplow-google-cloud-storage-loader:0.3.2 \
  --runner=DataFlowRunner \
  --jobName=[JOB-NAME] \
  --project=[PROJECT] \
  --streaming=true \
  --zone=[ZONE] \
  --inputSubscription=projects/[PROJECT]/subscriptions/[SUBSCRIPTION] \
  --outputDirectory=gs://[BUCKET] \
  --outputFilenamePrefix=output \ # optional
  --shardTemplate=-W-P-SSSSS-of-NNNNN \ # optional
  --outputFilenameSuffix=.txt \ # optional
  --windowDuration=5 \ # optional, in minutes
  --compression=none \ # optional, gzip, bz2 or none
  --numShards=1 \ # optional
  --dateFormat=YYYY/MM/dd/HH/ \ # optional
  --labels={\"label\": \"value\"} \ #OPTIONAL
  --partitionedOuptutDirectory=gs://[BUCKET]/[SUBDIR] # optional

To display the help message:

docker run snowplow/snowplow-google-cloud-storage-loader:0.3.2 \
  --help

To display documentation about Cloud Storage Loader-specific options:

docker run snowplow/snowplow-google-cloud-storage-loader:0.3.2 \
  --help=com.snowplowanalytics.storage.googlecloudstorage.loader.Options