Guide for devopsΒΆ

Dataflow runner is a zero-dependency binary and can be found at the following locations:

  • Linux: http://dl.bintray.com/snowplow/snowplow-generic/dataflow_runner_0.3.0_linux_amd64.zip
  • Windows: http://dl.bintray.com/snowplow/snowplow-generic/dataflow_runner_0.3.0_windows_amd64.zip
  • macOS: http://dl.bintray.com/snowplow/snowplow-generic/dataflow_runner_0.3.0_darwin_amd64.zip

Run it like so:

./dataflow-runner
NAME:
   dataflow-runner - Run templatable playbooks of Hadoop/Spark/et al jobs on Amazon EMR

USAGE:
   dataflow-runner [global options] command [command options] [arguments...]

VERSION:
   0.3.0

AUTHOR:
   Joshua Beemster <support@snowplowanalytics.com>

COMMANDS:
     up             Launches a new EMR cluster
     run            Adds jobflow steps to a running EMR cluster
     down           Terminates a running EMR cluster
     run-transient  Launches, runs and then terminates an EMR cluster
     help, h        Shows a list of commands or help for one command

GLOBAL OPTIONS:
   --log-level value  logging level, possible values are debug,info,warning,error,fatal,panic (default: "info")
   --help, -h     show help
   --version, -v  print the version

COPYRIGHT:
   (c) 2016-2017 Snowplow Analytics Ltd

Once downloaded, unzip it (Linux for example):

$ unzip dataflow_runner_0.3.0_linux_amd64.zip