Getting started on Snowplow Open Source

  1. Home
  2. Docs
  3. Getting started on Snowplow Open Source
  4. Setup Snowplow Open Source on AWS
  5. Setup Destinations
  6. Redshift
  7. Setup RDB Loader (post-R35)

Setup RDB Loader (post-R35)

Data is loaded from S3 to Redshift by two applications: RDB Shredder and RDB Loader.

In order to start loading data into Redshift, you need to setup S3 sink for enriched data first.

After setting S3 Loader for enriched data, you need to setup a FIFO SQS queue in order to allow RDB Shredder and RDB Loader to communicate.

Once infrastructure is ready, you need to:

  1. Configure both applications
  2. Run RDB Loader as long-running process with access to message queue
  3. Schedule EMR jobs with S3DistCp and Shredder