Pipeline Components and Applications

  1. Home
  2. Docs
  3. Pipeline Components and Applications
  4. Loaders and storage targets
  5. RDB Loader
  6. Loading transformed data

Loading transformed data

For a high-level overview of the RDB Loader architecture, of which the loader is a part, see RDB Loader.

The loader applications are specialised to a specific storage target. Each one performs 3 key tasks:

  • Consume messages from SQS / SNS to discover information about transformed data: where it is stored and what it looks like.
  • Use the information from the message to determine if any changes to the target table(s) are required, eg to add a column for a new event field. If required, submit the appropriate SQL statement for execution by the storage target.
  • Prepare and submit for execution the appropriate SQL COPY statement.

For loading into Redshift, use the Redshift loader. This loads shredded data into multiple Redshift tables.

For loading into Snowflake, use the Snowflake loader. This loads wide row JSON format data into a single Snowflake table. (This is not to be confused with Snowplow Snowflake Loader, which is a completely separate application, not part of the RDB Loader architecture. In the long run, snowplow-snowflake-loader will be phased out in favour of RDB Loader.)

For loading into Databricks, use the Databricks loader. This loads wide row Parquet format data into a single Databricks table.

Articles