RDB loader configuration reference

caution

You are reading documentation for an outdated version. Here’s the latest one!

Shredder and loader use different configurations starting from 2.0.0. An example config for loader can be found here.

This is a complete list of the options that can be configured

region	Optional if it can be resolved with AWS region provider chain. AWS region of the S3 bucket.
messageQueue	Required. A SQS topic name used by the shredder and loader to communicate.
jsonpaths	Optional. A S3 URI that holds JSONPath files.
storage.host	Required. Host name of redshift.
storage.port	Required. Port of redshift.
storage.database	Required. Name of the database.
storage.roleArn	Required. WS Role ARN allowing Redshift to load data from S3
storage.schema	Required. Redshift schema name, e.g. "atomic"
storage.username	Required. DB user with permission to load data.
storage.password	Required. Password of DB user
storage.jdbc.blockingRows	Optional. Refer to the Redshift JDBC driver reference.
storage.jdbc.disableIsValidQuery	Optional. Refer to the Redshift JDBC driver reference.
storage.jdbc.dsiLogLevel	Optional. Refer to the Redshift JDBC driver reference.
storage.jdbc.filterLevel	Optional. Refer to the Redshift JDBC driver reference.
storage.jdbc.loginTimeout	Optional. Refer to the Redshift JDBC driver reference.
storage.jdbc.logLevel	Optional. Refer to the Redshift JDBC driver reference.
storage.jdbc.socketTimeout	Optional. Refer to the Redshift JDBC driver reference.
storage.jdbc.ssl	Optional. Refer to the Redshift JDBC driver reference.
storage.jdbc.sslMode	Optional. Refer to the Redshift JDBC driver reference.
storage.jdbc.sslRootCert	Optional. Refer to the Redshift JDBC driver reference.
storage.jdbc.tcpKeepAlive	Optional. Refer to the Redshift JDBC driver reference.
storage.jdbc.tcpKeepAliveMinutes	Optional. Refer to the Redshift JDBC driver reference.
storage.maxError	Optional. Configures the Redshift MAXERROR load option. Default value 10.
monitoring.webhook.endpoint	Optional. An http endpoint where monitoring alerts should be sent.
monitoring.webhook.tags	Optional. Custom key-value pairs which can be added to the monitoring webhooks. E.g. {"tag1": "label1"}
monitoring.snowplow.appId	Optional. When using Snowplow tracking, set this appId in the event.
monitoring.snowplow.collector	Optional. Set to a collector url to turn on snowplow tracking.
monitoring.sentry.dsn	Optional. For tracking runtime exceptions.
monitoring.statsd.hostname	Optional, for sending loading metrics (latency and event counts) to a statsd server.
monitoring.statsd.port	Optional, port of the statsd server.
monitoring.statsd.tags	E.g. { "key1": "value1", "key2": "value2" }. Tags are used to annotate the statsd metric with any contextual information.
monitoring.statsd.prefix	Optional, default “snoplow.rdbloader”. Configures the prefix of statsd metric names.
monitoring.folders.staging	Required if folder monitoring section included in the config. Configuration for periodic unloaded/corrupted folders checks. Path where Loader could store auxiliary logs. Loader should be able to write here, Redshift should be able to load from here
monitoring.folders.period	Required if folder monitoring section included in the config. How often to check for unloaded/corrupted folders.
monitoring.folders.since	Required if folder monitoring section included in the config. Specifies until when folder monitoring will monitor.
monitoring.folders.until	Required if folder monitoring section included in the config. Specifies from when folder monitoring will start to monitor.
monitoring.folders.shredderOutput	Required if folder monitoring section included in the config. Path to shredded archive.
monitoring.healthCheck.frequencyadded in 2.1.0	Optional. How often to run a periodic DB health check, which raises a warning if DB does not respond to a SELECT 1
monitoring.healthCheck.timeoutadded in 2.1.0	Optional. How long to wait for a health check response.
retryQueue.periodadded in 2.1.0	Optional. Configures a backlog of recently failed folders that could be automatically retried. period is how often a batch of failed folders should be pulled into a discovery queue.
retryQueue.sizeadded in 2.1.0	Required if retryQueue section is included. How many failures should be kept in memory. After the limit is reached, new failures are dropped.
retryQueue.maxAttemptsadded in 2.1.0	Required if retryQueue section is included. How many attempts to make for each folder. After the limit is reached new failures are dropped.
retryQueue.intervaladded in 2.1.0	Required if retryQueue section is included. Artificial pause after each failed folder before being added to the retry queue.