Why is there a limit on throughput?
Each Snowplow application is deployed as a docker image on a single EC2 instance. This, along with the streams themselves, are the limiting factor when it comes to throughput. We made this decision for the following reasons:
- We wanted to keep the costs of this experience low, and using ECS fargate or kubernetes would be more expensive
- A single EC2 instance per application is more than enough resource for a proof of concept and to get you started with our OSS
How do I shut down the pipeline?
If you would like to shut down your pipeline then you can easily do so by running
Note that if you want to delete your S3 bucket and Postgres databases you would need to do that from within the AWS management console. If you want to maintain these then you can – just be aware that next time you spin up your pipeline you might see errors when the script to create the S3 bucket and Postgres DBs is running.
How do I migrate to a production ready pipeline?
If you are at the point where you would like to deliver a production use case, we recommend that you fork the terraform modules in order to make the necessary changes. You should also make sure you are using the
secure variant of the Quick Start. There is extensive AWS CloudWatch documentation which will guide you on what needs to be done to deliver a highly available, autoscaling pipeline.
How to upgrade the version of the application that I am using?
We release new versions of our pipeline components very frequently; however the versions used within the terraform modules are updated in line with our platform releases since these are the most stable and recommended versions of our components. Sign-up to get the latest updates on platform releases and new features.
When a new version of a module is released, follow these instructions to upgrade:
- Update the module version in your terraform
terraform planto check for what changes will be made
With the standard deployment, you will only have a single collector instance. This means you will experience brief downtime, typically less than a minute. To prevent this you will need to move to multiple collector set up, so there are multiple collector instances behind the load balancer.