Getting started on Snowplow Open Source

  1. Home
  2. Docs
  3. Getting started on Snowplow Open Source
  4. Setup Snowplow Open Source on GCP
  5. Setup the Snowplow collector
  6. Run the collector as a single instance VM in GCP

Run the collector as a single instance VM in GCP

To run the collector on a single GCP instance, you’ll first need to spin one up:

  • Go to the GCP dashboard, and once again, make sure your project is selected.
  • Click the hamburger on the top left corner, and select Compute Engine, under Compute
  • Enable billing if you haven’t (if you haven’t enabled billing, at this point the only option you’ll see is a button to do so)
  • Click “Create instance” and pick the apropriate settings for your case, making sure of, at least the following:
    • Under Access scopes, select “Set access for each API” and enable “Cloud PubSub”
    • Under Firewall, select “Allow HTTP traffic”
    • Optional Click Management, disk, networking, SSH keysUnder Networking, add a Tag, such as “collector”. (This is needed to add a tagged Firewall rule, explained below)
  • Click the hamburger on the top left corner, and click on “VPC Network”, under Networking
  • On the sidebar, click on “Firewall rules”
  • Click “Create Firewall Rule”
  • Name your rule
  • Under Source filter pick “Allow from any source”
  • Under Protocols and ports add “tcp:8080”
    • Note that 8080 is the port assigned to the collector in the configuration file. If you choose another port here, make sure you change the config file
  • Under Target tags add the Tag with which you labeled your instance (here collector)
  • Click “Create”
  • Then click “Upload Files” and upload your configuration file

Once you have your config file in place, ssh into your instance:

$ gcloud compute ssh your-instance-name --zone your-instance-zone

And then run:

$ sudo apt-get update $ sudo apt-get -y install default-jre $ sudo apt-get -y install unzip $ wget https://dl.bintray.com/snowplow/snowplow-generic/snowplow_scala_stream_collector_google_pubsub_<VERSION>.zip $ gsutil cp gs://<YOUR-BUCKET-NAME/<YOUR-CONFIG-FILE-NAME> . $ unzip snowplow_scala_stream_collector_google_pubsub_<VERSION>.zip $ java -jar snowplow-stream-collector-google-pubsub-<VERSION>.jar --config <YOUR-CONFIG-FILE-NAME>