Collecting data with Trackers and Webhooks

  1. Home
  2. Docs
  3. Collecting data with Trackers and Webhooks
  4. Trackers – collecting data from your own applications
  5. Javascript Tracker
  6. Self Hosting the JavaScript Tracker on Google Cloud

Self Hosting the JavaScript Tracker on Google Cloud

We recommend self-hosting the Snowplow JavaScript Tracker, sp.js, as it has some definite advantages over using a third-party-hosted JavaScript:

  1. Hosting your own JavaScript allows you to use your own JavaScript minification and asset pipelining approach (e.g. bundling all JavaScripts into one minified JavaScript)
  2. As Douglas Crockford put it about third-party JavaScripts: “it is extremely unwise to load code from servers you do not control.”
  3. Renaming sp.js will ensure that Snowplow continues to function in the presence of ad/content blockers, which typically block sp.js (see e.g. EasyPrivacy)

Below is a guide for hosting the Snowplow Analytics minified sp.js asset on Google Cloud Storage. This one recommended strategy for self hosting however there are other options available to self hosting sp.js, you may choose to bundle it into your application directly or host with a different provider as two examples.

The latest minified version of the Snowplow JavaScript Tracker, called sp.js, is available from the Snowplow Github repo.

Pre-requisites

For the purposes of this guide, we are going to assume that you want to rename and serve the standard sp.js from Google Cloud Storage. (We discuss other approaches in the Advanced options section below.). To accomplish this, you will need the following:

  • An account with Google Cloud
  • Some technical chops (not too many)

Once you have those ready, please read on…

Self-hosting instructions

Download and gzip tracker JavaScript file

1. Navigate to https://github.com/snowplow/snowplow-javascript-tracker/releases and download the latest version of the Snowplow JavaScript Tracker sp.js file

2. gzip the sp.js file to reduce the file size and reduce associated cloud storage and egress costs.
With the gzip command, we will also rename sp.js to a random 8 character string to reduce the chance of AdBlockers preventing the script from loading e.g. gh7rnghq.js

From a Terminal/Command Prompt window, navigate to where you have downloaded sp.js then run the following command:
On macOS and Linux: gzip -c sp.js > gh7rnghq.js
On Windows: Download gzip binaries from http://gnuwin32.sourceforge.net/packages/gzip.htm then run gzip -c sp.js > gh7rnghq.js

N.B. gzipping and renaming the file are optional but highly recommended. If you skipped this step and did not gzip the file then you can skip step 11 in the Uploading to Google Cloud Storage instructions.

Upload to Google Cloud Storage

1. Navigate to Google Cloud Console: https://console.cloud.google.com/

2. Ensure you are in the correct Google Cloud Project that you wish to host the Snowplow JavaScript Tracker in, using the Drop Down at the top of the page.

3. Then Navigate to the Storage section of Google Cloud Console (https://console.cloud.google.com/storage/browser) or using the Navigation bar:

4. Create a new bucket by clicking the “Create Bucket” button

5. Name your bucket, for example “company-name-sp-js” and select a suitable storage region – Multi Region offers the widest availability and highest SLAs.

N.B. The Bucket name can be in the format of a subdomain of your site (“spassets.acme.com”), if you wish to later complete the optional step of creating a DNS record to point to this bucket. This is optional and requires additional steps to ensure it works on HTTPS, these are described in a section below.

6. Select “Standard” for the storage class and “Fine Grained” for the access control options

7. Leave the Advanced Options as their Defaults and click “Create”

8. You should now be inside your new storage bucket. Click Upload Files and upload the renamed and gzipped file (gh7rnghq.js in our example) which you downloaded, renamed and gzipped earlier.

Optionally, you may also wish to create a folder for each version of the JavaScript tracker which will make managing future updates easier. If you wish to do this then first create a folder corresponding to the version you downloaded from GitHub, i.e. 2.14.0. This is recommended but for berevity, creating this folder has been skipped in this tutorial.

9. Once uploaded your bucket should look similar to this:

10. We now need to edit the Metadata so Google Cloud knows the file is gzipped. Open the “Edit Metadata” popup:

11. Alter the Content-Encoding value to be gzip:

12. Alter the Cache-Control to have a max-age of 1 year, this will also help reduce costs. Set the Cache-Control value to max-age=31536000.

13. Save the Metadata by clicking the Save button on the popup window.

14. Once again, click the three dots at the end of the row and but this time select “Edit Permissions”

15. Add a new Item in the Pop up window and enter “Group – allUsers – Reader” and click Save:

16. Your file should now be publicly accessible:

17. Click the link icon next the “Public to internet” and you will now have the URL for your self hosted Snowplow JavaScript Tracker:
e.g. https://storage.googleapis.com/company-name-sp-js/gh7rnghq.js 

Optional: Add DNS Record for bucket

To connect your domain to your Cloud Storage bucket, you will need to create a CNAME record. A CNAME record is a type of DNS record. For spassets.acme.com, the CNAME record will need to be:

NAMETYPEDATA
spassets.acme.comCNAMEc.storage.googleapis.com.

This step will only work correctly if you earlier created your bucket with a name corresponding to the subdomain (Step 7) used in the CNAME (i.e. spassets.acme.com) and you have verified ownership of this domain in Google Cloud: https://cloud.google.com/storage/docs/domain-name-verification.

CNAME redirection only works on HTTP, to ensure this works on HTTPS you must follow one of Google’s Troubleshooting steps detailed here: https://cloud.google.com/storage/docs/troubleshooting#https

Update your tracking tags to use the self-hosted version of sp.js

The standard Snowplow tracking tag looks something like:

<!-- Snowplow starts plowing --> <script type="text/javascript"> ;(function(p,l,o,w,i,n,g){if(!p[i]){p.GlobalSnowplowNamespace=p.GlobalSnowplowNamespace||[]; p.GlobalSnowplowNamespace.push(i);p[i]=function(){(p[i].q=p[i].q||[]).push(arguments) };p[i].q=p[i].q||[];n=l.createElement(o);g=l.getElementsByTagName(o)[0];n.async=1; n.src=w;g.parentNode.insertBefore(n,g)}}(window,document,"script","//d1fc8wv8zag5ca.cloudfront.net/2.10.2/sp.js","snowplow")); window.snowplow('newTracker', 'sp', '{{MY-COLLECTOR-URI}}', { // Initialise a tracker appId: '{{MY-SITE-ID}}' }); window.snowplow('trackPageView'); </script> <!-- Snowplow stops plowing -->

The reference to '//d1fc8wv8zag5ca.cloudfront.net/2.10.2/sp.js' loads sp.js, the Snowplow JavaScript Tracker. In this example tag, the version loaded is the version hosted by the Snowplow team from our own Cloudfront subdomain (and provided free to the community).

To use the self-hosted version, update the source string to point to your own self-hosted and renamed sp.js which in this example is gh7rnghq.js. If you have not added your own DNS Record then this will be https://storage.googleapis.com/company-name-sp-js/gh7rnghq.js or if you have completed the optional DNS Record step then this will be similar to http://spassets.acme.com/gh7rnghq.js:

;(function(p,l,o,w,i,n,g){if(!p[i]){p.GlobalSnowplowNamespace=p.GlobalSnowplowNamespace||[]; p.GlobalSnowplowNamespace.push(i);p[i]=function(){(p[i].q=p[i].q||[]).push(arguments) };p[i].q=p[i].q||[];n=l.createElement(o);g=l.getElementsByTagName(o)[0];n.async=1; n.src=w;g.parentNode.insertBefore(n,g)}}(window,document,"script","//storage.googleapis.com/company-name-sp-js/gh7rnghq.js","snowplow"));