Collecting data with Trackers and Webhooks

  1. Home
  2. Docs
  3. Collecting data with Trackers and Webhooks
  4. Trackers – collecting data from your own applications
  5. Javascript Tracker
  6. Self Hosting the JavaScript Tracker on AWS

Self Hosting the JavaScript Tracker on AWS

We recommend self-hosting the Snowplow JavaScript Tracker, sp.js, as it has some definite advantages over using a third-party-hosted JavaScript:

  1. Hosting your own JavaScript allows you to use your own JavaScript minification and asset pipelining approach (e.g. bundling all JavaScripts into one minified JavaScript)
  2. As Douglas Crockford put it about third-party JavaScripts: “it is extremely unwise to load code from servers you do not control.”
  3. Renaming sp.js will ensure that Snowplow continues to function in the presence of ad/content blockers, which typically block sp.js (see e.g. EasyPrivacy)

Below is a guide for hosting the Snowplow Analytics minified sp.js asset on Amazon Cloudfront. This is our recommended strategy for self hosting however there are other options available to self hosting sp.js, you may choose to bundle it into your application directly or host with a different provider as two examples.

The latest minified version of the Snowplow JavaScript Tracker, called sp.js, is available from the Snowplow Github repo.

Pre-requisites

For the purposes of this guide, we are going to assume that you want to serve the standard sp.js from CloudFront. (We discuss other approaches in the Advanced options section below.). To accomplish this, you will need the following:

  • An account with Amazon Web Services
  • S3 and CloudFront enabled within your AWS account
  • Some technical chops (not too many)

Once you have those ready, please read on…

Self-hosting instructions

1. Create a bucket for the JavaScript

First create a new bucket within your Amazon S3 account to store the pixel.

js-bucket

2. Upload the JavaScript

You want to upload the minified version of the Snowplow JavaScript, which is called sp.js. You can obtain the latest version of the JavaScript from the Snowplow Github repo.

Now you’re ready to upload the JavaScript file into S3. Within the S3 pane, hit Upload and browse to your file:

js-select

Then hit Open and you will see the following screen:

js-upload

Hit Set Details >, then hit Set Permissions > to set permissions on this file allowing Everyone to Open/Download it:

js-permissions

Now hit Start Upload to upload the JavaScript file into your bucket. When done, you should have something like this:

js-ready

Now that sp.js has been uploaded, we recommend that you set the Cache-Control max-age property on the file. This property determines both how long Cloudfront caches sp.js in its edge locations, and crucially, how long individual browsers cache sp.js before repinging Cloudfront for a fresh copy. By setting a long expiration date, you can reduce the number of browser requests for sp.js, which can significantly decrease your Cloudfront costs. (Especially if you are a large website or network of sites.)

The only disadvantage of a long expiration is that you need to find a way to force end users to fetch a fresh copy of sp.js when you upgrade to a newer version. This is easily managed by saving your new version to a new folder in your S3 bucket, and updating your Snowplow tags to point to the new version.

To set a long expiration date on sp.js, right click on it in the S3 console, and select Properties:

open-permissions

Click on the Metadata dropdown and then click on the Add more metadata button. New drop downs appear to enable you to enter a new key/value pair:

enter-key-value-pair

In the Key dropdown, select Cache-Control. In the value field, enter

max-age=$value_in_seconds

For example, if you want to set your items to expire in 10 years, enter, that is 10x365x24x60x60 = 315,360,000

max-age=315360000
entered-values

Now click save button (bottom right of the screen). You area ready to create your CloudFront distribution!

4. Create your CloudFront distribution

Now you are ready to create the CloudFront distribution which will serve your JavaScript. In the CloudFront tab, hit the Create Distribution button:

dist-create

Select the Download option and hit Continue:

dist-origin

For the Origin Domain Name, choose your S3 bucket (snowplow-static-js) from the dropdown, and accept the suggested Origin ID. Now hit Continue:

dist-behaviour

The defaults are fine on this screen, hit Continue again:

dist-details

On this screen leave Logging as Off and hit Continue to review a summary of your new distribution:

dist-review

Hit Create Distribution and then you should see something like this:

dist-enabled

Write down your CloudFront distribution’s Domain Name – e.g. http://d1fc8wv8zag5ca.cloudfront.net. You will need this later when you integrate Snowplow into your website.

5. Testing your JavaScript file on CloudFront

Before testing, take a 10 minute coffee or brandy break (that’s how long CloudFront takes to synchronize).

Done? Now just check that you can access your JavaScript file over both HTTP and HTTPS using a browser, wget or curl:

http://{{SUBDOMAIN}}.cloudfront.net/sp.js https://{{SUBDOMAIN}}.cloudfront.net/sp.js

If you have any problems, then double-check your CloudFront distribution’s URL, and check the permissions on your sp.js file: it must be Openable by Everyone.

That’s it – you now have a CloudFront distribution which will serve your Snowplow JavaScript to anybody anywhere in the world, fast. Now all that remains is to update your Snowplow header tag to fetch your own version of sp.js, rather than the version hosted by the Snowplow team.

5. Update your tracking tags to use the self-hosted version of sp.js

The standard Snowplow tracking tag looks something like:

<!-- Snowplow starts plowing --> <script type="text/javascript"> ;(function(p,l,o,w,i,n,g){if(!p[i]){p.GlobalSnowplowNamespace=p.GlobalSnowplowNamespace||[]; p.GlobalSnowplowNamespace.push(i);p[i]=function(){(p[i].q=p[i].q||[]).push(arguments) };p[i].q=p[i].q||[];n=l.createElement(o);g=l.getElementsByTagName(o)[0];n.async=1; n.src=w;g.parentNode.insertBefore(n,g)}}(window,document,"script","//d1fc8wv8zag5ca.cloudfront.net/2.7.2/sp.js","snowplow")); window.snowplow('newTracker', 'cf', '{{MY-COLLECTOR-URI}}', { // Initialise a tracker appId: '{{MY-SITE-ID}}', cookieDomain: '{{MY-COOKIE-DOMAIN}}' }); window.snowplow('trackPageView'); </script> <!-- Snowplow stops plowing -->

The reference to '://d1fc8wv8zag5ca.cloudfront.net/2.7.2/sp.js' loads sp.js, the Snowplow JavaScript Tracker. In this example tag, the version loaded is the version hosted by the Snowplow team from our own Cloudfront subdomain (and provided free to the community).

To use the version hosted yourself, update the source string to point to your own self-hosted sp.js:

;(function(p,l,o,w,i,n,g){if(!p[i]){p.GlobalSnowplowNamespace=p.GlobalSnowplowNamespace||[]; p.GlobalSnowplowNamespace.push(i);p[i]=function(){(p[i].q=p[i].q||[]).push(arguments) };p[i].q=p[i].q||[];n=l.createElement(o);g=l.getElementsByTagName(o)[0];n.async=1; n.src=w;g.parentNode.insertBefore(n,g)}}(window,document,"script","//://{{SUBDOMAIN}}.cloudfront.net/sp.js.cloudfront.net/2.7.2/sp.js","snowplow"));