1. Home
  2. Docs
  3. Snowplow Open Source Quick Start
  4. Quick Start Installation Guide on AWS
  5. Send test events to your pipeline

Send test events to your pipeline

Now your pipeline is up and running, we can send some events to it. At the end of this section, your pipeline will have collected and processed some sample events and you will have taken a look at these in Postgres and S3. 

Step 1: Send a simple page_view event to your collector

We have two quick examples here, simply as a test. Pick if you’d like to use cURL or our example events form. We’ll show you how to use our SDKs within your own applications in future steps.

Using cURL

Send a request using cURL from your terminal. This example is a typical page_view event, which has been taken from the docs.snowplowanalytics.com website. This example also sends a “failed event”, which sends a custom product_view event that will fail as an appropriate schema doesn’t exist in your Iglu Repository.

curl 'https://{{COLLECTOR_URL}}/com.snowplowanalytics.snowplow/tp2' \ -H 'Content-Type: application/json; charset=UTF-8' \ -H 'Cookie: _sp=305902ac-8d59-479c-ad4c-82d4a2e6bb9c' \ --data-raw '{"schema":"iglu:com.snowplowanalytics.snowplow/payload_data/jsonschema/1-0-4","data":[{"e":"pv","url":"https://docs.snowplowanalytics.com/docs/send-test-events-to-your-pipeline/","page":"Send test events to your pipeline - Snowplow Docs","refr":"https://docs.snowplowanalytics.com/","tv":"js-2.17.2","tna":"spExample","aid":"docs-example","p":"web","tz":"Europe/London","lang":"en-GB","cs":"UTF-8","res":"3440x1440","cd":"24","cookie":"1","eid":"4e35e8c6-03c4-4c17-8202-80de5bd9d953","dtm":"1626182778191","cx":"eyJzY2hlbWEiOiJpZ2x1OmNvbS5zbm93cGxvd2FuYWx5dGljcy5zbm93cGxvdy9jb250ZXh0cy9qc29uc2NoZW1hLzEtMC0wIiwiZGF0YSI6W3sic2NoZW1hIjoiaWdsdTpjb20uc25vd3Bsb3dhbmFseXRpY3Muc25vd3Bsb3cvd2ViX3BhZ2UvanNvbnNjaGVtYS8xLTAtMCIsImRhdGEiOnsiaWQiOiI0YTU2ZjQyNy05MTk2LTQyZDEtOWE0YS03ZjRlNzk2OTM3ZmEifX1dfQ","vp":"863x1299","ds":"848x5315","vid":"3","sid":"87c18fc8-2055-4ec4-8ad6-fff64081c2f3","duid":"5f06dbb0-a893-472b-b61a-7844032ab3d6","stm":"1626182778194"},{"e":"ue","ue_px":"eyJzY2hlbWEiOiJpZ2x1OmNvbS5zbm93cGxvd2FuYWx5dGljcy5zbm93cGxvdy91bnN0cnVjdF9ldmVudC9qc29uc2NoZW1hLzEtMC0wIiwiZGF0YSI6eyJzY2hlbWEiOiJpZ2x1OmNvbS5teV9jb21wYW55L3Byb2R1Y3Rfdmlldy9qc29uc2NoZW1hLzEtMC0wIiwiZGF0YSI6eyJpZCI6IjVOMFctUEwwVyIsImN1cnJlbnRfcHJpY2UiOjQ0Ljk5LCJkZXNjcmlwdGlvbiI6IlB1cnBsZSBTbm93cGxvdyBIb29kaWUifX19","tv":"js-2.17.2","tna":"spExample","aid":"docs-example","p":"web","tz":"Europe/London","lang":"en-GB","cs":"UTF-8","res":"3440x1440","cd":"24","cookie":"1","eid":"542a79d3-a3b8-421c-99d6-543ff140a56a","dtm":"1626182778193","cx":"eyJzY2hlbWEiOiJpZ2x1OmNvbS5zbm93cGxvd2FuYWx5dGljcy5zbm93cGxvdy9jb250ZXh0cy9qc29uc2NoZW1hLzEtMC0wIiwiZGF0YSI6W3sic2NoZW1hIjoiaWdsdTpjb20uc25vd3Bsb3dhbmFseXRpY3Muc25vd3Bsb3cvd2ViX3BhZ2UvanNvbnNjaGVtYS8xLTAtMCIsImRhdGEiOnsiaWQiOiI0YTU2ZjQyNy05MTk2LTQyZDEtOWE0YS03ZjRlNzk2OTM3ZmEifX1dfQ","vp":"863x1299","ds":"848x5315","vid":"3","sid":"87c18fc8-2055-4ec4-8ad6-fff64081c2f3","duid":"5f06dbb0-a893-472b-b61a-7844032ab3d6","refr":"https://docs.snowplowanalytics.com/","url":"https://docs.snowplowanalytics.com/docs/send-test-events-to-your-pipeline/","stm":"1626182778194"}]}'
Code language: Bash (bash)
Using example events

If you have set up HTTPS for your collector, you can test your collector using this form. To set up SSL, on a domain you already own, you will need to:

  • Create your SSL certificate using Amazon Certificate Manager 
  • Once you have done that you will need to add the SSL certificate as an input variable for your Collector load balancer (ssl_certificate_arn) and run terraform apply

This button will send a page_view event from this page, and a “failed event” of a custom product_view event that will fail as an appropriate schema doesn’t exist in your Iglu Repository.

function setupAndTrack() { const protocol_strip = /(http|https):/i; const slash_strip = /\//g; const input = document.getElementById("endpoint").value; const endpoint = input.replace(protocol_strip, "").replace(slash_strip, ""); window.snowplow('newTracker', 'spExample', endpoint, { appId: 'docs-example', stateStorageStrategy: 'cookieAndLocalStorage', cookieName: '_sp_ex_', contexts: { webPage: true } }); window.snowplow('trackPageView:spExample'); window.snowplow('trackSelfDescribingEvent:spExample', { schema: 'iglu:com.my_company/product_view/jsonschema/1-0-0', data: { id: '12345-ABCDE', current_price: 44.99 } }); } <label for="endpoint">Collector Endpoint URL:</label><br>
Code language: PHP (php)

We recommend installing the Poplin Snowplow chrome extension, which allows you to more easily see what events are getting triggered as you browse a webpage. 

Step 2: Query your rich, high quality data in Postgres

In the last section you created a Postgres database where all of your data is stored. Your Postgres database will contain the following standard Snowplow schemas:

  • atomic: this is your rich, high quality data
  • atomic_bad: this is the data that has failed pipeline validation. We will come back to this in the next step.

To query this data, you will first you need to connect to your Postgres database.

  • Connect to the database using the username and password you provided when creating the pipeline, along with the db_address and db_port you noted down after the pipeline was created.
    • If you need to reset your username or password you can follow these steps
    • If your Postgres RDS was configured to be publically accessible, there are a number of tools you can use to connect to a Postgres database from your local machine:
  • Run a query against your atomic.events table to take a look at the page view event that you generated in the previous step (where event_name = ‘page_view’). You can understand more about each field in the canonical event here.
    • SELECT * FROM atomic.events WHERE event_name = 'page_view';

By default, there are 5 enrichments enabled, as listed below. These enrichments add extra properties and values to your events in real time as they are being processed by the Enrich application.

Some enrichments are legacy and therefore populate your atomic.events table. From the above list, these are the campaign attribution, referer parser and event fingerprint enrichments. The UA parser and YAUAA enrichment create a separate context (also referred to as an entity), and are loaded into separate tables: 

  • atomic.com_snowplowanalytics_snowplow_ua_parser_context_1
  • atomic.nl_basjes_yauaa_context_1

Note: you can join these contexts back to your atomic.events using root_id = event_id.

Step 3: Query your bad data in Postgres

Your atomic_bad schema holds events that have failed to be processed by your pipeline. These are called failed events.

You will see in Postgres that you have a table called atomic_bad.com_snowplowanalytics_snowplow_badrows_schema_violation_1. When we triggered the events in Step 1 of this section, we also triggered one that failed to be processed by your pipeline. This is a fundamental aspect of Snowplow; ensuring that only good quality data reaches your enriched stream, syphoning off poor quality data so that you have the ability to correct and recover it. 

As the event passed through your pipeline, the Enrich application tried to fetch the schema for the event. It does this so it can validate that the structure of the event conforms to the schema definition that was defined up front, therefore ensuring it is of the quality expected.  

We purposely sent a custom event to your collector without having created the schema to validate against first so that we can demonstrate how failed events work. As such, the Enrich application was unable to validate the quality of the event and generated a failed event, complete with rich metadata as to why it failed.

This is what we refer to as a ‘self-describing event’; every custom event that you collect with your Snowplow pipeline describes how it should be structured, so that your pipeline knows when the event is ‘good’ or ‘bad’ quality. 

In the next section, we guide you through creating a custom schema so that your custom event would validate against it and end up in your atomic.events table rather than your atomic_bad. 

Schemas

Learn more about self-describing events and schemas, and the different types of failures here.

Note: you might also see adapter failure failed events in Postgres. Many adaptor failures are caused by bot traffic, so do not be surprised to see some of them in your pipeline. Find out more here.

Step 4: Inspect your data in S3

S3 provides an important backup of your data and can also serve as your data lake. 

  • Navigate to the AWS management console, search for S3 and select 
  • If you have multiple buckets on S3 already, you can navigate to the correct one by searching for the s3 bucket name that you entered when spinning up your pipeline

When you created your pipeline you also created three directories in your S3 bucket: 

The enriched/ and bad/ directory holds your enriched data, and the data that has failed to be validated by your pipeline. We took a look at this data in Postgres in the last step.  

The raw/ directory holds the events that come straight out of your collector and have not yet been validated (i.e. quality checked) or enriched by the Enrich application. They are thrift records and are therefore a little tricky to decode –  there are not many reasons to use this data, but backing this data up gives you the flexibility to replay this data should something go wrong further downstream in the pipeline. 

Athena

You can learn more about querying your data at scale on S3 using Athena here.

Further explore your pipeline by adding tracking to your own application, collecting custom events, and enabling further enrichments >>>