1. Home
  2. Docs
  3. Enriching your data
  4. Available enrichments
  5. Referer Parser enrichment

Referer Parser enrichment

Summary

The referer parser enrichment uses the Snowplow referer-parser to extract attribution data from referer URLs. You can provide a list of internal subdomains which will be treated as “internal” rather than unknown.

Overview

In order to help understand traffic patterns to your website, knowing which sites refer users is very much a staple of analytics. The referer parser process takes the value of the referring URL and matches it against the company/site it belongs to.

This is particularly useful when looking for specific traffic from search engine providers or social networks as an example. Rather than scouring a full referrer URL list this enrichment adds an additional field so you can look at reports that combine sub-domains from some of the bigger referrers.

The results of the lookup from the referer parser end up in the atomic.events table in your data warehouse under the columns refr_medium (refering to categories like social or search for example), refr_source (companies like Google or Facebook) as well as others with the reffr prefix.

By specifying particular subdomains in the enrichment configuration file, traffic from those subdomains will be grouped into “Internal” rather than “Unknown”, which should be clearer when building reports.

Example

Snowplow has several subdomains like console.snowplowanalytics.com and discourse.snowplowanalytics.com. As users move from these subdomains to our main snowplowanalytics.com domain, we would like to capture that traffic as being referred from an “Internal” medium. Therefore we would set the configuration as such:

{ "schema": "iglu:com.snowplowanalytics.snowplow/referer_parser/jsonschema/2-0-0", "data": { "vendor": "com.snowplowanalytics.snowplow", "name": "referer_parser", "enabled": true, "parameters": { "database": "referers-latest.json", "internalDomains": [ "console.snowplowanalytics.com", "discourse.snowplowanalytics.com" ], "uri": "https://s3-eu-west-1.amazonaws.com/snowplow-hosted-assets/third-party/referer-parser/" } } }

Enabling this enrichment with the above configuration would fill the refr_medium column in our data warehouse with “Internal” when the referring URL to a page matches the subdomains above.

If we were then to run a query on the DISTINCT values next to a count of sessions for each we could have a table like the one below:

REFR_MEDIUMSESSIONS
Search272,699
Internal142,555
Unknown127,335
Social14,525
Email5,345