1. Home
  2. Docs
  3. Enriching your data
  4. What is the enrichment process?

What is the enrichment process?

Overview

During Enrichment your events have extra properties and values attached to them, also know as dimension widening.

Snowplow enrichments can be categorized into three brackets:

  • Hardcoded enrichments loading atomic.events (legacy)
  • Configurable enrichments loading atomic.events (legacy)
  • Configurable enrichments adding new contexts to the derived_contexts JSON array

Legacy enrichments are those which populate atomic.events table as opposed to dedicated enrichment tables. The hardcoded legacy enrichments normally take place as part of common enrichment process and they precede configurable enrichments. During the common enrichment process the data received from collector(s) is mapped according to our Canonical Event Model.

Configurable enrichments often depend on the data produced by the common enrichment process.

Hardcoded enrichments

The following fields are populated depending on whether the tracker provided the corresponding value or not.

Raw ParameterEnriched ParameterPurpose
eidevent_idThe unique event identifier (UUID). Assigned during enrichment if not provided with eid
cvv_collectorCollector type/version
tnuidnetwork_useridUser ID set by Snowplow using 3rd party cookie. Overwriten with tracker-set tnuid.
ipuser_ipaddressSnowplow collectors log IP address as standard. However, you can override the value derived from the collector by populating this value in the tracker.
uauseragentRaw useragent (browser string). Could be overwritten with ua.

The following fields are populated depending on the collector and ETL (Extract, Transform, Load) utilized in the pipeline.

Added ParameterPurpose
v_etlHost ETL version
etl_tstampTimestamp event began ETL
collector_tstampTime stamp for the event recorded by the collector

The raw parameter res (if present) representing the screen/monitor resolution and coming in as a combination of width and height (ex. 1280x1024) is broken up into separate entities.

Added ParameterPurpose
dvce_screenwidthScreen / monitor width
dvce_screenheightScreen / monitor height

The url parameter provides the value for page_url in atomic.events, which represents the current page’s URL. The following parts are extracted and populate separate fields as outlined below.

Added ParameterPurpose
page_urlschemeScheme (protocol), ex. “http”
page_urlhostHost (domain), ex. “www.snowplowanalytics.com”
page_urlportPort if specified, 80 if not
page_urlpathPath to page, ex. “/product/index.html”
page_urlqueryQuerystring, ex. “id=GTM-DLRG”
page_urlfragmentFragment (anchor), ex. “4-conclusion”

Similarly, page_referrer gets the value from refr, which represents the referer’s URL, and the following parts are extracted and populate separate fields as shown below.

Added ParameterPurpose
refr_urlschemeScheme (protocol)
refr_urlhostHost (domain)
refr_urlportPort if specified, 80 if not
refr_urlpathPath to page
refr_urlqueryQuerystring
refr_urlfragmentFragment (anchor)

Additionally the derived timestamp is calculated, derived_tstamp. See this blog post for more details.

Finally, contexts, unstructured events and the relevant configurable enrichments (if enabled) are validated against their corresponding JSON schemas and the array of the derived contexts is assembled.

Configurable enrichment

All configurable enrichments are listed on the Available Enrichments page.

The following configurable enrichments write data into atomic.events table (legacy enrichments):

All other configurable enrichments create a separate context and thus are loaded into their own dedicated tables.