IAB Enrichment

Summary

The IAB Spiders & Robots Enrichment uses the IAB/ABC International Spiders and Bots List to determine whether an event was produced by a user or a robot/spider based on its’ IP address and user agent.

Overview

Spiders & bots are a sometimes considered a necessary evil of the web. We want search engine crawlers to find our site, but we also don’t want a lot of non-human traffic clouding our reporting.

The Interactive Advertising Bureau (IAB) is an advertising business organization that develops industry standards, conducts research, and provides legal support for the online advertising industry.

Their internationlly recognized list of spiders and bots is regularly maintained to try and identify the IP addresses of known bots and spiders.

This enrichment uses the IAB/ABC International Spiders & Robots List to look up information about an event based on it’s IP address and useragent.

For more information on this enrichment read here.

For help in configuring this enrichment please reach out to support@snowplowanalytics.com.

Example

Here is an example configuration JSON, containing all databases required by the enrichment:

{
    "schema": "com.snowplowanalytics.snowplow.enrichments/iab_spiders_and_robots_enrichment/jsonschema/1-0-0",
    "data": {
        "name": "iab_spiders_and_robots_enrichment",
        "vendor": "com.snowplowanalytics.snowplow.enrichments",
        "enabled": true,
        "parameters": {
            "ipFile": {
                "database": "ip_exclude_current_cidr.txt",
                "uri": "s3://my-private-bucket/iab"
            },
            "excludeUseragentFile": {
                "database": "exclude_current.txt",
                "uri": "s3://my-private-bucket/iab"
            },
            "includeUseragentFile": {
                "database": "include_current.txt",
                "uri": "s3://my-private-bucket/iab"
            }
        }
    }
}