1. Home
  2. Docs
  3. Designing Your Tracking
  4. Managing data structures via the API

Managing data structures via the API

As well as managing data structures through the Snowplow Console, Snowplow Insights customers can also manage them programmatically through the data structures API.

This functionality is key to automating any existing process you may have, including workflows in version control systems like GitHub.

Partnered with other tools like our CI tool and/or Snowplow Micro, it’s possible to have a very robust and automated data structure workflow that ensures data quality upstream of data hitting your pipeline.

Note

Data structures interfaces are only compatible with pipelines that have been upgraded to use Iglu Server registries, rather than static S3 registries. Please check here in the console, to see if you need an upgrade or if your registries are ready to go.

Authorization

To start using the API you’ll need authorization credentials.

First, you’ll need to:

Once you have these you can use this authorization flow to exchange credentials for a token.

This token will be needed in any request to the API in the form of Authorization: Bearer {{token}}

Getting started

You can have a look at and interact with all available endpoints in the API documentation.

The endpoints focus on the main operations in the workflow around:

  1. Retrieving existing data structures and their associated schemas
  2. Creating or editing new or existing data structures
  3. Validating a schema
  4. Deploying a schema to a registry

Each request will need to include your company’s organizationID which is a UUID that can be retrieved from the URL immediately following the .com when visiting console:

Retrieving data structures

The following GET requests allow you to retrieve data structures from both your development and production environment registries.

GET ​/api​/schemas​/v1​/organizations​/{organizationId}​/schemas

Use this request to:

  • Retrieve a list of all data structures
  • Retrieve a list of data structures filtered by vendor or name query parameters

GET ​/api​/schemas​/v1​/organizations​/{organizationId}​/schemas​/{schemaHash}

Use this request to retrieve a specific data structure by its hash, which is generated on creation.

GET ​/api​/schemas​/v1​/organizations​/{organizationId}​/schemas​/{schemaHash}​/versions​/{versionNumber}

Use this request to retrieve all versions of a particular data structure.

See the detailed API documentation for all options.

Validation

To validate that your schema is in proper JSON format and complies with warehouse loading requirements, you can use the validation POST requests.

POST ​/api​/schemas​/v1​/organizations​/{organizationId}​/validation-requests

POST ​/api​/schemas​/v1​/organizations​/{organizationId}​/validation-requests​/sync

GET ​/api​/schemas​/v1​/organizations​/{organizationId}​/validation-requests​/{validationRequestId}

Please note:

  • there is a sync version of the endpoint which will return the validation response immediately, while the option without will generate a validation request id that you can poll for against the GET request
  • be sure to include any meta data (see below) as a separate property of the data structure you are trying to validate

Deployments

The deployment endpoints deal with getting a new or edited version of your data structure into your development and production environments.

GET ​/api​/schemas​/v1​/organizations​/{organizationId}​/schemas​/{schemaHash}​/deployments

POST ​/api​/schemas​/v1​/organizations​/{organizationId}​/deployment-requests

POST ​/api​/schemas​/v1​/organizations​/{organizationId}​/deployment-requests​/sync

GET ​/api​/schemas​/v1​/organizations​/{organizationId}​/deployment-requests​/{deploymentRequestId}

Please note:

  • You deploy from one environment to another. This includes a deployment from (virtual environment) “VALIDATED” to “DEV”, then “DEV” to “PROD”.
  • Only users designated as “admin” in the console have the permissions to promote from “DEV” to “PROD”. This follows the workflow of data structures to validate, test on development and then deploy to production.
  • There is a sync option that will return the response of the deployment request directly. Otherwise you can poll for deployment responses using the deployment ID.
  • The property for message can be sent with a deployment which will capture any change log notes that will be stored against the deployment. This feature is only available to Enterprise tier accounts in the console.

Managing meta data

Meta data is used to add additional information to a Data Structure.

"meta": { "hidden": false, "schemaType": "event", "customData": {} }

The hidden property sets the schema as visible (true) or not (false) in the console.

The schemaType property can be set as null | “event” | “entity”.

The customData property is mapped as [string, string] and can be used to send across any key/value pairs you’d like to associate with the schema.

For example if you wanted to specify departmental ownership through meta data:

"customData": { "department": "marketing" }

You can update the meta data for a data structure using the PUT endpoint:

PUT ​/api​/schemas​/v1​/organizations​/{organizationId}​/schemas​/{schemaHash}​/meta