1. Home
  2. Docs
  3. Designing tracking
  4. Managing data structures via the API

Managing data structures via the API

As well as managing data structures through the Snowplow Console, Snowplow Insights customers can also manage them programmatically through the data structures API.

This functionality is key to automating any existing process you may have, including workflows in version control systems like GitHub.

Partnered with other tools like our CI tool and / or Snowplow Micro, it’s possible to have a very robust and automated data structure workflow that ensures data quality upstream of data hitting your pipeline.

Note

Data structures interfaces are only compatible with pipelines that have been upgraded to use Iglu Server registries, rather than static S3 registries. Please check here in the console, to see if you need an upgrade or if your registries are ready to go.

Authorization

To start using the API you’ll need authorization credentials.

First, you’ll need to:

Once you have these you can exchange credentials for a token.

Here is an example CURL to use to fetch the token:

curl --request POST \ --url 'https://id.snowplowanalytics.com/oauth/token' \ --header 'content-type: application/x-www-form-urlencoded' \ --data grant_type=password \ --data username=USER@DOMAIN.COM \ --data password='PASSWORD' \ --data audience=https://snowplowanalytics.com/api/ \ --data client_id='YOUR_CLIENT_ID' \ --data client_secret='YOUR_CLIENT_SECRET'

This token will be needed in any request to the API in the form of Authorization: Bearer {{token}}

Getting started

You can have a look at and interact with all available endpoints in the API documentation.

Authorizing in the API documentation

To be able to post sample requests in the documentation you need to click the Authorize button at the top of the document and authorize with your token. The value for the token field in each individual requests is overwritten by this authorization.

The endpoints focus on the main operations in the workflow around:

  1. Retrieving existing data structures and their associated schemas
  2. Creating or editing new or existing data structures
  3. Validating a schema
  4. Deploying a schema to a registry

Each request will need to include your company’s organizationID which is a UUID that can be retrieved from the URL immediately following the .com when visiting console:

Retrieving data structures

The following GET requests allow you to retrieve data structures from both your development and production environment registries.

Retrieve a list of all data structures

Use this request to:

  • Retrieve a list of all data structures
  • Retrieve a list of data structures filtered by vendor or name query parameters

GET ​/api​/schemas​/v1​/organizations​/{organizationId}​/schemas

Retrieve a specific data structure

Use this request to retrieve a specific data structure by its hash (see ‘Generating a data structure hash’ below), which is generated on creation.

GET ​/api​/schemas​/v1​/organizations​/{organizationId}​/schemas​/{schemaHash}

Retrieve specific version of a specific data structure

Use this request to retrieve all versions of a specific data structure by its hash (see ‘Generating a data structure hash’ below)

GET ​/api​/schemas​/v1​/organizations​/{organizationId}​/schemas​/{schemaHash}​/versions​/{versionNumber}

See the detailed API documentation for all options.

Generating a schema hash

To use the commands to retrieve information about a specific Data Structure, you need to encode its identifying parameters (organization ID, vendor, name and format) and hash it with SHA-256.

Example:
organization ID: 38e97db9-f3cb-404d-8250-cd227506e544
vendor: com.acme.event
schema name: search
format: jsonschema 

First concatenate the information with a dash (-) as the separator:
38e97db9-f3cb-404d-8250-cd227506e544-com.acme.event-search-jsonschema

And then hash them with SHA-256 to receive: a41ef92847476c1caaf5342c893b51089a596d8ecd28a54d3f22d922422a6700

Validation

To validate that your schema is in proper JSON format and complies with warehouse loading requirements, you can use the validation POST requests.

POST ​/api​/schemas​/v1​/organizations​/{organizationId}​/validation-requests

POST ​/api​/schemas​/v1​/organizations​/{organizationId}​/validation-requests​/sync

GET ​/api​/schemas​/v1​/organizations​/{organizationId}​/validation-requests​/{validationRequestId}

Example

curl 'https://console.snowplowanalytics.com/api/schemas/v1/organizations/cad39ca5-3e1e-4e88-91af-87d977a4acd8/validation-requests/sync' \ -H 'authorization: Bearer YOUR_TOKEN' \ -H 'content-type: application/json' \ --data-binary '{ "meta": { "hidden": false, "schemaType": "event", "customData": {} }, "data": { "description": "Schema for an example event", "properties": { "example_field_1": { "type": "string", "description": "the example_field_1 means x", "maxLength": 128 } }, "additionalProperties": false, "type": "object", "required": [ "example_field_1" ], "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#", "self": { "vendor": "com.acme", "name": "example_schema_name", "format": "jsonschema", "version": "1-0-0" } } }'

Please note:

  • the request’s body has two parts:
    • one for data structure metadata as value to the meta key
    • one for the schema itself as value to the data key
  • this example uses the synchronous version of validation that responds with the result immediately. There is also an asynchronous version available that returns a request ID that you can later poll to get the result.
  • you can add metadata specific to your organisation to the schema as key/value pairs in the customData object. See ‘Managing Meta Data‘ for more information.

Deployments

The deployment endpoints deal with getting a new or edited version of your data structure into your development and production environments.

GET ​/api​/schemas​/v1​/organizations​/{organizationId}​/schemas​/{schemaHash}​/deployments

POST ​/api​/schemas​/v1​/organizations​/{organizationId}​/deployment-requests

POST ​/api​/schemas​/v1​/organizations​/{organizationId}​/deployment-requests​/sync

GET ​/api​/schemas​/v1​/organizations​/{organizationId}​/deployment-requests​/{deploymentRequestId}

Example

curl 'https://console.snowplowanalytics.com/api/schemas/v1/organizations/cad39ca5-3e1e-4e88-91af-87d977a4acd8/deployment-requests/sync' \ -H 'authorization: Bearer MY_TOKEN' \ -H 'content-type: application/json' \ --data-binary '{ "message": "", "source": "VALIDATED", "target": "DEV", "vendor": "com.acme", "name": "example_schema_name", "format": "jsonschema", "version": "1-0-0" }'

Please note:

  • This example shows an example of deploying from VALIDATED to DEV. The method is the same for Production, but you would change the variables where "source": "DEV" and "target": "PROD"
  • The schema does not need to be added here again. It has been temporarily saved in the virtual VALIDATED environment, and will remain there for 60 minutes after validation.
  • The API enforces a workflow of validating, testing on development and then deploying to production. To achieve this you deploy from one environment to another; from (virtual environment) VALIDATED to DEV, then DEV to PROD.
  • Only users designated as “admin” in the console have the permissions to promote from DEV to PROD.
  • There is a sync option that will return the response of the deployment request directly. Otherwise you can poll for deployment responses using the deployment ID.
  • The property for message can be sent with a deployment which will capture any change log notes that will be stored against the deployment. This feature is only available to Enterprise tier accounts in the console.

Managing meta data

Meta data is used to add additional information to a Data Structure.

"meta": { "hidden": false, "schemaType": "event", "customData": {} }

The hidden property sets the schema as visible (true) or not (false) in the console.

The schemaType property can be set as null | “event” | “entity”.

The customData property is mapped as [string, string] and can be used to send across any key/value pairs you’d like to associate with the schema.

For example if you wanted to specify departmental ownership through meta data:

"customData": { "department": "marketing" }

You can update the meta data for a data structure using the PUT endpoint:

PUT ​/api​/schemas​/v1​/organizations​/{organizationId}​/schemas​/{schemaHash}​/meta