1. Home
  2. Docs
  3. Modeling your data
  4. Analytics SDK

Analytics SDK

Overview

Analytics SDK was created for data engineers and data scientists working with Snowplow in a number of languages.

Some good use cases for the SDK include:

  1. Performing event data modeling in Apache Spark as part our Spark batch pipeline
  2. Developing machine learning models on your event data using Apache Spark (e.g. using Databricks or Zeppelin on EMR)
  3. Performing analytics-on-write in AWS Lambda as part of our Kinesis real-time pipeline.

We are hugely excited about developing our analytics SDK initiative in four directions:

  1. Adding more SDKs for other languages popular for data analytics and engineering, including Python, Node.js (for AWS Lambda) and Java
  2. Adding additional event transformers to the Scala Analytics SDK – please let us know any suggestions!
  3. We are planning on “dogfooding” the Scala Analytics SDK by starting to use it in standard Snowplow components, such as our Kinesis Elasticsearch Sink
  4. Adding additional functions that are useful for processing event data (and sequences of event data) in particular

Snowplow Analytics SDKs

  • Scala Analytics SDK – lets you work with Snowplow enriched events in your Scala event processing, data modeling and machine-learning jobs. You can use this SDK with Apache Spark, AWS Lambda, Apache Flink, Scalding, Apache Samza and other Scala-compatible data processing frameworks.
  • Python Analytics SDK – lets you work with Snowplow enriched events in your Python event processing, data modeling and machine-learning jobs. You can use this SDK with Apache Spark, AWS Lambda, and other Python-compatible data processing frameworks.
  • .NET Analytics SDK – lets you work with Snowplow enriched events in your .NET event processing, data modeling and machine-learning jobs. You can use this SDK with Azure Data Lake Analytics, Azure Function, AWS Lambda, Microsoft Orleans and other .NET-compatible data processing frameworks.
  • JavaScript and TypeScript Analytics SDK – lets you work with Snowplow enriched events in your Node.js or other JavaScript environments. This SDK can be used with AWS Lambda and Google Cloud Functions.

Articles