Creating Your First .air File

For those who want to use machine learning in a production environment, training the model is just the first step. Here we show how to prepare your model for integration into the browser.
Applications built with the AI Squared Javascript SDK (including the AI Squared Google Chrome Extension) make it as easy as possible for you to integrate a machine learning model or analytic into an end-user application. To let the model/analytic reach its fullest potential there, though, it is important to communicate to the application (1) what data to pass to the model/analytic, (2) what outputs to expect from the model/analytic, and (3) how to display the model/analytic's output within the end-user environment. This tutorial walks you through the process of using our high-level python API to create the configuration document which defines the interaction of your model/analytic with the AI Squared application in the end user environment.
Good to know: check out our collection of use case examples for some end-to-end examples of training a model using our sparse model architecture and configuring it for use in an AI Squared application.

The Codebase

The aisquared.config package is a fully-featured Python API that supports the creation of .air files using both local and remote models/analytics. Please see the AI Squared Docs page for in-depth documentation.

Setting up the Configuration File

In creating the configuration file, you are: (1) first, defining how the model/analytic you are integrating into the browser will interact with the extension and the data in the production environment; and (2) second, creating a compressed .air file containing the configuration file and the model/analytic (if you are using a local mode/analytic - if you choose to use a remote model/analytic, the output .air file will contain only the configuration file which points to the remote endpoint).


The rough outline for defining the .air file is as follows:
1) Naming the .air file
2) Defining how you'll harvest your model's input data
3) Defining how the harvested data should be preprocessed
4) Identify the model to use for inference
5) Defining how to post-process model output (e.g. applying a labelmap)
6) Defining how model results will be rendered in the end user environment
7) *optional* Creating a feedback survey for user-generated feedback on the model's performance
8) *optional* Pointing to an MLflow experiment which will track the performance of the integrated model


This example uses the aisquared.config.ModelConfiguration object.
config = aisquared.config.ModelConfiguration(
name = 'example name', #name of the output .air file
harvesting_steps = harvester, #how to scrape data to run through the model
preprocessing_steps = preprocessor, #how to clean the scraped data
analytic = analytic, #points to location of local model or remote endpoint
postprocessing_steps = postprocessor, #how to manipulate model output
rendering_steps = renderer, #how to display the model results in the DOM
version = None, #version number of model, defaults to None
description = 'New model', #description of model visible in the extension
mlflow_uri = None, #URI of MLflow experiment
mlflow_user = None, #MLflow username
mlflow_token = None #MLflow token

Harvesting Data

To get value from a model, you need data to run through it. Once you've got an inference and you want to re-insert it into the browser environment, you'll need some element to map that inference to. This step helps with both of these by defining how the extension gathers text or image data from the DOM.
Data harvesting is facilitated by the aisquared.config.harvesting objects.

Preprocessing Model Inputs

The aisquared.config.preprocessing package is designed to take the raw data harvested in the previous step and transforms it into an appropriate input for the model or analytic you will use.

Identify Model or Analytic

Now that you've harvested and preprocessed your input data, it's time to identify the model or analytic you will run that data through using the aisquared.config.analytic package. This package contains classes which allow you to point to wherever your model or analytic is deployed.

Postprocessing Model Results

Most models output values that aren't directly useful to human end-users. Since the magic of AI Squared is putting model results directly within the end-user's field of view, doesn't it make sense to transform model outputs so that they tell humans what they need to know?
We thought so, so we created the postprocessing subpackage to help. This subpackage contains classes for transforming common model output types to human-readable values:

Displaying Model Results

So, now you've got human-readable model results - great job so far! Now it's time to take those model outputs and map them back into the browser. This is highly case- and model-dependent, so we've built several classes which make this a snap. At a high level, this process involves identifying how/where to map model predictions, and how to visually display them. Rendering options can be defined using the aisquared.config.rendering subpackage.

Computer Vision Use Cases

For cases where we've harvested image data from the DOM, processed it, and returned some class label(s), the next step is to map the predicted class label(s) back to images in the DOM.

Natural Language Use Cases

For cases where we want to inject natural language results into the DOM, we will need to identify keywords to use to map model results back to the DOM.

Compile .air File

Now that you've instantiated your .air file and defined all of the logic needed to successfully integrate model results into the browser, you can use the aisquared.config.ModelConfiguration package's .compile() method to create a compressed .air file that can be dragged & dropped into the AI Squared extension. For more information please see the Extension Quickstart guide, which will walk you through importing .air files into the AI Squared extension and using them to perform inference on data in the browser.

Special Cases

Please note that this guide is an introduction to rather than an exhaustive overview of the capabilities of AI Squared. In this guide we have summarized the processes we've simplified through the use of a high-level Python API to help data scientists and other Python-literate users deploy models using AI Squared.
The Python API calls included here are simply an abstracted way to interact with the underlying Javascript SDK, and users well-versed in Javascript will be able to be much more flexible in their use of AI Squared, including enabling use cases which are beyond the scope of the current Python API. We welcome feedback on our current API and input on what capabilities you would like us to support, as well as interest in accessing our Javascript SDK. Please reach out to [email protected] with feedback, suggestions, or comments!