Sentiment Analysis

This use case shows you how we can use our MANN and aisquared libraries to build and configure a .air file which can be dragged & dropped into a widget created using air JS for use in the browser.

In this example, we will show how to create a .air file to perform sentiment analysis in the browser using a neural network. To do this, we will utilize the IMDB Movie Reviews data set to build the initial model and then package the model using the aisquared Python SDK.

Dependencies

For this example, the following dependencies are required:

  • aisquared

Both of these are available on pypi via pip. The following cell also runs the commands to install these dependencies as well as imports them into the notebook environment, along with TensorFlow (which is a dependency of the mann package).

! pip install aisquared

from sklearn.metrics import classification_report, confusion_matrix
from sklearn.model_selection import train_test_split
import tensorflow as tf
import pandas as pd
import aisquared

Model Creation

Now that the required packages have been installed and imported, it is time to create the sentiment analysis model. To do this, we have to first download and preprocess the data, create the model, prune the model so that it can perform well in the browser, and then package the model in the .air format. The following cells will go through an in-depth explanation of each of the steps in this process.

# Loading the data from working directory

df = pd.read_csv('IMDB Dataset.csv')

# Load the data and tokenize

tokenizer = tf.keras.preprocessing.text.Tokenizer(10000, oov_token = 1)
tokenizer.fit_on_texts(df.review)
vocab = tokenizer.word_index

sequences = tokenizer.texts_to_sequences(df.review)
sequences = tf.keras.preprocessing.sequence.pad_sequences(sequences, 256, padding = 'pre', truncating = 'post')

labels = df.sentiment.apply(lambda x : {'positive' : 0, 'negative' : 1}[x]).values

x_train, x_test, y_train, y_test = train_test_split(sequences, labels, train_size = 0.7)
del vocab[1]

# Create the model

input_layer = tf.keras.layers.Input(sequences.shape[1:])
embedding_layer = tf.keras.layers.Embedding(
    10000,
    4
)(input_layer)
x = tf.keras.layers.Flatten()(embedding_layer)
for _ in range(5):
    x = tf.keras.layers.Dense(1000, activation = 'relu')(x)
output_layer = tf.keras.layers.Dense(1, activation = 'sigmoid')(x)

model = tf.keras.models.Model(input_layer, output_layer)
model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics = ['accuracy'])

Model Training and Evaluation

Now that you've created a model, you can get it ready for production. Since we are going to be using this model in an AI Squared - powered application in the browser, we'll want to ensure that it is as lightweight as possible. We accomplish this by using active sparsification - as the model trains, we reduce the parameter count without sacrificing accuracy, thereby yielding a model which requires less computational resources once its deployed. We then test the model's accuracy to ensure that it'll be up to the task of your production workloads.

# Train the model with the sparsification callback
model.fit(
    x_train,
    y_train.reshape(-1,1),
    epochs = 20,
    batch_size = 512,
    validation_split = 0.2,
    verbose = 2,
    callbacks = None
)

# Check model performance
preds = (model.predict(x_test) >= 0.5).astype(int)
print('Model Performance on Test Data:')
print('\n')
print(confusion_matrix(y_test, preds))
print(classification_report(y_test, preds))

# Save the model
model.save('SentimentClassifier.h5')

Package the Model

harvester = aisquared.config.harvesting.InputHarvester()

preprocessor = aisquared.config.preprocessing.text.TextPreprocessor(
    [
        aisquared.config.preprocessing.text.RemoveCharacters(),
        aisquared.config.preprocessing.text.ConvertToCase(lowercase = True),
        aisquared.config.preprocessing.text.Tokenize(),
        aisquared.config.preprocessing.text.ConvertToVocabulary(vocabulary = vocab, oov_character = 1, start_character = 0),
        aisquared.config.preprocessing.text.PadSequences(length = 256, pad_location = 'pre', truncate_location = 'post')
    ]
)   
analytic = aisquared.config.analytic.LocalModel('SentimentClassifier.h5', 'text')

postprocessor = aisquared.config.postprocessing.BinaryClassification(['positive', 'negative'], 0.5)

renderer = aisquared.config.rendering.DocumentRendering(include_probability = True)

feedback = aisquared.config.feedback.BinaryFeedback(['positive', 'negative'])

aisquared.config.ModelConfiguration(
    'SentimentClassifier',
    harvester,
    preprocessor,
    analytic,
    postprocessor,
    renderer,
    feedback).compile(dtype = 'float16')

Now that the model has been created, we can package the model into a single .air file that enables integration into the browser.

To perform this packaging, we will be utilizing the aisquared package Document.

Congratulations! Now you are ready to drag & drop your model into the browser with the AI Squared Extension!

Last updated