IP Address Lookup Analytic

This use case shows a more complex lookup use case using a local analytic.

In this example, we will show how to create a .air file to perform a lookup of analytic results from a local analytic - in this case, a JSON data structure. To do this, we will create a simple example which will utilize a regular expression to identify strings of digits which could potentially be IP addresses. Once those are identified, the IP addresses will be looked up in a fictional user database to see if there is any additional information about that user.

Dependencies

For this notebook, the following dependencies are required:

  • aisquared

This package is available on on pypi via pip. The following cell also runs the commands to install this dependency as well as imports them into the notebook environment.

! pip install aisquared

import aisquared
import json

Analytic Creation

Now that the required packages have been installed and imported, it is time to create the results of the analytic. These results are going to be recorded in JSON form via a Python dictionary, with the top-level key being IP address and the lower-level keys being the associated individual's name and customer ID.

# Configure the example results

results = {
    '111.111.11.11' : {
        'name' : 'John Doe',
        'userID' : 11111
    },
    '222.222.22.22' : {
        'name' : 'Jane Doe',
        'userID' : 22222
    },
    '333.333.33.33' : {
        'name' : 'Alice Doe',
        'UserID' : 33333
    }
}

# Save the analytic as a JSON file
file_name = 'analytic.json'
with open(file_name, 'w') as f:
    json.dump(results, f)

Create the Model Configuration

In the following block, we configure the harvesting, preprocessing, analytic, postprocessing, and rendering steps. Once those are created, we add them all to the ModelConfiguration object and compile them into the .air file.

Harvesting data from the webpage

# For harvesting, we need to harvest using a regular expression
# that identifies possible IP addresses numbers. The following lines of 
# code configure such harvesting

regex = '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
harvester = aisquared.config.harvesting.TextHarvester(
    how = 'regex',
    regex = regex
)

Pre- and Post-processing

# No pre and postprocessing steps are needed, so we can set those value to None
preprocesser = None
postprocesser = None

Analytic

# The analytic for this configuration is going to be a LocalAnalytic
# class, where we pass the saved file to the object

analytic = aisquared.config.analytic.LocalAnalytic(file_name)

Rendering

# To render results, we are going to use the WordRendering class to
# initialize the rendering of badges for individual words. By default,
# the WordRendering class renders the specific words/tokens that were
# input into the analytics

renderer = aisquared.config.rendering.WordRendering()

Putting it all Together

Now that the pieces are in place, we can put them together into a .air file that is ready to use in the AI Squared Extension.

# Finally, we will take the previous objects and put them all 
# together into a single ModelConfiguration object, which is then 
# compiled into the .air file

config = aisquared.config.ModelConfiguration(
    name = 'IPAddressLookup',
    harvesting_steps = harvester,
    preprocessing_steps = preprocesser,
    analytic = analytic,
    postprocessing_steps = postprocesser,
    rendering_steps = renderer,
    version = None,
    description = 'IP address lookup which shows name and user ID (if present) for IP addresses found',
    mlflow_uri = None,
    mlflow_user = None,
    mlflow_token = None
)

# compile to create .air file
config.compile()

Try it now!

To try using this analytic without running the Python code, here is a copy of the analytic as a compiled .air file.

Last updated