Spaces:

maxcembalest
/

ask-arthur

Sleeping

File size: 28,173 Bytes

ad8da65

# Model Onboarding

### Overview

This guide walks through the steps of onboarding a model deployed in production to Arthur.
Once your deployed model is onboarded, you can use Arthur to retrieve insights about
model performance efficiently and at scale.

```{note} This walkthrough uses tabular data.

To onboard models of other input types, see {doc}`cv_onboarding` and {doc}`nlp_onboarding`.
```

### Requirements

You will need to have access to the data your model ingests and the predictions it produces.

The model object itself is _not_ required, but it can be uploaded to enable the explainability enrichment.
See our {doc}`/more-info/FAQs` for more info.

***

### Outline

This guide will cover the three main steps to onboarding a model to the Arthur platform:

- [Model Registration](#model-registration) is the process of registering the model schema with Arthur and sending reference data
- [Onboarding Existing Inferences](#onboard-existing-inferences) sends your model's historical predictions to the Arthur platform
- [Production Integration](#production-integration) connects your model's ongoing predictions in deployment to be logged with Arthur


***

## Model Registration

### Connect to Arthur

The first step is to import functions from the `arthurai` package and establish a connection with an Arthur username and password.

```python
# Arthur imports
from arthurai import ArthurAI
from arthurai.common.constants import InputType, OutputType, Stage, ValueType, Enrichment

arthur = ArthurAI(url="https://app.arthur.ai",
                  login="<YOUR_USERNAME_OR_EMAIL>")
```

### Register Model Type

To register a model, we start by creating a model object and defining its
{ref}`high-level metadata <basic_concepts_input_output_types>`:

```python
arthur_model = arthur.model(
    partner_model_id="OnboardingModel_123",
    display_name="OnboardingModel",
    input_type=InputType.Tabular,
    output_type=OutputType.Multiclass,
    is_batch=False)
```

In particular, we set `is_batch=False` to define this as a {ref}`streaming model <basic_concepts_streaming_vs_batch>`,
which means the Arthur platform will receive the model's inferences as they are produced live in deployment.

### Register Attributes with [ArthurModel.build()](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.core.models.ArthurModel.html#arthurai.core.models.ArthurModel.build)

Next we'll add more detail to the model metadata, defining the model's {ref}`attributes <basic_concepts_attributes_and_stages>`.
The simplest method of registering your attributes is to use
[ArthurModel.build()](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.core.models.ArthurModel.html#arthurai.core.models.ArthurModel.build)
, which parses a Pandas DataFrame of your {ref}`reference dataset <basic_concepts_reference_dataset>` containing inputs,
metadata, predictions, and ground truth labels. In addition, a `pred_to_ground_truth_map` is required, which tells
Arthur which of your attributes represent to your model's predicted values, and how those predicted attributes correspond
to your model's ground truth attributes.

Here we build a model with a `pred_to_ground_truth_map` configured for a binary classification model.
```python
# Map PredictedValue attribute to its corresponding GroundTruth attribute value.
# This tells Arthur that in the data you send to the platform,
# the `predicted_probability` column represents
# the probability that the ground-truth column has the value 1
pred_to_ground_truth_map = {
    'predicted_probability' : 1
}
arthur_model.build(
    reference_df,
    ground_truth_column='ground_truth_label',
    pred_to_ground_truth_map=pred_to_ground_truth_map)
```

#### Non Input Attributes

Some features of your data may be important to track for monitoring model performance even though they are not model
inputs or outputs. These features can be added as non input attributes in the ArthurModel:

```python
# Specifying additional non input attributes when building a model.
# This tells Arthur to monitor ['age','sex','race','education']
# in the reference and inference data you send to the platform
arthur_model.build(
    reference_df,
    ground_truth_column='ground_truth_label',
    pred_to_ground_truth_map=pred_to_ground_truth_map,
    non_input_columns=['age','sex','race','education']
)
```

### Register Attributes Manually

As an alternative to passing a DataFrame to
[ArthurModel.build()`](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.core.models.ArthurModel.html#arthurai.core.models.ArthurModel.build)
, attributes can also be registered for your model
manually. Registering attributes manually may be preferable if you don't use the Pandas library, or if there are attribute
properties not configurable from parsing your reference data alone.

[ArthurModel.add_attribute()](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.core.models.ArthurModel.html#arthurai.core.models.ArthurModel.add_attribute)
 is the generic method add any type of attribute to a model - its
[docstring](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.core.models.ArthurModel.html#arthurai.core.models.ArthurModel.add_attribute)
 also links to the additional attribute registration methods tailored to specific model and data types for convenience.

#### Binary Classifier with Two Ground Truth Classes

If the data you send to the platform for a binary classifier has columns for the predicted
probability and ground-truth-status of class 0, as well as columns for the predicted
probability and ground-truth-status of class 1, then map each predicted value column to its corresponding ground truth
column:

```python
# Map PredictedValue attributes to their corresponding GroundTruth attribute names
pred_to_ground_truth_map = {'pred_0' : 'gt_0',
                            'pred_1' : 'gt_1'}

# add the ground truth and predicted attributes to the model
# specifying that the `pred_1` attribute is the
# positive predicted attribute, which means it corresponds to the
# probability that the binary target attribute is 1
arthur_model.add_binary_classifier_output_attributes(
    positive_predicted_attr='pred_1',
    pred_to_ground_truth_map=pred_to_ground_truth_map)

```

#### More Than Two Ground Truth Classes

If you are using a Multi-class model then you will have more than two Ground Truth classes. In order to make this work with the Arthur Platform, you will need to:

1. Ensure that you are using `predict_proba` (or a similar function) to predict the probability of a specific Ground Truth Class
2. Ensure that each class probability is included in its own column in your dataset
3. Ensure that your Ground Truth mapping contains all possible classes that might be predicted

So for example, if your model identifies the presence of an animal, specifically a dog, cat, or horse, in an image, your Ground Truth mapping must contain items for each of these clasess (even if the model output doesn't predict a value for these categories).

If the data you send to the platform has ground truth one-hot encoded, then map predictions to each column name:

```python
# Map PredictedValue attributes to their corresponding GroundTruth attribute names.
# This pred_to_ground_truth_map maps predicted values to one-hot encoded ground truth columns.
# For example, this tells Arthur that the `probability_dog` column represents
# the probability that the `dog_ground_truth` column has the value 1.
pred_to_ground_truth_map = {
    "probability_dog": "dog_ground_truth",
    "probability_cat": "cat_ground_truth",
    "probability_horse": "horse_ground_truth"
}

arthur_model.add_multiclass_classifier_output_attributes(
    pred_to_ground_truth_map=pred_to_ground_truth_map
)
```

If the data you send to the platform has ground truth values in a single column, then map predictions to each column value:

```python
# Map PredictedValue attributes to their corresponding GroundTruth attribute values.
# This pred_to_ground_truth_map maps predicted values to the values of the ground truth column.
# For example, this tells Arthur that the `probability_dog` column represents
# the probability that the ground truth column has the value "dog".
pred_to_ground_truth_map = {
    "probability_dog": "dog",
    "probability_cat": "cat",
    "probability_horse": "horse"
}

arthur_model.add_classifier_output_attributes_gtclass(
    pred_to_ground_truth_map=pred_to_ground_truth_map,
    ground_truth_column="animal"
)
```

#### Regression Attributes

If you are registering a regression model, then specify the type of the predicted and ground truth values when registering
the attributes:

```python
# Map PredictedValue attribute to its corresponding GroundTruth attribute
pred_to_ground_truth_map = {
    "predidcted_value": "ground_truth_value",
}

# add the pred_to_ground_truth_map, and specify the type of the
# predicted and ground truth values
arthur_model.add_regression_output_attributes(
    pred_to_ground_truth_map = pred_to_ground_truth_map,
    value_type = ValueType.Float
)
```

### Set Reference Data

If you used your reference data to register your model's attributes with
[ArthurModel.build()](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.core.models.ArthurModel.html#arthurai.core.models.ArthurModel.build)
, you don't need to complete this step because the dataframe you pass in as input to `build()` will be automatically saved
as your model's reference data in the Arthur system.

If you didn't use `build()` or want to update the reference dataset to be sent to Arthur, you can set it directly by using the
[`ArthurModel.set_reference_data()`](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.core.models.ArthurModel.html#arthurai.core.models.ArthurModel.set_reference_data)
method. This is also necessary if your reference dataset is too large to fit into memory as a Pandas DataFrame.


### Review Model

The method
[ArthurModel.review()](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.core.models.ArthurModel.html#arthurai.core.models.ArthurModel.review)
returns the model schema, which is a dataframe of properties for each of your model's registered attributes. The `review()`
method is automatically called when using `build()`, and can also be called on its own. Inspecting the model schema `review()`
returns is recommended to verify that attribute properties have been inferred correctly.
```{note}
Some important properties to check in the model schema:

- Check that attributes have the correct value types

- Check that attributes are correctly marked as categorical or continuous

- Check that attributes you want to monitor for bias have monitor_for_bias=True
```

By default, printing the model schema doesn't display all the attribute properties.
Therefore if you want to examine the model schema in its entirety, you can do so by formatting the maximum number of
rows and columns to display:

```python
pd.set_option('display.max_columns', 10); pd.set_option('max_rows', 50)
arthur_model.review()
```

The model schema should look like this:

```python
    name                    stage	            value_type	    categorical	    is_unique	    categories	                bins	range	        monitor_for_bias
0   X0	                    PIPELINE_INPUT	    FLOAT	    False	    False	    []	                        None	[16.0, 58.0]	False
1   ground_truth_label      GROUND_TRUTH            INTEGER	    True	    False	    [{value: 0}, {value: 1}]	None	[None, None]	False
2   predicted_probability   PREDICTED_VALUE	    FLOAT	    False	    False	    []	                        None	[0, 1]	        False
```

```{note}

To modify attribute properties in the model schema table, see the docstring for
[ArthurAttribute](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.core.attributes.ArthurAttribute.html#arthurai.core.attributes.ArthurAttribute)
 for a complete description of model attribute properties and their configuration methods.
```


### Save Model

Once you have reviewed your model schema and made any necessary modification to your model's attributes, you are ready
to save your model to Arthur.

Calling `arthur_model.save()` returns the unique ID Arthur creates for your model. You can easily load the model from
the Arthur system later on using either this ID or the `partner_model_id` you specified when you first created the model.
```python
arthur_model_id = arthur_model.save()
```

### Activate Enrichments

[Enrichments](../basic_concepts.html#enrichments) are model monitoring services Arthur provides that can be activated
once your model is saved to Arthur.

Models will have the {ref}`Anomaly Detection <enrichments_anomaly_detection>` enabled by default if your plan supports
it, but first we'll enable {ref}`Hotspots <enrichments_hotspots>` which doesn't require any configuration.

Second, we activate explainability, which requires more configuration and therefore comes with its own helper function.

```python
# first activate hotspots
arthur_model.enable_hotspots()

# enable explainability using its own helper function for convenience
arthur_model.enable_explainability(
    df=X_train,
    project_directory="/path/to/model_folder/",
    requirements_file="requirements.txt",
    user_predict_function_import_path="model_entrypoint",
    ignore_dirs=["folder_to_ignore"] # optionally exclude directories within the project folder from being bundled with predict function
)
```

For more information on enabling enrichments and updating their configurations, see {doc}`/user-guide/walkthroughs/enrichments`.

***

## Onboarding Existing Inferences

If your model is already running in production, a good next step is to send your historical inferences to Arthur. In
this section, we'll gather those historical inferences and then send them to the platform.


### Collecting Historical Inferences

When logging inferences with Arthur, you may include:
- **Model Inputs** which were sent to your model to make predictions
- **Model Predictions** which you could fetch from storage or re-compute from your input data if you don't have them
saved
- **Non-Input Data** that you want to include, and you registered with your Arthur model but doesn't feed
into your model
- **Ground Truth** labels for the inputs if you have them available
- **Partner Inference IDs** that uniquely identify your predictions and can be used to update inferences with ground
truth labels in the future (details below)
- **Inference Timestamps** that you can approximate with the [`generate_timestamps()` function](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.util.generate_timestamps.html?highlight=generate_timestamps#arthurai.util.generate_timestamps)
if you're just simulating production data or omit to use the current time
- **Ground Truth Timestamps** that you can approximate with the [`generate_timestamps()` function](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.util.generate_timestamps.html?highlight=generate_timestamps#arthurai.util.generate_timestamps)
if you're just simulating production data or omit to use the current time
- **Batch IDs** that denote something like a unique "run ID" if your model is a batch model

You might have all the data you need in one convenient place, or more often you'll need to gather them from a couple of
tables or data stores. For example, you might:
- collect your input and non-input data from your data warehouse
- fetch your predictions and timestamps from blob storage used with your model deployment
- match them to your ground truth labels in a different legacy system

#### Partner Inference IDs

Arthur offers Partner Inference IDs as a way to match specific inferences in Arthur against your other systems and
update your inferences with ground truth labels as they become available in the future. The most appropriate choice
for a partner inference ID depends on your specific circumstances but common strategies include _using existing IDs_
and _joining metadata with non-unique IDs_.

If you already have existing IDs that are unique to each inference and easily attached to future ground truth labels,
you can simply use those (casting to strings if needed).

Another common approach is to construct a partner inference ID from multiple pieces of metadata. For example, if your
model makes predictions about your customers at most once per day, you might construct your partner inference IDs as
`{customer_id}-{date}`. This would be easy to reconstruct when sending ground truth labels much later: simply lookup
the labels for all the customers passed to the model on a given day and append that date to their ID.

If you don't supply partner inference IDs, the SDK will generate them for you and return them to your
`send_inferences()` call. These can be kept for future reference, or discarded if you've already sent ground truth
values or don't plan to in the future.

### Sending Inferences

Arthur offers many flexible options for sending your inferences. We have a few SDK methods can accept Pandas DataFrames,
native Python objects, and Parquet files — with data grouped into single datasets or spread across separate method calls
and parameters. Two examples of these are outlined below, but for all the available usages see our SDK Reference for:

- the [`ArthurModel.send_inference()`](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.core.models.ArthurModel.html?highlight=send_inferences#arthurai.core.models.ArthurModel.send_inferences) and [`update_inference_ground_truths()`](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.core.models.ArthurModel.html?highlight=update_inference_ground_truths#arthurai.core.models.ArthurModel.update_inference_ground_truths) methods,
which are recommended for non-Parquet datasets under 100,000 rows
- the [`ArthurModel.send_bulk_inferences()`](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.core.models.ArthurModel.html?highlight=send_bulk_inferences#arthurai.core.models.ArthurModel.send_bulk_inferences) and [`send_bulk_ground_truths()`](https://docs.arthur.ai/sdk/sdk_v3/apiref/arthurai.core.models.ArthurModel.html?highlight=send_bulk_ground_truths#arthurai.core.models.ArthurModel.send_bulk_ground_truths)
methods which are recommended for sending large datasets or Parquet files

If you'd prefer to send data directly the REST API, see the [Inferences section of our API Reference](https://docs.arthur.ai/api-documentation/v3-api-docs.html#tag/inferences).

#### A Simple Case

Here we suppose we've gathered our input, non-input, and ground truth labels into a single DataFrame. We also fetch
our predictions and the time at which they were made, and send everything in a single method call. Here we're passing
the predictions and timestamps as parameters into the method, but we could also simply add them to the `inference_data`
DataFrame. We don't worry about partner inference IDs here, leaving them to be auto-generated.

```python
# load model input and non-input values, and ground truth labels + timestamps as a Pandas DataFrame
inference_data = ...

# retrieve predictions and timestamps as lists
#  note that we could also include these as columns in the DataFrame above
predictions, inference_timestamps = ...

# Send the inferences to Arthur
# just using auto-generated partner inference IDs since we're sending ground truth right now
arthur_model.send_inferences(
    inference_data,
    predictions=predictions,
    inference_timestamps=inference_timestamps)
```

### Sending Inferences at Scale with Delayed Ground Truth

Next, we consider a more complex case where we have a batch model with many inferences and send the ground truth
separately, relying on our Partner Inference IDs to join the ground truth values to the previous inferences. We
assume the data is neatly collected as described above. This may rely on an [ETL job](https://en.wikipedia.org/wiki/Extract,_transform,_load)
that might involve a Spark job or a Redshift export or a Snowflake export or Apache Beam job in Google Cloud Dataflow
or Pandas `from_sql()` and `to_parquet()` calls or whatever data wrangling toolkit you're most comfortable with.

```python
# we can collect a set of folder names each corresponding to a batch run, containing one or
#  more Parquet files with the input attributes columns, non-input attribute columns, and
#  prediction attribute columns as well as a "partner_inference_id" column with our unique
#  identifiers and an "inference_timestamp" column
inference_batch_dirs = ...

# then suppose we have a directory with one or more parquet files containing matching
#  "partner_inference_id"s and our ground truth attribute columns as well as a
#  "ground_truth_timestamp" column
ground_truth_dir = ...

# send the inferences to Arthur
for batch_dir in inference_batch_dirs:
    batch_id = batch_dir.split("/")[-1]  # use the directory name as the Batch ID
    arthur_model.send_bulk_inferences(
        directory_path=batch_dir,
        batch_id=batch_id)

# send the ground truths to Arthur
arthur_model.send_bulk_ground_truths(directory_path=ground_truth_dir)
```


### See Model in Dashboard
To confirm that the inferences have been sent, you can view your model and its inferences in the Arthur dashboard.

### Performance Results

Once you've logged your model's inferences with Arthur you can evaluate your model performance. You can open your
Arthur dashboard to view model performance in the UI, or use the code snippets below to fetch the same results right
from your Python environment using {doc}`Arthur's Query API </user-guide/api-query-guide/index>`.

#### Query Overall Performance

You can query overall Accuracy Rate with the following snippet, but for non-classifier models you might consider
replacing the `accuracyRate` function with another {doc}`model evaluation function </user-guide/api-query-guide/model_evaluation_functions>`.

```python
# query model accuracy across the batches
query = {
    "select": [
        {
            "function": "accuracyRate"
        }
    ]
}
query_result = arthur_model.query(query)
```

#### Visualize Performance Results

Visualize performance metrics over time:

```python
# plot model performance metrics over time
arthur_model.viz.metric_series(
    ["auc", "falsePositiveRate"],
    time_resolution="hour")
```

Visualize data drift over time:

```python
# plot drift over time of attributes
# from their baseline distribution in the model's reference data
arthur_model.viz.drift_series(
    ["X0", "predicted_probability"],
    drift_metric="KLDivergence",
    time_resolution="hour")
```

#### {doc}`API Query Guide </user-guide/api-query-guide/index>`

For more analysis of model performance, the {doc}`/user-guide/api-query-guide/index` shows how to use the Arthur
API to get the model performance results you need, efficiently and at scale. Our backend query engine allows for fine-grained and
customizable performance analysis.

***

## Production Integration

Now that you have registered your model and successfully gotten initial performance metrics on your model's
historical inferences, you are ready to connect your production pipeline to Arthur.

Arthur has several methods of receiving your production model's inference data. Most involve some process making a
call to one of the SDK methods described above, but where that process runs and reads data from depends on your
production environment. We explore a few common patterns below, as well as some of Arthur's direct
{doc}`integrations </user-guide/integrations/index>`.

For a quick start, consider the [quick integration](#quick-integration), which only involves adding a few lines of code
to your model prediction code.

If your model inputs and predictions are written out to a data stream such as a Kafka topic, consider [adding a stream
listener](#streaming-integration)

If you don't mind a bit of latency between when your predictions are made and logged with Arthur or it's much easier
to read your inference data from rest, consider setting up an [inference upload job](#inference-upload-job).

Note that these methods can be combined for prediction and ground truth values: you might use the quick integration or
streaming approach for inference data but a batch job to update ground labels.

### API Keys

API Keys authorize your request to send and receive data to and from the Arthur platform. With a valid API key added
to your production environment, your model deployment code can be augmented to send your model's inferences to Arthur.
See the {doc}`/platform-management/access-control-overview/standard_access_control` to obtain an Arthur API key.

### Quick Integration

Quick integration with Arthur means using the `send_inferences()` method *when* and *where* your model object
produces inferences. This is the simplest and quickest way to connect a production model to Arthur. However, this option
would have you add some latency to the speed with which your model is generating inferences. For more efficient
approaches, see options 2 and 3.

For example, suppose your model is hosted in production behind using an API using Flask - the call to
`arthur_model.send_inferences()` just needs to be included wherever your `predict` function is defined so your updated
code might look something like this:

```python
####################################################
# New code to fetch the ArthurModel
# connect to Arthur
import os
from arthurai import ArthurAI
arthur = ArthurAI(
    url="https://app.arthur.ai",
    access_key=os.environ["ARTHUR_API_KEY"])

# retrieve the arthur model
arthur_model = arthur.get_model(os.environ["ARTHUR_PARTNER_MODEL_ID"], id_type='partner_model_id')
####################################################

# your original model prediction function
# which can be on its own as a python script
# or wrapped by an API like a Flask app
def predict():
    # get data to apply model to
    inference_data = ...

    # generate inferences
    # in this example, the predictions are classification probabilities
    predictions = model.predict_proba(...)

    ####################################################
    #### NEW PART OF YOUR MODEL'S PREDICTION SCRIPT

    # SEND NEW INFERENCES TO ARTHUR
    arthur_model.send_inferences(
        inference_data,
        predictions=predictions)
    ####################################################

    return predictions
```

Alternatively if you have a batch model that runs in jobs, you might add similar code to the very end of your job,
rather than inside the `predict()` function.

### Streaming Integrations

If you write your model's inputs and outputs to a data stream, you can add a listener to that stream to log those
inferences with Arthur. For example, if you have a Kafka topic you might add a new `arthur` consumer group to listen
to new events and pass them to the `send_inferences()` method. If your inputs and predictions live in different topics
or you want to add non-input data from another topic, you might use [Kafka Streams](https://kafka.apache.org/documentation/streams/)
to join the various topics before sending to Arthur.

### Inference Upload Jobs

Another approach is to run jobs that read data from rest and send it to the Arthur platform. These jobs might be
scheduled or event-driven, depending on your architecture.

For example, you might have regularly scheduled jobs that:

1. look up the inference or ground truth data since the last run
1. format the data and write it to a few Parquet files
1. send the Parquet files to the Arthur platform using `send_bulk_inferences()` or `send_bulk_ground_truths()`

### Integrations

Rather than hand-rolling your own inference upload jobs, Arthur also offers more direct integrations.

For example, our {ref}`SageMaker Data Capture Integration <sagemaker_integration>` makes integrating with SageMaker
models a breeze by utilizing Data Capture to log the inferences into files in S3, and triggering upload jobs in
response to those file write events.

Our {ref}`Batch Ingestion from S3 <s3_batch_ingestion>` allows you to just upload your
Parquet files to S3, and Arthur will automatically import them into the system.


```{toctree}
:hidden:
:maxdepth: 3

General Onboarding <self>
cv_onboarding
nlp_onboarding
```