Spaces:

maxcembalest
/

ask-arthur

Sleeping

File size: 22,809 Bytes

ad8da65

# Data Drift

## Querying Drift in Python

The basic format of a drift query using the Python SDK involves specifying that the 
`query_type` parameter has the value 'drift':
```python
query = {...}

arthur_model.query(query, query_type='drift')
```

## Data Drift Endpoint

Data drift has a dedicated endpoint at `/models/{model_id}/inferences/query/data_drift`.

Returns the data drift metric between a `base` dataset with a `target` dataset. This endpoint can support up to 100 properties in one request.

* `num_bins` - Specifies the granularity of bucketing for continuous distributions and will be ignored if the attribute is categorical.
* `metric` - Specify one metric among {ref}`the data drift metrics Arthur offers <glossary_data_drift>`.
* `filter` - Optional blocks specific to either reference or inference set and specify which data should be used in the data drift calculation.
* `group_by` - Global and applies to both the base and target data.
* `rollup` - Optional parameter that will aggregate the calculated data drift value by the supported time dimension.

For `HypothesisTest`, the returned value is transformed as -log_10(P_value) to maintain directional parity with the other data drift metrics. That is, lower P_value is more significant and implies data drift, reflected in a higher -log_10(P_value). Further mathematical details are in the {ref}`glossary <glossary_hypothesis_test>.

Query Request:
```json
{
    "properties": [
        "<attribute1_name> [string]",
        "<attribute2_name> [string]",
        "<attribute3_name> [string]"
    ],
    "num_bins": "<num_bins> [int]",
    "metric": "[PSI|KLDivergence|JSDivergence|HellingerDistance|HypothesisTest]",
    "base": {
        "source": "[inference|reference]",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "target": {
        "source": "[inference|reference|ground_truth]",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "group_by [Optional]": [
        {
            "property": "<group_by_attribute_name> [string]"
        }
    ],
    "rollup [Optional]": "minute|hour|day|month|year|batch_id"
}
```

Query Response:
```json
{
    "query_result": [
        {
            "<attribute1_name>": "<attribute1_data_drift> [float]",
            "<attribute2_name>": "<attribute2_data_drift> [float]",
            "<attribute3_name>": "<attribute3_data_drift> [float]",
            "<group_by_attribute_name>": "<group_by_attribute_value> [string|int|null]",
            "rollup": "<rollup_attribute_value> [string|null]"
        }
    ]
}
```

See {ref}`endpoint_overview_filter_comparators` for a list of valid comparators.

#### Example: Reference vs. Inference

Sample Request: Calculate data drift for males, grouped by country, rolled up by hour.

```json
{
    "properties": [
        "age"
    ],
    "num_bins": 10,
    "metric": "PSI",
    "base": {
        "source": "reference",
        "filter": [
            {
                "property": "gender",
                "comparator": "eq",
                "value": "male"
            }
        ]
    },
    "target": {
        "source": "inference",
        "filter": [
            {
                "property": "gender",
                "comparator": "eq",
                "value": "male"
            },
            {
                "property": "inference_timestamp",
                "comparator": "gte",
                "value": "2020-07-22T10:00:00Z"
            },
            {
                "property": "inference_timestamp",
                "comparator": "lt",
                "value": "2020-07-23T10:00:00Z"
            }
        ]
    },
    "group_by": [
        {
            "property": "country"
        }
    ],
    "rollup": "hour"
}
```
Sample Response:
```json
{
    "query_result": [
        {
            "age": 2.3,
            "country": "Canada",
            "rollup": "2020-07-22T10:00:00Z"
        },
        {
            "age": 2.4,
            "country": "United States",
            "rollup": "2020-07-22T10:00:00Z"
        }
    ]
}
```

### Example: Inference vs. Inference

Sample Request: Compare data drift between two batches, with no grouping, no filters, and no rollups.

```json
{
    "properties": [
        "age"
    ],
    "num_bins": 10,
    "metric": "PSI",
    "base": {
        "source": "inference",
        "filter": [
            {
                "property": "batch_id",
                "comparator": "eq",
                "value": "5"
            }
        ]
    },
    "target": {
        "source": "inference",
        "filter": [
            {
                "property": "batch_id",
                "comparator": "eq",
                "value": "6"
            }
        ]
    }
}
```
Sample Response:
```json
{
    "query_result": [
        {
            "age": 2.3
        }
    ]
}
```

[back to top](#data-drift)

### Example: Reference vs. Ground Truth

Sample Request: Calculate data drift for individual ground truth class prediction probabilities, rolled up by hour.

```json
{
    "properties": [
        "gt_1"
    ],
    "num_bins": 10,
    "metric": "PSI",
    "base": {
        "source": "reference"
    },
    "target": {
        "source": "ground_truth",
        "filter": [
            {
                "property": "ground_truth_timestamp",
                "comparator": "gte",
                "value": "2020-07-22T10:00:00Z"
            },
            {
                "property": "ground_truth_timestamp",
                "comparator": "lt",
                "value": "2020-07-23T10:00:00Z"
            }
        ]
    },
    "rollup": "hour"
}
```
Sample Response:
```json
{
    "query_result": [
        {
            "gt_1": 0.03,
            "rollup": "2020-07-22T10:00:00Z"
        },
        {
            "gt_1": 0.4,
            "rollup": "2020-07-22T11:00:00Z"
        }
    ]
}
```

[back to top](#data-drift)


## Data Drift PSI Bucket Table Values

This metric has a dedicated endpoint at `/models/{model_id}/inferences/query/data_drift_psi_bucket_calculation_table`.

Returns the [PSI](https://scholarworks.wmich.edu/cgi/viewcontent.cgi?article=4249&context=dissertations) scores by bucket using the reference set data. This query for this endpoint omits the need for `metric` and takes in a single `property` but otherwise is identical to the [data drift endpoint](#data-drift-endpoint)

Note when using this endpoint with categorical features, the `bucket_min` and `bucket_max` fields will not be
returned in the response. Instead, the `bucket` field will contain the category name.

Query Request:
```json
{
    "property": "<attribute_name> [string]",
    "num_bins": "<num_bins> [int]",
    "base": {
        "source": "[inference|reference]",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "target": {
        "source": "[inference|reference]",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "group_by [Optional]": [
        {
            "property": "<group_by_attribute_name> [string]"
        }
    ],
    "rollup [Optional]": "minute|hour|day|month|year|batch_id"
}
```
Query Response:
```json
{
    "query_result": [
        {
            "bucket": "string",
            "rollup": "string|null",
            "group_by_property_1": "string|null",
            "base_bucket_max": "number",
            "base_bucket_min": "number",
            "base_count_per_bucket": "number",
            "base_ln_probability_per_bucket": "number",
            "base_probability_per_bucket": "number",
            "base_total": "number",
            "target_bucket_max": "number",
            "target_bucket_min": "number",
            "target_count_per_bucket": "number",
            "target_ln_probability_per_bucket": "number",
            "target_probability_per_bucket": "number",
            "target_total": "number",
            "probability_difference": "number",
            "ln_probability_difference": "number",
            "psi": "number"
        }
    ]
}
```

See {ref}`endpoint_overview_filter_comparators` for a list of valid comparators.

***

Sample Request: Calculate data drift bucket components for males, grouped by country, rolled up by hour.
```json
{
    "property": "age",
    "num_bins": 2,
    "base": {
        "source": "reference",
        "filter": [
            {
                "property": "gender",
                "comparator": "eq",
                "value": "male"
            }
        ]
    },
    "target": {
        "source": "inference",
        "filter": [
            {
                "property": "gender",
                "comparator": "eq",
                "value": "male"
            },
            {
                "property": "inference_timestamp",
                "comparator": "gte",
                "value": "2020-07-22T10:00:00Z"
            },
            {
                "property": "inference_timestamp",
                "comparator": "lt",
                "value": "2020-07-23T10:00:00Z"
            }
        ]
    },
    "group_by": [
        {
            "property": "country"
        }
    ],
    "rollup": "hour"
}
```
Sample Response:
```json
{
    "query_result": [
        {
            "bucket": "bucket_1",
            "rollup": "2020-01-01T00:00:00Z",
            "country": "Canada",
            "base_bucket_max": 0.9999971182990177,
            "base_bucket_min": 0.5009102069226075,
            "base_count_per_bucket": 4988,
            "base_ln_probability_per_bucket": -0.6955500651756032,
            "base_probability_per_bucket": 0.4988,
            "base_total": 10000,
            "target_bucket_max": 0.9999971182990177,
            "target_bucket_min": 0.5009102069226075,
            "target_count_per_bucket": 2487,
            "target_ln_probability_per_bucket": -0.6701670131762315,
            "target_probability_per_bucket": 0.5116231228142357,
            "target_total": 4861,
            "probability_difference": -0.012823122814235699,
            "ln_probability_difference": -0.025383051999371742,
            "psi": 0.00032548999318807485
        },
        {
            "bucket": "bucket_2",
            "rollup": "2020-01-01T00:00:00Z",
            "country": "United States",
            "base_bucket_max": 0.9999971182990177,
            "base_bucket_min": 0.5009102069226075,
            "base_count_per_bucket": 4988,
            "base_ln_probability_per_bucket": -0.6955500651756032,
            "base_probability_per_bucket": 0.4988,
            "base_total": 10000,
            "target_bucket_max": 0.9999971182990177,
            "target_bucket_min": 0.5009102069226075,
            "target_count_per_bucket": 2487,
            "target_ln_probability_per_bucket": -0.6701670131762315,
            "target_probability_per_bucket": 0.5116231228142357,
            "target_total": 4861,
            "probability_difference": -0.012823122814235699,
            "ln_probability_difference": -0.025383051999371742,
            "psi": 0.00032548999318807485
        },
        {
            "bucket": "bucket_1",
            "rollup": "2020-01-01T01:00:00Z",
            "country": "Canada",
            "base_bucket_max": 0.9999971182990177,
            "base_bucket_min": 0.5009102069226075,
            "base_count_per_bucket": 4988,
            "base_ln_probability_per_bucket": -0.6955500651756032,
            "base_probability_per_bucket": 0.4988,
            "base_total": 10000,
            "target_bucket_max": 0.9999971182990177,
            "target_bucket_min": 0.5009102069226075,
            "target_count_per_bucket": 2487,
            "target_ln_probability_per_bucket": -0.6701670131762315,
            "target_probability_per_bucket": 0.5116231228142357,
            "target_total": 4861,
            "probability_difference": -0.012823122814235699,
            "ln_probability_difference": -0.025383051999371742,
            "psi": 0.00032548999318807485
        },
        {
            "bucket": "bucket_2",
            "rollup": "2020-01-01T01:00:00Z",
            "country": "United States",
            "base_bucket_max": 0.9999971182990177,
            "base_bucket_min": 0.5009102069226075,
            "base_count_per_bucket": 4988,
            "base_ln_probability_per_bucket": -0.6955500651756032,
            "base_probability_per_bucket": 0.4988,
            "base_total": 10000,
            "target_bucket_max": 0.9999971182990177,
            "target_bucket_min": 0.5009102069226075,
            "target_count_per_bucket": 2487,
            "target_ln_probability_per_bucket": -0.6701670131762315,
            "target_probability_per_bucket": 0.5116231228142357,
            "target_total": 4861,
            "probability_difference": -0.012823122814235699,
            "ln_probability_difference": -0.025383051999371742,
            "psi": 0.00032548999318807485
        }
    ]
}
```

Sample Request: Compare data drift bucket components between two batches, with no grouping, no filters, and no rollups.

```json
{
    "property": "age",
    "num_bins": 10,
    "base": {
        "source": "inference",
        "filter": [
            {
                "property": "batch_id",
                "comparator": "eq",
                "value": "5"
            }
        ]
    },
    "target": {
        "source": "inference",
        "filter": [
            {
                "property": "batch_id",
                "comparator": "eq",
                "value": "6"
            }
        ]
    }
}
```
Sample Response:
```json
{
    "query_result": [
        {
            "bucket": "bucket_1",
            "base_bucket_max": 0.9999971182990177,
            "base_bucket_min": 0.5009102069226075,
            "base_count_per_bucket": 4988,
            "base_ln_probability_per_bucket": -0.6955500651756032,
            "base_probability_per_bucket": 0.4988,
            "base_total": 10000,
            "target_bucket_max": 0.9999971182990177,
            "target_bucket_min": 0.5009102069226075,
            "target_count_per_bucket": 2487,
            "target_ln_probability_per_bucket": -0.6701670131762315,
            "target_probability_per_bucket": 0.5116231228142357,
            "target_total": 4861,
            "probability_difference": -0.012823122814235699,
            "ln_probability_difference": -0.025383051999371742,
            "psi": 0.00032548999318807485
        },
        {
            "bucket": "bucket_2",
            "base_bucket_max": 0.9999971182990177,
            "base_bucket_min": 0.5009102069226075,
            "base_count_per_bucket": 4988,
            "base_ln_probability_per_bucket": -0.6955500651756032,
            "base_probability_per_bucket": 0.4988,
            "base_total": 10000,
            "target_bucket_max": 0.9999971182990177,
            "target_bucket_min": 0.5009102069226075,
            "target_count_per_bucket": 2487,
            "target_ln_probability_per_bucket": -0.6701670131762315,
            "target_probability_per_bucket": 0.5116231228142357,
            "target_total": 4861,
            "probability_difference": -0.012823122814235699,
            "ln_probability_difference": -0.025383051999371742,
            "psi": 0.00032548999318807485
        }
    ]
}
```

[back to top](#data-drift)

## Data Drift for Classification Outputs

For classification outputs, one may want to examine drift among a collection of different classes, i.e. the system of outputs, instead of the drift of the probability predictions of a single class. The query uses one of `"predicted_classes": ["*"]` or `"ground_truth_classes": ["*"]` but otherwise is identical to a standard data drift query. Rather than using the star operator to select all prediction or ground truth classes, respectively, in a model, a list of string classes can be provided for looking at drift of a subset of multiclass outputs. 

* `predicted_classes` - Specifies which prediction classes to use for `predictedClass` data drift. 
* `ground_truth_classes` - Specifies which prediction classes to use for `groundTruthClass` data drift. 

`properties` can be included in the same query as long as the target `source` corresonds to the classification output tag. For example, one can query drift on input attributes and `predictedClass` in the same query with target `source` of `inference`; one can query drift on individual ground truth labels and `groundTruthClass` in the same query with target `source` of `ground_truth`.

Query Request:
```json
{
    "properties [Optional]": [
        "<attribute1_name> [string]",
        "<attribute2_name> [string]",
        "<attribute3_name> [string]"
    ],
    "[predicted_classes|ground_truth_classes]": [
        "<class0_name> [string]"
        "<class1_name> [string]"
        ],
    "num_bins": "<num_bins> [int]",
    "metric": "[PSI|KLDivergence|JSDivergence|HellingerDistance|HypothesisTest]",
    "base": {
        "source": "[inference|reference]",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "target": {
        "source": "[inference|reference|ground_truth]",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "group_by [Optional]": [
        {
            "property": "<group_by_attribute_name> [string]"
        }
    ],
    "rollup [Optional]": "minute|hour|day|month|year|batch_id"
}
```

Query Response:
```json
{
    "query_result": [
        {
            "<attribute1_name>": "<attribute1_data_drift> [float]",
            "<attribute2_name>": "<attribute2_data_drift> [float]",
            "<attribute3_name>": "<attribute3_data_drift> [float]",
            "[predictedClass|groundTruthClass]": "<classification_data_drift> [float]",
            "<group_by_attribute_name>": "<group_by_attribute_value> [string|int|null]",
            "rollup": "<rollup_attribute_value> [string|null]"
        }
    ]
}
```

See {ref}`endpoint_overview_filter_comparators` for a list of valid comparators.

***

Sample Request: Calculate data drift on all prediction classes.
```json
{
    "predicted_classes": [
        "*"
    ],
    "num_bins": 20,
    "base": {
        "source": "reference"
    },
    "target": {
        "source": "inference"
    },
    "metric": "PSI"
}
```

Sample Response:
```json
{
    "query_result": [
        {
            "predictedClass": 0.021
        }
    ]
}
```

Sample Request: Calculate data drift on ground truth using the first and third ground truth classes.
```json
{
    "predicted_classes": [
        "gt_1",
        "gt_3"
    ],
    "num_bins": 20,
    "base": {
        "source": "reference"
    },
    "target": {
        "source": "ground_truth"
    },
    "metric": "PSI"
}
```

Sample Response:
```json
{
    "query_result": [
        {
            "groundTruthClass": 0.021
        }
    ]
}
```

[back to top](#data-drift)

(automated_data_drift_thresholds)=
## Automated Data Drift Thresholds

What is a sufficiently high data drift value to suggest that the target data has actually drifted from the base data? For `HypothesisTest`, we can reverse engineer -log_10(P_value) and plug in the conventional .05 alpha level to establish a lower bound of -log_10(.05). 

For the other data drift metrics, it is not sufficient to pin a constant. We abstract this away for the user and allow queries to obtain automatically generated data drift thresholds (lower bounds) based on a model's data. These thresholds can be used in alerting. For more information see: [Automating Data Drift Thresholding in Machine Learning Systems](https://arthur.ai/blog/automating-data-drift-thresholding-in-machine-learning-systems).

The query uses `"metric": "Thresholds"` and does not require nor use `"target"` and `"rollup"` fields but otherwise is identical to a standard data drift query.

Query Request:
```json
{
    "properties": [
        "<attribute1_name> [string]",
        "<attribute2_name> [string]",
        "<attribute3_name> [string]"
    ],
    "num_bins": "<num_bins> [int]",
    "metric": "Thresholds",
    "base": {
        "source": "reference",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "group_by [Optional]": [
        {
            "property": "<group_by_attribute_name> [string]"
        }
    ]
}
```

Query Response:
```json
{
    "query_result": [
        {
            "<attribute1_name>": {
                "HellingerDistance": "<threshold> [float]",
                "JSDivergence": "<threshold> [float]",
                "KLDivergence": "<threshold> [float]",
                "PSI": "<threshold> [float]"
            },
            "<attribute2_name>": {
                "HellingerDistance": "<threshold> [float]",
                "JSDivergence": "<threshold> [float]",
                "KLDivergence": "<threshold> [float]",
                "PSI": "<threshold> [float]"
            }
            
        }
    ]
}
```

See {ref}`endpoint_overview_filter_comparators` for a list of valid comparators.

***

Sample Request: 
```json
{
    "properties": [
        "AGE"
    ],
    "num_bins": 20,
    "base": {
        "source": "reference"
    },
    "metric": "Thresholds"
}
```

Sample Response:
```json
{
    "query_result": [
        {
            "AGE": {
                "HellingerDistance": 0.00041737395239735647,
                "JSDivergence": 2.959228131592643,
                "KLDivergence": 0.001893866910388703,
                "PSI": 0.0018945640055550161
            }
        }
    ]
}
```

[back to top](#data-drift)