Spaces:

maxcembalest
/

ask-arthur

Sleeping

App Files Files Community

ask-arthur / files /arthur-docs-markdown /user-guide /api-query-guide /data_drift.md.txt

maxcembalest

Upload 184 files

ad8da65 over 2 years ago

raw

history blame

22.8 kB

	# Data Drift

	## Querying Drift in Python

	The basic format of a drift query using the Python SDK involves specifying that the
	`query_type` parameter has the value 'drift':
	```python
	query = {...}

	arthur_model.query(query, query_type='drift')
	```

	## Data Drift Endpoint

	Data drift has a dedicated endpoint at `/models/{model_id}/inferences/query/data_drift`.

	Returns the data drift metric between a `base` dataset with a `target` dataset. This endpoint can support up to 100 properties in one request.

	* `num_bins` - Specifies the granularity of bucketing for continuous distributions and will be ignored if the attribute is categorical.
	* `metric` - Specify one metric among {ref}`the data drift metrics Arthur offers <glossary_data_drift>`.
	* `filter` - Optional blocks specific to either reference or inference set and specify which data should be used in the data drift calculation.
	* `group_by` - Global and applies to both the base and target data.
	* `rollup` - Optional parameter that will aggregate the calculated data drift value by the supported time dimension.

	For `HypothesisTest`, the returned value is transformed as -log_10(P_value) to maintain directional parity with the other data drift metrics. That is, lower P_value is more significant and implies data drift, reflected in a higher -log_10(P_value). Further mathematical details are in the {ref}`glossary <glossary_hypothesis_test>.

	Query Request:
	```json
	{
	"properties": [
	"<attribute1_name> [string]",
	"<attribute2_name> [string]",
	"<attribute3_name> [string]"
	],
	"num_bins": "<num_bins> [int]",
	"metric": "[PSI\|KLDivergence\|JSDivergence\|HellingerDistance\|HypothesisTest]",
	"base": {
	"source": "[inference\|reference]",
	"filter [Optional]": [
	{
	"property": "<filter_attribute_name> [string]",
	"comparator": "<comparator> [string]",
	"value": "<filter_threshold_value> [string\|int\|float]"
	}
	]
	},
	"target": {
	"source": "[inference\|reference\|ground_truth]",
	"filter [Optional]": [
	{
	"property": "<filter_attribute_name> [string]",
	"comparator": "<comparator> [string]",
	"value": "<filter_threshold_value> [string\|int\|float]"
	}
	]
	},
	"group_by [Optional]": [
	{
	"property": "<group_by_attribute_name> [string]"
	}
	],
	"rollup [Optional]": "minute\|hour\|day\|month\|year\|batch_id"
	}
	```

	Query Response:
	```json
	{
	"query_result": [
	{
	"<attribute1_name>": "<attribute1_data_drift> [float]",
	"<attribute2_name>": "<attribute2_data_drift> [float]",
	"<attribute3_name>": "<attribute3_data_drift> [float]",
	"<group_by_attribute_name>": "<group_by_attribute_value> [string\|int\|null]",
	"rollup": "<rollup_attribute_value> [string\|null]"
	}
	]
	}
	```

	See {ref}`endpoint_overview_filter_comparators` for a list of valid comparators.

	#### Example: Reference vs. Inference

	Sample Request: Calculate data drift for males, grouped by country, rolled up by hour.

	```json
	{
	"properties": [
	"age"
	],
	"num_bins": 10,
	"metric": "PSI",
	"base": {
	"source": "reference",
	"filter": [
	{
	"property": "gender",
	"comparator": "eq",
	"value": "male"
	}
	]
	},
	"target": {
	"source": "inference",
	"filter": [
	{
	"property": "gender",
	"comparator": "eq",
	"value": "male"
	},
	{
	"property": "inference_timestamp",
	"comparator": "gte",
	"value": "2020-07-22T10:00:00Z"
	},
	{
	"property": "inference_timestamp",
	"comparator": "lt",
	"value": "2020-07-23T10:00:00Z"
	}
	]
	},
	"group_by": [
	{
	"property": "country"
	}
	],
	"rollup": "hour"
	}
	```
	Sample Response:
	```json
	{
	"query_result": [
	{
	"age": 2.3,
	"country": "Canada",
	"rollup": "2020-07-22T10:00:00Z"
	},
	{
	"age": 2.4,
	"country": "United States",
	"rollup": "2020-07-22T10:00:00Z"
	}
	]
	}
	```

	### Example: Inference vs. Inference

	Sample Request: Compare data drift between two batches, with no grouping, no filters, and no rollups.

	```json
	{
	"properties": [
	"age"
	],
	"num_bins": 10,
	"metric": "PSI",
	"base": {
	"source": "inference",
	"filter": [
	{
	"property": "batch_id",
	"comparator": "eq",
	"value": "5"
	}
	]
	},
	"target": {
	"source": "inference",
	"filter": [
	{
	"property": "batch_id",
	"comparator": "eq",
	"value": "6"
	}
	]
	}
	}
	```
	Sample Response:
	```json
	{
	"query_result": [
	{
	"age": 2.3
	}
	]
	}
	```

	[back to top](#data-drift)

	### Example: Reference vs. Ground Truth

	Sample Request: Calculate data drift for individual ground truth class prediction probabilities, rolled up by hour.

	```json
	{
	"properties": [
	"gt_1"
	],
	"num_bins": 10,
	"metric": "PSI",
	"base": {
	"source": "reference"
	},
	"target": {
	"source": "ground_truth",
	"filter": [
	{
	"property": "ground_truth_timestamp",
	"comparator": "gte",
	"value": "2020-07-22T10:00:00Z"
	},
	{
	"property": "ground_truth_timestamp",
	"comparator": "lt",
	"value": "2020-07-23T10:00:00Z"
	}
	]
	},
	"rollup": "hour"
	}
	```
	Sample Response:
	```json
	{
	"query_result": [
	{
	"gt_1": 0.03,
	"rollup": "2020-07-22T10:00:00Z"
	},
	{
	"gt_1": 0.4,
	"rollup": "2020-07-22T11:00:00Z"
	}
	]
	}
	```

	[back to top](#data-drift)


	## Data Drift PSI Bucket Table Values

	This metric has a dedicated endpoint at `/models/{model_id}/inferences/query/data_drift_psi_bucket_calculation_table`.

	Returns the [PSI](https://scholarworks.wmich.edu/cgi/viewcontent.cgi?article=4249&context=dissertations) scores by bucket using the reference set data. This query for this endpoint omits the need for `metric` and takes in a single `property` but otherwise is identical to the [data drift endpoint](#data-drift-endpoint)

	Note when using this endpoint with categorical features, the `bucket_min` and `bucket_max` fields will not be
	returned in the response. Instead, the `bucket` field will contain the category name.

	Query Request:
	```json
	{
	"property": "<attribute_name> [string]",
	"num_bins": "<num_bins> [int]",
	"base": {
	"source": "[inference\|reference]",
	"filter [Optional]": [
	{
	"property": "<filter_attribute_name> [string]",
	"comparator": "<comparator> [string]",
	"value": "<filter_threshold_value> [string\|int\|float]"
	}
	]
	},
	"target": {
	"source": "[inference\|reference]",
	"filter [Optional]": [
	{
	"property": "<filter_attribute_name> [string]",
	"comparator": "<comparator> [string]",
	"value": "<filter_threshold_value> [string\|int\|float]"
	}
	]
	},
	"group_by [Optional]": [
	{
	"property": "<group_by_attribute_name> [string]"
	}
	],
	"rollup [Optional]": "minute\|hour\|day\|month\|year\|batch_id"
	}
	```
	Query Response:
	```json
	{
	"query_result": [
	{
	"bucket": "string",
	"rollup": "string\|null",
	"group_by_property_1": "string\|null",
	"base_bucket_max": "number",
	"base_bucket_min": "number",
	"base_count_per_bucket": "number",
	"base_ln_probability_per_bucket": "number",
	"base_probability_per_bucket": "number",
	"base_total": "number",
	"target_bucket_max": "number",
	"target_bucket_min": "number",
	"target_count_per_bucket": "number",
	"target_ln_probability_per_bucket": "number",
	"target_probability_per_bucket": "number",
	"target_total": "number",
	"probability_difference": "number",
	"ln_probability_difference": "number",
	"psi": "number"
	}
	]
	}
	```

	See {ref}`endpoint_overview_filter_comparators` for a list of valid comparators.

	***

	Sample Request: Calculate data drift bucket components for males, grouped by country, rolled up by hour.
	```json
	{
	"property": "age",
	"num_bins": 2,
	"base": {
	"source": "reference",
	"filter": [
	{
	"property": "gender",
	"comparator": "eq",
	"value": "male"
	}
	]
	},
	"target": {
	"source": "inference",
	"filter": [
	{
	"property": "gender",
	"comparator": "eq",
	"value": "male"
	},
	{
	"property": "inference_timestamp",
	"comparator": "gte",
	"value": "2020-07-22T10:00:00Z"
	},
	{
	"property": "inference_timestamp",
	"comparator": "lt",
	"value": "2020-07-23T10:00:00Z"
	}
	]
	},
	"group_by": [
	{
	"property": "country"
	}
	],
	"rollup": "hour"
	}
	```
	Sample Response:
	```json
	{
	"query_result": [
	{
	"bucket": "bucket_1",
	"rollup": "2020-01-01T00:00:00Z",
	"country": "Canada",
	"base_bucket_max": 0.9999971182990177,
	"base_bucket_min": 0.5009102069226075,
	"base_count_per_bucket": 4988,
	"base_ln_probability_per_bucket": -0.6955500651756032,
	"base_probability_per_bucket": 0.4988,
	"base_total": 10000,
	"target_bucket_max": 0.9999971182990177,
	"target_bucket_min": 0.5009102069226075,
	"target_count_per_bucket": 2487,
	"target_ln_probability_per_bucket": -0.6701670131762315,
	"target_probability_per_bucket": 0.5116231228142357,
	"target_total": 4861,
	"probability_difference": -0.012823122814235699,
	"ln_probability_difference": -0.025383051999371742,
	"psi": 0.00032548999318807485
	},
	{
	"bucket": "bucket_2",
	"rollup": "2020-01-01T00:00:00Z",
	"country": "United States",
	"base_bucket_max": 0.9999971182990177,
	"base_bucket_min": 0.5009102069226075,
	"base_count_per_bucket": 4988,
	"base_ln_probability_per_bucket": -0.6955500651756032,
	"base_probability_per_bucket": 0.4988,
	"base_total": 10000,
	"target_bucket_max": 0.9999971182990177,
	"target_bucket_min": 0.5009102069226075,
	"target_count_per_bucket": 2487,
	"target_ln_probability_per_bucket": -0.6701670131762315,
	"target_probability_per_bucket": 0.5116231228142357,
	"target_total": 4861,
	"probability_difference": -0.012823122814235699,
	"ln_probability_difference": -0.025383051999371742,
	"psi": 0.00032548999318807485
	},
	{
	"bucket": "bucket_1",
	"rollup": "2020-01-01T01:00:00Z",
	"country": "Canada",
	"base_bucket_max": 0.9999971182990177,
	"base_bucket_min": 0.5009102069226075,
	"base_count_per_bucket": 4988,
	"base_ln_probability_per_bucket": -0.6955500651756032,
	"base_probability_per_bucket": 0.4988,
	"base_total": 10000,
	"target_bucket_max": 0.9999971182990177,
	"target_bucket_min": 0.5009102069226075,
	"target_count_per_bucket": 2487,
	"target_ln_probability_per_bucket": -0.6701670131762315,
	"target_probability_per_bucket": 0.5116231228142357,
	"target_total": 4861,
	"probability_difference": -0.012823122814235699,
	"ln_probability_difference": -0.025383051999371742,
	"psi": 0.00032548999318807485
	},
	{
	"bucket": "bucket_2",
	"rollup": "2020-01-01T01:00:00Z",
	"country": "United States",
	"base_bucket_max": 0.9999971182990177,
	"base_bucket_min": 0.5009102069226075,
	"base_count_per_bucket": 4988,
	"base_ln_probability_per_bucket": -0.6955500651756032,
	"base_probability_per_bucket": 0.4988,
	"base_total": 10000,
	"target_bucket_max": 0.9999971182990177,
	"target_bucket_min": 0.5009102069226075,
	"target_count_per_bucket": 2487,
	"target_ln_probability_per_bucket": -0.6701670131762315,
	"target_probability_per_bucket": 0.5116231228142357,
	"target_total": 4861,
	"probability_difference": -0.012823122814235699,
	"ln_probability_difference": -0.025383051999371742,
	"psi": 0.00032548999318807485
	}
	]
	}
	```

	Sample Request: Compare data drift bucket components between two batches, with no grouping, no filters, and no rollups.

	```json
	{
	"property": "age",
	"num_bins": 10,
	"base": {
	"source": "inference",
	"filter": [
	{
	"property": "batch_id",
	"comparator": "eq",
	"value": "5"
	}
	]
	},
	"target": {
	"source": "inference",
	"filter": [
	{
	"property": "batch_id",
	"comparator": "eq",
	"value": "6"
	}
	]
	}
	}
	```
	Sample Response:
	```json
	{
	"query_result": [
	{
	"bucket": "bucket_1",
	"base_bucket_max": 0.9999971182990177,
	"base_bucket_min": 0.5009102069226075,
	"base_count_per_bucket": 4988,
	"base_ln_probability_per_bucket": -0.6955500651756032,
	"base_probability_per_bucket": 0.4988,
	"base_total": 10000,
	"target_bucket_max": 0.9999971182990177,
	"target_bucket_min": 0.5009102069226075,
	"target_count_per_bucket": 2487,
	"target_ln_probability_per_bucket": -0.6701670131762315,
	"target_probability_per_bucket": 0.5116231228142357,
	"target_total": 4861,
	"probability_difference": -0.012823122814235699,
	"ln_probability_difference": -0.025383051999371742,
	"psi": 0.00032548999318807485
	},
	{
	"bucket": "bucket_2",
	"base_bucket_max": 0.9999971182990177,
	"base_bucket_min": 0.5009102069226075,
	"base_count_per_bucket": 4988,
	"base_ln_probability_per_bucket": -0.6955500651756032,
	"base_probability_per_bucket": 0.4988,
	"base_total": 10000,
	"target_bucket_max": 0.9999971182990177,
	"target_bucket_min": 0.5009102069226075,
	"target_count_per_bucket": 2487,
	"target_ln_probability_per_bucket": -0.6701670131762315,
	"target_probability_per_bucket": 0.5116231228142357,
	"target_total": 4861,
	"probability_difference": -0.012823122814235699,
	"ln_probability_difference": -0.025383051999371742,
	"psi": 0.00032548999318807485
	}
	]
	}
	```

	[back to top](#data-drift)

	## Data Drift for Classification Outputs

	For classification outputs, one may want to examine drift among a collection of different classes, i.e. the system of outputs, instead of the drift of the probability predictions of a single class. The query uses one of `"predicted_classes": [""]` or `"ground_truth_classes": [""]` but otherwise is identical to a standard data drift query. Rather than using the star operator to select all prediction or ground truth classes, respectively, in a model, a list of string classes can be provided for looking at drift of a subset of multiclass outputs.

	* `predicted_classes` - Specifies which prediction classes to use for `predictedClass` data drift.
	* `ground_truth_classes` - Specifies which prediction classes to use for `groundTruthClass` data drift.

	`properties` can be included in the same query as long as the target `source` corresonds to the classification output tag. For example, one can query drift on input attributes and `predictedClass` in the same query with target `source` of `inference`; one can query drift on individual ground truth labels and `groundTruthClass` in the same query with target `source` of `ground_truth`.

	Query Request:
	```json
	{
	"properties [Optional]": [
	"<attribute1_name> [string]",
	"<attribute2_name> [string]",
	"<attribute3_name> [string]"
	],
	"[predicted_classes\|ground_truth_classes]": [
	"<class0_name> [string]"
	"<class1_name> [string]"
	],
	"num_bins": "<num_bins> [int]",
	"metric": "[PSI\|KLDivergence\|JSDivergence\|HellingerDistance\|HypothesisTest]",
	"base": {
	"source": "[inference\|reference]",
	"filter [Optional]": [
	{
	"property": "<filter_attribute_name> [string]",
	"comparator": "<comparator> [string]",
	"value": "<filter_threshold_value> [string\|int\|float]"
	}
	]
	},
	"target": {
	"source": "[inference\|reference\|ground_truth]",
	"filter [Optional]": [
	{
	"property": "<filter_attribute_name> [string]",
	"comparator": "<comparator> [string]",
	"value": "<filter_threshold_value> [string\|int\|float]"
	}
	]
	},
	"group_by [Optional]": [
	{
	"property": "<group_by_attribute_name> [string]"
	}
	],
	"rollup [Optional]": "minute\|hour\|day\|month\|year\|batch_id"
	}
	```

	Query Response:
	```json
	{
	"query_result": [
	{
	"<attribute1_name>": "<attribute1_data_drift> [float]",
	"<attribute2_name>": "<attribute2_data_drift> [float]",
	"<attribute3_name>": "<attribute3_data_drift> [float]",
	"[predictedClass\|groundTruthClass]": "<classification_data_drift> [float]",
	"<group_by_attribute_name>": "<group_by_attribute_value> [string\|int\|null]",
	"rollup": "<rollup_attribute_value> [string\|null]"
	}
	]
	}
	```

	See {ref}`endpoint_overview_filter_comparators` for a list of valid comparators.

	***

	Sample Request: Calculate data drift on all prediction classes.
	```json
	{
	"predicted_classes": [
	"*"
	],
	"num_bins": 20,
	"base": {
	"source": "reference"
	},
	"target": {
	"source": "inference"
	},
	"metric": "PSI"
	}
	```

	Sample Response:
	```json
	{
	"query_result": [
	{
	"predictedClass": 0.021
	}
	]
	}
	```

	Sample Request: Calculate data drift on ground truth using the first and third ground truth classes.
	```json
	{
	"predicted_classes": [
	"gt_1",
	"gt_3"
	],
	"num_bins": 20,
	"base": {
	"source": "reference"
	},
	"target": {
	"source": "ground_truth"
	},
	"metric": "PSI"
	}
	```

	Sample Response:
	```json
	{
	"query_result": [
	{
	"groundTruthClass": 0.021
	}
	]
	}
	```

	[back to top](#data-drift)

	(automated_data_drift_thresholds)=
	## Automated Data Drift Thresholds

	What is a sufficiently high data drift value to suggest that the target data has actually drifted from the base data? For `HypothesisTest`, we can reverse engineer -log_10(P_value) and plug in the conventional .05 alpha level to establish a lower bound of -log_10(.05).

	For the other data drift metrics, it is not sufficient to pin a constant. We abstract this away for the user and allow queries to obtain automatically generated data drift thresholds (lower bounds) based on a model's data. These thresholds can be used in alerting. For more information see: [Automating Data Drift Thresholding in Machine Learning Systems](https://arthur.ai/blog/automating-data-drift-thresholding-in-machine-learning-systems).

	The query uses `"metric": "Thresholds"` and does not require nor use `"target"` and `"rollup"` fields but otherwise is identical to a standard data drift query.

	Query Request:
	```json
	{
	"properties": [
	"<attribute1_name> [string]",
	"<attribute2_name> [string]",
	"<attribute3_name> [string]"
	],
	"num_bins": "<num_bins> [int]",
	"metric": "Thresholds",
	"base": {
	"source": "reference",
	"filter [Optional]": [
	{
	"property": "<filter_attribute_name> [string]",
	"comparator": "<comparator> [string]",
	"value": "<filter_threshold_value> [string\|int\|float]"
	}
	]
	},
	"group_by [Optional]": [
	{
	"property": "<group_by_attribute_name> [string]"
	}
	]
	}
	```

	Query Response:
	```json
	{
	"query_result": [
	{
	"<attribute1_name>": {
	"HellingerDistance": "<threshold> [float]",
	"JSDivergence": "<threshold> [float]",
	"KLDivergence": "<threshold> [float]",
	"PSI": "<threshold> [float]"
	},
	"<attribute2_name>": {
	"HellingerDistance": "<threshold> [float]",
	"JSDivergence": "<threshold> [float]",
	"KLDivergence": "<threshold> [float]",
	"PSI": "<threshold> [float]"
	}

	}
	]
	}
	```

	See {ref}`endpoint_overview_filter_comparators` for a list of valid comparators.

	***

	Sample Request:
	```json
	{
	"properties": [
	"AGE"
	],
	"num_bins": 20,
	"base": {
	"source": "reference"
	},
	"metric": "Thresholds"
	}
	```

	Sample Response:
	```json
	{
	"query_result": [
	{
	"AGE": {
	"HellingerDistance": 0.00041737395239735647,
	"JSDivergence": 2.959228131592643,
	"KLDivergence": 0.001893866910388703,
	"PSI": 0.0018945640055550161
	}
	}
	]
	}
	```

	[back to top](#data-drift)