Spaces:
Sleeping
Sleeping
# Enrichments | |
This guide will outline how to enable, disable, and configure Enrichments. | |
Enrichment | Constant | Description | |
---|---|--- | |
[Explainability](#explainability) | `Enrichment.Explainability` | Generates feature importance scores for inferences. Requires user to provide model files. | |
[Anomaly Detection](#anomaly-detection) | `Enrichment.AnomalyDetection` | Calculates a multivariate anomaly score on each inference. Requires reference set to be uploaded. | |
[Hotspots](#hotspots) | `Enrichment.Hotspots` | Finds data points which the model underperforms on. This is calculated for each batch or over 7 days worth of data for streaming models. | |
[Bias Mitigation](#bias-mitigation) | `Enrichment.BiasMitigation` | Calculates possible sets of group-conditional thresholds that may be used to produce fairer classifications. | |
(enrichments_explainability)= | |
## Explainability | |
### Compatibility | |
Explainability is supported for all InputTypes, and all OutputTypes except for ObjectDetection. | |
### Usage | |
To enable, we advise using the helper function `model.enable_explainability()` which simplifies some of the steps of updating the explainability Enrichment. For more detail, see our guide on {ref}`enabling explainability <enabling_explainability>` | |
Once enabled, you can use the generic functions (`model.update_enrichment()` or `model.update_enrichments()`) to update and change configuration, or disable explainability. | |
```python | |
# view configuration | |
arthur_model.get_enrichment(Enrichment.Explainability) | |
# enable | |
arthur_model.enable_explainability( | |
df=X_train.head(50), | |
project_directory="/path/to/model_code/", | |
requirements_file="example_requirements.txt", | |
user_predict_function_import_path="example_entrypoint" | |
) | |
# update configuration | |
config_to_update = { | |
'explanation_algo': 'shap', | |
'streaming_explainability_enabled': False | |
} | |
arthur_model.update_enrichment(Enrichment.Explainability, True, config_to_update) | |
# disable | |
arthur_model.update_enrichment(Enrichment.Explainability, False, {}) | |
``` | |
### {doc}`Explainability Walkthrough <explainability>` | |
See our {doc}`explainability walkthrough <explainability>` for a thorough guide on setting up the explainability enrichment. | |
--- | |
(enrichments_anomaly_detection)= | |
## Anomaly Detection | |
### Compatiblity | |
Anomaly Detection can be enabled for models with any InputType and OutputType. | |
Only a reference set is required - this can be a set of the model's train or test data. Once a reference set is uploaded, anomaly scores are calculated automatically. | |
### Usage | |
```python | |
# view current configuration | |
arthur_model.get_enrichment(Enrichment.AnomalyDetection) | |
# enable | |
arthur_model.update_enrichment(Enrichment.AnomalyDetection, True, {}) | |
# disable | |
arthur_model.update_enrichment(Enrichment.AnomalyDetection, False, {}) | |
``` | |
### Configuration | |
No additional configuration is needed for Anomaly Detection. | |
### {ref}`Algorithm <arthur_algorithms_anomaly_detection>` | |
See the explanation of our anomaly detection functionality from an algorithms perspective {ref}`here <arthur_algorithms_anomaly_detection>`. | |
--- | |
(enrichments_hotspots)= | |
## Hotspots | |
When a system has high-dimensional data, finding the right data input regions such troubleshooting becomes a difficult problem. Hotspots automates identifying regions associated with poor ML performance to significantly reduce time and error of finding such regions. | |
### Compatibility | |
Hotspots can only be enabled for tabular binary classifiers (that is, models with Tabular input types, Multiclass output | |
types, and at most two predicted value / ground truth attributes). | |
If your model sends data in batches, hotspot trees | |
will be created for each batch that has ground truth uploaded. For streaming models hotspot trees will be generated on | |
for inferences with ground truth on a weekly basis (Monday to Sunday). | |
### Usage | |
```python | |
# view current configuration | |
arthur_model.get_enrichment(Enrichment.Hotspots) | |
# enable | |
arthur_model.update_enrichment(Enrichment.Hotspots, True, {}) | |
# disable | |
arthur_model.update_enrichment(Enrichment.Hotspots, False, {}) | |
``` | |
### Configuration | |
There is currently no additional configuration for Hotspots. | |
### Fetching Hotspots | |
If we have hotspots enabled, we fetch hotspots via the [API endpoint](https://docs.arthur.ai/api-documentation/v3-api-docs.html#tag/enrichments/paths/~1models~1{model_id}~1enrichments~1hotspots~1find/get). From the SDK, with a loaded Arthur model, we can fetch hotspots as such: | |
```python | |
model.find_hotspots(metric="accuracy", threshold=.7, batch_id="batch_2903") | |
``` | |
The method signature is as follows: | |
```python | |
def find_hotspots(self, | |
metric: AccuracyMetric = AccuracyMetric.Accuracy, | |
threshold: float = 0.5, | |
batch_id: str = None, | |
date: str = None, | |
ref_set_id: str = None) -> Dict[str, Any]: | |
"""Retrieve hotspots from the model | |
:param metric: accuracy metric used to filter hotspots tree by, defaults to "accuracy" | |
:param threshold: threshold for of performance metric used for filtering hotspots, defaults to 0.5 | |
:param batch_id: string id for the batch to find hotspots in, defaults to None | |
:param date: string used to define date, defaults to None | |
:param ref_set_id: string id for the reference set to find hotspots in, defaults to None | |
:raise: ArthurUserError: failed due to user error | |
:raise: ArthurInternalError: failed due to an internal error | |
""" | |
``` | |
### Interpreting Hotspots | |
For a toy classification model with two inputs X0 and X1, a returned list of hotspots could be as follows: | |
```json | |
[ | |
{ | |
"regions": { | |
"X1": { | |
"gt": -7.839450836181641, | |
"lte": -2.257883667945862 | |
}, | |
"X0": { | |
"gt": -6.966174602508545, | |
"lte": -2.8999762535095215 | |
} | |
}, | |
"accuracy": 0.42105263157894735 | |
}, | |
{ | |
"regions": { | |
"X1": { | |
"gt": -7.839450836181641, | |
"lte": -5.140551567077637 | |
}, | |
"X0": { | |
"gt": 4.7409820556640625, | |
"lte": "inf" | |
} | |
}, | |
"accuracy": 0.35714285714285715 | |
}, | |
{ | |
"regions": { | |
"X1": { | |
"gt": 3.8619565963745117, | |
"lte": 6.9831953048706055 | |
}, | |
"X0": { | |
"gt": -0.9038164913654327, | |
"lte": 0.9839221835136414 | |
} | |
}, | |
"accuracy": 0.125 | |
} | |
] | |
```` | |
Here we have three hotspots. Taking the last hotspot, the input region is `-.90 < X0 <= .98` and `3.86 < X1 <= 6.98`, and the datapoints in that particular region have an accuracy of .125. This now allows the user to immediately investigate the "needle in the haystack" immediately. | |
**{ref}`Algorithm <arthur_algorithms_hotspots>`** | |
See the explanation of our Hotspots functionality from an algorithms perspective {ref}`here <arthur_algorithms_hotspots>`. | |
--- | |
(enrichments_bias_mitigation)= | |
## Bias Mitigation | |
### Compatibility | |
Bias Mitigation can be enabled for binary classification models of any input type, as long as at least one attribute | |
is marked as `monitor_for_bias=True`, and a reference set uploaded to Arthur. | |
### Usage | |
```python | |
# view current configuration | |
arthur_model.get_enrichment(Enrichment.BiasMitigation) | |
# enable | |
arthur_model.update_enrichment(Enrichment.BiasMitigation, True, {}) | |
# or | |
arthur_model.enable_bias_mitigation() | |
``` | |
Enabling Bias Mitigation will automatically train a mitigation model for all attributes marked as `monitor_for_bias=True`, for the constraints demographic parity, equalized odds, and equal opportunity. | |
### Configuration | |
There is currently no additional configuration for Bias Mitigation. | |
### {ref}`Algorithm <arthur_algorithms_bias_mitigation>` | |
See the explanation of our bias mitigation functionality from an algorithms perspective {ref}`here <arthur_algorithms_bias_mitigation>`. | |
--- | |
(enrichments_configuring_multiple_enrichments)= | |
## Configuring Multiple Enrichments | |
### Viewing Current Enrichments | |
You can use the SDK to fetch all enrichment settings for a model: | |
```python | |
arthur_model.get_enrichments() | |
``` | |
This will return a dictionary containing the configuration for all available enrichments: | |
```python | |
{'anomaly_detection': {'enabled': True, 'config': {}}, | |
'bias_mitigation': {'enabled': False}, | |
'explainability': {'enabled': False}, | |
'hotspots': {'enabled': False}} | |
``` | |
### Updating Enrichment Configurations | |
You can configure multiple enrichments at once: | |
```python | |
enrichment_configs = { | |
Enrichment.Explainability: {'enabled': False, 'config': {}}, | |
Enrichment.AnomalyDetection: {'enabled': True, 'config': {}} | |
} | |
arthur_model.update_enrichments(enrichment_configs) | |
``` | |