(sagemaker_integration)= ## AWS SageMaker Data Capture Integration Models deployed with AWS SageMaker can be configured to automatically push their real-time inferences to the Arthur platform by utilizing [SageMaker Data Capture](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-data-capture.html). This guide walks through setting up that integration and utilizing a Lambda function to send Data Capture log files to be ingested by the Arthur platform. ### Prerequisites - The model for which inferences are being ingested has already been onboarded onto Arthur. - The SageMaker model schema matches that of its Arthur model counterpart. ### SageMaker Configuration AWS SageMaker offers two features that enable this Arthur integration: Real-time endpoints & Data Capture. Endpoints are APIs that expose a trained model. Users can use the API to retrieve predictions from the hosted model in the endpoint. Data Capture is a feature that logs the inputs and outputs of each prediction from the hosted model endpoints. To enable Data Capture in a way that accurately logs all input and output data needed for the Arthur integration, a configuration must be passed in when deploying an endpoint (see below). #### Configuring Data Capture through the SageMaker SDK An extended description of the following configuration can be found in the "SageMaker Python SDK" tab of the [SageMaker Data Capture documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-data-capture.html#model-monitor-data-capture-defing.title). ```python from sagemaker.model import Model from sagemaker.model_monitor import DataCaptureConfig s3_capture_upload_path = f"s3://{bucket-name}/{model-specific-path}/datacapture" model = Model( ... ) data_capture_config = DataCaptureConfig( enable_capture=True, sampling_percentage=100, destination_s3_uri=s3_capture_upload_path, capture_options=['REQUEST','RESPONSE'], ) model.deploy( data_capture_config=data_capture_config, ... ) ``` This integration requires that `DataCaptureConfig` be set such that: - `capture_options` includes both `REQUEST` and `RESPONSE` to record model inputs and outputs for each inference - `sampling_percentage` is set to `100` in order to comprehensively ingest all new inferences - `enable_capture` is set to `True` #### Configuring Data Capture through the SageMaker API To create a real-time endpoint via the API, users can also call the [CreateEndpoint](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpoint.html) API. In order to ensure that this endpoint is deployed with Data Capture enabled, it must receive an `EndpointConfigName` that matches an `EndpointConfig` created using the [CreateEndpointConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpointConfig.html) API with the following specifications: ``` { ..., "DataCaptureConfig": { "CaptureContentTypeHeader": { "CsvContentTypes": [ "string" ], "JsonContentTypes": [ "string" ] }, "CaptureOptions": [ { "CaptureMode": "Input" }, { "CaptureMode": "Output" } ], "DestinationS3Uri": "string", "EnableCapture": true, "InitialSamplingPercentage": 100, "KmsKeyId": "string" }, "EndpointConfigName": "string", ... } ``` This integration requires that `DataCaptureConfig` be set such that: - `CaptureContentTypeHeader` be specified to an Arthur-supported content type (see section below). If no CsvContentTypes or JsonContentTypes are specified, SageMaker will by default `base64` encode when capturing the data. This content type is currently not supported by the Arthur platform. - `CaptureOptions` be set to both the `Input` and `Output` Capture Modes. - `EnableCapture` be set to `true`. - `InitialSamplingPercentage` be set to `100`. #### Supported Data Formats AWS SageMaker algorithms can accept and produce numerous MIME types for the HTTP payloads used in retrieving predictions from endpoint-hosted models. The MIME type utilized in an endpoint invocation also corresponds to the format of the Data Captured inference. The Arthur platform supports the following MIME types / data formats for those types: ##### MIME Type: `text/csv` ``` 37,Self-emp-not-inc,227253,Preschool,1,Married-civ-spouse,Sales,Husband,White,Male,0,0,30,Mexico\n24,Private,211129,Bachelors,13,Never-married,Exec-managerial,Other-relative,White,Female,0,0,60,United-States\n ``` - Each inference is represented as an ordered row of comma separate values, where each value represents a feature in the inference - These features must be specified in the same order as their counterparts in the corresponding Arthur Model - If multiple inferences are included in a single call to `invoke_endpoint`, each inference is separated by `\n` ##### MIME Type: `application/json` Arthur currently supports two unique JSON formats, which are described with examples below. ###### Option 1: Column-Ordered List of Feature-Values ```json { "instances": [ { "features": [1.5, 16, "testStringA", false] }, { "features": [2.0, 12, "testStringB", true] } ] } ``` - Each inference is represented as a new object inside a JSON array - The upper level key mapping to this inference array is named one of the following: `instances`, `predictions` - Each object within this JSON array is a key mapping to an ordered array of features - The second level key mapping to this feature array is named one of the following: `features`, `probabilities` ###### Option 2: Feature-Name Keys to Values Map ```json { "predictions": [ { "closest_cluster": 5, "distance_to_cluster": 36.5 }, { "closest_cluster": 2, "distance_to_cluster": 90.3 } ] } ``` - Each inference is represented as an object inside a JSON array - The upper level key mapping to this inference array is named one of the following: `instances`, `predictions` - Each object within this JSON array has keys representing feature names mapping to their corresponding feature values - The names of these features cannot be any one of the following: `instances`, `predictions`, `features`, `probabilities` #### Specifying Partner Inference ID on Arthur-Ingested Data Capture Inferences The Arthur platform enforces that each uploaded inference has a Partner Inference ID, which is a unique identifier used as the matching mechanism for joining ground truth data. Arthur's SageMaker integration populates the Arthur Inference ID from two possible sources in SageMaker. The default is to use SageMaker's EventID, which is a random ID auto-generated by SageMaker for each request. SageMaker's EventID is captured in the `eventMetadata/eventId` field of the data capture output files. As another option, SageMaker allows Invoke-Endpoint API callees to specify an `InferenceId` (or `inference-id`) to a call when using the API, SDK function, or CLI to invoke an endpoint. When InferenceId is specified, SageMaker appends an `eventMetadata/inferenceId` field to the Data Capture event. Both approaches generate a single `eventId` or `inferenceId` for each call to Invoke-Endpoint. If an InferenceId is specified, Arthur will use it as the Arthur Partner Inference ID. Otherwise it will default to the SageMaker EventId. One tricky part about SageMaker's Invoke-Endpoint API it it allows requesting multiple inferences in a single Invoke-Endpoint API call. In this case, the SageMaker EventId or callee-specified InferenceId would be shared by all inferences in the call and would not be unique. When this occurs, the Arthur integration will append an index number to either the EventId or InferenceId based on the order of the inference in the call to Invoke-Endpoint. When ingesting Data Capture inferences from SageMaker, the following table describes the partner inference ID any given inference is assigned on the Arthur platform.
SageMaker Invoke Call | Without Inference ID provided in Invoke Endpoint | With Inference ID provided in Invoke Endpoint |
---|---|---|
Single Inference in Invoke Endpoint | EventId | InferenceId |
Multiple Inferences in Invoke Endpoint | EventId_{index_within_invoke_endpoint_call} | InferenceId_{index_within_invoke_endpoint_call} |