File size: 4,102 Bytes
e59dc66 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
# AWS SageMaker Deployment Guide
This guide provides step-by-step instructions for deploying the Image Description application to AWS SageMaker.
## Prerequisites
- AWS account with SageMaker permissions
- AWS CLI installed and configured
- Docker installed on your local machine
- The source code from this repository
## Step 1: Create an Amazon ECR Repository
```bash
aws ecr create-repository --repository-name image-descriptor
```
Note the repository URI returned by this command. You'll use it in the next step.
## Step 2: Build and Push the Docker Image
1. Log in to ECR:
```bash
aws ecr get-login-password --region your-region | docker login --username AWS --password-stdin your-account-id.dkr.ecr.your-region.amazonaws.com
```
2. Build the Docker image:
```bash
docker build -t image-descriptor .
```
3. Tag and push the image:
```bash
docker tag image-descriptor:latest your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest
docker push your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest
```
## Step 3: Create a SageMaker Model
1. Create a model.json file:
```json
{
"ModelName": "QwenVLImageDescriptor",
"PrimaryContainer": {
"Image": "your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest",
"Environment": {
"PORT": "8080"
}
},
"ExecutionRoleArn": "arn:aws:iam::your-account-id:role/service-role/AmazonSageMaker-ExecutionRole"
}
```
2. Create the SageMaker model:
```bash
aws sagemaker create-model --cli-input-json file://model.json
```
## Step 4: Create an Endpoint Configuration
1. Create a config.json file:
```json
{
"EndpointConfigName": "QwenVLImageDescriptorConfig",
"ProductionVariants": [
{
"VariantName": "AllTraffic",
"ModelName": "QwenVLImageDescriptor",
"InstanceType": "ml.g5.2xlarge",
"InitialInstanceCount": 1
}
]
}
```
2. Create the endpoint configuration:
```bash
aws sagemaker create-endpoint-config --cli-input-json file://config.json
```
## Step 5: Create the Endpoint
```bash
aws sagemaker create-endpoint --endpoint-name qwen-vl-image-descriptor --endpoint-config-name QwenVLImageDescriptorConfig
```
This will take several minutes to deploy.
## Step 6: Invoke the Endpoint
You can invoke the endpoint using the AWS SDK or AWS CLI.
Using Python SDK:
```python
import boto3
import json
import base64
from PIL import Image
import io
# Initialize the SageMaker runtime client
runtime = boto3.client('sagemaker-runtime')
# Load and encode the image
with open('data_temp/page_2.png', 'rb') as f:
image_data = f.read()
image_b64 = base64.b64encode(image_data).decode('utf-8')
# Create the request payload
payload = {
'image_data': image_b64
}
# Invoke the endpoint
response = runtime.invoke_endpoint(
EndpointName='qwen-vl-image-descriptor',
ContentType='application/json',
Body=json.dumps(payload)
)
# Parse the response
result = json.loads(response['Body'].read().decode())
print(json.dumps(result, indent=2))
```
## Step 7: Set Up API Gateway (Optional)
For public HTTP access, set up an API Gateway:
1. Create a new REST API in API Gateway
2. Create a new resource and POST method
3. Configure the integration to use the SageMaker endpoint
4. Deploy the API to a stage
5. Note the API Gateway URL for client use
## Cost Optimization
To optimize costs:
1. Use SageMaker Serverless Inference instead of a dedicated endpoint
2. Implement auto-scaling for your endpoint
3. Use Spot Instances for non-critical workloads
4. Schedule endpoints to be active only during business hours
## Monitoring
Set up CloudWatch Alarms to monitor:
1. Endpoint invocation metrics
2. Error rates
3. Latency
4. Instance utilization
## Cleanup
To avoid ongoing charges, delete resources when not in use:
```bash
aws sagemaker delete-endpoint --endpoint-name qwen-vl-image-descriptor
aws sagemaker delete-endpoint-config --endpoint-config-name QwenVLImageDescriptorConfig
aws sagemaker delete-model --model-name QwenVLImageDescriptor
``` |