File size: 4,102 Bytes
e59dc66
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
# AWS SageMaker Deployment Guide

This guide provides step-by-step instructions for deploying the Image Description application to AWS SageMaker.

## Prerequisites

- AWS account with SageMaker permissions
- AWS CLI installed and configured
- Docker installed on your local machine
- The source code from this repository

## Step 1: Create an Amazon ECR Repository

```bash
aws ecr create-repository --repository-name image-descriptor
```

Note the repository URI returned by this command. You'll use it in the next step.

## Step 2: Build and Push the Docker Image

1. Log in to ECR:

```bash
aws ecr get-login-password --region your-region | docker login --username AWS --password-stdin your-account-id.dkr.ecr.your-region.amazonaws.com
```

2. Build the Docker image:

```bash
docker build -t image-descriptor .
```

3. Tag and push the image:

```bash
docker tag image-descriptor:latest your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest
docker push your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest
```

## Step 3: Create a SageMaker Model

1. Create a model.json file:

```json
{
    "ModelName": "QwenVLImageDescriptor",
    "PrimaryContainer": {
        "Image": "your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest",
        "Environment": {
            "PORT": "8080"
        }
    },
    "ExecutionRoleArn": "arn:aws:iam::your-account-id:role/service-role/AmazonSageMaker-ExecutionRole"
}
```

2. Create the SageMaker model:

```bash
aws sagemaker create-model --cli-input-json file://model.json
```

## Step 4: Create an Endpoint Configuration

1. Create a config.json file:

```json
{
    "EndpointConfigName": "QwenVLImageDescriptorConfig",
    "ProductionVariants": [
        {
            "VariantName": "AllTraffic",
            "ModelName": "QwenVLImageDescriptor",
            "InstanceType": "ml.g5.2xlarge",
            "InitialInstanceCount": 1
        }
    ]
}
```

2. Create the endpoint configuration:

```bash
aws sagemaker create-endpoint-config --cli-input-json file://config.json
```

## Step 5: Create the Endpoint

```bash
aws sagemaker create-endpoint --endpoint-name qwen-vl-image-descriptor --endpoint-config-name QwenVLImageDescriptorConfig
```

This will take several minutes to deploy.

## Step 6: Invoke the Endpoint

You can invoke the endpoint using the AWS SDK or AWS CLI.

Using Python SDK:

```python
import boto3
import json
import base64
from PIL import Image
import io

# Initialize the SageMaker runtime client
runtime = boto3.client('sagemaker-runtime')

# Load and encode the image
with open('data_temp/page_2.png', 'rb') as f:
    image_data = f.read()
image_b64 = base64.b64encode(image_data).decode('utf-8')

# Create the request payload
payload = {
    'image_data': image_b64
}

# Invoke the endpoint
response = runtime.invoke_endpoint(
    EndpointName='qwen-vl-image-descriptor',
    ContentType='application/json',
    Body=json.dumps(payload)
)

# Parse the response
result = json.loads(response['Body'].read().decode())
print(json.dumps(result, indent=2))
```

## Step 7: Set Up API Gateway (Optional)

For public HTTP access, set up an API Gateway:

1. Create a new REST API in API Gateway
2. Create a new resource and POST method
3. Configure the integration to use the SageMaker endpoint
4. Deploy the API to a stage
5. Note the API Gateway URL for client use

## Cost Optimization

To optimize costs:

1. Use SageMaker Serverless Inference instead of a dedicated endpoint
2. Implement auto-scaling for your endpoint
3. Use Spot Instances for non-critical workloads
4. Schedule endpoints to be active only during business hours

## Monitoring

Set up CloudWatch Alarms to monitor:

1. Endpoint invocation metrics
2. Error rates
3. Latency
4. Instance utilization

## Cleanup

To avoid ongoing charges, delete resources when not in use:

```bash
aws sagemaker delete-endpoint --endpoint-name qwen-vl-image-descriptor
aws sagemaker delete-endpoint-config --endpoint-config-name QwenVLImageDescriptorConfig
aws sagemaker delete-model --model-name QwenVLImageDescriptor
```