cursor_slides_internvl2 / docs /aws_deployment.md
mknolan's picture
Upload InternVL2 implementation
e59dc66 verified
|
raw
history blame
4.1 kB
# AWS SageMaker Deployment Guide
This guide provides step-by-step instructions for deploying the Image Description application to AWS SageMaker.
## Prerequisites
- AWS account with SageMaker permissions
- AWS CLI installed and configured
- Docker installed on your local machine
- The source code from this repository
## Step 1: Create an Amazon ECR Repository
```bash
aws ecr create-repository --repository-name image-descriptor
```
Note the repository URI returned by this command. You'll use it in the next step.
## Step 2: Build and Push the Docker Image
1. Log in to ECR:
```bash
aws ecr get-login-password --region your-region | docker login --username AWS --password-stdin your-account-id.dkr.ecr.your-region.amazonaws.com
```
2. Build the Docker image:
```bash
docker build -t image-descriptor .
```
3. Tag and push the image:
```bash
docker tag image-descriptor:latest your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest
docker push your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest
```
## Step 3: Create a SageMaker Model
1. Create a model.json file:
```json
{
"ModelName": "QwenVLImageDescriptor",
"PrimaryContainer": {
"Image": "your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest",
"Environment": {
"PORT": "8080"
}
},
"ExecutionRoleArn": "arn:aws:iam::your-account-id:role/service-role/AmazonSageMaker-ExecutionRole"
}
```
2. Create the SageMaker model:
```bash
aws sagemaker create-model --cli-input-json file://model.json
```
## Step 4: Create an Endpoint Configuration
1. Create a config.json file:
```json
{
"EndpointConfigName": "QwenVLImageDescriptorConfig",
"ProductionVariants": [
{
"VariantName": "AllTraffic",
"ModelName": "QwenVLImageDescriptor",
"InstanceType": "ml.g5.2xlarge",
"InitialInstanceCount": 1
}
]
}
```
2. Create the endpoint configuration:
```bash
aws sagemaker create-endpoint-config --cli-input-json file://config.json
```
## Step 5: Create the Endpoint
```bash
aws sagemaker create-endpoint --endpoint-name qwen-vl-image-descriptor --endpoint-config-name QwenVLImageDescriptorConfig
```
This will take several minutes to deploy.
## Step 6: Invoke the Endpoint
You can invoke the endpoint using the AWS SDK or AWS CLI.
Using Python SDK:
```python
import boto3
import json
import base64
from PIL import Image
import io
# Initialize the SageMaker runtime client
runtime = boto3.client('sagemaker-runtime')
# Load and encode the image
with open('data_temp/page_2.png', 'rb') as f:
image_data = f.read()
image_b64 = base64.b64encode(image_data).decode('utf-8')
# Create the request payload
payload = {
'image_data': image_b64
}
# Invoke the endpoint
response = runtime.invoke_endpoint(
EndpointName='qwen-vl-image-descriptor',
ContentType='application/json',
Body=json.dumps(payload)
)
# Parse the response
result = json.loads(response['Body'].read().decode())
print(json.dumps(result, indent=2))
```
## Step 7: Set Up API Gateway (Optional)
For public HTTP access, set up an API Gateway:
1. Create a new REST API in API Gateway
2. Create a new resource and POST method
3. Configure the integration to use the SageMaker endpoint
4. Deploy the API to a stage
5. Note the API Gateway URL for client use
## Cost Optimization
To optimize costs:
1. Use SageMaker Serverless Inference instead of a dedicated endpoint
2. Implement auto-scaling for your endpoint
3. Use Spot Instances for non-critical workloads
4. Schedule endpoints to be active only during business hours
## Monitoring
Set up CloudWatch Alarms to monitor:
1. Endpoint invocation metrics
2. Error rates
3. Latency
4. Instance utilization
## Cleanup
To avoid ongoing charges, delete resources when not in use:
```bash
aws sagemaker delete-endpoint --endpoint-name qwen-vl-image-descriptor
aws sagemaker delete-endpoint-config --endpoint-config-name QwenVLImageDescriptorConfig
aws sagemaker delete-model --model-name QwenVLImageDescriptor
```