Spaces:

mknolan
/

cursor_slides_internvl2

Paused

File size: 4,102 Bytes

e59dc66

# AWS SageMaker Deployment Guide

This guide provides step-by-step instructions for deploying the Image Description application to AWS SageMaker.

## Prerequisites

- AWS account with SageMaker permissions
- AWS CLI installed and configured
- Docker installed on your local machine
- The source code from this repository

## Step 1: Create an Amazon ECR Repository

```bash
aws ecr create-repository --repository-name image-descriptor
```

Note the repository URI returned by this command. You'll use it in the next step.

## Step 2: Build and Push the Docker Image

1. Log in to ECR:

```bash
aws ecr get-login-password --region your-region | docker login --username AWS --password-stdin your-account-id.dkr.ecr.your-region.amazonaws.com
```

2. Build the Docker image:

```bash
docker build -t image-descriptor .
```

3. Tag and push the image:

```bash
docker tag image-descriptor:latest your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest
docker push your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest
```

## Step 3: Create a SageMaker Model

1. Create a model.json file:

```json
{
    "ModelName": "QwenVLImageDescriptor",
    "PrimaryContainer": {
        "Image": "your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest",
        "Environment": {
            "PORT": "8080"
        }
    },
    "ExecutionRoleArn": "arn:aws:iam::your-account-id:role/service-role/AmazonSageMaker-ExecutionRole"
}
```

2. Create the SageMaker model:

```bash
aws sagemaker create-model --cli-input-json file://model.json
```

## Step 4: Create an Endpoint Configuration

1. Create a config.json file:

```json
{
    "EndpointConfigName": "QwenVLImageDescriptorConfig",
    "ProductionVariants": [
        {
            "VariantName": "AllTraffic",
            "ModelName": "QwenVLImageDescriptor",
            "InstanceType": "ml.g5.2xlarge",
            "InitialInstanceCount": 1
        }
    ]
}
```

2. Create the endpoint configuration:

```bash
aws sagemaker create-endpoint-config --cli-input-json file://config.json
```

## Step 5: Create the Endpoint

```bash
aws sagemaker create-endpoint --endpoint-name qwen-vl-image-descriptor --endpoint-config-name QwenVLImageDescriptorConfig
```

This will take several minutes to deploy.

## Step 6: Invoke the Endpoint

You can invoke the endpoint using the AWS SDK or AWS CLI.

Using Python SDK:

```python
import boto3
import json
import base64
from PIL import Image
import io

# Initialize the SageMaker runtime client
runtime = boto3.client('sagemaker-runtime')

# Load and encode the image
with open('data_temp/page_2.png', 'rb') as f:
    image_data = f.read()
image_b64 = base64.b64encode(image_data).decode('utf-8')

# Create the request payload
payload = {
    'image_data': image_b64
}

# Invoke the endpoint
response = runtime.invoke_endpoint(
    EndpointName='qwen-vl-image-descriptor',
    ContentType='application/json',
    Body=json.dumps(payload)
)

# Parse the response
result = json.loads(response['Body'].read().decode())
print(json.dumps(result, indent=2))
```

## Step 7: Set Up API Gateway (Optional)

For public HTTP access, set up an API Gateway:

1. Create a new REST API in API Gateway
2. Create a new resource and POST method
3. Configure the integration to use the SageMaker endpoint
4. Deploy the API to a stage
5. Note the API Gateway URL for client use

## Cost Optimization

To optimize costs:

1. Use SageMaker Serverless Inference instead of a dedicated endpoint
2. Implement auto-scaling for your endpoint
3. Use Spot Instances for non-critical workloads
4. Schedule endpoints to be active only during business hours

## Monitoring

Set up CloudWatch Alarms to monitor:

1. Endpoint invocation metrics
2. Error rates
3. Latency
4. Instance utilization

## Cleanup

To avoid ongoing charges, delete resources when not in use:

```bash
aws sagemaker delete-endpoint --endpoint-name qwen-vl-image-descriptor
aws sagemaker delete-endpoint-config --endpoint-config-name QwenVLImageDescriptorConfig
aws sagemaker delete-model --model-name QwenVLImageDescriptor
```