cursor_slides_internvl2 / docs /aws_deployment.md
mknolan's picture
Upload InternVL2 implementation
e59dc66 verified
|
raw
history blame
4.1 kB

AWS SageMaker Deployment Guide

This guide provides step-by-step instructions for deploying the Image Description application to AWS SageMaker.

Prerequisites

  • AWS account with SageMaker permissions
  • AWS CLI installed and configured
  • Docker installed on your local machine
  • The source code from this repository

Step 1: Create an Amazon ECR Repository

aws ecr create-repository --repository-name image-descriptor

Note the repository URI returned by this command. You'll use it in the next step.

Step 2: Build and Push the Docker Image

  1. Log in to ECR:
aws ecr get-login-password --region your-region | docker login --username AWS --password-stdin your-account-id.dkr.ecr.your-region.amazonaws.com
  1. Build the Docker image:
docker build -t image-descriptor .
  1. Tag and push the image:
docker tag image-descriptor:latest your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest
docker push your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest

Step 3: Create a SageMaker Model

  1. Create a model.json file:
{
    "ModelName": "QwenVLImageDescriptor",
    "PrimaryContainer": {
        "Image": "your-account-id.dkr.ecr.your-region.amazonaws.com/image-descriptor:latest",
        "Environment": {
            "PORT": "8080"
        }
    },
    "ExecutionRoleArn": "arn:aws:iam::your-account-id:role/service-role/AmazonSageMaker-ExecutionRole"
}
  1. Create the SageMaker model:
aws sagemaker create-model --cli-input-json file://model.json

Step 4: Create an Endpoint Configuration

  1. Create a config.json file:
{
    "EndpointConfigName": "QwenVLImageDescriptorConfig",
    "ProductionVariants": [
        {
            "VariantName": "AllTraffic",
            "ModelName": "QwenVLImageDescriptor",
            "InstanceType": "ml.g5.2xlarge",
            "InitialInstanceCount": 1
        }
    ]
}
  1. Create the endpoint configuration:
aws sagemaker create-endpoint-config --cli-input-json file://config.json

Step 5: Create the Endpoint

aws sagemaker create-endpoint --endpoint-name qwen-vl-image-descriptor --endpoint-config-name QwenVLImageDescriptorConfig

This will take several minutes to deploy.

Step 6: Invoke the Endpoint

You can invoke the endpoint using the AWS SDK or AWS CLI.

Using Python SDK:

import boto3
import json
import base64
from PIL import Image
import io

# Initialize the SageMaker runtime client
runtime = boto3.client('sagemaker-runtime')

# Load and encode the image
with open('data_temp/page_2.png', 'rb') as f:
    image_data = f.read()
image_b64 = base64.b64encode(image_data).decode('utf-8')

# Create the request payload
payload = {
    'image_data': image_b64
}

# Invoke the endpoint
response = runtime.invoke_endpoint(
    EndpointName='qwen-vl-image-descriptor',
    ContentType='application/json',
    Body=json.dumps(payload)
)

# Parse the response
result = json.loads(response['Body'].read().decode())
print(json.dumps(result, indent=2))

Step 7: Set Up API Gateway (Optional)

For public HTTP access, set up an API Gateway:

  1. Create a new REST API in API Gateway
  2. Create a new resource and POST method
  3. Configure the integration to use the SageMaker endpoint
  4. Deploy the API to a stage
  5. Note the API Gateway URL for client use

Cost Optimization

To optimize costs:

  1. Use SageMaker Serverless Inference instead of a dedicated endpoint
  2. Implement auto-scaling for your endpoint
  3. Use Spot Instances for non-critical workloads
  4. Schedule endpoints to be active only during business hours

Monitoring

Set up CloudWatch Alarms to monitor:

  1. Endpoint invocation metrics
  2. Error rates
  3. Latency
  4. Instance utilization

Cleanup

To avoid ongoing charges, delete resources when not in use:

aws sagemaker delete-endpoint --endpoint-name qwen-vl-image-descriptor
aws sagemaker delete-endpoint-config --endpoint-config-name QwenVLImageDescriptorConfig
aws sagemaker delete-model --model-name QwenVLImageDescriptor