Spaces:

Shyamnath
/

inferencing-llm

Running

File size: 6,337 Bytes

469eae6

# LiteLLM Proxy Client

A Python client library for interacting with the LiteLLM proxy server. This client provides a clean, typed interface for managing models, keys, credentials, and making chat completions.

## Installation

```bash
pip install litellm
```

## Quick Start

```python
from litellm.proxy.client import Client

# Initialize the client
client = Client(
    base_url="http://localhost:4000",  # Your LiteLLM proxy server URL
    api_key="sk-api-key"               # Optional: API key for authentication
)

# Make a chat completion request
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "Hello, how are you?"}
    ]
)
print(response.choices[0].message.content)
```

## Features

The client is organized into several resource clients for different functionality:

- `chat`: Chat completions
- `models`: Model management
- `model_groups`: Model group management
- `keys`: API key management
- `credentials`: Credential management

## Chat Completions

Make chat completion requests to your LiteLLM proxy:

```python
# Basic chat completion
response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What's the capital of France?"}
    ]
)

# Stream responses
for chunk in client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
):
    print(chunk.choices[0].delta.content or "", end="")
```

## Model Management

Manage available models on your proxy:

```python
# List available models
models = client.models.list()

# Add a new model
client.models.add(
    model_name="gpt-4",
    litellm_params={
        "api_key": "your-openai-key",
        "api_base": "https://api.openai.com/v1"
    }
)

# Delete a model
client.models.delete(model_name="gpt-4")
```

## API Key Management

Manage virtual API keys:

```python
# Generate a new API key
key = client.keys.generate(
    models=["gpt-4", "gpt-3.5-turbo"],
    aliases={"gpt4": "gpt-4"},
    duration="24h",
    key_alias="my-key",
    team_id="team123"
)

# List all keys
keys = client.keys.list(
    page=1,
    size=10,
    return_full_object=True
)

# Delete keys
client.keys.delete(
    keys=["sk-key1", "sk-key2"],
    key_aliases=["alias1", "alias2"]
)
```

## Credential Management

Manage model credentials:

```python
# Create new credentials
client.credentials.create(
    credential_name="azure1",
    credential_info={"api_type": "azure"},
    credential_values={
        "api_key": "your-azure-key",
        "api_base": "https://example.azure.openai.com"
    }
)

# List all credentials
credentials = client.credentials.list()

# Get a specific credential
credential = client.credentials.get(credential_name="azure1")

# Delete credentials
client.credentials.delete(credential_name="azure1")
```

## Model Groups

Manage model groups for load balancing and fallbacks:

```python
# Create a model group
client.model_groups.create(
    name="gpt4-group",
    models=[
        {"model_name": "gpt-4", "litellm_params": {"api_key": "key1"}},
        {"model_name": "gpt-4-backup", "litellm_params": {"api_key": "key2"}}
    ]
)

# List model groups
groups = client.model_groups.list()

# Delete a model group
client.model_groups.delete(name="gpt4-group")
```

## Low-Level HTTP Client

The client provides access to a low-level HTTP client for making direct requests
to the LiteLLM proxy server. This is useful when you need more control or when
working with endpoints that don't yet have a high-level interface.

```python
# Access the HTTP client
client = Client(
    base_url="http://localhost:4000",
    api_key="sk-api-key"
)

# Make a custom request
response = client.http.request(
    method="POST",
    uri="/health/test_connection",
    json={
        "litellm_params": {
            "model": "gpt-4",
            "api_key": "your-api-key",
            "api_base": "https://api.openai.com/v1"
        },
        "mode": "chat"
    }
)

# The response is automatically parsed from JSON
print(response)
```

### HTTP Client Features

- Automatic URL handling (handles trailing/leading slashes)
- Built-in authentication (adds Bearer token if `api_key` is provided)
- JSON request/response handling
- Configurable timeout (default: 30 seconds)
- Comprehensive error handling
- Support for custom headers and request parameters

### HTTP Client `request` method parameters

- `method`: HTTP method (GET, POST, PUT, DELETE, etc.)
- `uri`: URI path (will be appended to base_url)
- `data`: (optional) Data to send in the request body
- `json`: (optional) JSON data to send in the request body
- `headers`: (optional) Custom HTTP headers
- Additional keyword arguments are passed to the underlying requests library

## Error Handling

The client provides clear error handling with custom exceptions:

```python
from litellm.proxy.client.exceptions import UnauthorizedError

try:
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "Hello"}]
    )
except UnauthorizedError as e:
    print("Authentication failed:", e)
except Exception as e:
    print("Request failed:", e)
```

## Advanced Usage

### Request Customization

All methods support returning the raw request object for inspection or modification:

```python
# Get the prepared request without sending it
request = client.models.list(return_request=True)
print(request.method)  # GET
print(request.url)     # http://localhost:8000/models
print(request.headers) # {'Content-Type': 'application/json', ...}
```

### Pagination

Methods that return lists support pagination:

```python
# Get the first page of keys
page1 = client.keys.list(page=1, size=10)

# Get the second page
page2 = client.keys.list(page=2, size=10)
```

### Filtering

Many list methods support filtering:

```python
# Filter keys by user and team
keys = client.keys.list(
    user_id="user123",
    team_id="team456",
    include_team_keys=True
)
```

## Contributing

Contributions are welcome! Please check out our [contributing guidelines](../../CONTRIBUTING.md) for details.

## License

This project is licensed under the MIT License - see the [LICENSE](../../LICENSE) file for details.