Spaces:

Shyamnath
/

inferencing-llm

Sleeping

App Files Files Community

inferencing-llm / litellm /proxy /client /README.md

Shyamnath

Push core package and essential files

469eae6 2 months ago

preview code

raw

history blame

6.34 kB

	# LiteLLM Proxy Client

	A Python client library for interacting with the LiteLLM proxy server. This client provides a clean, typed interface for managing models, keys, credentials, and making chat completions.

	## Installation

	```bash
	pip install litellm
	```

	## Quick Start

	```python
	from litellm.proxy.client import Client

	# Initialize the client
	client = Client(
	base_url="http://localhost:4000", # Your LiteLLM proxy server URL
	api_key="sk-api-key" # Optional: API key for authentication
	)

	# Make a chat completion request
	response = client.chat.completions.create(
	model="gpt-3.5-turbo",
	messages=[
	{"role": "user", "content": "Hello, how are you?"}
	]
	)
	print(response.choices[0].message.content)
	```

	## Features

	The client is organized into several resource clients for different functionality:

	- `chat`: Chat completions
	- `models`: Model management
	- `model_groups`: Model group management
	- `keys`: API key management
	- `credentials`: Credential management

	## Chat Completions

	Make chat completion requests to your LiteLLM proxy:

	```python
	# Basic chat completion
	response = client.chat.completions.create(
	model="gpt-4",
	messages=[
	{"role": "system", "content": "You are a helpful assistant."},
	{"role": "user", "content": "What's the capital of France?"}
	]
	)

	# Stream responses
	for chunk in client.chat.completions.create(
	model="gpt-4",
	messages=[{"role": "user", "content": "Tell me a story"}],
	stream=True
	):
	print(chunk.choices[0].delta.content or "", end="")
	```

	## Model Management

	Manage available models on your proxy:

	```python
	# List available models
	models = client.models.list()

	# Add a new model
	client.models.add(
	model_name="gpt-4",
	litellm_params={
	"api_key": "your-openai-key",
	"api_base": "https://api.openai.com/v1"
	}
	)

	# Delete a model
	client.models.delete(model_name="gpt-4")
	```

	## API Key Management

	Manage virtual API keys:

	```python
	# Generate a new API key
	key = client.keys.generate(
	models=["gpt-4", "gpt-3.5-turbo"],
	aliases={"gpt4": "gpt-4"},
	duration="24h",
	key_alias="my-key",
	team_id="team123"
	)

	# List all keys
	keys = client.keys.list(
	page=1,
	size=10,
	return_full_object=True
	)

	# Delete keys
	client.keys.delete(
	keys=["sk-key1", "sk-key2"],
	key_aliases=["alias1", "alias2"]
	)
	```

	## Credential Management

	Manage model credentials:

	```python
	# Create new credentials
	client.credentials.create(
	credential_name="azure1",
	credential_info={"api_type": "azure"},
	credential_values={
	"api_key": "your-azure-key",
	"api_base": "https://example.azure.openai.com"
	}
	)

	# List all credentials
	credentials = client.credentials.list()

	# Get a specific credential
	credential = client.credentials.get(credential_name="azure1")

	# Delete credentials
	client.credentials.delete(credential_name="azure1")
	```

	## Model Groups

	Manage model groups for load balancing and fallbacks:

	```python
	# Create a model group
	client.model_groups.create(
	name="gpt4-group",
	models=[
	{"model_name": "gpt-4", "litellm_params": {"api_key": "key1"}},
	{"model_name": "gpt-4-backup", "litellm_params": {"api_key": "key2"}}
	]
	)

	# List model groups
	groups = client.model_groups.list()

	# Delete a model group
	client.model_groups.delete(name="gpt4-group")
	```

	## Low-Level HTTP Client

	The client provides access to a low-level HTTP client for making direct requests
	to the LiteLLM proxy server. This is useful when you need more control or when
	working with endpoints that don't yet have a high-level interface.

	```python
	# Access the HTTP client
	client = Client(
	base_url="http://localhost:4000",
	api_key="sk-api-key"
	)

	# Make a custom request
	response = client.http.request(
	method="POST",
	uri="/health/test_connection",
	json={
	"litellm_params": {
	"model": "gpt-4",
	"api_key": "your-api-key",
	"api_base": "https://api.openai.com/v1"
	},
	"mode": "chat"
	}
	)

	# The response is automatically parsed from JSON
	print(response)
	```

	### HTTP Client Features

	- Automatic URL handling (handles trailing/leading slashes)
	- Built-in authentication (adds Bearer token if `api_key` is provided)
	- JSON request/response handling
	- Configurable timeout (default: 30 seconds)
	- Comprehensive error handling
	- Support for custom headers and request parameters

	### HTTP Client `request` method parameters

	- `method`: HTTP method (GET, POST, PUT, DELETE, etc.)
	- `uri`: URI path (will be appended to base_url)
	- `data`: (optional) Data to send in the request body
	- `json`: (optional) JSON data to send in the request body
	- `headers`: (optional) Custom HTTP headers
	- Additional keyword arguments are passed to the underlying requests library

	## Error Handling

	The client provides clear error handling with custom exceptions:

	```python
	from litellm.proxy.client.exceptions import UnauthorizedError

	try:
	response = client.chat.completions.create(
	model="gpt-4",
	messages=[{"role": "user", "content": "Hello"}]
	)
	except UnauthorizedError as e:
	print("Authentication failed:", e)
	except Exception as e:
	print("Request failed:", e)
	```

	## Advanced Usage

	### Request Customization

	All methods support returning the raw request object for inspection or modification:

	```python
	# Get the prepared request without sending it
	request = client.models.list(return_request=True)
	print(request.method) # GET
	print(request.url) # http://localhost:8000/models
	print(request.headers) # {'Content-Type': 'application/json', ...}
	```

	### Pagination

	Methods that return lists support pagination:

	```python
	# Get the first page of keys
	page1 = client.keys.list(page=1, size=10)

	# Get the second page
	page2 = client.keys.list(page=2, size=10)
	```

	### Filtering

	Many list methods support filtering:

	```python
	# Filter keys by user and team
	keys = client.keys.list(
	user_id="user123",
	team_id="team456",
	include_team_keys=True
	)
	```

	## Contributing

	Contributions are welcome! Please check out our [contributing guidelines](../../CONTRIBUTING.md) for details.

	## License

	This project is licensed under the MIT License - see the [LICENSE](../../LICENSE) file for details.