Spaces:
Configuration error
Configuration error
What endpoints does the litellm proxy have 💥 LiteLLM Proxy Server | |
LiteLLM Server manages: | |
Calling 100+ LLMs Huggingface/Bedrock/TogetherAI/etc. in the OpenAI ChatCompletions & Completions format | |
Set custom prompt templates + model-specific configs (temperature, max_tokens, etc.) | |
Quick Start | |
View all the supported args for the Proxy CLI here | |
$ litellm --model huggingface/bigcode/starcoder | |
#INFO: Proxy running on http://0.0.0.0:8000 | |
Test | |
In a new shell, run, this will make an openai.ChatCompletion request | |
litellm --test | |
This will now automatically route any requests for gpt-3.5-turbo to bigcode starcoder, hosted on huggingface inference endpoints. | |
Replace openai base | |
import openai | |
openai.api_base = "http://0.0.0.0:8000" | |
print(openai.chat.completions.create(model="test", messages=[{"role":"user", "content":"Hey!"}])) | |
Supported LLMs | |
Bedrock | |
Huggingface (TGI) | |
Anthropic | |
VLLM | |
OpenAI Compatible Server | |
TogetherAI | |
Replicate | |
Petals | |
Palm | |
Azure OpenAI | |
AI21 | |
Cohere | |
$ export AWS_ACCESS_KEY_ID="" | |
$ export AWS_REGION_NAME="" # e.g. us-west-2 | |
$ export AWS_SECRET_ACCESS_KEY="" | |
$ litellm --model bedrock/anthropic.claude-v2 | |
Server Endpoints | |
POST /chat/completions - chat completions endpoint to call 100+ LLMs | |
POST /completions - completions endpoint | |
POST /embeddings - embedding endpoint for Azure, OpenAI, Huggingface endpoints | |
GET /models - available models on server |