Spaces:
Running
Running
# Starting server | |
echo "Starting server" | |
xinference-local -H 0.0.0.0 | |
xinference launch --model-engine llama.cpp --model-name llama-3-instruct --size-in-billions 8 --model-format ggufv2 --quantization Q4_K_M | |
litellm --model llama-3-instruct --drop_params |