Spaces:

adaptiveaiventures
/

llama2-interference

Runtime error

adaptiveaiventures commited on Jan 18

Commit

f48b421

verified ·

1 Parent(s): c9ad6e4

Create Dockerfile

Files changed (1) hide show

Dockerfile ADDED Viewed

+FROM ghcr.io/huggingface/text-generation-inference:latest
+# Define the model to use
+ENV MODEL_ID="adaptiveaiventures/Llama-2-7b-chat-finetune"
+# Set the number of GPU shards (adjust based on GPU availability)
+ENV NUM_SHARD=1
+# Run the TGI server
+CMD ["--model-id", "${MODEL_ID}", "--port", "8080", "--num-shard", "${NUM_SHARD}"]