YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Hydra Decoder (Codebase)


We currently support inference in the single GPU and batch size 1 setting, which is consistent with Medusa and is also the most common setup for local model hosting.

Model Weights. We uploaded three model weights for users to try.

Base Model Description Hugging Face Repo
Vicuna-7b Medusa Model (Original) FasterDecoding/medusa-vicuna-7b-v1.3
Vicuna-7b Medusa Model - 3 Head (Ours) Rango2000/medusa-3h-vicuna-7b-v1.3
Vicuna-7b Hydra Model - 3 Head - 1 Decoding Layer (Ours) shiqihe/hydra-decoder-1l-vicuna-7b-v1.3
Vicuna-7b Hydra Model - 3 Head - 2 Decoding Layer (Ours) shiqihe/hydra-decoder-2l-vicuna-7b-v1.3

Inference. You can use the following command to launch a CLI interface:

# optional
CUDA_VISIBLE_DEVICES=0
# run cli
python -m medusa.inference.cli --model [path/repo of hydra decoder]

license: mit

Downloads last month
3
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.