This is a 2-bit quantized Llama2-7B model using FrameQuant. Use the inference script at https://github.com/vsingh-group/FrameQuant to run the model.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.