--- pipeline_tag: text-generation tags: - exllama - exl2 - phi - code - math - tchat - text-generation --- **Quantization**: ExLlamaV2 (ExL2) at **2.5 bits** per weight. ## Overview This is an ExLlamaV2 (ExL2) 2.5 bpw quantized version of [microsoft/phi-4](https://huggingface.co/microsoft/phi-4). ## Quantization By I often have idle A100 GPUs while building/testing the app, so I put them to use quantizing models. I hope the community finds these quantizations useful. Andrew Webby @ [RolePlai](https://roleplai.app/)