metadata
base_model: gordicaleksa/YugoGPT
inference: false
language:
- sr
- hr
license: apache-2.0
model_creator: gordicaleksa
model_name: YugoGPT
model_type: mistral
quantized_by: Luka Secerovic
About the model
YugoGPT is currently the best open-source base 7B LLM for BCS (Bosnian, Croatian, Serbian).
This repository contains the model in GGUF format, which is very useful for local inference, and doesn't require expensive hardware.
Versions
The model is compressed into a couple of smaller versions. Compression drops the quality slightly, but significantly increases the inference speed.
It's suggested to use the Q4_1
version as it's the fastest one.
Name | Size (GB) | Note |
---|---|---|
Q4_1 | 4.55 | Weights compressed to 4 bits. The fastest version. |
q8_0 | 7.7 | Weights compressed to 8 bits. |
fp16 | 14.5 | Weights compressed to 16 bits. |
fp32 | 29 | Original, 32 bit weights. Not recommended to use this. |
How to run this model locally?
LMStudio - the easiest way ⚡️
Install LMStudio.
- After installation, search for "alkibijad/YugoGPT":
- Choose a model version (recommended
Q4_1
): - After the model finishes downloading, click on "chat" on the left side and start chatting:
- [Optional] You can setup a system prompt, e.g. "You're a helpful assistant" or however else you want.
That's it!
llama.cpp - advanced 🤓
Ako si napredan korisnik i želiš da se petljaš sa komandnom linijom i naučiš više o GGUF
formatu, idi na llama.cpp i pročitaj uputstva 🙂
If you're an advanced user and want to use CLI and learn more about GGUF
format, go to llama.cpp and follow the instructions 🙂