|
--- |
|
base_model: gordicaleksa/YugoGPT |
|
inference: false |
|
language: |
|
- sr |
|
- hr |
|
license: apache-2.0 |
|
model_creator: gordicaleksa |
|
model_name: YugoGPT |
|
model_type: mistral |
|
quantized_by: Luka Secerovic |
|
--- |
|
[![sr](https://img.shields.io/badge/lang-sr-green.svg)](https://huggingface.co/alkibijad/YugoGPT-GGUF/blob/main/README.md) |
|
[![en](https://img.shields.io/badge/lang-en-red.svg)](https://huggingface.co/alkibijad/YugoGPT-GGUF/blob/main/README.en.md) |
|
|
|
# About the model |
|
[YugoGPT](https://huggingface.co/gordicaleksa/YugoGPT) is currently the best open-source base 7B LLM for BCS (Bosnian, Croatian, Serbian). |
|
|
|
This repository contains the model in [GGUF](https://github.com/ggerganov/llama.cpp/tree/master) format, which is very useful for local inference, and doesn't require expensive hardware. |
|
|
|
# Versions |
|
The model is compressed into a couple of smaller versions. Compression drops the quality slightly, but significantly increases the inference speed. |
|
|
|
It's suggested to use the `Q4_1` version as it's the fastest one. |
|
|
|
|
|
| Name | Size (GB) | Note | |
|
|-------|---------------|----------------------------------------------------------------------------| |
|
| Q4_1 | 4.55 | Weights compressed to 4 bits. The fastest version. | |
|
| q8_0 | 7.7 | Weights compressed to 8 bits. | |
|
| fp16 | 14.5 | Weights compressed to 16 bits. | |
|
| fp32 | 29 | Original, 32 bit weights. Not recommended to use this. | |
|
|
|
# How to run this model locally? |
|
## LMStudio - the easiest way ⚡️ |
|
Install [LMStudio](https://lmstudio.ai/). |
|
|
|
- After installation, search for "alkibijad/YugoGPT": |
|
![Pretraga](./media/lm_studio_screen_1.png "Pretraga modela") |
|
- Choose a model version (recommended `Q4_1`): |
|
![Izaberi model](./media/lm_studio_screen_2.1.png "Izaberi model") |
|
- After the model finishes downloading, click on "chat" on the left side and start chatting. |
|
- [Optional] You can setup a system prompt, e.g. "You're a helpful assistant" or however else you want. |
|
![Chat](./media/lm_studio_screen_3.png "Chat") |
|
|
|
That's it! |
|
|
|
## llama.cpp - advanced 🤓 |
|
Ako si napredan korisnik i želiš da se petljaš sa komandnom linijom i naučiš više o `GGUF` formatu, idi na [llama.cpp](https://github.com/ggerganov/llama.cpp/tree/master) i pročitaj uputstva 🙂 |
|
If you're an advanced user and want to use CLI and learn more about `GGUF` format, go to [llama.cpp](https://github.com/ggerganov/llama.cpp/tree/master) and follow the instructions 🙂 |