trajkovnikola
/

MKLLM-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

MKLLM-7B / README.md

trajkovnikola's picture

Update README.md

e0a021b verified 6 months ago

|

1.78 kB

	---
	license: apache-2.0
	language:
	- mk
	- en
	tags:
	- axolotl
	---

	# MKLLM-7B

	MKLLM-7B is an open-source Large Language Model for the Macedonian language. The model is built on top of the amazing [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) model by continued pretraining on a mix of Macedonian and English text.
	A corpus of around 300M tokens, repeated in 2 epochs, was used for the training and even though this might be considered small compared to other similar projects, the resulting model is very capable in understanding and processing the Macedonian language.

	We have built two instruction models on top of the base model which showcase the potential of the model.

	1. [MKLLM-7B-Instruct](https://huggingface.co/trajkovnikola/MKLLM-7B-Instruct): An instruction-tuned that performs better than leading models from the same size:

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f85631e019bdfd8cd83f10/k0ztAR-H8xdPZHNxhu35_.png)

	2. [MKLLM-7B-Translate](https://huggingface.co/trajkovnikola/MKLLM-7B-Translate): An LLM as a translator implementation that has quite an impressive performance:

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f85631e019bdfd8cd83f10/Pi2fRyGjorfsJAaj-B5wW.png)

	Notes

	- MKLLM-7B is a base model and is not intended for deployment without fine-tuning. The model has no moderation mechanisms.
	- MKLLM-7B can hallucinate and produce factually incorrect output. This is especially pronounced when discussing Macedonian topics due to the smaller training dataset.

	[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)