Polish-Lobotomy: An awful polish fine-tune

Model Description

This fine-tuned Phi-3 model is the first attempt at a Polish fine-tune of Phi-3. It is very bad, probably because of the fine-tuning method (making the model learn a new language probably needs a full fine-tune) and the small dataset.

Training Details

  • Trained on a single RTX 4060 for approximately 1 hour
  • Utilized 8-bit QLORA for efficient training
  • Despite the short training period, the model somehow managed to learn something (but not very well)

image/jpeg

Dataset

The model was trained on the Polish subset of the AYA dataset, which can be found at https://huggingface.co/datasets/CohereForAI/aya_dataset.

Prompt Template

The prompt template used for this model is identical to the Phi 3 template.

Disclaimer

Please be advised that this model's output may contain nonsensical responses. Viewer discretion is strongly advised (but not really necessary).

Use this model at your own risk, and please engage with the output responsibly (but let's be real, it's not like it's going to be useful for anything).

Downloads last month
13
Safetensors
Model size
3.82B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train DuckyBlender/polish-lobotomy