ariel-ml
/

PULI-LlumiX-32K-instruct-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

PULI-LlumiX-32K-instruct-GGUF / README.md

ariel-ml's picture

Upload README.md with huggingface_hub

65d9584 verified 6 months ago

|

2.09 kB

	---
	license: llama2
	language:
	- hu
	- en
	tags:
	- puli
	- llama
	- finetuned
	base_model: ariel-ml/PULI-LlumiX-32K-instruct-f16-0.2
	pipeline_tag: text-generation
	---

	# PULI LlumiX 32K instruct (6.74B billion parameter)

	<img src="logo.webp" width="340" style="margin-left:'auto' margin-right:'auto' display:'block'"/>

	Intruct finetuned version of NYTK/PULI-LlumiX-32K.

	## Provided files
	\| Quant method \| Bits \| Use case \|
	\| ---- \| ---- \| ---- \|
	\| Q3_K_M \| 3 \| very small, high quality loss \|
	\| Q4_K_S \| 4 \| small, greater quality loss \|
	\| Q4_K_M \| 4 \| medium, balanced quality - recommended \|
	\| Q5_K_S \| 5 \| large, low quality loss - recommended \|
	\| Q5_K_M \| 5 \| large, very low quality loss - recommended \|
	\| Q6_K \| 6 \| very large, extremely low quality loss \|
	\| Q8_0 \| 8 \| very large, extremely low quality loss - not recommended \|

	## Training platform
	[Runpod](https://runpod.ui) RTX 4090 GPU

	## Hyper parameters

	- Epoch: 3
	- LoRA rank (r): 16
	- LoRA alpha: 16
	- Lr: 2e-4
	- Lr scheduler: cosine
	- Optimizer: adamw_8bit
	- Weight decay: 0.01

	## Dataset

	boapps/szurkemarha

	Only Hungarian instructions were selected: ~53000 prompts.

	## Prompt format: ChatML

	```
	<\|im_start\|>system
	Egy segítőkész mesterséges intelligencia asszisztens vagy. Válaszold meg a kérdést legjobb tudásod szerint!<\|im_end\|>
	<\|im_start\|>user
	Ki a legerősebb szuperhős?<\|im_end\|>
	<\|im_start\|>assistant
	A legerősebb szuperhős a Marvel univerzumában Hulk.<\|im_end\|>
	```

	## Base model

	- Trained with OpenChatKit [github](https://github.com/togethercomputer/OpenChatKit)
	- The [LLaMA-2-7B-32K](https://huggingface.co/togethercomputer/LLaMA-2-7B-32K) model were continuously pretrained on Hungarian dataset
	- The model has been extended to a context length of 32K with position interpolation
	- Checkpoint: 100 000 steps

	## Base model dataset for continued pretraining

	- Hungarian: 7.9 billion words, documents (763K) that exceed 5000 words in length
	- English: Long Context QA (2 billion words), BookSum (78 million words)

	## Limitations

	- max_seq_length = 32 768
	- float16
	- vocab size: 32 000