jekunz
/

smollm-135m-cpt-fineweb-icelandic

Text Generation

Model card Files Files and versions Community

smollm-135m-cpt-fineweb-icelandic / README.md

jekunz's picture

Create README.md

04ccdc1 verified 11 days ago

|

history blame contribute delete

546 Bytes

	---
	license: apache-2.0
	datasets:
	- HuggingFaceFW/fineweb-2
	language:
	- is
	base_model:
	- HuggingFaceTB/SmolLM2-135M-Instruct
	pipeline_tag: text-generation
	---
	This is a SmolLM2-135M-Instruct model fine-tuned on the Icelandic portion of Fineweb-2. It is intended for my research and has not been evaluated more broadly yet.

	Training:
	- 1 Epoch
	- Learning rate: 5e-4
	- LR scheduler: Cosine
	- Warmup ratio: 0.05
	- Batch size: 1
	- 4 A100 (40GB) GPUs
	- Gradient accumulation steps: 64
	- Effective batch size: 256
	- Max. context length: 8192 tokens