Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
jekunz
/
smollm-135m-lora-fineweb-faroese
like
0
Safetensors
HuggingFaceFW/fineweb-2
Faroese
License:
apache-2.0
Model card
Files
Files and versions
Community
LoRA setup:
Rank: 256
Alpha: 512
Target modules: ["up_proj", "down_proj", "gate_proj", "o_proj"]
Training:
1 Epoch
Learning rate: 8e-4
LR scheduler: Cosine
Warmup ratio: 0.05
Batch size: 1
4 A100 (40GB) GPUs
Gradient accumulation steps: 64
Effective batch size: 256
Max. context length: 8192 tokens
(renamed from jekunz/smollm-135m-lora-fineweb-fao-test3)
Downloads last month
-
Downloads are not tracked for this model.
How to track
Inference API
Unable to determine this model's library. Check the
docs
.
Model tree for
jekunz/smollm-135m-lora-fineweb-faroese
Base model
HuggingFaceTB/SmolLM2-135M
Quantized
HuggingFaceTB/SmolLM2-135M-Instruct
Finetuned
(
60
)
this model
Dataset used to train
jekunz/smollm-135m-lora-fineweb-faroese
HuggingFaceFW/fineweb-2
Viewer
•
Updated
19 days ago
•
12.5B
•
71.9k
•
397
Collection including
jekunz/smollm-135m-lora-fineweb-faroese
SmolLM CPT LoRA
Collection
3 items
•
Updated
3 days ago