|
--- |
|
license: agpl-3.0 |
|
datasets: |
|
- HeNLP/HeDC4 |
|
- HebArabNlpProject/HebNLI |
|
- pig4431/HeQ_v1 |
|
language: |
|
- he |
|
pipeline_tag: text-generation |
|
tags: |
|
- RWKV |
|
- Hebrew |
|
--- |
|
|
|
# TamiLM Hebrew Nano |
|
|
|
A Modern Hebrew specialized LLM based on the RWKVv6 Architecture |
|
Trained only on Modern Hebrew datasets, with a custom vocabulary optimized for Modern Hebrew |
|
|
|
Trained at [Tel Aviv Makers Hackerspace](https://wiki.telavivmakers.org/) |
|
|
|
### Params |
|
|
|
Layers `12` |
|
Depth `512` |
|
Head size `64` |
|
Train ctx_len `512` |
|
Train tokens `6,841,411,389 (6 Billion)` |
|
Vocab size `65536` |
|
|
|
### Train Compute |
|
|
|
All compute was performed on a single Nvidia P40 card |
|
Experiments: `62 hours 52 Minutes (2.6 days)` |
|
Training run: `208 hours 10 Minutes (8.6 days)` |
|
|
|
### How to run |
|
|
|
1. Load model into your favourite RWKV v6 runtime |
|
2. Swap RWKV tokenizer with Huggingface tokenizers |
|
3. Load vocab json from this repo |