TamiLM-Hebrew-Nano / README.md
Wissotsky's picture
fix formatting
12048f2 verified
---
license: agpl-3.0
datasets:
- HeNLP/HeDC4
- HebArabNlpProject/HebNLI
- pig4431/HeQ_v1
language:
- he
pipeline_tag: text-generation
tags:
- RWKV
- Hebrew
---
# TamiLM Hebrew Nano
A Modern Hebrew specialized LLM based on the RWKVv6 Architecture
Trained only on Modern Hebrew datasets, with a custom vocabulary optimized for Modern Hebrew
Trained at [Tel Aviv Makers Hackerspace](https://wiki.telavivmakers.org/)
### Params
Layers `12`
Depth `512`
Head size `64`
Train ctx_len `512`
Train tokens `6,841,411,389 (6 Billion)`
Vocab size `65536`
### Train Compute
All compute was performed on a single Nvidia P40 card
Experiments: `62 hours 52 Minutes (2.6 days)`
Training run: `208 hours 10 Minutes (8.6 days)`
### How to run
1. Load model into your favourite RWKV v6 runtime
2. Swap RWKV tokenizer with Huggingface tokenizers
3. Load vocab json from this repo