Edit model card

TamiLM Hebrew Nano

A Modern Hebrew specialized LLM based on the RWKVv6 Architecture
Trained only on Modern Hebrew datasets, with a custom vocabulary optimized for Modern Hebrew

Trained at Tel Aviv Makers Hackerspace

Params

Layers 12
Depth 512
Head size 64
Train ctx_len 512
Train tokens 6,841,411,389 (6 Billion)
Vocab size 65536

Train Compute

All compute was performed on a single Nvidia P40 card
Experiments: 62 hours 52 Minutes (2.6 days)
Training run: 208 hours 10 Minutes (8.6 days)

How to run

  1. Load model into your favourite RWKV v6 runtime
  2. Swap RWKV tokenizer with Huggingface tokenizers
  3. Load vocab json from this repo
Downloads last month
27
Inference Examples
Unable to determine this model's library. Check the docs .

Datasets used to train Wissotsky/TamiLM-Hebrew-Nano