metadata
language:
- en
- zh
- de
- fr
- es
- pt
- ru
- it
- ja
- ko
- vi
- ar
tags:
- pytorch
- text-generation
- causal-lm
- rwkv
license: apache-2.0
datasets:
- EleutherAI/pile
- togethercomputer/RedPajama-Data-1T
RWKV-4 World
Model Description
RWKV-4 trained on 100+ world languages.
How to use:
- use latest rwkv pip package (0.7.4+)
- use latest ChatRWKV v2/benchmark_world.py to test
The difference between World & Raven:
- set pipeline = PIPELINE(model, "rwkv_vocab_v20230424") instead of 20B_tokenizer.json (EXACTLY AS WRITTEN HERE. "rwkv_vocab_v20230424" is included in rwkv 0.7.4+)
- use Question/Answer or User/AI or Human/Bot prompt. DO NOT USE Bob/Alice or Q/A
- use fp32 (will overflow in fp16 at this moment - fixable in future)
NOTE: the new greedy tokenizer will tokenize '\n\n' as one single token instead of ['\n','\n']
A good prompt example:
Question: hi
Answer: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
Question: xxxxxx
Answer: