|
--- |
|
language: |
|
- en |
|
- zh |
|
- de |
|
- fr |
|
- es |
|
- pt |
|
- ru |
|
- it |
|
- ja |
|
- ko |
|
- vi |
|
- ar |
|
tags: |
|
- pytorch |
|
- text-generation |
|
- causal-lm |
|
- rwkv |
|
license: apache-2.0 |
|
datasets: |
|
- EleutherAI/pile |
|
- togethercomputer/RedPajama-Data-1T |
|
--- |
|
|
|
# RWKV-4 World |
|
|
|
## Model Description |
|
|
|
RWKV-4 trained on 100+ world languages. |
|
|
|
How to use: |
|
* use latest rwkv pip package (0.7.4+) |
|
* use latest ChatRWKV v2/benchmark_world.py to test |
|
|
|
The difference between World & Raven: |
|
* set pipeline = PIPELINE(model, "rwkv_vocab_v20230424") instead of 20B_tokenizer.json (EXACTLY AS WRITTEN HERE. "rwkv_vocab_v20230424" is included in rwkv 0.7.4+) |
|
* use Question/Answer or User/AI or Human/Bot prompt. **DO NOT USE Bob/Alice or Q/A** |
|
* use **fp32** (will overflow in fp16 at this moment - fixable in future) |
|
|
|
NOTE: the new greedy tokenizer will tokenize '\n\n' as one single token instead of ['\n','\n'] |
|
|
|
A good prompt example: |
|
``` |
|
Question: hi |
|
|
|
Answer: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it. |
|
|
|
Question: xxxxxx |
|
|
|
Answer: |
|
``` |