rwkv7-0.1B-g1

This is RWKV-7 g1 model under flash-linear attention format. The g1 model series added significant more data and incorporated deep thinking abilities.

Model Details

Model Description

  • Developed by: Bo Peng, Yu Zhang, Songlin Yang, Ruichong Zhang
  • Funded by: RWKV Project (Under LF AI & Data Foundation)
  • Model type: RWKV7
  • Language(s) (NLP): Multilingal
  • License: Apache-2.0
  • Parameter count: 191M
  • Tokenizer: RWKV World tokenizer
  • Vocabulary size: 65,536

Model Sources

Uses

Install flash-linear-attention and the latest version of transformers before using this model:

pip install git+https://github.com/fla-org/flash-linear-attention
pip install 'transformers>=4.48.0'

Direct Use

You can use this model just as any other HuggingFace models:

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('fla-hub/rwkv7-0.1B-g1', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('fla-hub/rwkv7-0.1B-g1', trust_remote_code=True)

Training Data

This model is trained on the World v3.5 with a total of more than 5 trillion tokens.

FAQ

Q: safetensors metadata is none.

A: upgrade transformers to >=4.48.0: pip install 'transformers>=4.48.0'

Thinking Prompt

<|rwkv_tokenizer_end_of_text|>User: <Your Question Here>

Assistant: <think

Don't close the brackets for <think!

Addidtional Caveats for Prompting

Always add <|rwkv_tokenizer_end_of_text|> (Token ID = 0) before your prompt. The model is incapable of attending the first token it receives due to state initialization issues.

Bad prompt example:

Mathews lifted a dark brow. "Are you sure about that? I mean, wouldn't it be better to wait until Dale is home safe and sound?"

"The longer I wait to tell her, the worse it will be for both of us."

"Good luck. You're going to need it," said

The model is unable to recall Mathews because it is the very first token of the input.

Good prompt example:

<|rwkv_tokenizer_end_of_text|>Mathews lifted a dark brow. "Are you sure about that? I mean, wouldn't it be better to wait until Dale is home safe and sound?"

"The longer I wait to tell her, the worse it will be for both of us."

"Good luck. You're going to need it," said

the model will output Mathews as expected.

Without this token: lambada_openai ppl=13.84 acc=48.13%

With this token added: lambada_openai ppl=12.36 acc=49.12%

Note: this phenomenon is very rare for Transformers but significant for RNNs. We speculate that the model uses the first token to pin the states, to better acquire information from later tokens.

Downloads last month
84
Safetensors
Model size
191M params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for fla-hub/rwkv7-0.1B-g1

Base model

BlinkDL/rwkv7-g1
Finetuned
(1)
this model

Collection including fla-hub/rwkv7-0.1B-g1