This is the first model which converts Qwen2.5-7B's checkpoint to RWKV-7 architecture. | |
It's trained in one server with 8xA800 for one day which might not be that versatile. It shows an acceptable performance to chat with you fluently. | |
The shortage is that this base model can't do math and related tasks. I'll add a more balanced data to improve that model's capability later. | |
Please refer the https://github.com/yynil/RWKVinLLAMA/blob/rwkv_7/gradio/chat_rwkv7.py about how to use it. | |