** Model Detail
Model type: RWKV7 SigLIP2 is an opensource chatbot trained using RWKV7 architecture and SigLIP2 Encoder.
Model date: Feb,2025
Paper or resources for more information: https://github.com/JL-er/WorldRWKV
Where to send questions or comments about the model: https://github.com/JL-er/WorldRWKV/issues
** Training datasets:
- Pretrain: LLaVA 595k
- Fine-tune: LLaVA 665k
** Evaluation dataset
Currently, we tested RWKV7 SigLIP2 on 4 benchmarks proposed for instruction-following LMMs. More benchmarks will be released soon.
Benchmarks
Encoder LLM VQAV2 TextVQA GQA ScienceQA SigLIP2 RWKV7-3B 78.30 51.09 60.75 70.93 Inference
from infer.worldmodel import Worldinfer from PIL import Image llm_path='WorldRWKV/RWKV7-3B-siglip2/rwkv-0' #Local model path encoder_path='google/siglip2-base-patch16-384' encoder_type='siglip' model = Worldinfer(model_path=llm_path, encoder_type=encoder_type, encoder_path=encoder_path) img_path = './docs/03-Confusing-Pictures.jpg' image = Image.open(img_path).convert('RGB') text = '\x16User: What is unusual about this image?\x17Assistant:' result = model.generate(text, image) print(result)
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.