--- license: apache-2.0 language: - fa library_name: hezar tags: - hezar --- A vision encoder decoder model initialized from `hezarai/roberta-base-fa` and `google/vit-base-patch16-224` weights. **This model cannot perform image-to-text inference out of the box without finetuning.**