unum-cloud
/

uform-gen

text-generation

image-captioning

visual-question-answering

Model card Files Files and versions Community

ashvardanian commited on Dec 29, 2023

Commit

dbe9130

·

1 Parent(s): 8844b54

Update README.md

Files changed (1) hide show

README.md +19 -1

README.md CHANGED Viewed

@@ -1,14 +1,32 @@
 ---
-license: apache-2.0
 language:
 - en
 ---
 <h1 align="center">UForm</h1>
 <h3 align="center">
 Pocket-Sized Multimodal AI<br/>
 For Content Understanding and Generation<br/>
 </h3>
 ## Description
 UForm-Gen is a small generative vision-language model primarily designed for Image Captioning and Visual Question Answering. The model consists of two parts:

 ---
+pipeline_tag: image-to-text
+tags:
+- image-captioning
+- visual-question-answering
+datasets:
+- sbu_captions
+- visual_genome
+- HuggingFaceM4/VQAv2
+- ChristophSchuhmann/MS_COCO_2017_URL_TEXT
+widget:
+- text: "What is the invoice number?"
+  src: "https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/invoice.png"
+- text: "What is the purchase amount?"
+  src: "https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/contract.jpeg"
 language:
 - en
+license: apache-2.0
+base_model: unum-cloud/uform-vl-english
 ---
 <h1 align="center">UForm</h1>
 <h3 align="center">
 Pocket-Sized Multimodal AI<br/>
 For Content Understanding and Generation<br/>
 </h3>
+<Gallery />
 ## Description
 UForm-Gen is a small generative vision-language model primarily designed for Image Captioning and Visual Question Answering. The model consists of two parts: