ashvardanian commited on
Commit
dbe9130
1 Parent(s): 8844b54

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -1
README.md CHANGED
@@ -1,14 +1,32 @@
1
  ---
2
- license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  language:
4
  - en
 
 
5
  ---
 
6
  <h1 align="center">UForm</h1>
7
  <h3 align="center">
8
  Pocket-Sized Multimodal AI<br/>
9
  For Content Understanding and Generation<br/>
10
  </h3>
11
 
 
 
12
  ## Description
13
 
14
  UForm-Gen is a small generative vision-language model primarily designed for Image Captioning and Visual Question Answering. The model consists of two parts:
 
1
  ---
2
+ pipeline_tag: image-to-text
3
+ tags:
4
+ - image-captioning
5
+ - visual-question-answering
6
+ datasets:
7
+ - sbu_captions
8
+ - visual_genome
9
+ - HuggingFaceM4/VQAv2
10
+ - ChristophSchuhmann/MS_COCO_2017_URL_TEXT
11
+ widget:
12
+ - text: "What is the invoice number?"
13
+ src: "https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/invoice.png"
14
+ - text: "What is the purchase amount?"
15
+ src: "https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/contract.jpeg"
16
  language:
17
  - en
18
+ license: apache-2.0
19
+ base_model: unum-cloud/uform-vl-english
20
  ---
21
+
22
  <h1 align="center">UForm</h1>
23
  <h3 align="center">
24
  Pocket-Sized Multimodal AI<br/>
25
  For Content Understanding and Generation<br/>
26
  </h3>
27
 
28
+ <Gallery />
29
+
30
  ## Description
31
 
32
  UForm-Gen is a small generative vision-language model primarily designed for Image Captioning and Visual Question Answering. The model consists of two parts: