ashvardanian commited on
Commit
9f6638f
1 Parent(s): dbe9130

Add previews

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -9,30 +9,30 @@ datasets:
9
  - HuggingFaceM4/VQAv2
10
  - ChristophSchuhmann/MS_COCO_2017_URL_TEXT
11
  widget:
12
- - text: "What is the invoice number?"
13
- src: "https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/invoice.png"
14
- - text: "What is the purchase amount?"
15
- src: "https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/contract.jpeg"
16
  language:
17
  - en
18
  license: apache-2.0
19
  base_model: unum-cloud/uform-vl-english
20
  ---
21
 
 
 
22
  <h1 align="center">UForm</h1>
23
  <h3 align="center">
24
  Pocket-Sized Multimodal AI<br/>
25
  For Content Understanding and Generation<br/>
26
  </h3>
27
 
28
- <Gallery />
29
-
30
  ## Description
31
 
32
  UForm-Gen is a small generative vision-language model primarily designed for Image Captioning and Visual Question Answering. The model consists of two parts:
33
 
34
- 1. [UForm Vision Encoder](https://huggingface.co/unum-cloud/uform-vl-english)
35
- 2. [Sheared-LLaMA-1.3B](https://huggingface.co/princeton-nlp/Sheared-LLaMA-1.3B) manually tuned on the instructions dataset
36
 
37
  The model was pre-trained on: MSCOCO, SBU Captions, Visual Genome, VQAv2, GQA and a few internal datasets.
38
 
 
9
  - HuggingFaceM4/VQAv2
10
  - ChristophSchuhmann/MS_COCO_2017_URL_TEXT
11
  widget:
12
+ - text: "The living room is cozy, featuring a red leather chair and a white table. The chair is in the center, and the table is on the left side. A lamp on the left side illuminates the space. A large picture hangs on the wall, adding artistic flair. A vase on the table adds a decorative touch. The room is well-lit, creating a warm and inviting atmosphere."
13
+ src: "https://github.com/ashvardanian/usearch-images/blob/main/assets/uform-gen-interior.png?raw=true"
14
+ - text: "A young girl stands in a grassy field, holding an umbrella to shield herself from the rain. She dons a yellow dress and seems to relish her time outdoors. The umbrella is open, offering protection from the rain. The field is bordered by trees, fostering a tranquil and natural ambiance"
15
+ src: "https://github.com/ashvardanian/usearch-images/blob/main/assets/uform-gen-umbrella.png?raw=true"
16
  language:
17
  - en
18
  license: apache-2.0
19
  base_model: unum-cloud/uform-vl-english
20
  ---
21
 
22
+ <Gallery />
23
+
24
  <h1 align="center">UForm</h1>
25
  <h3 align="center">
26
  Pocket-Sized Multimodal AI<br/>
27
  For Content Understanding and Generation<br/>
28
  </h3>
29
 
 
 
30
  ## Description
31
 
32
  UForm-Gen is a small generative vision-language model primarily designed for Image Captioning and Visual Question Answering. The model consists of two parts:
33
 
34
+ 1. [`uform-vl-english`](https://huggingface.co/unum-cloud/uform-vl-english) visual encoder,
35
+ 2. [`Sheared-LLaMA-1.3B`](https://huggingface.co/princeton-nlp/Sheared-LLaMA-1.3B) language model tuned on instruction datasets.
36
 
37
  The model was pre-trained on: MSCOCO, SBU Captions, Visual Genome, VQAv2, GQA and a few internal datasets.
38