Update README.md
Browse files
README.md
CHANGED
@@ -8,11 +8,20 @@ language:
|
|
8 |
library_name: transformers
|
9 |
---
|
10 |
|
11 |
-
|
12 |
TinyLLaVA, a tiny model (1.4B) trained using the exact recipe of [LLaVA-1.5](https://github.com/haotian-liu/LLaVA).
|
13 |
We trained our TinyLLaVA using [TinyLlama](https://huggingface.co/PY007/TinyLlama-1.1B-Chat-v0.3) as our LLM backbone, and [clip-vit-large-patch14-336](https://huggingface.co/openai/clip-vit-large-patch14-336) as our vision backbone.
|
14 |
|
15 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
The weights have been converted to hf format.
|
17 |
|
18 |
## How to use the model
|
|
|
8 |
library_name: transformers
|
9 |
---
|
10 |
|
11 |
+
## Model type
|
12 |
TinyLLaVA, a tiny model (1.4B) trained using the exact recipe of [LLaVA-1.5](https://github.com/haotian-liu/LLaVA).
|
13 |
We trained our TinyLLaVA using [TinyLlama](https://huggingface.co/PY007/TinyLlama-1.1B-Chat-v0.3) as our LLM backbone, and [clip-vit-large-patch14-336](https://huggingface.co/openai/clip-vit-large-patch14-336) as our vision backbone.
|
14 |
|
15 |
+
## Model Performance
|
16 |
+
We have evaluated TinyLLaVA on [GQA](https://cs.stanford.edu/people/dorarad/gqa/about.html), [VizWiz](https://www.vizwiz.com/), [VQAv2](https://visualqa.org/), [TextVQA](https://textvqa.org/) and [SQA](https://github.com/lupantech/ScienceQA).
|
17 |
+
|
18 |
+
| Model | VQAv2 | GQA | SQA | TextVQA | VizWiz |
|
19 |
+
| -------------------- | :------------: | :------------: | :------------: | :------------: | :------------: |
|
20 |
+
| TinyLLaVA-v1 | 73.41 | 57.54 | 59.40 | 46.37 | 49.56 |
|
21 |
+
|
22 |
+
More evaluations are ongoing.
|
23 |
+
|
24 |
+
## Model use
|
25 |
The weights have been converted to hf format.
|
26 |
|
27 |
## How to use the model
|