AIRI-Institute
/

OmniFusion

Model card Files Files and versions Community

razzant commited on Dec 29, 2023

Commit

acdc403

·

1 Parent(s): daa6833

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ license: apache-2.0
 ### Architecture
 <p align="left">
-<img src="https://raw.githubusercontent.com/AIRI-Institute/OmniFusion/main/content/architecture.png" width="70%">
 </p>
@@ -25,14 +25,14 @@ To further enhance the model's multimodal capabilities, we employ trainable spec
 2. Once the adapter has learned to map ViT's visual embeddings to the language model's textual space, we proceed to unfreeze Mistral for improved understanding of dialog formats and complex queries.
 <p align="left">
-<img src="https://raw.githubusercontent.com/AIRI-Institute/OmniFusion/main/content/datasets.png" width="50%">
 </p>
 ### Results
 OmniFusion was benchmarked against the latest multimodal SOTA models. It excelled in generative metrics and classification benchmarks like VisualDialog.
 <p align="left">
-<img src="https://raw.githubusercontent.com/AIRI-Institute/OmniFusion/main/content/radar.png" width="50%">
 </p>
 Model Performance on Visual Dialog Benchmark
@@ -45,7 +45,7 @@ Model Performance on Visual Dialog Benchmark
 ### Examples
 <p align="left">
-<img src="https://raw.githubusercontent.com/AIRI-Institute/OmniFusion/main/content/examples.png" width="70%">
 </p>
 ### Future Plans

 ### Architecture
 <p align="left">
+<img src="https://raw.githubusercontent.com/AIRI-Institute/OmniFusion/main/content/architecture.png" width="100%">
 </p>
 2. Once the adapter has learned to map ViT's visual embeddings to the language model's textual space, we proceed to unfreeze Mistral for improved understanding of dialog formats and complex queries.
 <p align="left">
+<img src="https://raw.githubusercontent.com/AIRI-Institute/OmniFusion/main/content/datasets.png" width="70%">
 </p>
 ### Results
 OmniFusion was benchmarked against the latest multimodal SOTA models. It excelled in generative metrics and classification benchmarks like VisualDialog.
 <p align="left">
+<img src="https://raw.githubusercontent.com/AIRI-Institute/OmniFusion/main/content/radar.png" width="70%">
 </p>
 Model Performance on Visual Dialog Benchmark
 ### Examples
 <p align="left">
+<img src="https://raw.githubusercontent.com/AIRI-Institute/OmniFusion/main/content/examples.png" width="100%">
 </p>
 ### Future Plans