PhelixZhen commited on
Commit
e5c2401
·
verified ·
1 Parent(s): e0d5fc4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -3
README.md CHANGED
@@ -1,3 +1,12 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ # Algea-VE: A Tiny Multimodal Language Model with Only 0.8B Parameters
6
+
7
+
8
+ Algea-ve is trained on the LAION-CC-SBU dataset using [algea-550M-base](https://huggingface.co/PhelixZhen/Algae-550M-base) as the base model and fine-tuned on llava_v1_5_mix665k. It uses CLIP ViT-L/14-336 as the visual encoder. The model is very small, requiring only 32GB of VRAM for fine-tuning and 3GB for inference.
9
+
10
+ Due to insufficient training of the base model, the current model has some issues with hallucinations and repetition. To address this, I am training a new model that will maintain the same size but offer better performance.
11
+
12
+ This model is built based on the llavaphi project. To use the model, please click [here](https://github.com/phelixzhen/Algea-VE).