nectec
/

Pathumma-llm-vision-1.0.0

Visual Question Answering

Model card Files Files and versions Community

Thirawarit commited on 9 days ago

Commit

d675ad7

•

1 Parent(s): f1b4e16

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -9,7 +9,7 @@ base_model:
 pipeline_tag: visual-question-answering
 ---
-# Pathumma-llm-vision-Idefic3-8b-llama3-1.0.0
 ## Model Overview
 Pathumma-llm-vision-1.0.0 is a multi-modal language model fine-tuned for Visual Question Answering (VQA) and Image Captioning tasks. It contains 8 billion parameters and leverages both image and text processing to understand and generate multi-modal content.

 pipeline_tag: visual-question-answering
 ---
+# Pathumma-llm-vision-1.0.0
 ## Model Overview
 Pathumma-llm-vision-1.0.0 is a multi-modal language model fine-tuned for Visual Question Answering (VQA) and Image Captioning tasks. It contains 8 billion parameters and leverages both image and text processing to understand and generate multi-modal content.