Update README.md
Browse files
README.md
CHANGED
@@ -11,8 +11,8 @@ pipeline_tag: image-text-to-text
|
|
11 |
`BLIP3` is a series of foundational vision-language models (VLMs) developed by Salesforce AI Research. \
|
12 |
These models have been trained at scale on high-quality image caption datasets and interleaved image-text data. BLIP3 highlights a few features below,
|
13 |
|
14 |
-
* The pretrained foundation model, `blip3-phi3-mini-base-r-v1`, achieves state-of-the-art performance under 5b parameters and demonstrates strong in-context learning capabilities.
|
15 |
-
* The instruct fine-tuned model, `blip3-phi3-mini-instruct-r-v1`, achieves state-of-the-art performance among open-source and closed-source VLMs under 5b parameters.
|
16 |
* `blip3-phi3-mini-instruct-r-v1` supports flexible high-resolution image encoding with efficient visual token sampling.
|
17 |
|
18 |
More technical details will come with a technical report soon.
|
|
|
11 |
`BLIP3` is a series of foundational vision-language models (VLMs) developed by Salesforce AI Research. \
|
12 |
These models have been trained at scale on high-quality image caption datasets and interleaved image-text data. BLIP3 highlights a few features below,
|
13 |
|
14 |
+
* The **pretrained** foundation model, `blip3-phi3-mini-base-r-v1`, achieves state-of-the-art performance under 5b parameters and demonstrates strong in-context learning capabilities.
|
15 |
+
* The **instruct** fine-tuned model, `blip3-phi3-mini-instruct-r-v1`, achieves state-of-the-art performance among open-source and closed-source VLMs under 5b parameters.
|
16 |
* `blip3-phi3-mini-instruct-r-v1` supports flexible high-resolution image encoding with efficient visual token sampling.
|
17 |
|
18 |
More technical details will come with a technical report soon.
|