NCSOFT
/

VARCO-VISION-14B

Image-Text-to-Text

text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

kimyoungjune commited on Dec 2, 2024

Commit

e375def

·

verified ·

1 Parent(s): 7dbd3a1

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ library_name: transformers
 **VARCO-VISION-14B** is a powerful English-Korean Vision-Language Model (VLM) developed through four distinct training phases, culminating in a final preference optimization stage. Designed to excel in both multimodal and text-only tasks, VARCO-VISION-14B not only surpasses other models of similar size in performance but also achieves scores comparable to those of proprietary models. The model currently accepts a single image and accompanying text as input, generating text as output. It supports grounding—the ability to identify the locations of objects within an image—as well as OCR (Optical Character Recognition) to recognize text within images.
 - **Developed by:** NC Research, Multimodal Generation Team
-- **Technical Report:** [Coming Soon]()
 - **Demo Page:** [Coming Soon]()
 - **Languages:** Korean, English
 - **License:** CC BY-NC 4.0

 **VARCO-VISION-14B** is a powerful English-Korean Vision-Language Model (VLM) developed through four distinct training phases, culminating in a final preference optimization stage. Designed to excel in both multimodal and text-only tasks, VARCO-VISION-14B not only surpasses other models of similar size in performance but also achieves scores comparable to those of proprietary models. The model currently accepts a single image and accompanying text as input, generating text as output. It supports grounding—the ability to identify the locations of objects within an image—as well as OCR (Optical Character Recognition) to recognize text within images.
 - **Developed by:** NC Research, Multimodal Generation Team
+- **Technical Report:** [VARCO-VISION: Expanding Frontiers in Korean Vision-Language Models](https://arxiv.org/pdf/2411.19103)
 - **Demo Page:** [Coming Soon]()
 - **Languages:** Korean, English
 - **License:** CC BY-NC 4.0