5 30 87

Umitcan Sahin PRO

ucsahin

AI & ML interests

Visual Language Models, Large Language Models, Vision Transformers

Recent Activity

reacted to davanstrien's post with 👍 4 days ago

Updated the ColPali Query Generator Space https://huggingface.co/spaces/davanstrien/ColPali-Query-Generator to use https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct. Given an input image, it generates several queries along with explanations to justify them. This approach can generate synthetic data for fine-tuning ColPali models.

liked a model 7 days ago

Qwen/Qwen2.5-VL-72B-Instruct

liked a model 7 days ago

Qwen/Qwen2.5-VL-7B-Instruct

View all activity

Organizations

None yet

Posts 2

Post

3801

🚀 Introducing TraVisionLM: Turkish Visual Language Model - The First of Its Kind! 🇹🇷🖼️

I'm thrilled to share TraVisionLM on Hugging Face! With 875M parameters, this lightweight, efficient model handles Turkish instructions for image inputs. Fully compatible with the Transformers library, it’s easy to load, fine-tune, and use—no external libraries needed!

Developed solo, TraVisionLM is a strong foundation for low-resource language research. While still improving, it's a key step for Turkish-language AI. Your feedback is welcome as I refine the model.

🎉 Explore it now:

- Model: ucsahin/TraVisionLM-base
- Demo: https://huggingface.co/spaces/ucsahin/TraVisionLM-Turkish_Visual_Language_Model
- Object Detection Finetune: ucsahin/TraVisionLM-Object-Detection-ft

Let’s push Turkish visual language processing forward!

---

🚀 TraVisionLM: Türünün İlk Örneği Türkçe Görsel Dil Modelini Sunuyorum! 🇹🇷🖼️

TraVisionLM modelini Hugging Face'te yayınladım! 875M parametre ile bu hafif ve verimli model, görüntüye dayalı Türkçe talimatları işlemek için tasarlandı. Transformers kütüphanesiyle tamamen uyumlu, yüklemesi, eğitmesi ve kullanması çok kolay—dış kütüphane gerekmez!

Tek başıma geliştirdiğim TraVisionLM, düşük kaynaklı dillerde araştırmalar için sağlam bir temel sunuyor. Geliştirmeye devam ederken geri bildirimlerinizi bekliyorum.

🎉 Hemen keşfedin:

- Model: ucsahin/TraVisionLM-base
- Demo: https://huggingface.co/spaces/ucsahin/TraVisionLM-Turkish_Visual_Language_Model
- Obje Tespiti İnce Ayarı: ucsahin/TraVisionLM-Object-Detection-ft

Türkçe görsel dil işleme sınırlarını birlikte zorlayalım!

Post

3937

Florence-2 has a great capability of detecting various objects in a zero-shot setting with the task prompt "<OD>". However, if you want to detect specific objects that the base model is not able to in its current form, you can easily finetune it for this particular task. Below I show how to finetune the model to detect tables in a given image, but a similar process can be applied to detect any objects. Thanks to @andito , @merve , and @SkalskiP for sharing the fix for finetuning the Florence-2 model. Please also check their great blog post at https://huggingface.co/blog/finetune-florence2.

Colab notebook: https://colab.research.google.com/drive/1Y8GVjwzBIgfmfD3ZypDX5H1JA_VG0YDL?usp=sharing
Finetuned model: ucsahin/Florence-2-large-TableDetection