Orkut Murat Yılmaz

orkut

orkutmuratyilmaz

AI & ML interests

Geo Sciences, Free Software

Recent Activity

liked a dataset about 1 hour ago

alibayram/yapay_zeka_turkce_mmlu_model_cevaplari

liked a model about 1 hour ago

TencentARC/InstantMesh

liked a model about 1 hour ago

VAST-AI/MIDI-3D

View all activity

Organizations

orkut's activity

liked a dataset about 1 hour ago

alibayram/yapay_zeka_turkce_mmlu_model_cevaplari

Viewer • Updated about 1 hour ago • 6.2k • 452 • 3

liked 2 models about 1 hour ago

TencentARC/InstantMesh

Image-to-3D • Updated Apr 11, 2024 • 77.4k • 278

VAST-AI/MIDI-3D

Image-to-3D • Updated 5 days ago • 255 • 10

liked a model about 18 hours ago

watt-ai/watt-tool-70B

Updated Dec 20, 2024 • 5.05k • 60

liked a Space 3 days ago

FireDetection

⚡

Fire and smoke detections from video with fine-tuned Yolov12

liked a model 12 days ago

Wan-AI/Wan2.1-I2V-14B-480P

Image-to-Video • Updated 16 days ago • 295k • 116

liked a Space 12 days ago

1.16k

Wan2.1

💻

Wan: Open and Advanced Large-Scale Video Generative Models

liked a dataset about 1 month ago

yusufbaykaloglu/University_Mevzuat_QA_v2

Viewer • Updated Feb 6 • 14.3k • 262 • 8

upvoted an article about 1 month ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.16k

liked a model about 1 month ago

deepseek-ai/Janus-Pro-1B

Any-to-Any • Updated Feb 1 • 84.2k • 402

liked 3 models 3 months ago

liked a Space 3 months ago

LOGO SDXL LORA FREE DEMO

💻

liked a model 5 months ago

nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

Text Generation • Updated Oct 25, 2024 • 265k • • 2.03k

reacted to merve's post with 🔥 8 months ago

Post

2292

We have recently merged Video-LLaVA to transformers! 🤗🎞️
What makes this model different?

Demo: llava-hf/video-llava
Model: LanguageBind/Video-LLaVA-7B-hf

Compared to other models that take image and video input and either project them separately or downsampling video and projecting selected frames, Video-LLaVA is converting images and videos to unified representation and project them using a shared projection layer.

It uses Vicuna 1.5 as the language model and LanguageBind's own encoders that's based on OpenCLIP, these encoders project the modalities to an unified representation before passing to projection layer.

I feel like one of the coolest features of this model is the joint understanding which is also introduced recently with many models

It's a relatively older model but ahead of it's time and works very well! Which means, e.g. you can pass model an image of a cat and a video of a cat and ask questions like whether the cat in the image exists in video or not 🤩

liked a model 9 months ago

alibayram/Doktor-Llama

Text Generation • Updated Jul 5, 2024 • 11

liked a Space 9 months ago

1.9k

Stable Diffusion XL on TPUv5e

🏋

Generate images from text prompts with various styles

liked a model 9 months ago

mistralai/Codestral-22B-v0.1

Text Generation • Updated Jul 31, 2024 • 10.7k • 1.23k

reacted to harpreetsahota's post with 🔥 9 months ago

Post

2237

The Coachella of Computer Vision, CVPR, is right around the corner. In anticipation of the conference, I curated a dataset of the papers.

I'll have a technical blog post out tomorrow doing some analysis on the dataset, but I'm so hyped that I wanted to get it out to the community ASAP.

The dataset consists of the following fields:

- An image of the first page of the paper
- title: The title of the paper
- authors_list: The list of authors
- abstract: The abstract of the paper
- arxiv_link: Link to the paper on arXiv
- other_link: Link to the project page, if found
- category_name: The primary category this paper according to [arXiv taxonomy](https://arxiv.org/category_taxonomy)
- all_categories: All categories this paper falls into, according to arXiv taxonomy
- keywords: Extracted using GPT-4o

Here's how I created the dataset 👇🏼

Generic code for building this dataset can be found [here](https://github.com/harpreetsahota204/CVPR-2024-Papers).

This dataset was built using the following steps:

- Scrape the CVPR 2024 website for accepted papers
- Use DuckDuckGo to search for a link to the paper's abstract on arXiv
- Use arXiv.py (python wrapper for the arXiv API) to extract the abstract and categories, and download the pdf for each paper
- Use pdf2image to save the image of paper's first page
- Use GPT-4o to extract keywords from the abstract

Voxel51/CVPR_2024_Papers