16 20 181

NB PRO

Skier8402

https://nyab.notion.site

Shuyib

AI & ML interests

Practicing Computer Vision, Optimization, NLP and multimodal system implementation.

Recent Activity

updated a collection about 12 hours ago

Datasets

liked a dataset about 12 hours ago

glaiveai/glaive-function-calling-v2

updated a collection about 12 hours ago

Computer vision

View all activity

Organizations

Skier8402's activity

liked a dataset about 12 hours ago

glaiveai/glaive-function-calling-v2

Viewer • Updated Sep 27, 2023 • 113k • 1.46k • 420

liked a Space about 12 hours ago

RF-DETR

🔥

SOTA real-time object detection model

liked a Space 7 days ago

115

OctoTools

🚀

An Agentic Framework with Tools for Complex Reasoning

liked a dataset 8 days ago

FreedomIntelligence/medical-o1-reasoning-SFT

Viewer • Updated Feb 22 • 50.1k • 27.7k • 542

liked a model 12 days ago

sesame/csm-1b

Text-to-Speech • Updated 9 days ago • 37.7k • 1.63k

liked a Space 14 days ago

101

Phi 4 Multimodal

🌖

Interact with AI using text, images, or audio

liked 2 Spaces 15 days ago

Magma UI

📚

Magma-8B model for UI Agents

394

OmniParser V2

🏢

OmniParser, turn your LLM into GUI agent

liked a Space 20 days ago

152

Agent Dino

🌠

@image @rAgent @web @text @tts1 @tts2 @3d

liked a dataset 22 days ago

allenai/olmOCR-mix-0225

Viewer • Updated 29 days ago • 259k • 5.69k • 98

liked 2 models 25 days ago

microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • Updated about 7 hours ago • 767k • 1.23k

allenai/olmOCR-7B-0225-preview-GGUF

Updated 27 days ago • 7.95k • 23

liked a model 27 days ago

Qwen/Qwen2.5-VL-3B-Instruct

Image-Text-to-Text • Updated 3 days ago • 1.24M • 289

liked a Space 27 days ago

1.34k

Wan2.1

💻

Wan: Open and Advanced Large-Scale Video Generative Models

liked 6 Spaces 28 days ago

Talk to OpenAI (Gradio UI)

🗣

Talk to OpenAI (Gradio UI)

Hello Computer (Gradio)

💻

Say computer (Gradio)

Talk to Gemini

♊

Talk to Gemini using Google's multimodal API

Phonic AI Chat

🎙

Talk to Phonic AI's speech-to-speech model

LLM Voice Chat (Gradio)

💻

LLM Voice by ElevenLabs (Gradio)

Whisper Realtime Transcription (Gradio UI)

👂

Transcribe audio in realtime - Gradio UI version