AK's picture

AK

akhaliq

·

_akhaliq

AI & ML interests

None yet

Organizations

akhaliq's activity

upvoted a collection 1 day ago

Llama-3.1-Nemotron-70B

SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 3 days ago • 76

upvoted a paper 7 days ago

ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization

Paper • 2406.04312 • Published Jun 6 • 1

upvoted a paper 8 days ago

CursorCore: Assist Programming through Aligning Anything

Paper • 2410.07002 • Published 9 days ago • 12

upvoted an article 8 days ago

Article

Welcome, Gradio 5

9 days ago

• 56

upvoted a collection 17 days ago

NVLM 1.0

A family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks and text-only tasks. • 1 item • Updated 17 days ago • 42

upvoted a collection 21 days ago

Emu3

3 items • Updated 21 days ago • 50

upvoted a collection 28 days ago

Oryx

Oryx: One Multi-Modal LLM for On-Demand Spatial-Temporal Understanding • 5 items • Updated 29 days ago • 11

upvoted a paper about 1 month ago

How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data

Paper • 2409.03810 • Published Sep 5 • 30

upvoted 4 collections about 2 months ago

Sapiens

Foundation models for human tasks. Code: https://github.com/facebookresearch/sapiens • 72 items • Updated 29 days ago • 40

Qwen2-VL

Vision-language model series based on Qwen2 • 15 items • Updated 30 days ago • 138

LongVILA

A series of VILA models that specialize for **long-context** abilities • 4 items • Updated Aug 21 • 4

XGen-MM-1 models and datasets

A collection of all XGen-MM (Foundation LMM) models! • 14 items • Updated 10 days ago • 34

upvoted 4 collections 2 months ago

DeepSeek-Prover

DeepSeek-V1-and-V1.5-Series • 7 items • Updated Aug 16 • 13

Hermes 3

The Hermes 3 Series of Models • 8 items • Updated Aug 23 • 84

Qwen2-Audio

Audio-language model series based on Qwen2 • 4 items • Updated 30 days ago • 41

Qwen2-Math

Math-specific model series based on Qwen2 • 8 items • Updated 30 days ago • 45

upvoted a collection 6 months ago

OpenELM Instruct Models

4 items • Updated 14 days ago • 113

upvoted an article 6 months ago

Article

AI Apps in a Flash with Gradio's Reload Mode

Apr 16

• 21

upvoted a collection 8 months ago

Playground v2.5

2 items • Updated Feb 27 • 23

upvoted a collection 9 months ago

AIM

AIM: Autoregressive Image Models • 5 items • Updated 14 days ago • 48

upvoted a paper 10 months ago

Generative Multimodal Models are In-Context Learners

Paper • 2312.13286 • Published Dec 20, 2023 • 34

upvoted a collection 10 months ago

Diffusion model Spaces

313 items • Updated 8 days ago • 31

upvoted 2 papers 11 months ago

TokenCompose: Grounding Diffusion with Token-level Supervision

Paper • 2312.03626 • Published Dec 6, 2023 • 5

Single-Image 3D Human Digitization with Shape-Guided Diffusion

Paper • 2311.09221 • Published Nov 15, 2023 • 20

upvoted 3 papers about 1 year ago

ILLUME: Rationalizing Vision-Language Models through Human Interactions

Paper • 2208.08241 • Published Aug 17, 2022 • 2

DS-Fusion: Artistic Typography via Discriminated and Stylized Diffusion

Paper • 2303.09604 • Published Mar 16, 2023 • 6

Retentive Network: A Successor to Transformer for Large Language Models

Paper • 2307.08621 • Published Jul 17, 2023 • 170

upvoted 15 papers over 1 year ago

Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events

Paper • 2307.06439 • Published Jul 12, 2023 • 9

Example-based Motion Synthesis via Generative Motion Matching

Paper • 2306.00378 • Published Jun 1, 2023 • 6

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning

Paper • 2307.04725 • Published Jul 10, 2023 • 64

Focused Transformer: Contrastive Training for Context Scaling

Paper • 2307.03170 • Published Jul 6, 2023 • 11

A Survey on Evaluation of Large Language Models

Paper • 2307.03109 • Published Jul 6, 2023 • 42

Hardwiring ViT Patch Selectivity into CNNs using Patch Mixing

Paper • 2306.17848 • Published Jun 30, 2023 • 8

One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization

Paper • 2306.16928 • Published Jun 29, 2023 • 38

Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust

Paper • 2305.20030 • Published May 31, 2023 • 8

Fast Segment Anything

Paper • 2306.12156 • Published Jun 21, 2023 • 34

Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture

Paper • 2301.08243 • Published Jan 19, 2023 • 6

TART: A plug-and-play Transformer module for task-agnostic reasoning

Paper • 2306.07536 • Published Jun 13, 2023 • 11

Weakly supervised information extraction from inscrutable handwritten document images

Paper • 2306.06823 • Published Jun 12, 2023 • 4

Aladdin: Zero-Shot Hallucination of Stylized 3D Assets from Abstract Scene Descriptions

Paper • 2306.06212 • Published Jun 9, 2023 • 9

Transformers learn through gradual rank increase

Paper • 2306.07042 • Published Jun 12, 2023 • 9

FasterViT: Fast Vision Transformers with Hierarchical Attention

Paper • 2306.06189 • Published Jun 9, 2023 • 30