5 62 74

Vikramjeet Singh

VikramSingh178

https://vikramxd.github.io

AI & ML interests

Computer Vision | Transformers| Diffusion Models | ML Systems

Recent Activity

upvoted a paper 3 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

liked a model 19 days ago

rhymes-ai/Allegro

upvoted a paper 20 days ago

Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

View all activity

Organizations

VikramSingh178's activity

upvoted a paper 3 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 4 days ago • 184

upvoted a paper 20 days ago

Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

Paper • 2407.06189 • Published Jul 8, 2024 • 26

upvoted a paper 25 days ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published about 1 month ago • 85

upvoted a paper 27 days ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published 30 days ago • 136

upvoted 2 papers about 1 month ago

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning

Paper • 2402.00769 • Published Feb 1, 2024 • 22

SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters

Paper • 2412.00174 • Published Nov 29, 2024 • 22

upvoted a collection about 1 month ago

Daily Papers

Collection

1 item • Updated Oct 26, 2023 • 66

upvoted 2 papers about 2 months ago

VEnhancer: Generative Space-Time Enhancement for Video Generation

Paper • 2407.07667 • Published Jul 10, 2024 • 14

VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models

Paper • 2411.13503 • Published Nov 20, 2024 • 30

upvoted 3 papers 2 months ago

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Paper • 2411.07126 • Published Nov 11, 2024 • 28

LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation

Paper • 2411.04997 • Published Nov 7, 2024 • 37

How Far is Video Generation from World Model: A Physical Law Perspective

Paper • 2411.02385 • Published Nov 4, 2024 • 33

upvoted 2 papers 3 months ago

FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality

Paper • 2410.19355 • Published Oct 25, 2024 • 23

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers

Paper • 2410.10629 • Published Oct 14, 2024 • 9

upvoted a collection 3 months ago

3D Reconstruction

Collection

42 items • Updated 42 minutes ago • 3

upvoted a paper 3 months ago

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

Paper • 2410.02073 • Published Oct 2, 2024 • 40

upvoted 2 papers 4 months ago

Colorful Diffuse Intrinsic Image Decomposition in the Wild

Paper • 2409.13690 • Published Sep 20, 2024 • 13

Shot2Story20K: A New Benchmark for Comprehensive Understanding of Multi-shot Videos

Paper • 2312.10300 • Published Dec 16, 2023 • 1

upvoted a paper 5 months ago

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 60

upvoted an article 5 months ago

Article

Optimum-NVIDIA - Unlock blazingly fast LLM inference in just 1 line of code

Dec 5, 2023

• 4