Sarah Thompson

crimsonFalcon91

AI & ML interests

None yet

Recent Activity

liked a model about 4 hours ago

upvoted a paper about 4 hours ago

DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation

upvoted a paper 12 days ago

3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors

View all activity

Organizations

None yet

crimsonFalcon91's activity

liked a model about 4 hours ago

answerdotai/ModernBERT-base

Fill-Mask • Updated 6 days ago • 27.6k • 433

upvoted a paper about 4 hours ago

DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation

Paper • 2412.18597 • Published about 22 hours ago • 10

upvoted 12 papers 12 days ago

3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors

Paper • 2410.16266 • Published Oct 21 • 4

xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs

Paper • 2410.16267 • Published Oct 21 • 17

Mitigating Object Hallucination via Concentric Causal Attention

Paper • 2410.15926 • Published Oct 21 • 16

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

Paper • 2410.17247 • Published Oct 22 • 45

SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes

Paper • 2410.17249 • Published Oct 22 • 41

LLM-based Optimization of Compound AI Systems: A Survey

Paper • 2410.16392 • Published Oct 21 • 14

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Paper • 2410.17250 • Published Oct 22 • 14

Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes

Paper • 2410.16930 • Published Oct 22 • 6

Improve Vision Language Model Chain-of-thought Reasoning

Paper • 2410.16198 • Published Oct 21 • 22

liked a dataset 12 days ago

foursquare/fsq-os-places

Viewer • Updated 22 days ago • 105M • 3.09k • 63

liked a model 12 days ago

meta-llama/Llama-3.3-70B-Instruct

Text Generation • Updated 4 days ago • 301k • • 1.29k

upvoted 2 papers 12 days ago

Phi-4 Technical Report

Paper • 2412.08905 • Published 14 days ago • 92

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published 13 days ago • 90