19 29 16

Xiangtai Li

LXT

https://lxtgh.github.io/

AI & ML interests

Computer Vision, Multi-Modal Understanding, Generative AI

Recent Activity

authored a paper about 1 month ago

Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models

authored a paper about 1 month ago

UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions

authored a paper about 1 month ago

Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology

View all activity

Organizations

authored 3 papers about 1 month ago

Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models

Paper • 2505.24164 • Published May 30

UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions

Paper • 2506.13691 • Published Jun 16 • 2

Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology

Paper • 2507.07999 • Published Jul 10 • 47

upvoted a paper about 1 month ago

Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology

Paper • 2507.07999 • Published Jul 10 • 47

updated 4 datasets about 1 month ago

updated 2 models about 1 month ago

General-Level/General-Bench-Closeset

Updated Jul 9

General-Level/General-Bench-Closeset

Updated Jul 9

published a model about 1 month ago

General-Level/General-Bench-Closeset

Updated Jul 9

upvoted 3 papers about 2 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30 • 48

Ovis-U1 Technical Report

Paper • 2506.23044 • Published Jun 29 • 61

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1 • 226

liked a model about 2 months ago

black-forest-labs/FLUX.1-Kontext-dev

Image-to-Image • Updated Jun 27 • 356k • • 2.11k

upvoted a paper about 2 months ago

VMoBA: Mixture-of-Block Attention for Video Diffusion Models

Paper • 2506.23858 • Published Jun 30 • 31

authored 4 papers 2 months ago

OmniAudio: Generating Spatial Audio from 360-Degree Video

Paper • 2504.14906 • Published Apr 21

Towards Semantic Equivalence of Tokenization in Multimodal LLM

Paper • 2406.05127 • Published Jun 7, 2024

So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection

Paper • 2505.18660 • Published May 24 • 1

PixelThink: Towards Efficient Chain-of-Pixel Reasoning

Paper • 2505.23727 • Published May 29 • 4

Xiangtai Li

AI & ML interests

Recent Activity

Organizations

LXT's activity