Maxi PRO
maxiw
AI & ML interests
Computer Agents | VLMs
Organizations
None yet
Research on GUI Models
-
Qwen2.5-VL Technical Report
Paper • 2502.13923 • Published • 200 -
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Paper • 2404.05719 • Published • 83 -
ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Paper • 2411.17465 • Published • 89 -
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Paper • 2409.12191 • Published • 78
GUI Models
Research on GUI Models
-
Qwen2.5-VL Technical Report
Paper • 2502.13923 • Published • 200 -
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Paper • 2404.05719 • Published • 83 -
ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Paper • 2411.17465 • Published • 89 -
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Paper • 2409.12191 • Published • 78
GUI Datasets
Datasets from the graphical user interfaces domain (screenshots).