-
Octopus v2: On-device language model for super agent
Paper • 2404.01744 • Published • 58 -
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Paper • 2404.05719 • Published • 82 -
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Paper • 2404.07972 • Published • 48 -
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 55
Shaoguang Mao
dawnmsg
·
AI & ML interests
None yet
Recent Activity
authored
a paper
about 19 hours ago
FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation
for Feature Implementation
liked
a dataset
3 months ago
microsoft/MMLU-CF
upvoted
a
paper
8 months ago
Scaling Synthetic Data Creation with 1,000,000,000 Personas
Organizations
None yet
Collections
1
models
None public yet
datasets
None public yet