2 5 1

Graham Neubig

gneubig

http://www.phontron.com

AI & ML interests

NLP

Recent Activity

updated a dataset 4 days ago

gneubig/aime-1983-2024

authored a paper 6 days ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

upvoted a paper 6 days ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

View all activity

Organizations

gneubig's activity

updated a dataset 4 days ago

gneubig/aime-1983-2024

Viewer • Updated 4 days ago • 933 • 61

authored a paper 6 days ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published 7 days ago • 43

upvoted a paper 6 days ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published 7 days ago • 43

updated a dataset 11 days ago

all-hands/openhands-feedback

Viewer • Updated 11 days ago • 275 • 86 • 2

upvoted a paper 13 days ago

The BrowserGym Ecosystem for Web Agent Research

Paper • 2412.05467 • Published 18 days ago • 18

authored a paper 16 days ago

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published 19 days ago • 45

upvoted a paper 16 days ago

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published 19 days ago • 45

authored a paper 19 days ago

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published 20 days ago • 43

liked a dataset 19 days ago

all-hands/openhands-feedback

Viewer • Updated 11 days ago • 275 • 86 • 2

upvoted a paper 19 days ago

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published 20 days ago • 43

authored a paper about 1 month ago

OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs

Paper • 2411.14199 • Published Nov 21 • 28

authored a paper 2 months ago

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Paper • 2410.17250 • Published Oct 22 • 14

upvoted a paper 2 months ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21 • 43

authored 3 papers 2 months ago

authored a paper 3 months ago

Agent Workflow Memory

Paper • 2409.07429 • Published Sep 11 • 28

authored a paper 4 months ago

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark

Paper • 2409.02813 • Published Sep 4 • 28

commented a paper 5 months ago

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Paper • 2407.16741 • Published Jul 23 • 68 •

authored a paper 5 months ago

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Paper • 2407.16741 • Published Jul 23 • 68