meghsn's picture
orby-agent-v0 (#8)
3011461 verified

OrbyAgent-Claude-3.5-Sonnet

This agent is developed by Orby AI.

The agent does not use any benchmark-specific information in the prompts. For WebArena benchmark, we use the original evaluator and task definitions for fair comparison.

It uses Claude-3.5-sonnet-20241022 as a backend, with both screenshot and HTML as inputs. More details can be found in our research blog.