Commit History
Copy
fe65f69
Copy
4425c4b
Copy
1661f8d
Copy
c4e8028
Copy
b11f271
Remove placeholder text
50ce699
Update title
84e21ef
Simplify
eba4aa7
Add data
b0b7fbb
Reformat
423bf9b
xeon27
commited on
Update reproducibility text
5652cd0
xeon27
commited on
Update examples
363cbd2
Update content and style
c8da037
Add agentharm and swe-bench tasks
1289818
xeon27
commited on
Add results for GAIA and GDM tasks
2718fde
xeon27
commited on
Update about page
8596ab1
Add model name links and change single-turn to base
9c55d6d
xeon27
commited on
Remove filtering
e344502
Change nomenclature to single-turn
eb538cb
xeon27
commited on
Add new tasks
6eaffc5
xeon27
commited on
Add task link in description
ba14348
xeon27
commited on
[WIP] Add task link in description
6410971
xeon27
commited on
[WIP] Add task link in description
159e996
xeon27
commited on
[WIP] Add task link in description
fcd47ae
xeon27
commited on
Make task names clickable and link to inspect-evals repo
15e5347
xeon27
commited on
Make values clickable
bbde2b0
xeon27
commited on
Add title and required text
ba2f546
xeon27
commited on
Add GAIA and GDM-InterCode-CTF tasks
0dddab1
xeon27
commited on
Add base eval tasks
006ba57
xeon27
commited on