Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
Duplicated fromΒ
OpenHands/evaluation
ryanhoangt
/
OpenHands-evaluation
like
2
Sleeping
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
743d952
OpenHands-evaluation
Ctrl+K
Ctrl+K
6 contributors
History:
32 commits
Xingyao Wang
plot success rate with cost when available
743d952
10 months ago
outputs
add results for deepseek chat v2
10 months ago
pages
visualize swe-bench-lite & fix stuck in look
11 months ago
utils
Merge commit 'f6d9f43457bdadd36685181efda2fd45e813a02c'
11 months ago
.gitattributes
Safe
1.61 kB
initial results
11 months ago
.gitignore
Safe
79 Bytes
update gitignore
10 months ago
0_π_OpenDevin_Benchmark.py
Safe
4.06 kB
plot success rate with cost when available
10 months ago
README.md
Safe
277 Bytes
Update README.md
11 months ago
requirements.txt
Safe
52 Bytes
update visualizer on multi-page
11 months ago