Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Stanford Autonomous Agent Lab
university
https://www.autonomousagents.stanford.edu/
Activity Feed
Follow
14
AI & ML interests
None defined yet.
Recent Activity
fagunpatel98
updated
a dataset
14 days ago
SAA-Lab/SLPHelmOutputs
sangttruong
authored
a paper
about 2 months ago
ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code
sangttruong
authored
a paper
about 2 months ago
Reliable and Efficient Amortized Model-based Evaluation
View all activity
Team members
8
SAA-Lab
's datasets
118
Sort: Recently updated
SAA-Lab/SLPHelmOutputs
Preview
•
Updated
14 days ago
•
3.01k
SAA-Lab/LitBench-new-rationales
Viewer
•
Updated
23 days ago
•
43.7k
•
100
SAA-Lab/litbench-rationales-gpt4
Viewer
•
Updated
23 days ago
•
24.2k
•
96
SAA-Lab/LitBench-Test
Viewer
•
Updated
Jul 7
•
2.38k
•
35
SAA-Lab/LitBench-Train
Viewer
•
Updated
Jul 7
•
43.8k
•
198
•
1
SAA-Lab/SLPHelmBenchmarkOutput
Preview
•
Updated
Jul 5
•
95
SAA-Lab/LitBench-Test-Release
Viewer
•
Updated
Jun 18
•
2.38k
•
6
SAA-Lab/LitBench-Test-IDs-Complete-Final
Viewer
•
Updated
Jun 18
•
2.48k
•
3
SAA-Lab/LitBench-Test-IDs-Complete
Viewer
•
Updated
Jun 17
•
2.48k
•
5
SAA-Lab/LitBench-Test-Enhanced
Viewer
•
Updated
Jun 17
•
2.48k
•
15
SAA-Lab/LitBench-Test-IDs
Viewer
•
Updated
Jun 17
•
2.48k
SAA-Lab/SLPHelmUltraSuite
Viewer
•
Updated
May 16
•
7.54k
•
602
SAA-Lab/LitBench-Rationales
Viewer
•
Updated
May 16
•
43.7k
•
66
SAA-Lab/human-exp-1
Viewer
•
Updated
May 15
•
40
•
12
SAA-Lab/SLPHelmDataset
Viewer
•
Updated
May 15
•
19.4k
•
1.48k
SAA-Lab/wp_non_length_corrected
Viewer
•
Updated
May 14
•
65.5k
SAA-Lab/SLPHelmManualLabels
Viewer
•
Updated
May 14
•
926
•
699
SAA-Lab/wp_shp
Preview
•
Updated
May 14
SAA-Lab/wp_naive
Viewer
•
Updated
May 14
•
395k
•
1
SAA-Lab/test_jan25-cwv-genrm_qwen1.5b-ckptNone
Viewer
•
Updated
May 13
•
155
•
4
SAA-Lab/test_jan25-cwv-genrm_qwen3b-ckptNone
Viewer
•
Updated
May 13
•
155
•
3
SAA-Lab/test_jan25-cwv-genrm_qwen7b-ckptNone
Viewer
•
Updated
May 13
•
155
•
3
SAA-Lab/test_jan25-cwv-genrm_llama1b-ckptNone
Viewer
•
Updated
May 13
•
155
•
2
SAA-Lab/test_jan25-cwv-genrm_llama3b-ckptNone
Viewer
•
Updated
May 13
•
155
•
4
SAA-Lab/test_jan25-cwv-genrm_llama8b-ckptNone
Viewer
•
Updated
May 13
•
155
•
3
SAA-Lab/test_jan25-cwv-genrm_cot_qwen1.5b-ckptglobal_step_324
Viewer
•
Updated
May 13
•
155
•
3
SAA-Lab/test_jan25-cwv-genrm_cot_qwen3b-ckptglobal_step_324
Viewer
•
Updated
May 13
•
155
•
3
SAA-Lab/test_jan25-cwv-genrm_cot_qwen7b-ckptglobal_step_324
Viewer
•
Updated
May 13
•
155
•
4
SAA-Lab/test_jan25-cwv-genrm_cot_llama1b-ckptglobal_step_324
Viewer
•
Updated
May 13
•
155
•
2
SAA-Lab/test_jan25-cwv-genrm_cot_llama3b-ckptglobal_step_324
Viewer
•
Updated
May 13
•
155
•
3
Previous
1
2
3
4
Next