2 2

Alex Gu

minimario

AI & ML interests

None yet

Recent Activity

updated a dataset 18 days ago

minimario/cruxeval-ocaml

published a dataset 29 days ago

minimario/cruxeval-ocaml

liked a dataset 9 months ago

bigcode/bigcodebench

View all activity

Organizations

minimario's activity

updated a dataset 18 days ago

minimario/cruxeval-ocaml

Viewer • Updated 18 days ago • 90 • 172

published a dataset 29 days ago

minimario/cruxeval-ocaml

Viewer • Updated 18 days ago • 90 • 172

liked a dataset 9 months ago

bigcode/bigcodebench

Viewer • Updated 18 days ago • 5.7k • 12.6k • 52

liked a Space 9 months ago

185

BigCodeBench Leaderboard

🥇

Explore and analyze code evaluation data

updated a model 9 months ago

minimario/your-model

Updated Jun 7, 2024

New activity in meta-llama/Meta-Llama-3-8B 10 months ago

[READ IF YOU DO NOT HAVE ACCESS] Getting access to the model

#172 opened 10 months ago by

osanseviero

published an article 11 months ago

Article

Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs

Apr 16, 2024

• 15

authored 3 papers 12 months ago

SantaCoder: don't reach for the stars!

Paper • 2301.03988 • Published Jan 9, 2023 • 7

LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers

Paper • 2310.15164 • Published Oct 23, 2023 • 1

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

Paper • 2403.07974 • Published Mar 12, 2024 • 2

updated a dataset about 1 year ago

minimario/math-openwebmath-retrievals

Viewer • Updated Mar 8, 2024 • 12.5k • 77 • 1

authored a paper about 1 year ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 138

New activity in cruxeval-org/cruxeval about 1 year ago

Add metadata to dataset card

#2 opened about 1 year ago by

albertvillanova

updated 3 datasets about 1 year ago

authored a paper about 1 year ago

CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution

Paper • 2401.03065 • Published Jan 5, 2024 • 11

updated 2 datasets about 1 year ago

minimario/FOLIO

Viewer • Updated Jan 2, 2024 • 1.21k • 226 • 1

minimario/livecodebench-execute-v2

Viewer • Updated Dec 17, 2023 • 2.07k • 137 • 1

updated a dataset over 1 year ago

minimario/livecodebench-execute

Viewer • Updated Nov 30, 2023 • 709 • 66 • 1