LLM4Code/CodeARC_annotated_llama3.1

CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis

Code: https://github.com/Anjiang-Wei/CodeARC

Website: https://anjiang-wei.github.io/CodeARC-Website/

Dataset: https://huggingface.co/datasets/anjiangwei/CodeARC-Problems

10 Input-Output examples for each problem: https://huggingface.co/datasets/anjiangwei/CodeARC-Invocations

Fine-tuned models: https://huggingface.co/LLM4Code/CodeARC_annotated_llama3.1

https://huggingface.co/LLM4Code/CodeARC_anonymous_llama3.1

@article{wei2025codearc,
  title={CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis},
  author={Wei, Anjiang and Suresh, Tarun and Cao, Jiannan and Kannan, Naveen and Wu, Yuheng and Yan, Kai and Teixeira, Thiago SFX and Wang, Ke and Aiken, Alex},
  journal={arXiv preprint arXiv:2503.23145},
  year={2025}
}

LLM4Code
/

CodeARC_annotated_llama3.1

Model tree for LLM4Code/CodeARC_annotated_llama3.1