CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis

Paper: https://arxiv.org/pdf/2503.23145

Code: https://github.com/Anjiang-Wei/CodeARC

Website: https://anjiang-wei.github.io/CodeARC-Website/

Dataset: https://huggingface.co/datasets/anjiangwei/CodeARC-Problems

10 Input-Output examples for each problem: https://huggingface.co/datasets/anjiangwei/CodeARC-Invocations

Fine-tuned models: https://huggingface.co/LLM4Code/CodeARC_annotated_llama3.1

https://huggingface.co/LLM4Code/CodeARC_anonymous_llama3.1

@article{wei2025codearc,
  title={CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis},
  author={Wei, Anjiang and Suresh, Tarun and Cao, Jiannan and Kannan, Naveen and Wu, Yuheng and Yan, Kai and Teixeira, Thiago SFX and Wang, Ke and Aiken, Alex},
  journal={arXiv preprint arXiv:2503.23145},
  year={2025}
}
Downloads last month
9
Safetensors
Model size
8.03B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for LLM4Code/CodeARC_annotated_llama3.1

Finetuned
(1608)
this model