Training Software Engineering Agents and Verifiers with SWE-Gym

Jiayi Pan^*,1, Xingyao Wang^*,2, Graham Neubig³, Navdeep Jaitly⁴, Heng Ji², Alane Suhr^{^,1}, Yizhe Zhang^{^,4}

¹UC Berkeley, ²UIUC, ³CMU, ⁴Apple
_{^*Equal contribution, ^{^}Equal supervision}

We present SWE-Gym, the first environment for training real-world software engineering agents. We use it to train strong LM agents that achieve state-of-the-art open results on SWE-Bench, with early, promising scaling characteristics as we increase training and inference-time compute.

Progress in agents for software engineering has been limited by the lack of training environments that both include rigorous verification for reinforcement learning and cover the expansive tasks encountered in real-world repository-level engineering.

We introduce SWE-Gym: An Open Environment for Training Software Engineering Agents & Verifiers. Our baselines achieve new open SOTA - 32%/26% on SWE-Bench Verified/Lite, with promising scaling trends.

SWE-Gym enables scalable improvements for software engineering agents at both training and inference time. Our current results is primarity bottlenecked by training and inference compute, rather than the size of our environment.

Reproducing Results

Please refer to our Github Repo for more details: See docs/OpenHands.md and docs/MoatlessTools.md for instructions on reproducing results with our training and inference-time results for OpenHands and MoatlessTools agents.

📚 Citation

@misc{pan2024trainingsoftwareengineeringagents,
      title={Training Software Engineering Agents and Verifiers with SWE-Gym}, 
      author={Jiayi Pan and Xingyao Wang and Graham Neubig and Navdeep Jaitly and Heng Ji and Alane Suhr and Yizhe Zhang},
      year={2024},
      eprint={2412.21139},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2412.21139}, 
}