Created on February 03, 2026
2026
New paper on a diverse evaluation benchmark for Code Generation Agents out on arXiv arXiv:2602.02262