The SWE-bench leaderboard, a crucial benchmark for evaluating AI models, has been updated with new performance data for the current generation of models. This refresh provides valuable insights into the latest advancements and capabilities of various AI systems.
Source: Simon Willison