Skip to main content

Read the full write-up

Notion blog post with full details
DeepSWE is a 32B software engineering agent that achieves 59% on SWE-Bench-Verified with test-time scaling (42.2% Pass@1). It tops the SWE-Bench leaderboard for open-weight models.

Results

ModelParametersSWE-Bench-Verified
DeepSWE32B59.0%

Approach

Trained on top of Qwen3-32B to search, view, and navigate codebases, using RL to improve software engineering task completion. See the DeepSWE example for instructions on reproducing this with rLLM. Released: July 2025