Training long-horizon terminal agents with reinforcement learning
GitHub repository
View source code and documentation
Terminal-Bench-RL is a benchmark and training framework for long-horizon task completion in terminal environments. It provides a suite of tasks for evaluating and training terminal agents using rLLM’s RL pipeline.