AgentFlow plugin — it runs inside an AWS Bedrock AgentCore Runtime container built and deployed from the agentcore-rl-toolkit repo. rLLM drives rollouts over the network and trains the policy locally with the verl backend.
Because the agent lives in an external container, there is no pip install -e cookbooks/<name> plugin and no rllm eval/train CLI entry point. The cookbook in cookbooks/migrationbench/ only handles the rLLM side: registering the dataset from S3 metadata and launching verl training against the remote runtime.
Pattern
| Aspect | Value |
|---|---|
| Loop shape | Multi-turn coding agent, running inside an AgentCore Runtime container |
| Dataset | migration_bench — MigrationBench Java repos, registered from S3 metadata |
| Runtime | AWS Bedrock AgentCore Runtime (remote, auto-scaling microVMs) |
| Backend | verl only (distributed multi-GPU) |
| Reward shape | Agent runs mvn build/tests in-container, writes the reward to S3; rLLM polls it |
| Model (reference) | Qwen3-Coder-30B-A3B-Instruct (LoRA), tested on a single 8×B200 node with verl 0.8.0 |
Prerequisites
The agent code, container build, and dataset upload all live in the toolkit. Build and deploy the agent by following the strands_migration_agent example. That produces the three inputs this cookbook needs:- An AgentCore agent runtime ARN — the deployed container that performs the migrations.
- A data S3 bucket — holds the prepared MigrationBench repo tarballs and their
metadata.json, written by the toolkit’spreprocess.py. The container downloads repos from here at runtime, andprepare_migrationbench_data.pyreads its metadata to register the dataset. - An output S3 bucket — where the agent writes rollout results (rewards, etc.); rLLM polls it each rollout. May differ from the data bucket.
Configure the environment
Run the remaining steps from the cookbook folder (
cd cookbooks/migrationbench); the train script sources .env from the current directory. Copy the example and fill it in:Register the dataset
metadata.json files from s3://<data-bucket>/tars/{train,test}/, then registers migration_bench/{train,test} with the rLLM DatasetRegistry:- Train — repos under
tars/train/withnum_test_cases > 0. - Test — all repos under
tars/test/.
Run training
.env for the agent ARN and output bucket, then runs verl with rllm.remote_runtime.backend=agentcore so rollouts execute in the deployed container. Tune MODEL_PATH, parallelism (TP/EP/CP), batch sizes, and trainer.n_gpus_per_node/nnodes in the script to match your hardware.
