ray.init() with a runtime_env that controls which environment variables every Ray worker sees. This page describes what get_ppo_ray_runtime_env() puts in that dict, the precedence order between rLLM defaults, your shell, and ray job submit --runtime-env-json, and the knobs you have for overriding each layer.
Precedence
get_ppo_ray_runtime_env() composes the runtime env from three layers. Higher layers win on key conflicts.
| Priority | Source | Where it comes from |
|---|---|---|
| 1 (low) | rLLM defaults | PPO_RAY_RUNTIME_ENV in rllm/trainer/verl/ray_runtime_env.py |
| 2 | Forwarded host env | Variables in your shell whose names match a FORWARD_PREFIXES entry |
| 3 (high) | Job-submit runtime env | runtime_env field of --runtime-env-json passed to ray job submit |
get_ppo_ray_runtime_env() itself: the host-forwarded vars overwrite the rLLM defaults via a plain dict.update. The third layer is honored by Ray’s own merge inside ray.init(). To make that merge behave as a proper override (instead of raising on a key conflict), rLLM pops every key the job config sets from the dict it returns, so the job config’s value is the one that survives.
This is the contract as of PR #521. Earlier versions did not look at the job-submit runtime env at all, so a
ray job submit --runtime-env-json=... invocation could collide with rLLM’s defaults and fail.Default environment variables
These are the values rLLM sets at the lowest priority. Anything you forward from your shell or specify in--runtime-env-json will override them.
Forwarding from your shell
Any variable in your shell whose name starts with one of these prefixes is forwarded to workers automatically:| Category | Prefixes |
|---|---|
| Inference engines | VLLM_, SGL_, SGLANG_ |
| HuggingFace | HF_, TOKENIZERS_, DATASETS_ |
| Training frameworks | TORCH_, PYTORCH_, DEEPSPEED_, MEGATRON_ |
| CUDA / NCCL | NCCL_, CUDA_, CUBLAS_, CUDNN_, NV_, NVIDIA_ |
| rLLM extension hooks | RLLM_ (used by RLLM_EXTRA_WORKER_SETUP_HOOK; see below) |
WARN) on every worker.
Suppressing forwarding with RLLM_EXCLUDE
RLLM_EXCLUDE is a comma-separated list that opts variables out of the host-forwarding step. Two forms are supported:
* removes the corresponding entry from the forward-prefix list (so VLLM* drops the VLLM_ prefix entirely). Values without * are matched against full variable names and excluded individually. The two forms can be combined.
Overriding with ray job submit
If you launch training via ray job submit, the runtime_env field of --runtime-env-json wins over both rLLM defaults and host-forwarded vars. This is the right escape hatch when you need a different value on the cluster than on the submitting node.
get_ppo_ray_runtime_env() reads this dict from RAY_JOB_CONFIG_JSON_ENV_VAR (which the Ray job runtime sets for you), pops the listed keys from its own env_vars so they cannot collide, and only sets working_dir=None when the job config does not specify a working_dir. Without that handling Ray’s ray.init() would refuse to merge and raise unless you also exported RAY_OVERRIDE_JOB_RUNTIME_ENV=1.
Worker process setup hook
Beyond environment variables, rLLM also wires aworker_process_setup_hook into the runtime env. Ray invokes this once at the start of every worker interpreter, before any user code runs. rLLM points it at rllm.trainer.verl.patch.apply_all_verl_patches, which monkey-patches the verl modules that the FSDP / vLLM workers re-import on startup.
This matters because verl is pinned (currently 0.7.1) and rLLM ships small backports for known issues — e.g. the qwen3-VL inplace-add fix from verl#5881, the NCCL-deadlock fix on dynamic-batching from verl#5750, and the jagged-NestedTensor _ragged_idx preservation fix from verl#6127. Ray spawns each worker as a fresh Python interpreter that re-imports verl from disk, so a driver-side import verl; verl.X = patched_X never propagates. The setup hook runs the patches inside every worker.
If your --runtime-env-json already specifies a worker_process_setup_hook, rLLM steps out of the way (same precedence rule as for env_vars): the job-submitted hook wins and rLLM does not register its own. In that case you are responsible for calling apply_all_verl_patches() yourself if you want the verl backports.
Adding your own setup hook (RLLM_EXTRA_WORKER_SETUP_HOOK)
Sometimes you need to apply environment-specific tweaks on every worker — for example, disabling cuDNN on a host whose cuDNN install is broken, or pointing Triton at a per-PID cache to avoid races between concurrent FSDP workers. These do not belong in rLLM itself (they are properties of one machine), so we expose an extension point instead of asking you to fork the runtime env.
Set the RLLM_EXTRA_WORKER_SETUP_HOOK environment variable to "<absolute-path-to-file.py>:<function-name>". apply_all_verl_patches reads it on every worker, loads the file via importlib.util.spec_from_file_location (so it works even when the file is not on sys.path — e.g. lives under a gitignored tmp/ directory), and calls the named function. Failures are logged but never re-raised.
my_local_setup.py
train.py
RLLM_ prefix is in FORWARD_PREFIXES, so the variable is forwarded to workers without any extra config. The hook runs after rLLM’s verl patches.
Programmatic access
AgentTrainer calls get_ppo_ray_runtime_env() for you. If you need the same dict for a custom Ray actor, import it directly:
env_vars key. It contains a working_dir key only when no job-submit working_dir is in effect.
