Skip to main content
When rLLM launches verl training, it calls ray.init() with a runtime_env that controls which environment variables every Ray worker sees. This page describes what get_ppo_ray_runtime_env() puts in that dict, the precedence order between rLLM defaults, your shell, and ray job submit --runtime-env-json, and the knobs you have for overriding each layer.

Precedence

get_ppo_ray_runtime_env() composes the runtime env from three layers. Higher layers win on key conflicts.
PrioritySourceWhere it comes from
1 (low)rLLM defaultsPPO_RAY_RUNTIME_ENV in rllm/trainer/verl/ray_runtime_env.py
2Forwarded host envVariables in your shell whose names match a FORWARD_PREFIXES entry
3 (high)Job-submit runtime envruntime_env field of --runtime-env-json passed to ray job submit
The first two layers are merged inside get_ppo_ray_runtime_env() itself: the host-forwarded vars overwrite the rLLM defaults via a plain dict.update. The third layer is honored by Ray’s own merge inside ray.init(). To make that merge behave as a proper override (instead of raising on a key conflict), rLLM pops every key the job config sets from the dict it returns, so the job config’s value is the one that survives.
This is the contract as of PR #521. Earlier versions did not look at the job-submit runtime env at all, so a ray job submit --runtime-env-json=... invocation could collide with rLLM’s defaults and fail.

Default environment variables

These are the values rLLM sets at the lowest priority. Anything you forward from your shell or specify in --runtime-env-json will override them.
PPO_RAY_RUNTIME_ENV = {
    "env_vars": {
        "TOKENIZERS_PARALLELISM": "true",
        "NCCL_DEBUG": "WARN",
        "VLLM_LOGGING_LEVEL": "WARN",
        "VLLM_ALLOW_RUNTIME_LORA_UPDATING": "true",
        "CUDA_DEVICE_MAX_CONNECTIONS": "1",
        "VLLM_DISABLE_COMPILE_CACHE": "1",
        "NCCL_CUMEM_ENABLE": "0",
    },
}
The last two are present to work around known issues — vLLM’s compile cache corruption (vllm-project/vllm#31199) and a hang during weight sync in disaggregated mode. Override them with care.

Forwarding from your shell

Any variable in your shell whose name starts with one of these prefixes is forwarded to workers automatically:
CategoryPrefixes
Inference enginesVLLM_, SGL_, SGLANG_
HuggingFaceHF_, TOKENIZERS_, DATASETS_
Training frameworksTORCH_, PYTORCH_, DEEPSPEED_, MEGATRON_
CUDA / NCCLNCCL_, CUDA_, CUBLAS_, CUDNN_, NV_, NVIDIA_
rLLM extension hooksRLLM_ (used by RLLM_EXTRA_WORKER_SETUP_HOOK; see below)
So, to bump vLLM’s logging level for a single run, you can just export it before launching training:
export VLLM_LOGGING_LEVEL=DEBUG
export NCCL_DEBUG=INFO
python train.py ...
Both values will overwrite rLLM’s defaults (WARN) on every worker.

Suppressing forwarding with RLLM_EXCLUDE

RLLM_EXCLUDE is a comma-separated list that opts variables out of the host-forwarding step. Two forms are supported:
export RLLM_EXCLUDE="CUDA_VISIBLE_DEVICES,HF_TOKEN"
A value containing * removes the corresponding entry from the forward-prefix list (so VLLM* drops the VLLM_ prefix entirely). Values without * are matched against full variable names and excluded individually. The two forms can be combined.
Reach for RLLM_EXCLUDE when a host variable is leaking into workers and breaking them — a stray CUDA_VISIBLE_DEVICES from your launcher is the classic case. It only affects the host-forwarding layer; the rLLM defaults and any --runtime-env-json keys are untouched.

Overriding with ray job submit

If you launch training via ray job submit, the runtime_env field of --runtime-env-json wins over both rLLM defaults and host-forwarded vars. This is the right escape hatch when you need a different value on the cluster than on the submitting node.
ray job submit \
  --runtime-env-json='{
    "env_vars": {
      "NCCL_DEBUG": "INFO",
      "VLLM_LOGGING_LEVEL": "DEBUG"
    },
    "working_dir": "."
  }' \
  -- python train.py ...
get_ppo_ray_runtime_env() reads this dict from RAY_JOB_CONFIG_JSON_ENV_VAR (which the Ray job runtime sets for you), pops the listed keys from its own env_vars so they cannot collide, and only sets working_dir=None when the job config does not specify a working_dir. Without that handling Ray’s ray.init() would refuse to merge and raise unless you also exported RAY_OVERRIDE_JOB_RUNTIME_ENV=1.

Worker process setup hook

Beyond environment variables, rLLM also wires a worker_process_setup_hook into the runtime env. Ray invokes this once at the start of every worker interpreter, before any user code runs. rLLM points it at rllm.trainer.verl.patch.apply_all_verl_patches, which monkey-patches the verl modules that the FSDP / vLLM workers re-import on startup. This matters because verl is pinned (currently 0.7.1) and rLLM ships small backports for known issues — e.g. the qwen3-VL inplace-add fix from verl#5881, the NCCL-deadlock fix on dynamic-batching from verl#5750, and the jagged-NestedTensor _ragged_idx preservation fix from verl#6127. Ray spawns each worker as a fresh Python interpreter that re-imports verl from disk, so a driver-side import verl; verl.X = patched_X never propagates. The setup hook runs the patches inside every worker. If your --runtime-env-json already specifies a worker_process_setup_hook, rLLM steps out of the way (same precedence rule as for env_vars): the job-submitted hook wins and rLLM does not register its own. In that case you are responsible for calling apply_all_verl_patches() yourself if you want the verl backports.

Adding your own setup hook (RLLM_EXTRA_WORKER_SETUP_HOOK)

Sometimes you need to apply environment-specific tweaks on every worker — for example, disabling cuDNN on a host whose cuDNN install is broken, or pointing Triton at a per-PID cache to avoid races between concurrent FSDP workers. These do not belong in rLLM itself (they are properties of one machine), so we expose an extension point instead of asking you to fork the runtime env. Set the RLLM_EXTRA_WORKER_SETUP_HOOK environment variable to "<absolute-path-to-file.py>:<function-name>". apply_all_verl_patches reads it on every worker, loads the file via importlib.util.spec_from_file_location (so it works even when the file is not on sys.path — e.g. lives under a gitignored tmp/ directory), and calls the named function. Failures are logged but never re-raised.
my_local_setup.py
def apply():
    import torch
    torch.backends.cudnn.enabled = False  # workaround broken cuDNN install
    import os
    os.environ.setdefault("TRITON_CACHE_DIR", f"/tmp/triton-cache-{os.getpid()}")
train.py
import os
from pathlib import Path

# Set BEFORE importing rllm — get_ppo_ray_runtime_env() reads it via
# the RLLM_ forwarded prefix and propagates it to every Ray worker.
os.environ["RLLM_EXTRA_WORKER_SETUP_HOOK"] = f"{Path(__file__).parent / 'my_local_setup.py'}:apply"
The RLLM_ prefix is in FORWARD_PREFIXES, so the variable is forwarded to workers without any extra config. The hook runs after rLLM’s verl patches.

Programmatic access

AgentTrainer calls get_ppo_ray_runtime_env() for you. If you need the same dict for a custom Ray actor, import it directly:
import ray
from rllm.trainer.verl.ray_runtime_env import get_ppo_ray_runtime_env

runtime_env = get_ppo_ray_runtime_env()
ray.init(runtime_env=runtime_env)
The returned dict always contains an env_vars key. It contains a working_dir key only when no job-submit working_dir is in effect.