When rLLM launches verl training, it callsDocumentation Index
Fetch the complete documentation index at: https://docs.rllm-project.com/llms.txt
Use this file to discover all available pages before exploring further.
ray.init() with a runtime_env that controls which environment variables every Ray worker sees. This page describes what get_ppo_ray_runtime_env() puts in that dict, the precedence order between rLLM defaults, your shell, and ray job submit --runtime-env-json, and the knobs you have for overriding each layer.
Precedence
get_ppo_ray_runtime_env() composes the runtime env from three layers. Higher layers win on key conflicts.
| Priority | Source | Where it comes from |
|---|---|---|
| 1 (low) | rLLM defaults | PPO_RAY_RUNTIME_ENV in rllm/trainer/verl/ray_runtime_env.py |
| 2 | Forwarded host env | Variables in your shell whose names match a FORWARD_PREFIXES entry |
| 3 (high) | Job-submit runtime env | runtime_env field of --runtime-env-json passed to ray job submit |
get_ppo_ray_runtime_env() itself: the host-forwarded vars overwrite the rLLM defaults via a plain dict.update. The third layer is honored by Ray’s own merge inside ray.init(). To make that merge behave as a proper override (instead of raising on a key conflict), rLLM pops every key the job config sets from the dict it returns, so the job config’s value is the one that survives.
This is the contract as of PR #521. Earlier versions did not look at the job-submit runtime env at all, so a
ray job submit --runtime-env-json=... invocation could collide with rLLM’s defaults and fail.Default environment variables
These are the values rLLM sets at the lowest priority. Anything you forward from your shell or specify in--runtime-env-json will override them.
Forwarding from your shell
Any variable in your shell whose name starts with one of these prefixes is forwarded to workers automatically:| Category | Prefixes |
|---|---|
| Inference engines | VLLM_, SGL_, SGLANG_ |
| HuggingFace | HF_, TOKENIZERS_, DATASETS_ |
| Training frameworks | TORCH_, PYTORCH_, DEEPSPEED_, MEGATRON_ |
| CUDA / NCCL | NCCL_, CUDA_, CUBLAS_, CUDNN_, NV_, NVIDIA_ |
WARN) on every worker.
Suppressing forwarding with RLLM_EXCLUDE
RLLM_EXCLUDE is a comma-separated list that opts variables out of the host-forwarding step. Two forms are supported:
* removes the corresponding entry from the forward-prefix list (so VLLM* drops the VLLM_ prefix entirely). Values without * are matched against full variable names and excluded individually. The two forms can be combined.
Overriding with ray job submit
If you launch training via ray job submit, the runtime_env field of --runtime-env-json wins over both rLLM defaults and host-forwarded vars. This is the right escape hatch when you need a different value on the cluster than on the submitting node.
get_ppo_ray_runtime_env() reads this dict from RAY_JOB_CONFIG_JSON_ENV_VAR (which the Ray job runtime sets for you), pops the listed keys from its own env_vars so they cannot collide, and only sets working_dir=None when the job config does not specify a working_dir. Without that handling Ray’s ray.init() would refuse to merge and raise unless you also exported RAY_OVERRIDE_JOB_RUNTIME_ENV=1.
Programmatic access
AgentTrainer calls get_ppo_ray_runtime_env() for you. If you need the same dict for a custom Ray actor, import it directly:
env_vars key. It contains a working_dir key only when no job-submit working_dir is in effect.
