UnifiedTrainer and its ecosystem.
Tutorials
Unified workflow trainer
Unified trainer
End-to-end training loop architecture and lifecycle
Backend protocol
Backend interface contract used by the unified trainer
Configuration
Config structure and forwarding behavior
Algorithms
Advantage estimator
rLLM-native advantage estimation and customization
Pre-computing advantage
Step-level precomputed training signals and mixed-mode training

