Language

Task Owner Config Contract¶

Task owner YAML is the identity of a composed task/backend/algorithm path. The contract is recorded in ADR-0003 Task Owner And Config Compose Contract.

Owner Paths¶

PPO and APPO owner YAMLs use conf/{ppo,appo}/task/<task>/<backend>.yaml.
MLX PPO composes from conf/ppo/config_mlx.yaml and reuses the PPO task owner YAML layout.
Off-policy owner YAMLs include the algorithm dimension: conf/offpolicy/task/<algo>/<task>/<backend>.yaml.
Other existing config roots, such as conf/ppo_him/ and conf/hora_distill/, follow the same owner-YAML identity rule for their supported tasks.

Use public CLI flags to switch backend, for example uv run train --algo ppo --task go2_joystick_flat --sim mujoco or uv run train --algo ppo --task go2_joystick_flat --sim motrix.
For off-policy entrypoints, keep --algo <algo> aligned with the internal owner YAML path conf/offpolicy/task/<algo>/<task>/<backend>.yaml.
training.sim_backend is an identity field inside the selected owner YAML. It is not an independent backend switch.
Backend-specific reward, env, scene, and algorithm differences belong in the owner YAML, not in training scripts.
Reward config must be explicitly injected by the owner YAML when the task uses rewards.

PPO owner example: conf/ppo/task/go2_joystick_flat/mujoco.yaml
APPO config root: conf/appo/config.yaml
Off-policy config root: conf/offpolicy/config.yaml
Off-policy task/algo guard: src/unilab/training/common.py
Config tests: tests/config/test_config_system.py, tests/scripts/test_train_script_configs.py, tests/envs/locomotion/g1/test_issue175_regression.py