Language

Hydra Config¶

UniLab uses Hydra composition with task owner YAMLs. The owner YAML is the identity of the task, backend, reward, scene, and task-specific runtime fields.

Owner Paths¶

Stack	Owner YAML Shape
PPO	`conf/ppo/task/<task>/<backend>.yaml`
MLX PPO	`conf/ppo/task/<task>/<backend>.yaml` with `conf/ppo/config_mlx.yaml`
APPO	`conf/appo/task/<task>/<backend>.yaml`
SAC / TD3 / FlashSAC	`conf/offpolicy/task/<algo>/<task>/<backend>.yaml`
HIM-PPO	`conf/ppo_him/task/<task>/<backend>.yaml`
HORA distillation	`conf/hora_distill/task/<task>/<backend>.yaml`

Examples:

uv run train --algo ppo --task go2_joystick_flat --sim mujoco
uv run train --algo ppo --task go2_joystick_flat --sim motrix
uv run train --algo sac --task g1_walk_flat --sim mujoco

For off-policy, --algo selects the first owner-path segment under conf/offpolicy/task/<algo>/; do not include the algorithm name in --task.

Safe Overrides¶

Hydra overrides can tune fields inside the selected owner path:

uv run train --algo ppo --task go2_joystick_flat --sim mujoco \
  algo.max_iterations=10 \
  algo.num_envs=128 \
  training.no_play=true

Common fields:

algo.max_iterations
algo.num_envs
algo.load_run
algo.seed
training.no_play
training.play_only
training.play_render_mode
training.logger

Inspect the Composed Config¶

To debug composition, append --cfg job to print the fully composed config without running training:

uv run train --algo ppo --task go2_joystick_flat --sim mujoco --cfg job

Backend Identity¶

training.sim_backend is an identity field set by the selected owner YAML. It is not an independent backend switch. Use the unified CLI --sim flag to select the backend.

See the developer contract in Task Owner Config Contract.