Hydra Config¶
UniLab uses Hydra composition with task owner YAMLs. The owner YAML is the identity of the task, backend, reward, scene, and task-specific runtime fields.
Owner Paths¶
Stack |
Owner YAML Shape |
|---|---|
PPO |
|
MLX PPO |
|
APPO |
|
SAC / TD3 / FlashSAC |
|
HIM-PPO |
|
HORA distillation |
|
Examples:
uv run train --algo ppo --task go2_joystick_flat --sim mujoco
uv run train --algo ppo --task go2_joystick_flat --sim motrix
uv run train --algo sac --task g1_walk_flat --sim mujoco
For off-policy, --algo selects the first owner-path segment under
conf/offpolicy/task/<algo>/; do not include the algorithm name in --task.
Safe Overrides¶
Hydra overrides can tune fields inside the selected owner path:
uv run train --algo ppo --task go2_joystick_flat --sim mujoco \
algo.max_iterations=10 \
algo.num_envs=128 \
training.no_play=true
Common fields:
algo.max_iterationsalgo.num_envsalgo.load_runalgo.seedtraining.no_playtraining.play_onlytraining.play_render_modetraining.logger
Inspect the Composed Config¶
To debug composition, append --cfg job to print the fully composed config
without running training:
uv run train --algo ppo --task go2_joystick_flat --sim mujoco --cfg job
Backend Identity¶
training.sim_backend is an identity field set by the selected owner YAML. It
is not an independent backend switch. Use the unified CLI --sim flag to
select the backend.
See the developer contract in Task Owner Config Contract.