CLI Reference¶
UniLab exposes package commands for common training and playback routes, and keeps the lower-level scripts available for debugging Hydra composition.
Unified Commands¶
Goal |
Command Shape |
Routed Script |
|---|---|---|
PPO |
|
|
MLX PPO |
|
|
APPO |
|
|
SAC |
|
|
TD3 |
|
|
FlashSAC |
|
|
Examples:
uv run train --algo ppo --task go2_joystick_flat --sim mujoco
uv run train --algo appo --task g1_motion_tracking --sim motrix training.no_play=true
uv run train --algo sac --task g1_walk_flat --sim mujoco training.no_play=true
uv run train --algo flashsac --task go2_joystick_flat --sim mujoco
The CLI builds the owner YAML path from --algo, --task, --sim, and
optional --profile. Route-defining values must use the CLI flags; Hydra
overrides after the command are for fields such as algo.max_iterations,
algo.num_envs, and training.no_play.
Per-Environment Invocation¶
The CLI prefix depends on the install flavor:
ROCm: run
make sync-rocmonce, then useuv run ....Intel XPU: use
uv run --no-sync ....
Tab Completion¶
Shell completion is optional. It only completes the uv run train / uv run eval
entrypoints, their flags, and some choices; it never changes command behavior.
On a fresh checkout, one setup command syncs the environment and installs the
completion:
make setup
# When you need Motrix:
make setup-motrix
make setup runs uv sync followed by uv run --no-sync unilab-complete install;
make setup-motrix runs uv sync --extra motrix followed by the same completion
install. The install command picks Bash or Zsh from $SHELL / platform and only
writes user-level rc files. The current shell is not auto-activated; reopen the
terminal or source the rc file to apply.
If make is unavailable, run the steps directly:
uv sync && uv run --no-sync unilab-complete install
Bash users (Linux / WSL) can instead add this to ~/.bashrc:
source scripts/completions/unilab.bash
Zsh users (default on macOS) can add this to ~/.zshrc:
autoload -Uz compinit
compinit
source scripts/completions/unilab.zsh
After reopening the terminal or sourcing the rc file, candidates appear for
uv run <TAB>, uv run train --algo <TAB>, and uv run train --sim <TAB>.
Evaluation¶
uv run eval sets training.play_only=true and optionally maps --load-run to
algo.load_run.
uv run eval --algo ppo --task go2_joystick_flat --sim mujoco --load-run -1
uv run eval --algo ppo --task go2_joystick_flat --sim motrix --load-run -1 \
--render-mode record
Supported render modes are auto, interactive, record, and none.
Demo¶
uv run demo dance
uv run demo wallflip
uv run demo boxtracking
uv run demo locomani
uv run demo inhandgrasp
uv run demo dance --refresh --device cpu
Available demos: teaser, dance, wallflip, boxtracking, locomani, inhandgrasp.
Each demo fetches a pre-trained checkpoint from the
unilabsim/unilab-checkpoints Hugging Face dataset on first run and caches it
under src/unilab/assets/checkpoints/<demo>/model_0.pt. Pass --refresh to
re-download.
Mainland China users: when huggingface.co is unreachable, switch to the
community mirror before running the demo:
export HF_ENDPOINT=https://hf-mirror.com
The demo entrypoint is implemented by src/unilab/demo.py and routed from
src/unilab/cli.py.
Low-Level Scripts¶
The lower-level scripts remain available when you need to inspect Hydra config groups or reproduce a script-level issue. For normal usage, keep route-defining values in the unified CLI flags above.
For off-policy routes, keep --algo aligned with the owner tree under
conf/offpolicy/task/<algo>/; do not include the algorithm name in --task.
Common Overrides¶
training.no_play=true
algo.max_iterations=10
algo.num_envs=128
algo.load_run=-1
training.logger=wandb
Use uv run eval ... --load-run ... for playback and --render-mode record
or --render-mode none for render behavior.
Backend selection belongs to the CLI --sim choice and the resulting owner
YAML. Do not use training.sim_backend=<backend> as a standalone switch.