Language

MLX PPO¶

MLX PPO uses the PPO task-owner tree but swaps the training runtime to the MLX implementation. The entry script is scripts/train_mlx_ppo.py, the config is conf/ppo/config_mlx.yaml, and the implementation lives under src/unilab/algos/mlx/ppo/.

Quick Start¶

uv run train --algo mlx_ppo --task go2_joystick_flat --sim mujoco
uv run train --algo mlx_ppo --task go2_joystick_flat --sim motrix training.no_play=true

Notes¶

conf/ppo/config_mlx.yaml sets training.device=mlx.
The mlx dependency is enabled by the sys_platform == 'darwin' marker in pyproject.toml.
MLX compose coverage is tracked separately in the generated support matrix: 后端支持矩阵.

Use torch PPO first when you need the default training path; use MLX PPO when you are intentionally exercising the MLX runtime.