NaN Visualizer¶
PPO has a NaN guard under training.nan_guard in conf/ppo/config.yaml. When
enabled, scripts/train_rsl_rl.py installs NanGuard, checks observation dicts
and rewards, and writes a .npz dump plus model metadata when it detects
NaN/Inf values.
uv run train --algo ppo --task go2_joystick_flat --sim mujoco \
training.nan_guard.enabled=true \
training.nan_guard.output_dir=/tmp/unilab/nan_dumps
The viewer implementation is src/unilab/tools/viz_nan.py, registered as the
unilab-viz-nan console entry. It replays a dump path and lets you select the
environment index. Dump format and round-trip loading are covered by
tests/test_nan_guard.py.