Domain Randomization¶
This page only describes the current status of tasks in the repo that are already registered and already wired to a DR provider. All conclusions come from the code; nothing is inferred from design intent.
The current unified entry point lives in NpEnv._init_domain_randomization() and DomainRandomizationManager:
init path: the task provider produces an
InitRandomizationPlan; the manager calls the backend’sapply_init_randomization(...)during env initializationreset path: the task provider produces a
ResetPlan; the manager validates capability and then calls the backend’sset_state(..., randomization=...)interval path: the task provider produces an
IntervalRandomizationPlan; the manager calls the backend’sapply_interval_randomization(...)as needed before step
These three paths correspond to three lifecycle classes:
init-lifecycle DR: items that change the model identity or model geometry; can only take effect during env/backend initialization and materialization, e.g. Sharpa-hand object
geom_sizescaling.reset-lifecycle DR: items that do not change model identity, only change parameters or reset state within the same model, e.g.
base_mass_delta,base_com_offset,gravity,kp,kd.interval-lifecycle DR: external perturbations between steps, e.g. push.
Status Conclusions¶
All tasks currently wired to a DR provider use the unified DR entry point; no task bypasses
DomainRandomizationManagerto run a separate DR flow insidereset().They are all roughly structured: task files define a
domain_randconfig dataclass, aDomainRandomizationProvider, and aResetPlan;G1WalkFlatreusesG1Walk’s provider.What is “unified” today is mainly the entry point and execution flow, not every randomization item itself. The shared helper
build_common_reset_randomization()currently generatesbase_mass_delta,base_com_offset,gravity,kp,kd; the shared interval helper currently only generates push.ResetRandomizationPayloadcan already expressgravity,body_iquat,body_inertia,kp,kd, andMuJoCoBackendhas declared support. Whether these are actually used still depends on whether the task provider samples and dispatches them.MotrixBackendcurrently supportsbase_mass_delta,base_com_offset,kp,kd, and interval push; and it requires all model actuators to be position actuators during initialization.geom_sizeis not a reset-lifecycle field; Sharpa-hand object geom scale is handled by init-lifecycle model materialization.
Uniformity Assessment Table¶
Task |
Uses unified DR entry? |
Structured form? |
reset form |
interval form |
Code |
|---|---|---|---|---|---|
|
Yes |
Yes: |
task state sampling + common payload |
push |
|
|
Yes |
Yes: |
task state sampling + common payload |
push |
|
|
Yes |
Yes: |
task state sampling + common payload |
push |
|
|
Yes |
Yes: reuses |
task state sampling + common payload |
push |
|
|
Yes |
Yes: |
extensive task-specific reset sampling + common payload |
push |
|
|
Yes |
Yes: |
task-specific reset sampling + common payload |
none |
|
|
Yes |
Yes: |
grasp cache sampling + common payload |
object |
|
|
Yes |
Yes: reuses the Sharpa rotation provider and overrides reset sampling |
grasp collection reset + common payload |
none |
|
Per-task Domain Randomization List¶
Task |
Currently implemented reset domain randomization |
Currently implemented interval domain randomization |
Default state |
|---|---|---|---|
|
base xy; base yaw; base qvel; command sampling; |
|
|
|
base xy; base yaw; base qvel; command sampling; |
|
kp/kd enabled by default; common payload and push disabled by default |
|
base xy; base yaw; base qvel sampled by |
|
kp/kd enabled by default; common payload and push disabled by default |
|
Same as |
|
kp/kd enabled by default; common payload and push disabled by default |
|
motion frame sampling; root pose perturbation |
|
|
|
If a grasp cache exists, sample a grasp randomly; otherwise apply |
none |
If the grasp cache path is available it is sampled by default; |
|
grasp cache bucketed sampling by |
object |
|
|
hand pose reset; object pose / quat reset; collects successful grasps and stores them bucketed by |
none |
Used by default to generate the Sharpa grasp cache; cache filename includes the single scale value; common payload disabled by default |
Current Unified DR Capabilities and Boundaries¶
1. Unified Entry Point Is Complete¶
The unified entry point is guaranteed by NpEnv and DomainRandomizationManager:
Tasks only need to register a provider
The manager uniformly performs capability validation
The backend is uniformly responsible for actually applying the randomization payload
So from an execution-path perspective, the tasks are already unified.
3. Backend Capabilities Already Exceed What Tasks Currently Use¶
ResetRandomizationPayload now contains:
base_mass_deltabase_com_offsetgravitybody_iquatbody_inertiakpkd
Backend capability today:
MuJoCoBackend: supports the 7 reset terms above, plus interval push and interval body forceMotrixBackend: supportsbase_mass_delta,base_com_offset,kp,kd, plus interval push; requires actuators to all be position actuators during initialization
Notes:
The current
IntervalRandomizationPlansupportspush_perturbation_limit,body_linear_velocity_delta, andbody_force; among these,body_forceexpresses hot-path direct external-force perturbations without exposing the backend-privatexfrc_applieddetails.The current MuJoCo backend’s interval push and interval body force are both dispatched through
xfrc_applied; the Sharpa-hand object disturbance has been switched to direct force disturbance.The Motrix backend currently still does not support direct body-force disturbance, so such owner configs must continue to be explicitly disabled.
But on the task side, the current reality is: not every provider constructs these fields. The backend contract is the capability boundary; whether the task config and provider dispatch a payload is what determines whether a given task actually enables the corresponding DR item.
Reset gravity Usage¶
gravity is a reset-lifecycle DR: on each reset, a full MuJoCo gravity vector (gx, gy, gz) is sampled per env subset and dispatched to the backend via ResetRandomizationPayload.gravity. This vector expresses both direction and magnitude:
Direction: determined by the direction of
(gx, gy, gz).Magnitude: determined by the vector norm
sqrt(gx^2 + gy^2 + gz^2).Lifecycle: only sampled and written at reset; the env retains that gravity until the next reset re-samples it.
Backend: currently in UniLab, only the MuJoCo backend declares support for this reset term; the Motrix backend does not. Some tasks filter it by capability and skip it; others raise an error in the validate stage.
The config entry is under each task’s env.domain_rand:
env:
domain_rand:
randomize_gravity: true
gravity_range:
- [-0.2, -0.2, -10.5]
- [0.2, 0.2, -8.5]
Field semantics:
randomize_gravity: whether to enable gravity reset DR; defaults tofalse.gravity_range: a(2, 3)-shaped per-dimension sampling range; the first and second rows give the upper and lower bounds of each component.On each reset, each dimension is uniformly sampled within
[min(row0, row1), max(row0, row1)]. The direction is not automatically normalized, and the gravity norm is not fixed.
If you only want to randomize the magnitude while keeping the vertical-down direction, only open up the z component:
uv run train --algo ppo --task g1_walk_flat --sim mujoco \
env.domain_rand.randomize_gravity=true \
'env.domain_rand.gravity_range=[[0.0,0.0,-10.5],[0.0,0.0,-8.5]]'
If you want to randomize both direction and magnitude, open up x/y/z:
uv run train --algo ppo --task g1_walk_flat --sim mujoco \
env.domain_rand.randomize_gravity=true \
'env.domain_rand.gravity_range=[[-0.3,-0.3,-10.5],[0.3,0.3,-8.5]]'
Notes:
gravity_rangemust be convertible into a(2, 3)array; otherwise reset will raise an error when constructing the payload.This term does not call
mj_setConst; MuJoCo step / forward readsmjModel.opt.gravitydirectly.Do not enable this term under the Motrix backend; the current Motrix capability does not include
gravity.If your current environment still has a
mujoco-unipackage installed that does not include thegravityfield, MuJoCo reset will raise unsupported field; you need to use amujoco-unibuild/release that includes the field.During training it is recommended to start from a small tilt range; otherwise sampling a too-large horizontal gravity early on may degrade the task into being unlearnable.
Interval push Usage¶
Tasks supporting interval push configure it under env.domain_rand:
env:
domain_rand:
push_robots: true
push_interval: 750
max_force: [1.0, 1.0, 0.5]
push_body_name: null
push_robots: whether to enable push.push_interval: trigger every N env steps.max_force: a length-3 external-force upper limit; each dimension is sampled within[-max_force, max_force].push_body_name: the target body / link to apply the force to. Defaults tonull, meaning the backend’sbase_nameis used.
uv run train --algo ppo --task g1_walk_flat --sim mujoco \
env.domain_rand.push_robots=true \
env.domain_rand.push_interval=500 \
'env.domain_rand.max_force=[20.0,20.0,5.0]' \
env.domain_rand.push_body_name=torso_link
Notes:
MuJoCo resolves by body name, Motrix resolves by link name; a missing name raises an error during env/backend initialization.
push_body_nameis an init config; changing it after env creation does not change the already-resolved target.The hot path only samples and applies the external force; it does not parse XML / asset and does not probe backend-private capability.
MuJoCo push is implemented via
xfrc_appliedexternal force and does not directly overwrite base velocity.
geom_size Lifecycle Boundary¶
geom_size is explicitly not part of ResetRandomizationPayload, and must not be modified on the hot path via BatchEnvPool.reset(..., randomization=...).
The reason is that geom_size changes model geometry and model identity; the correct lifecycle is:
The task provider generates the model variants and env-to-model assignment in
build_init_randomization_plan(...).The MuJoCo backend modifies geom size on the cold path using
MjSpecand compiles scale-specificMjModels.The backend constructs
BatchEnvPoolwith a model sequence of lengthnum_envs.
The reset stage only performs state and parameter perturbations within the same model identity; it does not handle
geom_size.
This boundary exists to honor the cold-path asset/model-metadata access principle: step(), reset(), and hot-path DR do not parse XML, do not read assets, and do not branch at runtime based on asset metadata.