MuJoCoUniPersistent Batched Runtime Primitives for MuJoCo

Yufei Jia*, Junzhe Wu — Tsinghua University

Abstract

We present MuJoCoUni, a downstream MuJoCo distribution for online robot learning and batched physics evaluation. Alongside the open-loop batched trajectory generation already provided by upstream mujoco.rollout, MuJoCoUni supplies runtime primitives for stateful environment execution. Its core object, BatchEnvPool, is a C++/pybind11 executor that owns per-environment mjModel copies, per-thread mjData workers, and an internal thread pool. It provides final-state-only short stepping, sparse reset, reset-lifecycle domain randomization, batched sensor forward evaluation without advancing dynamics, and batched Jacobian and height-field queries. The implementation is confined to the Python binding layer; MuJoCo's solver, contact model, integrator, and core source tree retain upstream semantics. This report describes the BatchEnvPool API, implementation boundary, relationship to rollout, and the validation and benchmark scripts shipped with mujoco-uni (pip install mujoco-uni).

1. Introduction

Robot-learning systems increasingly place the physics simulator inside the training loop. The runtime sends batched controls, advances a short time window, reads sensors and task state, and resets only terminated environments. MuJoCo already provides mature XML/MJB assets, sensors, contact solving, and debugging tools; when these fine-grained operations run at high frequency, interface overhead, object lifetime, and output shape directly affect training efficiency.

GPU-resident simulators and GPU-oriented MuJoCo backends are important paths for efficient training. When a task also needs upstream CPU MuJoCo behavior for models, sensors, contact or constraint handling, or debugging, a CPU-side batched runtime provides a complementary route.

Upstream MuJoCo already provides batched stepping through the official mujoco.rollout interface. It uses a C++ thread pool to run open-loop mj_step from many initial states and returns full state and sensor trajectories. Importantly, the persistence in rollout is limited to optional thread-pool reuse; environment models, data, state updates, reset semantics, and randomization lifecycles remain external to the call.

Online robot RL also needs an environment-runtime interface. The runtime should preserve environments and model variants across calls, return only the final state after short stepping windows, and apply sparse reset with domain randomization for terminated environments. Observation and control computation further need batched sensor forward passes, site Jacobians, and local terrain-height queries without advancing dynamics.

MuJoCoUni is a lightweight downstream distribution of MuJoCo with additions concentrated in the Python binding layer. Its core object, BatchEnvPool, creates per-environment mjModel copies, per-thread mjData workers, and an internal thread pool. It exposes step, forward, reset, compute_site_jacobians, and sample_hfield_height; MuJoCo's physics kernel and solver are unchanged.

The contribution of this report is engineering-oriented. We describe the complementary relationship between MuJoCoUni and upstream rollout, present the persistent environment pool and reset/forward/query primitives, and summarize the repository scripts for numerical parity, field-patching tests, and micro-benchmarks.

2. Related Work

2.1 Upstream MuJoCo Batching

The closest interface to MuJoCoUni is MuJoCo's upstream mujoco.rollout module. rollout generates open-loop trajectories from a batch of initial states and control sequences, supports single-threaded or thread-pool execution, and returns state and sensor arrays with shape nbatch × nstep × dim. The center of the rollout abstraction is "generate a full trajectory from input tensors"; the center of the MuJoCoUni abstraction is "maintain a repeatedly interactive environment pool."

rollout fits full-trajectory tasks such as planning, system identification, and trajectory optimization. BatchEnvPool is complementary when tasks need per-environment models to persist across calls, short steps to return only final states, sparse reset-time patches, or batched current-state queries.

2.2 Vectorized Environment Runtimes

Vectorized environment runtimes organize many environments behind one interface and are a common engineering layer in RL systems. EnvPool demonstrates the value of moving environment execution into a high-performance C++ runtime, and robot benchmarks such as ManiSkill expose batched task interfaces. MotrixSim shows a systems route that combines CPU-parallel simulation with reinforcement-learning algorithms for robot policy training. MuJoCoUni occupies a lower-level position: it extends the MuJoCo binding layer for systems that need the standard mjModel workflow, persistent model pools, reset-time domain randomization, and batched physics queries.

2.3 GPU-Resident Physics

Brax implements a vectorizable and differentiable physics kernel in JAX; MJX maps a subset of MuJoCo to JAX; Isaac Gym and Isaac Lab provide NVIDIA GPU-resident simulation through PhysX; Genesis and MuJoCo Warp also target GPU-side physics execution. These systems can provide high throughput at large parallel scales, but GPU paths typically require models, contact and constraint handling, and data layout to fit an accelerator-friendly execution model.

MuJoCoUni takes a complementary route. It preserves MuJoCo CPU physics semantics and concentrates batched execution plus common robot-task queries in the C++ binding layer. It is not a replacement claim against GPU-resident simulation; it is a CPU-batched backend for MuJoCo workloads where feature coverage matters more than accelerator residency.

2.4 Domain Randomization

Domain randomization is a basic technique for sim-to-real training and robust policy search. Standard MuJoCo Python workflows typically copy or mutate mjModel fields and call mj_setConst when required. MuJoCoUni moves common field patches into BatchEnvPool.reset, so sparse reset can handle both state reset and per-environment randomization.

2.5 Evolutionary and Optimization Workloads

Evolutionary computing, neuroevolution, and model search also rely on large numbers of physics evaluations. MuJoCoUni's persistent model pools, model-variant initialization, and final-state return semantics fit workloads that evaluate many candidate bodies or controllers in parallel.

3. System Design and API

MuJoCoUni architecture — **Architecture.** BatchEnvPool maintains persistent model and worker resources behind the Python interface and executes batched operations through standard MuJoCo calls without modifying the physics kernel.

3.1 Design Boundary

MuJoCoUni has a narrow design boundary: it adds a batched runtime inside the MuJoCo Python package without changing the physics kernel. Throughput improvements come from object lifetime, thread scheduling, and batched interfaces rather than from reducing the MuJoCo physics feature set. The core additions are batch_env.cc and batch_env.py.

3.2 Pool Construction

BatchEnvPool(model, *, nbatch, nthread=None) accepts either one MjModel or a compatible model sequence. The constructor creates one model copy per environment with mj_copyModel and one mjData per worker thread. When nthread > 0, an internal thread pool assigns chunks of environment indices to workers.

This supports parameter-level randomization through reset-time field patches and geometry-level randomization through precompiled MjModel variants (link lengths, mesh scales, collision geometry).

3.3 Execution Primitives

**Table 1:** Core BatchEnvPool primitives. N = nbatch.
Primitive	Input	Output / Purpose
`step`	(N, nstate), nstep, control	Final state (N, nstate); optional sensordata
`forward`	(N, nstate)	Sensordata (N, nsensordata) without advancing dynamics
`reset`	env_ids, states, randomization	Reset state/sensors for selected environments
`compute_site_jacobians`	state, site ids	Batched translational/rotational Jacobians
`sample_hfield_height`	state, geom id, XY offsets	Batched terrain heights or clearances

Batched stepping. step(initial_state, nstep, control=None) runs mj_step for nstep steps on every environment. Controls are (N, nstep, ncontrol). With return_sensor=True, final-step sensordata is also returned.

Forward evaluation. forward(initial_state) runs one mj_forward over all environments and returns sensors without advancing dynamics.

Sparse reset. reset(env_ids, initial_state, randomization=None) acts only on selected environments. Cost scales with the number of terminated environments.

Site Jacobians. compute_site_jacobians computes jacp and/or jacr for one or more sites. Output shape: (N, K, 3, nv).

Height-field sampling. sample_hfield_height bilinearly samples a MuJoCo hfield geom. Output is terrain height or clearance.

3.4 Reset-Time Domain Randomization

The reset randomization payload is a dictionary from field name to float64 arrays. Fields requiring refresh trigger mj_setConst after patching.

**Table 2:** Supported reset-lifecycle model patches in BatchEnvPool.
Field	mj_setConst	Use case
`body_mass`	yes	Body mass and payload randomization
`body_ipos`	yes	Inertial-frame COM offsets
`body_iquat`	yes	Inertial-frame orientation perturbations
`body_inertia`	yes	Inertia tensor randomization
`dof_armature`	yes	Joint armature perturbations
`gravity`	no	Per-env gravity vectors
`geom_friction`	no	Contact friction randomization
`kp`, `kd`	no	Position-actuator gain randomization

4. Validation and Benchmarks

This section reports MuJoCoUni benchmarks on four MuJoCo models using the discardvisual compiler option. All data collected on Intel i9-14900HX, Ubuntu 20.04, MuJoCoUni 3.8.0, Python 3.13, NumPy 2.4, 16 simulation threads.

4.1 Step and Forward Throughput

Four models tested: Unitree Go1 (18 DoF), Wonik Allegro (16 DoF), Franka Panda (9 DoF), CMU Humanoid (56 DoF). Throughput saturates around 256–512 environments. At saturation, Allegro reaches ~1.8M steps/s, Go1 ~1.2M, Franka ~410k, Humanoid ~290k.

Benchmark robot models — **Figure 1a:** Four robot models used in benchmarks.

**Figure 1b:** Step and forward throughput.

4.2 Model-Variant Overhead

When each environment owns a distinct mjModel copy, cache locality decreases slightly. At saturation (256–512 environments) the gap closes and throughput is essentially identical.

Multi-model overhead — **Figure 2:** Step throughput: single shared model vs. per-environment model variants.

4.3 Reset Performance

At 4096 environments, the C++ path completes a full reset in 3.5 ms vs. 53 ms for a Python loop — a ~15× speedup. The C++ path scales linearly with reset fraction.

**Figure 3:** Reset latency on Go1. Left: full reset across environment counts. Right: partial reset at 4096 environments.

4.4 Batched Jacobian Performance

The C++ pool computes Jacobians for 4096 environments in 0.53 ms vs. 11.9 ms for a Python loop — a ~22× speedup.

4.5 Height-Field Sampling Performance

At 4096 environments, the C++ path takes 0.52 ms vs. 290 ms for a Python loop — a ~555× speedup.

Terrain heightfield — **Figure 5a:** Stairs height-field terrain.

Hfield sampling time — **Figure 5b:** Height-field sampling time.

5. Applications

5.1 Robot Reinforcement Learning

Robot RL is the primary target workload. BatchEnvPool gathers short-horizon stepping, sensor reads, sparse reset, reset-time domain randomization, and current-state queries into one MuJoCo-side object. Downstream systems can consume final states through synchronous batch sampling, asynchronous collection, or offline data generation.

5.2 Sim-to-Real Domain Randomization

MuJoCoUni places common MuJoCo field patches and required mj_setConst refreshes inside reset; geometry-level changes are represented by precompiled model variants at construction time.

5.3 Terrain-Aware Locomotion

sample_hfield_height samples MuJoCo hfield data in batch, supports yaw/world/body alignment, and returns either terrain height or frame clearance.

5.4 Manipulation and Kinematic Control

compute_site_jacobians runs the minimal kinematic prefix over the full pool and calls mj_jacSite in batch, supporting operational-space control, reward computation, constraint checks, and IK auxiliary objectives.

5.5 Batch Optimization

MuJoCoUni's persistent model pools, model-variant initialization, and final-state-only step fit evaluation loops whose objective depends on final state, terminal events, or aggregated rewards.

6. Discussion

6.1 Runtime Boundary and Tradeoffs

Per-environment model copies increase memory use, geometry-level randomization requires precompiled compatible models, and reset-time field patching covers the currently registered field set. The corresponding benefits are clear object ownership, lower-frequency Python interaction, and simulator-side interfaces embeddable in different systems.

6.2 System Context

GPU-resident simulation and CUDA stacks (Isaac Gym, Isaac Lab, MuJoCo Playground, Genesis) demonstrate large-scale GPU-parallel training efficiency. CPU MuJoCo preserves mature XML/MJB assets, sensors, debugging, and visualization workflows. For workloads needing full MuJoCo feature coverage, cross-platform deployment, or reuse of existing assets, MuJoCoUni provides a concrete CPU-batched engineering path.

6.3 Availability

MuJoCoUni is released as the open-source mujoco-uni Python package with unit tests and parity checks. The benchmark code is at github.com/unilabsim/mujoco_uni_bench.

References (12)

[1] Emanuel Todorov, Tom Erez, and Yuval Tassa. MuJoCo: A physics engine for model-based control. IROS, 2012.
[2] Jiayi Weng et al. EnvPool: A highly parallel reinforcement learning environment execution engine. NeurIPS, 2022.
[3] Stone Tao et al. ManiSkill3: GPU parallelized robotics simulation and rendering for generalizable embodied AI. RSS, 2025.
[4] Yufei Jia et al. GS-Playground: A high-throughput photorealistic simulator for vision-informed robot learning. arXiv:2604.25459, 2026.
[5] C. Daniel Freeman et al. Brax — A differentiable physics engine for large scale rigid body simulation. arXiv:2106.13281, 2021.
[6] MuJoCo XLA Authors. MuJoCo XLA (MJX), 2024.
[7] Viktor Makoviychuk et al. Isaac Gym: High performance GPU-based physics simulation for robot learning. arXiv:2108.10470, 2021.
[8] Mayank Mittal et al. Isaac Lab: A GPU-accelerated simulation framework for multi-modal robot learning. arXiv:2511.04831, 2025.
[9] Genesis Authors. Genesis: A generative and universal physics engine for robotics and beyond, 2024.
[10] Google DeepMind and NVIDIA. MuJoCo Warp: GPU-optimized MuJoCo, 2025.
[11] Rustam Eynaliyev and Houcen Liu. Combining GPU and CPU for accelerating evolutionary computing workloads. arXiv:2502.11129, 2025.
[12] Kevin Zakka et al. MuJoCo Playground. arXiv:2502.08844, 2025.

Citation

MuJoCoUni

@article{jia2026mujocouni,
  title={MuJoCoUni: Persistent Batched Runtime Primitives for MuJoCo},
  author={Jia, Yufei and Wu, Junzhe},
  journal={arXiv preprint arXiv:2605.24922},
  year={2026}
}

MuJoCoUni面向在线机器人学习和批量物理评估的 MuJoCo 下游发行版

Yufei Jia*, Junzhe Wu — Tsinghua University

摘要

我们介绍 MuJoCoUni，一个面向在线机器人学习和批量物理评估的 MuJoCo 下游发行版。在上游 mujoco.rollout 已提供开环批量轨迹生成的基础上，MuJoCoUni 补充面向有状态环境运行时的接口。核心对象 BatchEnvPool 是一个 C++/pybind11 执行器，拥有每环境 mjModel 副本、每线程 mjData worker 和内部线程池。它提供终态返回的短步进、稀疏重置、reset 生命周期域随机化、无动力学推进的批量传感器前向计算，以及批量 Jacobian 和高度场查询。实现仅位于 Python 绑定层；MuJoCo 的求解器、接触模型、积分器和核心源码树保持上游语义。该包可通过 pip install mujoco-uni 安装。

1. 引言

机器人学习系统越来越把物理仿真器放在训练内环中。运行时会发送批量控制量、推进短时间窗、读取传感器和任务状态，并只重置已终止环境。MuJoCo 已经提供成熟的 XML/MJB 资产、传感器、接触求解器和调试工具；当这些小粒度操作高频发生时，接口开销、对象生命周期和返回数据形状会直接影响训练效率。

GPU-resident 仿真器和 MuJoCo 的 GPU-oriented 后端是高效训练的重要路径。但当任务既需要高并行吞吐，又需要沿用上游 CPU MuJoCo 的模型、传感器、接触/约束行为或调试工具链时，CPU 侧批量运行时是一条互补路线。

上游 MuJoCo 已经通过官方 mujoco.rollout 接口提供批量 step。该接口使用 C++ 线程池对多个初始状态执行开环 mj_step，并返回完整的状态和传感器轨迹。重要的是，rollout 的持久性仅涉及可选复用线程池；环境模型、数据、状态更新、reset 语义和随机化生命周期仍由调用者在每次调用外部管理。

在线机器人 RL 还需要环境运行时接口。运行时需要跨调用保留环境和模型变体，短步进后只返回下一次控制所需的终态，并对终止环境执行稀疏 reset 与域随机化。观测和控制计算还需要不推进动力学的批量传感器前向、site Jacobian 和局部地形高度查询。

MuJoCoUni 是 MuJoCo 的轻量级下游发行版，新增内容集中在 Python 绑定层。核心对象 BatchEnvPool 创建每环境 mjModel 副本、每线程 mjData worker 和内部线程池。它暴露 step、forward、reset、compute_site_jacobians 和 sample_hfield_height；MuJoCo 的物理内核和求解器保持不变。

本报告的贡献是工程性的。我们描述 MuJoCoUni 与上游 rollout 的互补关系，给出持久环境池和 reset/forward/query 原语的实现方式，并总结仓库中的一致性检查、字段 patch 测试和 micro-benchmark 脚本。

2. 相关工作

2.1 上游 MuJoCo 批量接口

MuJoCoUni 最接近的接口是上游 MuJoCo 的 mujoco.rollout 模块。rollout 从一批初始状态和控制序列生成开环轨迹，返回 nbatch × nstep × dim 的状态和传感器数组。其抽象中心是"从输入张量生成完整轨迹"；MuJoCoUni 的抽象中心是"维护一个可重复交互的环境池"。

rollout 适合规划、系统辨识和轨迹优化等完整轨迹任务。当任务需要跨调用保留每环境模型、只返回短步进终态、稀疏 reset patch 或批量当前状态查询时，BatchEnvPool 提供互补原语。

2.2 向量化环境运行时

向量化环境运行时把大量环境组织成一个统一接口，是强化学习系统中的常见工程层。EnvPool 展示了把环境执行下沉到高性能 C++ 运行时的价值；ManiSkill 等机器人基准也提供批量任务接口。MotrixSim 展示了 CPU 并行仿真结合强化学习算法训练机器人策略的系统路线。MuJoCoUni 的位置更底层：它不是完整任务框架，而是 MuJoCo 绑定层扩展。

2.3 GPU-Resident 物理

Brax、MJX、Isaac Gym、Isaac Lab、Genesis 和 MuJoCo Warp 都面向大规模 GPU 侧物理执行。这些系统可以提供很高吞吐，但通常要求模型、接触/约束处理和数据布局适配 accelerator-friendly 的执行模型。

MuJoCoUni 采取互补路线：保留 MuJoCo CPU 物理语义，并把批量执行与常用机器人任务查询集中到 C++ 绑定层。它不是对 GPU-resident 路线的替代声明，而是面向 feature coverage 比 accelerator residency 更重要的工作负载。

2.4 域随机化

域随机化是 sim-to-real 训练和鲁棒策略搜索中的基础技术。MuJoCoUni 将常见字段 patch 放进 BatchEnvPool.reset，使稀疏 reset 可以同时处理状态重置和每环境随机化。

2.5 进化与优化工作负载

进化计算、神经进化和模型搜索也依赖大量物理评估。MuJoCoUni 的持久化模型池、model-variant 初始化和终态返回模式适合并行评估大量候选体或控制器。

3. 系统设计与 API

MuJoCoUni 架构 — **架构图：**BatchEnvPool 在 Python 接口后持久化模型与 worker 资源，并通过标准 MuJoCo 调用执行批量操作，而不修改物理内核。

3.1 设计边界

MuJoCoUni 的设计边界很窄：只在 MuJoCo Python 包中新增一个批量运行时，不改变物理内核。吞吐优化来自对象生命周期、线程调度和批量接口，而不是缩减 MuJoCo 物理特性集合。

3.2 Pool 构造

BatchEnvPool(model, *, nbatch, nthread=None) 接受一个 MjModel 或长度为 1/nbatch 的兼容模型序列。构造函数使用 mj_copyModel 创建每环境模型副本，并为每个工作线程创建一个 mjData。这种构造方式同时支持 reset 时字段 patch 的参数级随机化，以及通过预编译模型变体表达的几何级随机化。

3.3 执行原语

**表 1：**核心 BatchEnvPool 原语。N = nbatch。
原语	输入	输出/用途
`step`	(N, nstate), nstep, 控制	终态 (N, nstate)；可选 sensordata
`forward`	(N, nstate)	Sensordata (N, nsensordata)，不推进动力学
`reset`	env_ids, 状态, 随机化	只重置选中环境的状态和传感器
`compute_site_jacobians`	状态, site ids	批量平动/转动 Jacobian
`sample_hfield_height`	状态, geom id, XY 偏移	批量地形高度或间隙

批量步进：step(initial_state, nstep, control=None) 在所有环境上执行 nstep 次 mj_step。

前向评估：forward(initial_state) 对所有环境运行一次 mj_forward 并返回传感器。

稀疏重置：reset(env_ids, initial_state, randomization=None) 只作用于给定环境子集。

Site Jacobian：compute_site_jacobians 对一个或多个 site 计算 jacp 和/或 jacr。

高度场采样：sample_hfield_height 对 MuJoCo hfield geom 执行双线性采样。

3.4 Reset 域随机化

Reset 随机化 payload 是一个从字段名到 float64 数组的字典。需要刷新的字段会在 patch 后对目标环境调用 mj_setConst。

**表 2：**BatchEnvPool 支持的 reset 模型字段 patch。
字段	mj_setConst	用途
`body_mass`	是	质量和负载随机化
`body_ipos`	是	惯性系质心偏移
`body_iquat`	是	惯性系方向扰动
`body_inertia`	是	惯性张量随机化
`dof_armature`	是	关节电枢扰动
`gravity`	否	每环境重力向量
`geom_friction`	否	接触摩擦随机化
`kp`, `kd`	否	位置执行器增益随机化

4. 验证与基准测试

本节报告 MuJoCoUni 在四个 MuJoCo 模型上的性能基准；所有模型使用 discardvisual 编译选项。硬件：Intel i9-14900HX，Ubuntu 20.04，16 仿真线程。

4.1 Step 和 Forward 吞吐

四个模型：Unitree Go1 (18 DoF)、Wonik Allegro (16 DoF)、Franka Panda (9 DoF)、Humanoid (56 DoF)。饱和时 Allegro 达约 1.8M steps/s，Go1 约 1.2M，Franka 约 410k，Humanoid 约 290k。

Step/Forward 吞吐 — **图 1b：**Step 和 Forward 吞吐率。

4.2 多模型开销

每环境独立模型副本对缓存局部性有轻微影响，但在饱和区间 (256–512) 差距消失，吞吐基本一致。

4.3 Reset 性能

4096 环境下，C++ 路径完成全量 reset 仅需 3.5 ms，Python 循环需 53 ms，加速约 15 倍。

Reset 延迟 — **图 3：**Go1 上的 reset 延迟。左：全量 reset；右：部分 reset (4096 环境)。

4.4 批量 Jacobian 性能

C++ 池在 4096 环境上计算 Jacobian 仅需 0.53 ms，Python 循环需 11.9 ms，加速约 22 倍。

4.5 高度场采样性能

4096 环境下，C++ 路径耗时 0.52 ms，Python 循环需 290 ms，加速约 555 倍。

5. 应用

5.1 机器人强化学习

机器人 RL 是 MuJoCoUni 的主要目标工作负载。BatchEnvPool 将短 horizon step、传感器读取、稀疏 reset、域随机化和当前状态查询收束到同一个 MuJoCo 侧对象中。

5.2 Sim-to-Real 域随机化

MuJoCoUni 将常见 MuJoCo 字段 patch 和必要的 mj_setConst 刷新放在 reset 中；几何级变化则通过构造时的预编译模型序列表达。

5.3 地形感知运动

sample_hfield_height 直接从 MuJoCo hfield 数据中批量采样，支持 yaw/world/body 对齐，并返回地形高度或 frame clearance。

5.4 操作与运动学控制

compute_site_jacobians 在整个 pool 上执行最小运动学前缀并批量调用 mj_jacSite，可用于操作空间控制、奖励计算和约束检查。

5.5 批量优化

MuJoCoUni 的持久化模型池、model-variant 初始化和终态返回模式适合目标函数依赖最终状态、终端事件或聚合 reward 的评估循环。

6. 讨论

6.1 运行时边界与权衡

每环境模型副本增加内存占用，几何级随机化需要预编译兼容模型，reset 字段 patch 只覆盖当前注册的字段集合。对应收益是清晰的对象所有权、较低的 Python 交互频率，以及可嵌入不同系统的仿真侧接口。

6.2 系统背景

GPU-resident 仿真和 CUDA 软件栈展示了大规模 GPU 并行环境的训练效率。CPU MuJoCo 提供跨平台运行，保留成熟的资产、传感器、调试和可视化工作流。对需要 MuJoCo 完整能力覆盖、跨平台部署或复用现有资产的工作负载，MuJoCoUni 提供了一条具体的 CPU-batched 工程路径。

6.3 可用性

MuJoCoUni 以开源 mujoco-uni 包发布，附带单元测试和数值一致性检查。基准代码在 github.com/unilabsim/mujoco_uni_bench。

参考文献 (12)

[1] Emanuel Todorov, Tom Erez, Yuval Tassa. MuJoCo: A physics engine for model-based control. IROS, 2012.
[2] Jiayi Weng et al. EnvPool. NeurIPS, 2022.
[3] Stone Tao et al. ManiSkill3. RSS, 2025.
[4] Yufei Jia et al. GS-Playground. arXiv:2604.25459, 2026.
[5] C. D. Freeman et al. Brax. arXiv:2106.13281, 2021.
[6] MuJoCo XLA (MJX), 2024.
[7] V. Makoviychuk et al. Isaac Gym. arXiv:2108.10470, 2021.
[8] M. Mittal et al. Isaac Lab. arXiv:2511.04831, 2025.
[9] Genesis Authors. Genesis, 2024.
[10] Google DeepMind & NVIDIA. MuJoCo Warp, 2025.
[11] R. Eynaliyev, H. Liu. arXiv:2502.11129, 2025.
[12] K. Zakka et al. MuJoCo Playground. arXiv:2502.08844, 2025.

引用

MuJoCoUni

@article{jia2026mujocouni,
  title={MuJoCoUni: Persistent Batched Runtime Primitives for MuJoCo},
  author={Jia, Yufei and Wu, Junzhe},
  journal={arXiv preprint arXiv:2605.24922},
  year={2026}
}