unilab.ipc — Shared-Memory Runtime

The bridge between CPU simulation workers and the GPU learner. Everything here is a building block of the async runner that powers APPO / FastSAC / FastTD3 / FlashSAC.

Submodule

Role

async_runner

The high-level orchestration loop

shared_buffer

NumPy-backed shared-memory ring/buffer

rollout_ring_buffer

Rollout window used by on-policy collectors

replay_buffer

Off-policy replay backed by shared memory

replay_pipelines.*

Host-to-device staging (CPU-pinned double buffer, native h2d)

shared_obs_stats

Running mean/std shared across workers

weight_sync

Push learner weights back to workers

unilab.ipc

IPC primitives for multi-process RL training.

Async runner

Base async runner for multi-process RL training.

class unilab.ipc.async_runner.AsyncRunner[source]

Bases: ABC

Base class for async RL algorithms.

Manages: - Shared memory allocation/cleanup - Collector process lifecycle - Error propagation from collector subprocess - Training loop skeleton

Parameters:
__init__(env_name, env_cfg_overrides, rl_cfg, *, device=None, collector_device=None, sim_backend='mujoco', num_envs=4096)[source]
Parameters:
abstract learn(max_iterations, save_interval=50, log_dir='logs')[source]
Parameters:
  • max_iterations (int)

  • save_interval (int)

  • log_dir (str)

Return type:

None

close()[source]
Return type:

None