unilab.ipc — Shared-Memory Runtime¶
The bridge between CPU simulation workers and the GPU learner. Everything here is a building block of the async runner that powers APPO / FastSAC / FastTD3 / FlashSAC.
Submodule |
Role |
|---|---|
|
The high-level orchestration loop |
|
NumPy-backed shared-memory ring/buffer |
|
Rollout window used by on-policy collectors |
|
Off-policy replay backed by shared memory |
|
Host-to-device staging (CPU-pinned double buffer, native h2d) |
|
Running mean/std shared across workers |
|
Push learner weights back to workers |
IPC primitives for multi-process RL training. |
Async runner¶
Base async runner for multi-process RL training.
- class unilab.ipc.async_runner.AsyncRunner[source]¶
Bases:
ABCBase class for async RL algorithms.
Manages: - Shared memory allocation/cleanup - Collector process lifecycle - Error propagation from collector subprocess - Training loop skeleton
- Parameters:
- __init__(env_name, env_cfg_overrides, rl_cfg, *, device=None, collector_device=None, sim_backend='mujoco', num_envs=4096)[source]¶