unilab.ipc.replay_buffer¶
Packed shared-memory replay buffer for off-policy RL.
Classes
Shared replay buffer backed by authoritative packed CPU storage. |
- class unilab.ipc.replay_buffer.ReplayBuffer[source]¶
Bases:
SharedBufferBaseShared replay buffer backed by authoritative packed CPU storage.
Device transfer is owned by replay pipeline transfer backends. The fallback sample() path copies a sampled packed batch to
self.deviceand keeps no per-device replay cache.- Parameters:
- __init__(capacity, obs_dim, action_dim, device, defer_gpu=False, critic_dim=0, packed_cpu_storage=False)[source]¶
- add(obs, actions, rewards, next_obs, dones, truncated, terminal_mask=None, terminal_next_obs=None, critic=None, next_critic=None, terminal_next_critic=None)[source]¶
Add batch (called by collector).
dones follows the UniLab env lifecycle contract: done = terminated | truncated. Learners must pair it with truncated when computing bootstrap masks.