unilab.ipc.replay_buffer.ReplayBuffer¶
- class unilab.ipc.replay_buffer.ReplayBuffer[source]¶
Bases:
SharedBufferBaseShared replay buffer backed by authoritative packed CPU storage.
Device transfer is owned by replay pipeline transfer backends. The fallback sample() path copies a sampled packed batch to
self.deviceand keeps no per-device replay cache.- Parameters:
Methods
__init__(capacity, obs_dim, action_dim, device)add(obs, actions, rewards, next_obs, dones, ...)Add batch (called by collector).
sample(batch_size)Sample batch (called by learner).
- __init__(capacity, obs_dim, action_dim, device, defer_gpu=False, critic_dim=0, packed_cpu_storage=False)[source]¶
- add(obs, actions, rewards, next_obs, dones, truncated, terminal_mask=None, terminal_next_obs=None, critic=None, next_critic=None, terminal_next_critic=None)[source]¶
Add batch (called by collector).
dones follows the UniLab env lifecycle contract: done = terminated | truncated. Learners must pair it with truncated when computing bootstrap masks.