unilab.algos.torch.appo.worker

APPO Rollout Worker — runs in a subprocess.

Collects rollout payloads and writes them to RolloutRingBuffer.

Functions

appo_collector_fn(stop_event, env_name, ...)

Entry point for the APPO collector subprocess.

compute_timeout_bootstrap_correction(critic, ...)

Compute gamma * V(final_observation) for current timeout envs.

put_latest_metrics(metrics_queue, msg, *, ...)

Best-effort metrics enqueue that keeps recent data under learner stalls.

unilab.algos.torch.appo.worker.put_latest_metrics(metrics_queue, msg, *, worker_name)[source]

Best-effort metrics enqueue that keeps recent data under learner stalls.

Parameters:
Return type:

None

unilab.algos.torch.appo.worker.compute_timeout_bootstrap_correction(critic, collector_device, gamma, timeout_mask, final_obs, final_critic)[source]

Compute gamma * V(final_observation) for current timeout envs.

Parameters:
Return type:

ndarray

unilab.algos.torch.appo.worker.appo_collector_fn(stop_event, env_name, rl_cfg, num_envs, steps_per_env, shm_rollout_ring_buffer_name, sync_primitives, obs_dim, action_dim, critic_dim, actor_weight_sync_name, actor_weight_param_shapes, critic_weight_sync_name, critic_weight_param_shapes, metrics_queue, collector_device='cpu', sim_backend='mujoco', env_cfg_override=None, seed=None)[source]

Entry point for the APPO collector subprocess.

Creates environment + policy, collects rollouts, writes raw payloads to the IPC ring buffer. Error handling is provided by the _collector_entry_wrapper in async_runner.py.

Parameters:
  • stop_event (Any)

  • env_name (str)

  • rl_cfg (dict)

  • num_envs (int)

  • steps_per_env (int)

  • shm_rollout_ring_buffer_name (Dict[str, str])

  • sync_primitives (tuple)

  • obs_dim (int)

  • action_dim (int)

  • critic_dim (int)

  • actor_weight_sync_name (str)

  • actor_weight_param_shapes (dict)

  • critic_weight_sync_name (str)

  • critic_weight_param_shapes (dict)

  • metrics_queue (Any)

  • collector_device (str)

  • sim_backend (str)

  • env_cfg_override (dict | None)

  • seed (int | None)