unilab.algos.torch.hora.appo_worker¶
HORA-owned APPO rollout worker.
Functions
Compute timeout bootstrap values for grouped HORA observations. |
|
|
Collect grouped HORA APPO rollouts into the shared IPC ring buffer. |
- unilab.algos.torch.hora.appo_worker.compute_hora_timeout_bootstrap_correction(critic, collector_device, gamma, timeout_mask, final_obs, final_critic=None, final_priv_info=None)[source]¶
Compute timeout bootstrap values for grouped HORA observations.
- unilab.algos.torch.hora.appo_worker.hora_appo_collector_fn(stop_event, env_name, rl_cfg, num_envs, steps_per_env, shm_rollout_ring_buffer_name, sync_primitives, obs_dim, action_dim, critic_dim, actor_weight_sync_name, actor_weight_param_shapes, critic_weight_sync_name, critic_weight_param_shapes, metrics_queue, collector_device='cpu', sim_backend='mujoco', env_cfg_override=None, priv_info_dim=0, seed=None)[source]¶
Collect grouped HORA APPO rollouts into the shared IPC ring buffer.
- Parameters:
stop_event (
Any)env_name (
str)rl_cfg (
dict)num_envs (
int)steps_per_env (
int)sync_primitives (
tuple)obs_dim (
int)action_dim (
int)critic_dim (
int)actor_weight_sync_name (
str)actor_weight_param_shapes (
dict)critic_weight_sync_name (
str)critic_weight_param_shapes (
dict)metrics_queue (
Any)collector_device (
str)sim_backend (
str)priv_info_dim (
int)