unilab.algos.torch.him_ppo.algorithm.HIMPPO¶
- class unilab.algos.torch.him_ppo.algorithm.HIMPPO[source]¶
Bases:
object- Parameters:
Methods
__init__(actor_critic[, ...])act(obs, critic_obs)compute_returns(last_critic_obs)init_storage(num_envs, ...)process_env_step(next_obs, rewards, dones, ...)update()Attributes
- __init__(actor_critic, num_learning_epochs=1, num_mini_batches=1, clip_param=0.2, gamma=0.998, lam=0.95, value_loss_coef=1.0, entropy_coef=0.0, learning_rate=0.001, max_grad_norm=1.0, use_clipped_value_loss=True, schedule='fixed', desired_kl=0.01, device='cpu', **kwargs)[source]¶
-
actor_critic:
HIMActorCritic¶
-
storage:
HIMRolloutStorage|None¶
- init_storage(num_envs, num_transitions_per_env, actor_obs_shape, critic_obs_shape, action_shape)[source]¶