unilab.algos.torch.rsl_rl_ppo
Classes
-
class unilab.algos.torch.rsl_rl_ppo.FinalObservationAwarePPO[source]
Bases: PPO
PPO variant that bootstraps time limits from env final_observation.
- Parameters:
-
-
learning_rate: float
-
__init__(*args, enable_compile=False, **kwargs)[source]
- Parameters:
-
-
update()[source]
- Return type:
dict[str, float]
-
process_env_step(obs, rewards, dones, extras)[source]
- Parameters:
-
- Return type:
None