unilab.algos.torch.rsl_rl_ppo.FinalObservationAwarePPO

class unilab.algos.torch.rsl_rl_ppo.FinalObservationAwarePPO[source]

Bases: PPO

PPO variant that bootstraps time limits from env final_observation.

Parameters:

Methods

__init__(*args[, enable_compile])

process_env_step(obs, rewards, dones, extras)

update()

Attributes

learning_rate: float
__init__(*args, enable_compile=False, **kwargs)[source]
Parameters:
update()[source]
Return type:

dict[str, float]

process_env_step(obs, rewards, dones, extras)[source]
Parameters:
Return type:

None

__call__(*args, **kwargs)

Call self as a function.

Parameters:
Return type:

Any

__getitem__(key)
Parameters:

key (Any)

Return type:

_MockObject

__len__()
Return type:

int

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

Any