unilab.algos.torch

Standalone PPO entrypoints

class unilab.algos.torch.rsl_rl_ppo.FinalObservationAwarePPO[source]

Bases: PPO

PPO variant that bootstraps time limits from env final_observation.

Parameters:
learning_rate: float
__init__(*args, enable_compile=False, **kwargs)[source]
Parameters:
update()[source]
Return type:

dict[str, float]

process_env_step(obs, rewards, dones, extras)[source]
Parameters:
Return type:

None

Runtime resolution helpers for RSL-RL PPO script assembly.

class unilab.algos.torch.rsl_rl_runtime.RslRlPPORuntime[source]

Bases: object

Resolved PPO runtime consumed by the generic RSL-RL entrypoint.

Parameters:

wrapper_cls (type[RslRlVecEnvWrapper])

wrapper_cls: type[RslRlVecEnvWrapper]
__init__(wrapper_cls)
Parameters:

wrapper_cls (type[RslRlVecEnvWrapper])

unilab.algos.torch.rsl_rl_runtime.resolve_rsl_rl_ppo_runtime(rl_cfg, *, default_wrapper_cls)[source]

Resolve the PPO runtime bundle from owner config.

Parameters:
Return type:

RslRlPPORuntime