unilab.algos.torch¶
FlashSAC algorithm package. |
|
Off-policy RL unified infrastructure. |
Standalone PPO entrypoints¶
- class unilab.algos.torch.rsl_rl_ppo.FinalObservationAwarePPO[source]¶
Bases:
PPOPPO variant that bootstraps time limits from env final_observation.
Runtime resolution helpers for RSL-RL PPO script assembly.
- class unilab.algos.torch.rsl_rl_runtime.RslRlPPORuntime[source]¶
Bases:
objectResolved PPO runtime consumed by the generic RSL-RL entrypoint.
- Parameters:
wrapper_cls (
type[RslRlVecEnvWrapper])
-
wrapper_cls:
type[RslRlVecEnvWrapper]¶
- __init__(wrapper_cls)¶
- Parameters:
wrapper_cls (
type[RslRlVecEnvWrapper])
- unilab.algos.torch.rsl_rl_runtime.resolve_rsl_rl_ppo_runtime(rl_cfg, *, default_wrapper_cls)[source]¶
Resolve the PPO runtime bundle from owner config.
- Parameters:
default_wrapper_cls (
type[RslRlVecEnvWrapper])
- Return type: