unilab.algos.torch.hora.appo

HORA-owned APPO entry helpers.

Functions

play_hora_appo(cfg, rl_cfg, *, root_dir, ...)

Play HORA APPO checkpoints with grouped actor and privileged inputs.

resolve_hora_appo_runtime(rl_cfg)

Resolve HORA APPO entrypoints from an explicit runtime marker.

Classes

HoraAPPORuntime

Resolved HORA APPO entrypoints used by the generic APPO script.

class unilab.algos.torch.hora.appo.HoraAPPORunner[source]

Bases: APPORunner

APPO runner variant that preserves grouped HORA observations.

__init__(*args, **kwargs)[source]
learn(max_iterations=1500, save_interval=50, log_dir='logs', logger_type='tensorboard')[source]
Parameters:
  • max_iterations (int)

  • save_interval (int)

  • log_dir (str)

  • logger_type (str)

Return type:

None

class unilab.algos.torch.hora.appo.HoraAPPORuntime[source]

Bases: object

Resolved HORA APPO entrypoints used by the generic APPO script.

Parameters:
  • runner_cls (type[HoraAPPORunner]) – Runner class used for HORA APPO training mode.

  • play_fn (Callable[..., str | None]) – Play-mode callable used for HORA APPO checkpoint playback.

Returns:

Immutable entrypoint bundle consumed by generic APPO script assembly.

runner_cls: type[HoraAPPORunner]
play_fn: Callable[..., str | None]
__init__(runner_cls, play_fn)
Parameters:
unilab.algos.torch.hora.appo.play_hora_appo(cfg, rl_cfg, *, root_dir, resolve_checkpoint_path)[source]

Play HORA APPO checkpoints with grouped actor and privileged inputs.

Parameters:
Return type:

str | None

unilab.algos.torch.hora.appo.resolve_hora_appo_runtime(rl_cfg)[source]

Resolve HORA APPO entrypoints from an explicit runtime marker.

Parameters:

rl_cfg (dict[str, Any]) – Resolved algorithm config dictionary from Hydra composition.

Return type:

HoraAPPORuntime | None

Returns:

HoraAPPORuntime when the owner config selects HORA APPO, otherwise None.