unilab.base¶
Environment registry and base classes.
- class unilab.base.TerminalObservationContract[source]¶
Bases:
objectTerminalObservationContract(terminal_obs: ‘np.ndarray | None’, terminal_mask: ‘np.ndarray’, timeout_terminal_mask: ‘np.ndarray’, terminal_critic: ‘np.ndarray | None’ = None)
- Parameters:
- class unilab.base.TransitionBootstrapContract[source]¶
Bases:
objectTransitionBootstrapContract(actor_next_obs: ‘np.ndarray’, transition_next_obs: ‘np.ndarray’, terminal_mask: ‘np.ndarray’, timeout_terminal_mask: ‘np.ndarray’, actor_next_critic: ‘np.ndarray | None’ = None, transition_next_critic: ‘np.ndarray | None’ = None)
- Parameters:
- __init__(actor_next_obs, transition_next_obs, terminal_mask, timeout_terminal_mask, actor_next_critic=None, transition_next_critic=None)¶
- unilab.base.ensure_registries(packages=None, *, optional_packages=None, fail_on_error=True)[source]¶
Import env registry bootstrap modules.
- unilab.base.flatten_obs_dict(obs)[source]¶
Concatenate obs groups in insertion order -> flat (N, total_dim) array.
- unilab.base.flatten_policy_obs_dict(obs)[source]¶
Build actor-policy inputs from the single actor observation group.
- unilab.base.get_critic_base_dim(obs_groups_spec)[source]¶
Get critic observation dim, falling back to actor obs when absent.
- unilab.base.get_obs_dims(obs_groups_spec)[source]¶
Extract (actor_obs_dim, critic_obs_dim) from obs_groups_spec.
When no separate critic group exists, critic_obs_dim == actor_obs_dim.
- unilab.base.patch_transition_next_obs(next_obs, final_observation=None, done=None, info=None, next_critic=None)[source]¶
Patch transition next obs with final_observation without mutating actor inputs.
- unilab.base.resolve_terminal_observation_contract(next_obs_batch_size, final_observation=None, done=None, info=None, truncated=None)[source]¶
Resolve terminal observation facts without constructing patched next obs.
- unilab.base.resolve_transition_bootstrap_contract(next_obs, info=None, final_observation=None, done=None, truncated=None, next_critic=None)[source]¶
Resolve actor/storage observations and timeout bootstrap masks for a step.
- unilab.base.split_obs_dict(obs)[source]¶
Split observation dict into (actor_obs, critic_obs).
When no separate critic group exists, critic_obs == actor_obs.
Modules
Curriculum learning for adaptive difficulty adjustment. |
|