unilab.logging.onpolicy.OnPolicyLogger

class unilab.logging.onpolicy.OnPolicyLogger[source]

Bases: BaseTrainingLogger

Rich logger for on-policy RL (PPO, A2C, etc).

Parameters:

Methods

__init__([algo_name, max_iterations, ...])

close()

Release live terminal state and backend handles without printing a summary.

finish(*[, title, extra_summary])

log_save(path)

log_step(iteration[, metrics, reward, ...])

start(*[, status])

update_ep_length(length)

__init__(algo_name='PPO', max_iterations=1500, num_envs=4096, num_steps=24, env_name='', log_dir='', log_backend='tensorboard', wandb_project='unilab', wandb_entity=None, wandb_name='', wandb_group=None, wandb_job_type=None, wandb_tags=None, wandb_notes=None)[source]
Parameters:
start(*, status='')[source]
Parameters:

status (str)

finish(*, title='Training Summary', extra_summary='')[source]
Parameters:
  • title (str)

  • extra_summary (str)

log_step(iteration, metrics=None, reward=None, reward_components=None, collect_time=0.0, train_time=0.0)[source]
Parameters:
close()

Release live terminal state and backend handles without printing a summary.

Return type:

None

log_save(path)
Parameters:

path (str)

update_ep_length(length)
Parameters:

length (float)