unilab.algos.torch.hora.appo_learner¶
HORA-owned APPO learner with grouped actor and privileged observations.
Classes
APPO learner variant for HORA grouped observations. |
- class unilab.algos.torch.hora.appo_learner.HoraAPPOLearner[source]¶
Bases:
APPOLearnerAPPO learner variant for HORA grouped observations.
- Parameters:
actor (
MLPModel)critic (
MLPModel)num_learning_epochs (
int)num_mini_batches (
int)clip_param (
float)gamma (
float)lam (
float)value_loss_coef (
float)entropy_coef (
float)learning_rate (
float)max_grad_norm (
float)use_clipped_value_loss (
bool)schedule (
str)desired_kl (
float)adaptive_kl_factor (
float)adaptive_lr_factor (
float)device (
str)optimizer (
str)tau (
float)target_update_freq (
int)vtrace_clip_rho (
float)vtrace_clip_c (
float)enable_compile (
bool)