unilab.algos.mlx.ppo.runner.MLXPPOAgent¶
- class unilab.algos.mlx.ppo.runner.MLXPPOAgent[source]¶
Bases:
objectHigh-level PPO wrapper to keep train script lightweight.
Methods
__init__(cfg, obs_dim, action_dim, learning_rate)act(obs)current_action_std(action_shape)load_trainer_state(trainer_state_path)load_weights(path)normalize_rewards(rewards)policy_mean(obs)save_checkpoint(model_path, ...)update(buffer, last_obs)update_normalization(obs)Attributes
- update(buffer, last_obs)[source]¶
- Parameters:
buffer (
RolloutBuffer)last_obs (
array)