unilab.algos.torch.flash_sac.learner.RewardNormalizer¶
- class unilab.algos.torch.flash_sac.learner.RewardNormalizer[source]¶
Bases:
objectAdaptive reward scaling with running discounted-return statistics.
Methods
__init__(gamma, g_max, device[, eps])load_state_dict(state_dict)normalize(rewards)update_from_transitions(rewards, dones)