unilab.algos.torch.flash_sac.update

FlashSAC update helpers.

Functions

build_lr_lambda(init_lr, peak_lr, end_lr, ...)

compute_categorical_td_target(support, ...)

resolve_target_entropy(action_dim, ...)

select_min_q_log_probs(next_q_values, ...)

unilab.algos.torch.flash_sac.update.build_lr_lambda(init_lr, peak_lr, end_lr, warmup_steps, decay_steps)[source]
Parameters:
unilab.algos.torch.flash_sac.update.select_min_q_log_probs(next_q_values, next_q_log_probs)[source]
Parameters:
Return type:

Tensor

unilab.algos.torch.flash_sac.update.compute_categorical_td_target(support, target_log_probs, reward, dones, truncated, actor_entropy, gamma)[source]
Parameters:
Return type:

Tensor

unilab.algos.torch.flash_sac.update.resolve_target_entropy(action_dim, target_sigma, target_entropy)[source]
Parameters:
Return type:

float