unilab.algos.torch.fast_td3.learner.FastTD3Learner

class unilab.algos.torch.fast_td3.learner.FastTD3Learner[source]

Bases: object

FastTD3 learner aligned with reference FastTD3 repository.

Key hyperparameters (from Go1JoystickFlat): - gamma=0.97, tau=0.1 - AdamW with weight_decay=0.1 - Cosine LR schedule - Distributional critic (C51, num_atoms=101, v_min/max=±10) - CDQ (Clipped Double Q-learning) toggle - Observation normalization

Parameters:

Methods

__init__(obs_dim, action_dim, critic_obs_dim)

get_state_dict()

load_state_dict(state_dict)

normalize_obs(obs[, update])

Normalize observations using running statistics.

soft_update()

Backward-compatible alias for older call sites.

soft_update_target()

Polyak-average update of critic target network only (matching reference FastTD3).

update_actor(data)

One actor update step.

update_critic(data)

One critic update step.

__init__(obs_dim, action_dim, critic_obs_dim, num_envs=1024, device='cpu', gamma=0.97, tau=0.01, actor_lr=0.0003, critic_lr=0.0003, actor_hidden_dim=512, critic_hidden_dim=1024, num_atoms=101, v_min=-10.0, v_max=10.0, init_scale=0.01, log_std_min=-3.0, log_std_max=0.0, weight_decay=0.001, use_cdq=True, policy_noise=0.1, noise_clip=0.2, policy_frequency=2, max_iterations=50000, obs_normalization=True)[source]
Parameters:
normalize_obs(obs, update=False)[source]

Normalize observations using running statistics.

Parameters:
Return type:

Tensor

update_critic(data)[source]

One critic update step.

Parameters:

data (Dict[str, Tensor])

Return type:

Dict[str, float]

update_actor(data)[source]

One actor update step.

Parameters:

data (Dict[str, Tensor])

Return type:

Dict[str, float]

soft_update_target()[source]

Polyak-average update of critic target network only (matching reference FastTD3).

Return type:

None

soft_update()[source]

Backward-compatible alias for older call sites.

Return type:

None

get_state_dict()[source]
Return type:

Dict

load_state_dict(state_dict)[source]
Parameters:

state_dict (Dict)

Return type:

None