Skip to content

TensorBoard Callback

lerax.callback.tensorboard.TensorBoardCallbackStepState

Bases: AbstractCallbackStepState

State for TensorBoardCallback.

Records cumulative episode returns and lengths, and the exponential moving average of them over episodes.

Attributes:

Name Type Description
step Int[Array, '']

Current training step.

episode_return Float[Array, '']

Cumulative return for the current episode.

episode_length Int[Array, '']

Length of the current episode.

episode_done Bool[Array, '']

Boolean indicating if the current episode is done.

average_return Float[Array, '']

Exponential moving average of episode returns.

average_length Float[Array, '']

Exponential moving average of episode lengths.

Parameters:

Name Type Description Default
step Int[Array, '']

Current training step.

required
episode_return Float[ArrayLike, '']

Cumulative return for the current episode.

required
episode_length Int[ArrayLike, '']

Length of the current episode.

required
episode_done Bool[ArrayLike, '']

Boolean indicating if the current episode is done.

required
average_return Float[ArrayLike, '']

Exponential moving average of episode returns.

required
average_length Float[ArrayLike, '']

Exponential moving average of episode lengths.

required

lerax.callback.TensorBoardCallback

Bases: AbstractCallback[EmptyCallbackState, TensorBoardCallbackStepState]

Callback for recording training statistics to TensorBoard.

Each training iteration, the following statistics are logged: - episode/return: The exponential moving average of episode returns. - episode/length: The exponential moving average of episode lengths. - train/: - learning_rate: The current learning rate. - ...: Any other statistics in the training log.

Note

If the callback is instantiated inside a JIT-compiled function, it may not work correctly.

Attributes:

Name Type Description
tb_writer JITSummaryWriter

The TensorBoard summary writer.

alpha float

Smoothing factor for exponential moving averages.

Parameters:

Name Type Description Default
name str | None

Name for the TensorBoard log directory. If None, a name is generated based on the current time, environment name, and policy name.

None
env AbstractEnvLike | None

The environment being trained on. Used for naming if name is None.

None
policy AbstractPolicy | None

The policy being trained. Used for naming if name is None.

None
alpha float

Smoothing factor for exponential moving averages.

0.9

__init__

__init__(
    name: str | None = None,
    env: AbstractEnvLike | None = None,
    policy: AbstractPolicy | None = None,
    alpha: float = 0.9,
)

reset

reset(ctx: ResetContext, *, key: Key) -> EmptyCallbackState

step_reset

step_reset(
    ctx: ResetContext, *, key: Key
) -> TensorBoardCallbackStepState

on_step

on_step(
    ctx: StepContext, *, key: Key
) -> TensorBoardCallbackStepState

on_iteration

on_iteration(
    ctx: IterationContext, *, key: Key
) -> EmptyCallbackState

on_training_start

on_training_start(
    ctx: TrainingContext, *, key: Key
) -> EmptyCallbackState

on_training_end

on_training_end(
    ctx: TrainingContext, *, key: Key
) -> EmptyCallbackState