TensorBoard Callback

lerax.callback.tensorboard.TensorBoardCallbackStepState

Bases: AbstractCallbackStepState

State for TensorBoardCallback.

Records cumulative episode returns and lengths, and the exponential moving average of them over episodes.

Attributes:

Name	Type	Description
`step`	`Int[Array, '']`	Current training step.
`episode_return`	`Float[Array, '']`	Cumulative return for the current episode.
`episode_length`	`Int[Array, '']`	Length of the current episode.
`episode_done`	`Bool[Array, '']`	Boolean indicating if the current episode is done.
`average_return`	`Float[Array, '']`	Exponential moving average of episode returns.
`average_length`	`Float[Array, '']`	Exponential moving average of episode lengths.

Parameters:

Name	Type	Description	Default
`step`	`Int[Array, '']`	Current training step.	required
`episode_return`	`Float[ArrayLike, '']`	Cumulative return for the current episode.	required
`episode_length`	`Int[ArrayLike, '']`	Length of the current episode.	required
`episode_done`	`Bool[ArrayLike, '']`	Boolean indicating if the current episode is done.	required
`average_return`	`Float[ArrayLike, '']`	Exponential moving average of episode returns.	required
`average_length`	`Float[ArrayLike, '']`	Exponential moving average of episode lengths.	required

lerax.callback.TensorBoardCallback

Bases: AbstractCallback[EmptyCallbackState, TensorBoardCallbackStepState]

Callback for recording training statistics to TensorBoard.

Each training iteration, the following statistics are logged: - episode/return: The exponential moving average of episode returns. - episode/length: The exponential moving average of episode lengths. - train/: - learning_rate: The current learning rate. - ...: Any other statistics in the training log.

Note

If the callback is instantiated inside a JIT-compiled function, it may not work correctly.

Attributes:

Name	Type	Description
`tb_writer`	`JITSummaryWriter`	The TensorBoard summary writer.
`alpha`	`float`	Smoothing factor for exponential moving averages.

Parameters:

Name	Type	Description	Default
`name`	`str \| None`	Name for the TensorBoard log directory. If None, a name is generated based on the current time, environment name, and policy name.	`None`
`env`	`AbstractEnvLike \| None`	The environment being trained on. Used for naming if `name` is None.	`None`
`policy`	`AbstractPolicy \| None`	The policy being trained. Used for naming if `name` is None.	`None`
`alpha`	`float`	Smoothing factor for exponential moving averages.	`0.9`

init

__init__(
    name: str | None = None,
    env: AbstractEnvLike | None = None,
    policy: AbstractPolicy | None = None,
    alpha: float = 0.9,
)

reset

reset(ctx: ResetContext, *, key: Key) -> EmptyCallbackState

step_reset

step_reset(
    ctx: ResetContext, *, key: Key
) -> TensorBoardCallbackStepState

on_step

on_step(
    ctx: StepContext, *, key: Key
) -> TensorBoardCallbackStepState

on_iteration

on_iteration(
    ctx: IterationContext, *, key: Key
) -> EmptyCallbackState

on_training_start

on_training_start(
    ctx: TrainingContext, *, key: Key
) -> EmptyCallbackState

on_training_end

on_training_end(
    ctx: TrainingContext, *, key: Key
) -> EmptyCallbackState