Continuous Mountain Car
lerax.env.classic_control.ContinuousMountainCar
Bases: AbstractClassicControlEnv[ContinuousMountainCarState, Float[Array, ''], Float[Array, '2']]
Continuous Mountain Car environment matching the Gymnasium Continuous MountainCar environment.
Note
To achieve identical dynamics to Gymnasium set solver=diffrax.Euler().
Action Space
The action space is a 1-dimensional continuous space representing the force applied to the car in the range [-1.0, 1.0].
Observation Space
The observation space is a 2-dimensional continuous space representing the position and velocity of the car:
| Index | Observation | Min Value | Max Value |
|---|---|---|---|
| 0 | Car Position | -1.2 | 0.6 |
| 1 | Car Velocity | -0.07 | 0.07 |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
min_action
|
Float[ArrayLike, '']
|
Minimum action value (default: -1.0). |
-1.0
|
max_action
|
Float[ArrayLike, '']
|
Maximum action value (default: 1.0). |
1.0
|
min_position
|
Float[ArrayLike, '']
|
Minimum position of the car (default: -1.2). |
-1.2
|
max_position
|
Float[ArrayLike, '']
|
Maximum position of the car (default: 0.6). |
0.6
|
max_speed
|
Float[ArrayLike, '']
|
Maximum speed of the car (default: 0.07). |
0.07
|
goal_position
|
Float[ArrayLike, '']
|
Position at which the goal is reached (default: 0.5). |
0.5
|
power
|
Float[ArrayLike, '']
|
Power of the car's engine (default: 0.0015). |
0.0015
|
dt
|
Float[ArrayLike, '']
|
Time step for each action (default: 1.0). |
1.0
|
solver
|
diffrax.AbstractSolver | None
|
Diffrax solver to use for ODE integration (default: Tsit5). |
None
|
stepsize_controller
|
diffrax.AbstractStepSizeController | None
|
Step size controller for adaptive solvers (default: PIDController with rtol=1e-5, atol=1e-5). |
None
|
render_states
render_states(
states: Sequence[StateType],
renderer: AbstractRenderer | Literal["auto"] = "auto",
dt: float = 0.0,
)
Render a sequence of frames from multiple states.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
states
|
Sequence[StateType]
|
A sequence of environment states to render. |
required |
renderer
|
AbstractRenderer | Literal['auto']
|
The renderer to use for rendering. If "auto", uses the default renderer. |
'auto'
|
dt
|
float
|
The time delay between rendering each frame, in seconds. |
0.0
|
render_stacked
render_stacked(
states: StateType,
renderer: AbstractRenderer | Literal["auto"] = "auto",
dt: float = 0.0,
)
Render multiple frames from stacked states.
Stacked states are typically batched states stored in a pytree structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
states
|
StateType
|
A pytree of stacked environment states to render. |
required |
renderer
|
AbstractRenderer | Literal['auto']
|
The renderer to use for rendering. If "auto", uses the default renderer. |
'auto'
|
dt
|
float
|
The time delay between rendering each frame, in seconds. |
0.0
|
reset
Wrap the functional logic into a Gym API reset method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
Key[Array, '']
|
A JAX PRNG key for any stochasticity in the reset. |
required |
Returns:
| Type | Description |
|---|---|
tuple[StateType, ObsType, dict]
|
A tuple of the initial state, initial observation, and additional info. |
step
step(
state: StateType,
action: ActType,
*,
key: Key[Array, ""],
) -> tuple[
StateType,
ObsType,
Float[Array, ""],
Bool[Array, ""],
Bool[Array, ""],
dict,
]
Wrap the functional logic into a Gym API step method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
state
|
StateType
|
The current environment state. |
required |
action
|
ActType
|
The action to take. |
required |
key
|
Key[Array, '']
|
A JAX PRNG key for any stochasticity in the step. |
required |
Returns:
| Type | Description |
|---|---|
tuple[StateType, ObsType, Float[Array, ''], Bool[Array, ''], Bool[Array, ''], dict]
|
A tuple of the next state, observation, reward, terminal flag, truncate flag, and additional info. |