Humanoid
Lerax port of Gymnasium's Humanoid environment. A 3D humanoid with 17 actuated joints must walk in the +x direction while remaining upright.
Observation space
Flattened concatenation of qpos, qvel, cinert[1:], cvel[1:], qfrc_actuator[6:], and cfrc_ext[1:]. Each of the last four blocks is individually switchable via include_*_in_observation (all default True). When exclude_current_positions_from_observation=True (default) the first two entries of qpos are dropped. Unbounded Box.
Action space
Box(low, high) from the model's actuator_ctrlrange — 17 continuous joint torques.
Reward
forward_reward + healthy_reward - ctrl_cost - contact_cost with
forward_reward = forward_reward_weight * x_velocity_of_center_of_mass(default weight1.25)healthy_reward = is_healthy * healthy_reward(default5.0)ctrl_cost = ctrl_cost_weight * sum(action ** 2)(default0.1)contact_cost = contact_cost_weight * clip(sum(cfrc_ext ** 2), -inf, 10.0)(default weight5e-7)
Termination
If terminate_when_unhealthy=True (default), terminates when torso z leaves healthy_z_range=(1.0, 2.0). No built-in truncation.
lerax.env.mujoco.Humanoid
Bases: AbstractMujocoEnv[Float[Array, '...'], Float[Array, '...']]
MJX-based humanoid environment roughly matching Gymnasium's Humanoid-v5.
transition
render_states
render_states(
states: Sequence[StateType],
renderer: AbstractRenderer | Literal["auto"] = "auto",
dt: float = 0.0,
)
Render a sequence of frames from multiple states.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
states
|
Sequence[StateType]
|
A sequence of environment states to render. |
required |
renderer
|
AbstractRenderer | Literal['auto']
|
The renderer to use for rendering. If "auto", uses the default renderer. |
'auto'
|
dt
|
float
|
The time delay between rendering each frame, in seconds. |
0.0
|
render_stacked
render_stacked(
states: StateType,
renderer: AbstractRenderer | Literal["auto"] = "auto",
dt: float = 0.0,
)
Render multiple frames from stacked states.
Stacked states are typically batched states stored in a pytree structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
states
|
StateType
|
A pytree of stacked environment states to render. |
required |
renderer
|
AbstractRenderer | Literal['auto']
|
The renderer to use for rendering. If "auto", uses the default renderer. |
'auto'
|
dt
|
float
|
The time delay between rendering each frame, in seconds. |
0.0
|
reset
Wrap the functional logic into a Gym API reset method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
Key[Array, '']
|
A JAX PRNG key for any stochasticity in the reset. |
required |
Returns:
| Type | Description |
|---|---|
tuple[StateType, ObsType, dict]
|
A tuple of the initial state, initial observation, and additional info. |
step
step(
state: StateType,
action: ActType,
*,
key: Key[Array, ""],
) -> tuple[
StateType,
ObsType,
Float[Array, ""],
Bool[Array, ""],
Bool[Array, ""],
dict,
]
Wrap the functional logic into a Gym API step method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
state
|
StateType
|
The current environment state. |
required |
action
|
ActType
|
The action to take. |
required |
key
|
Key[Array, '']
|
A JAX PRNG key for any stochasticity in the step. |
required |
Returns:
| Type | Description |
|---|---|
tuple[StateType, ObsType, Float[Array, ''], Bool[Array, ''], Bool[Array, ''], dict]
|
A tuple of the next state, observation, reward, terminal flag, truncate flag, and additional info. |