Skip to content

Humanoid Standup

Lerax port of Gymnasium's Humanoid Standup environment. Same humanoid model as Humanoid, but starts lying on the ground. The agent is rewarded for raising its torso z-coordinate as high as possible.

Observation space

Same layout as Humanoid: flattened concatenation of qpos, qvel, cinert[1:], cvel[1:], qfrc_actuator[6:], and cfrc_ext[1:], each individually switchable. Current x/y are dropped by default. Unbounded Box.

Action space

Box(low, high) from the model's actuator_ctrlrange — 17 continuous joint torques.

Reward

uph_cost - ctrl_cost - impact_cost + 1 where

  • uph_cost = uph_cost_weight * (qpos[2] / dt) (torso z divided by control dt; default weight 1.0)
  • ctrl_cost = ctrl_cost_weight * sum(data.ctrl ** 2) (default 0.1)
  • impact_cost = clip(impact_cost_weight * sum(cfrc_ext ** 2), -inf, 10.0) (default weight 0.5e-6)

Termination

Never terminates. No built-in truncation.

lerax.env.mujoco.HumanoidStandup

Bases: AbstractMujocoEnv[Float[Array, '...'], Float[Array, '...']]

MJX-based humanoid standup environment matching Gymnasium's HumanoidStandup-v5.

unwrapped property

unwrapped: Self

Return the unwrapped environment

action_mask

action_mask(
    state: MujocoEnvState, *, key: Key[Array, ""]
) -> None

transition

transition(
    state: MujocoEnvState,
    action: ActType,
    *,
    key: Key[Array, ""],
) -> MujocoEnvState

truncate

truncate(state: MujocoEnvState) -> Bool[Array, '']

default_renderer

default_renderer() -> MujocoRenderer

render

render(state: MujocoEnvState, renderer: AbstractRenderer)

render_states

render_states(
    states: Sequence[StateType],
    renderer: AbstractRenderer | Literal["auto"] = "auto",
    dt: float = 0.0,
)

Render a sequence of frames from multiple states.

Parameters:

Name Type Description Default
states Sequence[StateType]

A sequence of environment states to render.

required
renderer AbstractRenderer | Literal['auto']

The renderer to use for rendering. If "auto", uses the default renderer.

'auto'
dt float

The time delay between rendering each frame, in seconds.

0.0

render_stacked

render_stacked(
    states: StateType,
    renderer: AbstractRenderer | Literal["auto"] = "auto",
    dt: float = 0.0,
)

Render multiple frames from stacked states.

Stacked states are typically batched states stored in a pytree structure.

Parameters:

Name Type Description Default
states StateType

A pytree of stacked environment states to render.

required
renderer AbstractRenderer | Literal['auto']

The renderer to use for rendering. If "auto", uses the default renderer.

'auto'
dt float

The time delay between rendering each frame, in seconds.

0.0

reset

reset(
    *, key: Key[Array, ""]
) -> tuple[StateType, ObsType, dict]

Wrap the functional logic into a Gym API reset method.

Parameters:

Name Type Description Default
key Key[Array, '']

A JAX PRNG key for any stochasticity in the reset.

required

Returns:

Type Description
tuple[StateType, ObsType, dict]

A tuple of the initial state, initial observation, and additional info.

step

step(
    state: StateType,
    action: ActType,
    *,
    key: Key[Array, ""],
) -> tuple[
    StateType,
    ObsType,
    Float[Array, ""],
    Bool[Array, ""],
    Bool[Array, ""],
    dict,
]

Wrap the functional logic into a Gym API step method.

Parameters:

Name Type Description Default
state StateType

The current environment state.

required
action ActType

The action to take.

required
key Key[Array, '']

A JAX PRNG key for any stochasticity in the step.

required

Returns:

Type Description
tuple[StateType, ObsType, Float[Array, ''], Bool[Array, ''], Bool[Array, ''], dict]

A tuple of the next state, observation, reward, terminal flag, truncate flag, and additional info.

model_from_path staticmethod

model_from_path(xml_file: str | Path) -> mujoco.MjModel