G1Locomotion
The 29-DoF Unitree G1 must track randomly sampled [vx, vy, yaw_rate] velocity commands while maintaining a periodic bipedal gait. A gait phase is carried on the state and advanced at a per-episode sampled frequency. Domain randomization, observation noise, and random pushes are applied for sim-to-real robustness.
Observation space
103-dim float vector with observation noise:
- local linear velocity (3)
- pelvis gyro (3)
- gravity vector in pelvis frame (3)
- velocity command
[vx, vy, yaw_rate](3) - joint angles relative to default pose (29)
- joint velocities (29)
- last action (29)
- gait phase
[cos_l, sin_l, cos_r, sin_r](4)
Action space
Box(-1, 1, shape=(29,)). Scaled to joint position targets around the default pose: targets = default_joint_positions + action * action_scale (default action_scale = 0.5).
Reward
Weighted sum multiplied by dt. Major terms (defaults in parentheses):
tracking_lin_vel(1.0):exp(-||cmd_xy - local_linvel_xy||^2 / tracking_sigma)tracking_ang_vel(0.75):exp(-(cmd_yaw - gyro_z)^2 / tracking_sigma)ang_vel_xy(-0.15),orientation(-2.0),feet_slip(-0.25)feet_air_time(2.0) rewarding first ground contacts with air time in[0.2, 0.5]feet_phase(1.0):exp(-phase_error / 0.01) * moving, wherephase_error = sum((foot_z - desired_z(phase))^2)termination(-100.0) on fall/self-contact,stand_still(-1.0) under zero commandcollision(-0.1) on hand-thigh contacts,dof_pos_limits(-1.0) outside soft limitspose(-0.1),joint_deviation_hip(-0.25),joint_deviation_knee(-0.1)torques,action_rate,dof_acc,alivedefault to0.0
All weights are overridable via the reward_weights dict.
Termination
Terminates on fall (torso_gravity_z < 0), NaN in qpos/qvel, or dangerous self-contacts (foot-foot, foot-shin). No built-in truncation.
Deviations from Gymnasium
Not a Gymnasium environment. Uses MJX with per-episode domain randomization (friction, armature, body masses, joint friction loss, torso inertia offset), observation noise, periodic random horizontal pushes, and a gait-phase signal carried on state. Velocity commands are resampled per episode; with zero_command_probability=0.1 the command is zeroed to teach standing still.
lerax.env.unitree.g1.G1Locomotion
Bases: AbstractG1Env
Unitree G1 locomotion: track velocity commands with a natural gait.
The agent controls 29 joint position targets (legs, waist, arms) and must track randomly sampled linear and angular velocity commands while maintaining a stable bipedal gait. Domain randomization is applied per-episode for sim-to-real transfer.
Observation (103 dims): - Local linear velocity (3) - Gyroscope angular velocity (3) - Gravity vector in body frame (3) - Velocity command [vx, vy, yaw_rate] (3) - Joint angles offset from default (29) - Joint velocities (29) - Last action (29) - Gait phase [cos_l, sin_l, cos_r, sin_r] (4)
Action: [-1, 1]^29 scaled to joint position targets around default pose.
Reward: Weighted sum of velocity tracking, energy penalties, gait phase tracking, and pose regularization, multiplied by dt.
transition
Step physics with domain-randomized model.
Computes motor targets from the action, optionally applies push
perturbations or pulling forces, steps physics via lax.scan
using the per-episode randomized model on the state, then updates
gait phase and tracking.
transition_info
render_states
render_states(
states: Sequence[StateType],
renderer: AbstractRenderer | Literal["auto"] = "auto",
dt: float = 0.0,
)
Render a sequence of frames from multiple states.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
states
|
Sequence[StateType]
|
A sequence of environment states to render. |
required |
renderer
|
AbstractRenderer | Literal['auto']
|
The renderer to use for rendering. If "auto", uses the default renderer. |
'auto'
|
dt
|
float
|
The time delay between rendering each frame, in seconds. |
0.0
|
render_stacked
render_stacked(
states: StateType,
renderer: AbstractRenderer | Literal["auto"] = "auto",
dt: float = 0.0,
)
Render multiple frames from stacked states.
Stacked states are typically batched states stored in a pytree structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
states
|
StateType
|
A pytree of stacked environment states to render. |
required |
renderer
|
AbstractRenderer | Literal['auto']
|
The renderer to use for rendering. If "auto", uses the default renderer. |
'auto'
|
dt
|
float
|
The time delay between rendering each frame, in seconds. |
0.0
|
reset
Wrap the functional logic into a Gym API reset method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
Key[Array, '']
|
A JAX PRNG key for any stochasticity in the reset. |
required |
Returns:
| Type | Description |
|---|---|
tuple[StateType, ObsType, dict]
|
A tuple of the initial state, initial observation, and additional info. |
step
step(
state: StateType,
action: ActType,
*,
key: Key[Array, ""],
) -> tuple[
StateType,
ObsType,
Float[Array, ""],
Bool[Array, ""],
Bool[Array, ""],
dict,
]
Wrap the functional logic into a Gym API step method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
state
|
StateType
|
The current environment state. |
required |
action
|
ActType
|
The action to take. |
required |
key
|
Key[Array, '']
|
A JAX PRNG key for any stochasticity in the step. |
required |
Returns:
| Type | Description |
|---|---|
tuple[StateType, ObsType, Float[Array, ''], Bool[Array, ''], Bool[Array, ''], dict]
|
A tuple of the next state, observation, reward, terminal flag, truncate flag, and additional info. |
_local_linvel
Local linear velocity from pelvis IMU velocimeter.
_gravity_vector
Gravity direction in pelvis frame (upvector sensor).
_torso_gravity_vector
Gravity direction in torso frame.
_global_linvel
Global linear velocity from frame sensor.
_global_angvel
Global angular velocity from frame sensor.
_joint_angles_offset
Joint angles relative to default pose.
_joint_velocities
Actuated joint velocities.
_foot_contact
Binary foot contact flags from contact sensors.
_self_contact_termination
Check for dangerous self-contacts (foot-foot, foot-shin).
_foot_positions
World-frame positions of foot sites.
_foot_velocities
Global linear velocities of foot sites.
_snap_to_ground
Shift the robot vertically so its lowest point just touches the ground.
Checks both body positions and foot site positions to find the
true lowest point, then adjusts qpos[2] so that point sits
at a small clearance above z = 0.
_init_common
_init_common(
xml_file: str | Path = "scene_mjx.xml",
control_frequency_hz: float = 50.0,
action_scale: float = 0.5,
keyframe_name: str = "knees_bent",
soft_joint_pos_limit_factor: float = 0.95,
push_enable: bool = True,
push_interval_range: tuple[float, float] = (5.0, 10.0),
push_magnitude_range: tuple[float, float] = (0.1, 2.0),
noise_level: float = 1.0,
noise_scales: dict[str, float] | None = None,
friction_range: tuple[float, float] = (0.4, 1.0),
friction_loss_scale_range: tuple[float, float] = (
0.5,
2.0,
),
armature_scale_range: tuple[float, float] = (1.0, 1.05),
mass_scale_range: tuple[float, float] = (0.9, 1.1),
torso_offset_range: tuple[float, float] = (-1.0, 1.0),
relative_actions: bool = False,
pulling_force_magnitude: float = 0.0,
unactuated_steps: int = 0,
)
Shared initialization for all G1 environments.