ros_gazebo_gym.robot_gazebo_goal_env

The Gazebo GOAL environment is mainly used to connect the simulated Gym GOAL environment to the Gazebo simulator. It takes care of the resets of the simulator after each step or the resets of the controllers (if needed), it also takes care of all the steps that need to be done on the simulator when doing a training step or a training reset (typical steps in the reinforcement learning loop).

Important

This class is similar to the RobotGazeboEnv class but now instead of the gym.Env class the gym.GoalEnv class is used as the superclass. As a result, a goal-based environment is created. You should use this goal based Gazebo environment when you are working with RL algorithms that require a spare reward space (i.e. Hindsight Experience Replay (HER)).

This goal-based environment just like any regular gymnasium environment, but it imposes a required structure on the observation_space. More concretely, the observation space is required to contain at least three elements, namely observation, desired_goal, and achieved_goal. Here, desired_goal specifies the goal that the agent should attempt to achieve. achieved_goal is the goal that it currently achieved instead. observation contains the actual observations of the environment as per usual.

Further with this goal env, not the cumulative cost is published to the /ros_gazebo_gym/reward reward topic during the env reset, like was the case in the RobotGazeboEnv, but the cost at each step.

Module Contents

Classes

RobotGazeboGoalEnv

Connects the simulated GOAL gymnasium environment to the gazebo simulator.

class ros_gazebo_gym.robot_gazebo_goal_env.RobotGazeboGoalEnv(robot_name_space, reset_controls, controllers_list=None, reset_robot_pose=False, reset_world_or_sim='SIMULATION', log_reset=True, pause_simulation=False, publish_rviz_training_info_overlay=False)[source]

Bases: gymnasium_robotics.GoalEnv

Connects the simulated GOAL gymnasium environment to the gazebo simulator.

gazebo

Gazebo connector which can be used to interact with the gazebo simulation.

Type:

GazeboConnection

episode_num

The current episode.

Type:

int

step_num

The current step.

Type:

int

step_reward

The reward achieved by the current step.

Type:

float

Initiate the RobotGazebo environment instance.

Parameters:
  • robot_name_space (str) – The namespace the robot is on.

  • reset_controls (bool) – Whether the controllers should be reset when the RobotGazeboEnv.reset() method is called.

  • controllers_list (list, optional) – A list with currently available controllers to look for. Defaults to None, which means that the class will try to retrieve all the running controllers.

  • reset_robot_pose (bool) – Boolean specifying whether to reset the robot pose when the simulation is reset.

  • reset_world_or_sim (str, optional) – Whether you want to reset the whole simulation “SIMULATION” at startup or only the world “WORLD” (object positions). Defaults to “SIMULATION”.

  • log_reset (bool, optional) – Whether we want to print a log statement when the world/simulation is reset. Defaults to True.

  • pause_sim (bool, optional) – Whether the simulation should be paused after each step (i.e. after each action). Defaults to False.

  • publish_rviz_training_info_overlay (bool, optional) – Whether a RViz overlay should be published with the training results. Defaults to False.

step(action)[source]

Function executed each time step. Here we get the action execute it in a time step and retrieve the observations generated by that action. We also publish the step reward on the /ros_gazebo_gym/reward topic.

Parameters:

action (numpy.ndarray) – The action we want to perform in the environment.

Returns:

tuple containing:

  • obs (np.ndarray): Environment observation.

  • cost (float): Cost of the action.

  • terminated (bool): Whether the episode is terminated.

  • truncated (bool): Whether the episode was truncated. This value is set by wrappers when for example a time limit is reached or the agent goes out of bounds.

  • info (dict): Additional information about the environment.

Return type:

(tuple)

Note

Here we should convert the action num to movement action, execute the action in the simulation and get the observations result of performing that action.

reset(seed=None, options=None)[source]

Function executed when resetting the environment.

Parameters:
  • seed (int, optional) – The seed to use for the random number generator. Defaults to None.

  • options (dict, optional) – The options to pass to the environment. Defaults to None.

Returns:

tuple containing:

  • obs (numpy.ndarray): The current state

  • info_dict (dict): Dictionary with additional information.

Return type:

(tuple)

close()[source]

Function executed when closing the environment. Use it for closing GUIS and other systems that need closing.

render(render_mode='human')[source]

Overload render method since rendering is handled in Gazebo.

pause_controllers(controllers_list=None, filter_list=[])[source]

Pauses the controllers.

Parameters:
  • controller_list (list, optional) – The controllers you want to pause. Defaults to None, which means that the class will pause all the running controllers.

  • filter_list (list, optional) – The controllers you want to ignore when pausing. Defaults to [].

unpause_controllers()[source]

Un-pauses the paused controllers.