stable_gym.envs.classic_control.ex3_ekf.ex3_ekf

The noisy master slave system (Ex3EKF) gymnasium environment.

Module Contents

Classes

Ex3EKF

Noisy master slave system

Attributes

EPISODES

RANDOM_STEP

env

stable_gym.envs.classic_control.ex3_ekf.ex3_ekf.EPISODES = 10[source]
stable_gym.envs.classic_control.ex3_ekf.ex3_ekf.RANDOM_STEP = True[source]
class stable_gym.envs.classic_control.ex3_ekf.ex3_ekf.Ex3EKF(render_mode=None, clipped_action=True)[source]

Bases: gymnasium.Env

Noisy master slave system

Description:

The goal of the agent in the Ex3EKF environment is to act in such a way that estimator perfectly estimated the original noisy system. By doing this it serves as a RL based stationary Kalman filter. First presented by Wu et al. 2023.

Observation:

Type: Box(4)

Num

Observation

Min

Max

0

The estimated angle

-10000 rad

10000 rad

1

The estimated frequency

-10000 hz

10000 hz

2

Actual angle

-10000 rad

10000 rad

3

Actual frequency

-10000 rad

10000 rad

Actions:

Type: Box(2)

Num

Action

0

First action coming from the RL Kalman filter

1

Second action coming from the RL Kalman filter

Cost:

A cost, computed as the sum of the squared differences between the estimated and the actual states:

\[C = {(\hat{x}_1 - x_1)}^2 + {(\hat{x}_2 - x_2)}^2\]
Starting State:

All observations are assigned a uniform random value in [-0.05..0.05]

Episode Termination:
  • When the step cost is higher than 100.

Solved Requirements:

Considered solved when the average cost is lower than 300.

state

The current system state.

Type:

numpy.ndarray

t

The current time step.

Type:

float

dt

The environment step size. Also available as tau.

Type:

float

sigma

The variance of the system noise.

Type:

float

Initialise new Ex3EKF environment instance.

Parameters:
  • render_mode (str, optional) – The render mode you want to use. Defaults to None. Not used in this environment.

  • clipped_action (str, optional) – Whether the actions should be clipped if they are greater than the set action limit. Defaults to True.

property tau[source]

Alias for the environment step size. Done for compatibility with the other gymnasium environments.

property physics_time[source]

Returns the physics time. Alias for t.

step(action)[source]

Take step into the environment.

Parameters:

action (numpy.ndarray) – The action we want to perform in the environment.

Returns:

tuple containing:

  • obs (np.ndarray): Environment observation.

  • cost (float): Cost of the action.

  • terminated (bool): Whether the episode is terminated.

  • truncated (bool): Whether the episode was truncated. This value is set by wrappers when for example a time limit is reached or the agent goes out of bounds.

  • info (dict): Additional information about the environment.

Return type:

(tuple)

reset(seed=None, options=None)[source]

Reset gymnasium environment.

Parameters:
  • seed (int, optional) – A random seed for the environment. By default None`.

  • options (dict, optional) – A dictionary containing additional options for resetting the environment. By default None. Not used in this environment.

Returns:

tuple containing:

  • obs (numpy.ndarray): Initial environment observation.

  • info (dict): Dictionary containing additional information.

Return type:

(tuple)

reference(x)[source]

Returns the current value of the periodic reference signal that is tracked by the Synthetic oscillatory network.

Parameters:

x (float) – The reference value.

Returns:

The current reference value.

Return type:

float

abstract render(mode='human')[source]

Render one frame of the environment.

Parameters:

mode (str, optional) – Gym rendering mode. The default mode will do something human friendly, such as pop up a window.

Raises:

NotImplementedError – Will throw a NotImplimented error since the render method has not yet been implemented.

Note

This currently is not yet implemented.

stable_gym.envs.classic_control.ex3_ekf.ex3_ekf.env[source]