stable_gym.envs.classic_control.ex3_ekf.ex3_ekf

The noisy master slave system (Ex3EKF) gymnasium environment.

Attributes

`EPISODES`
`RANDOM_STEP`
`env`

Classes

Ex3EKF

Noisy master slave system

Module Contents

stable_gym.envs.classic_control.ex3_ekf.ex3_ekf.EPISODES = 10[source]

stable_gym.envs.classic_control.ex3_ekf.ex3_ekf.RANDOM_STEP = True[source]

class stable_gym.envs.classic_control.ex3_ekf.ex3_ekf.Ex3EKF(render_mode=None, clipped_action=True)[source]

Bases: gymnasium.Env

Noisy master slave system

Description:

The goal of the agent in the Ex3EKF environment is to act in such a way that estimator perfectly estimated the original noisy system. By doing this it serves as a RL based stationary Kalman filter. First presented by Wu et al. 2023.

Observation:

Type: Box(4)

Num	Observation	Min	Max
0	The estimated angle	-10000 rad	10000 rad
1	The estimated frequency	-10000 hz	10000 hz
2	Actual angle	-10000 rad	10000 rad
3	Actual frequency	-10000 rad	10000 rad

Actions:

Type: Box(2)

Num	Action
0	First action coming from the RL Kalman filter
1	Second action coming from the RL Kalman filter

Cost:

A cost, computed as the sum of the squared differences between the estimated and the actual states:

\[C = {(\hat{x}_1 - x_1)}^2 + {(\hat{x}_2 - x_2)}^2\]

Starting State:

All observations are assigned a uniform random value in [-0.05..0.05]

Episode Termination:

When the step cost is higher than 100.

Solved Requirements:

Considered solved when the average cost is lower than 300.

state[source]

The current system state.

Type:: numpy.ndarray

t[source]

The current time step.

Type:: float

dt[source]

The environment step size. Also available as tau.

Type:: float

sigma[source]

The variance of the system noise.

Type:: float

Initialise new Ex3EKF environment instance.

Parameters:

render_mode (str, optional) – The render mode you want to use. Defaults to None. Not used in this environment.
clipped_action (str, optional) – Whether the actions should be clipped if they are greater than the set action limit. Defaults to True.

_action_clip_warning = False[source]

t = 0.0[source]

dt = 0.1[source]

q1 = 0.01[source]

g = 9.81[source]

l_net = 1.0[source]

mean1 = [0, 0][source]

cov1[source]

mean2 = 0[source]

cov2 = 0.01[source]

missing_rate = 0[source]

sigma = 0[source]

high[source]

action_space[source]

observation_space[source]

reward_range = (0.0, 100.0)[source]

_clipped_action[source]

viewer = None[source]

state = None[source]

output = None[source]

steps_beyond_done = None[source]

step(action)[source]

Take step into the environment.

Parameters:

action (numpy.ndarray) – The action we want to perform in the environment.

Returns:

tuple containing:

obs (np.ndarray): Environment observation.

cost (float): Cost of the action.

terminated (bool): Whether the episode is terminated.

truncated (bool): Whether the episode was truncated. This value is set by wrappers when for example a time limit is reached or the agent goes out of bounds.

info (dict): Additional information about the environment.

Return type:

(tuple)

reset(seed=None, options=None)[source]

Reset gymnasium environment.

Parameters:

seed (int, optional) – A random seed for the environment. By default None`.
options (dict, optional) – A dictionary containing additional options for resetting the environment. By default None. Not used in this environment.

Returns:

tuple containing:

obs (numpy.ndarray): Initial environment observation.

info (dict): Dictionary containing additional information.

Return type:

(tuple)

reference(x)[source]

Returns the current value of the periodic reference signal that is tracked by the Synthetic oscillatory network.

Parameters:: x (float) – The reference value.
Returns:: The current reference value.
Return type:: float

abstract render(mode='human')[source]

Render one frame of the environment.

Parameters:: mode (str, optional) – Gym rendering mode. The default mode will do something human friendly, such as pop up a window.
Raises:: NotImplementedError – Will throw a NotImplimented error since the render method has not yet been implemented.

Note

This currently is not yet implemented.

property tau[source]
Alias for the environment step size. Done for compatibility with the
other gymnasium environments.

property physics_time[source]
Returns the physics time. Alias for :attr:`.t`.

stable_gym.envs.classic_control.ex3_ekf.ex3_ekf.env[source]