stable_gym.envs.classic_control.ex3_ekf.ex3_ekf
The noisy master slave system (Ex3EKF) gymnasium environment.
Attributes
Classes
Noisy master slave system |
Module Contents
- class stable_gym.envs.classic_control.ex3_ekf.ex3_ekf.Ex3EKF(render_mode=None, clipped_action=True)[source]
Bases:
gymnasium.Env
Noisy master slave system
- Description:
The goal of the agent in the Ex3EKF environment is to act in such a way that estimator perfectly estimated the original noisy system. By doing this it serves as a RL based stationary Kalman filter. First presented by Wu et al. 2023.
- Observation:
Type: Box(4)
Num
Observation
Min
Max
0
The estimated angle
-10000 rad
10000 rad
1
The estimated frequency
-10000 hz
10000 hz
2
Actual angle
-10000 rad
10000 rad
3
Actual frequency
-10000 rad
10000 rad
- Actions:
Type: Box(2)
Num
Action
0
First action coming from the RL Kalman filter
1
Second action coming from the RL Kalman filter
- Cost:
A cost, computed as the sum of the squared differences between the estimated and the actual states:
\[C = {(\hat{x}_1 - x_1)}^2 + {(\hat{x}_2 - x_2)}^2\]- Starting State:
All observations are assigned a uniform random value in
[-0.05..0.05]
- Episode Termination:
When the step cost is higher than 100.
- Solved Requirements:
Considered solved when the average cost is lower than 300.
Initialise new Ex3EKF environment instance.
- Parameters:
- step(action)[source]
Take step into the environment.
- Parameters:
action (numpy.ndarray) – The action we want to perform in the environment.
- Returns:
tuple containing:
obs (
np.ndarray
): Environment observation.cost (
float
): Cost of the action.terminated (
bool
): Whether the episode is terminated.truncated (
bool
): Whether the episode was truncated. This value is set by wrappers when for example a time limit is reached or the agent goes out of bounds.info (
dict
): Additional information about the environment.
- Return type:
(tuple)
- reset(seed=None, options=None)[source]
Reset gymnasium environment.
- Parameters:
- Returns:
tuple containing:
obs (
numpy.ndarray
): Initial environment observation.info (
dict
): Dictionary containing additional information.
- Return type:
(tuple)
- reference(x)[source]
Returns the current value of the periodic reference signal that is tracked by the Synthetic oscillatory network.
- abstract render(mode='human')[source]
Render one frame of the environment.
- Parameters:
mode (str, optional) – Gym rendering mode. The default mode will do something human friendly, such as pop up a window.
- Raises:
NotImplementedError – Will throw a NotImplimented error since the render method has not yet been implemented.
Note
This currently is not yet implemented.