stable_gym.envs.classic_control.ex3_ekf
Noisy master slave system (Ex3EKF) gymnasium environment.
Dynamics
The dynamics of the system whose state is to be estimated are given by:
In which the state vector \((x(k)\) is given by:
and the measurement vector \((y(k))\) is given by:
Estimator design:
Submodules
Classes
| Noisy master slave system | 
Package Contents
- class stable_gym.envs.classic_control.ex3_ekf.Ex3EKF(render_mode=None, clipped_action=True)[source]
- Bases: - gymnasium.Env- Noisy master slave system - Description:
- The goal of the agent in the Ex3EKF environment is to act in such a way that estimator perfectly estimated the original noisy system. By doing this it serves as a RL based stationary Kalman filter. First presented by Wu et al. 2023. 
- Observation:
- Type: Box(4) - Num - Observation - Min - Max - 0 - The estimated angle - -10000 rad - 10000 rad - 1 - The estimated frequency - -10000 hz - 10000 hz - 2 - Actual angle - -10000 rad - 10000 rad - 3 - Actual frequency - -10000 rad - 10000 rad 
- Actions:
- Type: Box(2) - Num - Action - 0 - First action coming from the RL Kalman filter - 1 - Second action coming from the RL Kalman filter 
- Cost:
- A cost, computed as the sum of the squared differences between the estimated and the actual states: \[C = {(\hat{x}_1 - x_1)}^2 + {(\hat{x}_2 - x_2)}^2\]
- Starting State:
- All observations are assigned a uniform random value in - [-0.05..0.05]
- Episode Termination:
- When the step cost is higher than 100. 
 
- Solved Requirements:
- Considered solved when the average cost is lower than 300. 
 - state
- The current system state. - Type:
 
 - Initialise new Ex3EKF environment instance. - Parameters:
 - _action_clip_warning = False
 - t = 0.0
 - dt = 0.1
 - q1 = 0.01
 - g = 9.81
 - l_net = 1.0
 - mean1 = [0, 0]
 - cov1
 - mean2 = 0
 - cov2 = 0.01
 - missing_rate = 0
 - sigma = 0
 - high
 - action_space
 - observation_space
 - reward_range = (0.0, 100.0)
 - _clipped_action
 - viewer = None
 - state = None
 - output = None
 - steps_beyond_done = None
 - step(action)[source]
- Take step into the environment. - Parameters:
- action (numpy.ndarray) – The action we want to perform in the environment. 
- Returns:
- tuple containing: - obs ( - np.ndarray): Environment observation.
- cost ( - float): Cost of the action.
- terminated ( - bool): Whether the episode is terminated.
- truncated ( - bool): Whether the episode was truncated. This value is set by wrappers when for example a time limit is reached or the agent goes out of bounds.
- info ( - dict): Additional information about the environment.
 
- Return type:
- (tuple) 
 
 - reset(seed=None, options=None)[source]
- Reset gymnasium environment. - Parameters:
- Returns:
- tuple containing: - obs ( - numpy.ndarray): Initial environment observation.
- info ( - dict): Dictionary containing additional information.
 
- Return type:
- (tuple) 
 
 - reference(x)[source]
- Returns the current value of the periodic reference signal that is tracked by the Synthetic oscillatory network. 
 - abstract render(mode='human')[source]
- Render one frame of the environment. - Parameters:
- mode (str, optional) – Gym rendering mode. The default mode will do something human friendly, such as pop up a window. 
- Raises:
- NotImplementedError – Will throw a NotImplimented error since the render method has not yet been implemented. 
 - Note - This currently is not yet implemented. 
 - property tau
- Alias for the environment step size. Done for compatibility with the
- other gymnasium environments.
 - property physics_time
- Returns the physics time. Alias for :attr:`.t`.