Ex3EKF gymnasium environment

A gymnasium environment for a noisy master-slave system. This environment can be used to train an RL-based stationary Kalman filter. First presented by Wu et al. 2023.

Observation space

hat_x_1: The estimated angle.
hat_x_2: The estimated frequency.
x_1: Actual angle.
x_2: Actual frequency.

Action space

u1: First action coming from the RL Kalman filter.
u2: Second action coming from the RL Kalman filter.

Episode termination

An episode is terminated when the maximum step limit is reached, or the step cost exceeds 100.

Environment goal

The agent’s goal in the Ex3EKF environment is to act so that the estimator estimates the original noisy system perfectly. By doing this, it serves as an RL-based stationary Kalman filter.

Cost function

The Ex3EKF environment uses the following cost function:

\[ cost = (hat_x_1 - x_1)^2 + (hat_x_2 - x_2)^2 \]

Environment step return

In addition to the observations, the environment returns an info dictionary containing the current reference and the error when a step is taken. This results in returning the following array:

[hat_x_1, hat_x_2, x_1, x_2, info_dict]

How to use

This environment is part of the Stable Gym package. It is therefore registered as a gymnasium environment when you import the Stable Gym package. If you want to use the environment in stand-alone mode, you can register it yourself.

Important

This environment does not have a render function.