stable_gym.envs.biological
Stable Gym gymnasium environments that are based on Biological systems.
Subpackages
Classes
Synthetic oscillatory network environment. |
|
Challenging (i.e. complicated) oscillatory network environment. This environment |
Package Contents
- class stable_gym.envs.biological.Oscillator(render_mode=None, max_cost=100.0, reference_target_position=8.0, reference_amplitude=7.0, reference_frequency=1 / 200, reference_phase_shift=0.0, clip_action=True, exclude_reference_from_observation=False, exclude_reference_error_from_observation=False, action_space_dtype=np.float64, observation_space_dtype=np.float64)[source]
Bases:
gymnasium.Env
Synthetic oscillatory network environment.
Note
Can also be used in a vectorized manner. See the gym.vector documentation.
- Description:
The goal of the agent in the oscillator environment is to act in such a way that one of the proteins of the synthetic oscillatory network follows a supplied reference signal.
- Source:
This environment corresponds to the Oscillator environment used in the paper Han et al. 2020. In our implementation several additional features were added to the environment to make it more flexible and easier to use:
Environment arguments now allow for modification of the reference signal parameters.
System parameters can now be individually adjusted for each protein, rather than applying the same parameters across all proteins.
The reference can be omitted from the observation.
Reference error can be included in the info dictionary.
The observation space was expanded to accurately reproduce the plots presented in Han et al. 2020, which was not possible with the original code’s observation space.
Added an adjustable
max_cost
threshold for episode termination, defaulting to 100 to match the original environment.
- Observation:
Type: Box(7) or Box(8) depending on the
exclude_reference_error_from_observation
argument.Num
Observation
Min
Max
0
Lacl mRNA transcripts concentration
0
\(\infty\)
1
tetR mRNA transcripts concentration
0
\(\infty\)
2
CI mRNA transcripts concentration
0
\(\infty\)
3
lacI (repressor) protein concentration(Inhibits transcription of the tetR gene)0
\(\infty\)
4
tetR (repressor) protein concentration(Inhibits transcription of CI gene)0
\(\infty\)
5
CI (repressor) protein concentration(Inhibits transcription of lacI gene)0
\(\infty\)
6
The reference we want to follow
0
\(\infty\)
Optional - The error between the currentvalue of protein 1 and the reference\(-\infty\)
\(\infty\)
- Actions:
Type: Box(3)
Num
Action
Min
Max
0
Relative intensity of light signal that induce theexpression of the Lacl mRNA gene.0
1
1
Relative intensity of light signal that induce theexpression of the tetR mRNA gene.0
1
2
Relative intensity of light signal that induce theexpression of the CI mRNA gene.0
1
- Cost:
A cost, computed as the sum of the squared differences between the estimated and the actual states:
\[C = {p_1 - r_1}^2\]- Starting State:
All observations are assigned a uniform random value in
[0..5]
- Episode Termination:
An episode is terminated when the maximum step limit is reached.
The step exceeds a threshold (default is 100). This threshold can be adjusted using the max_cost environment argument.
- Solved Requirements:
Considered solved when the average cost is lower than 300.
- How to use:
import stable_gym import gymnasium as gym env = gym.make("stable_gym:Oscillator-v1")
On reset, the
options
parameter allows the user to change the bounds used to determine the new random state whenrandom=True
.
- state
The current system state.
- Type:
Initialise a new Oscillator environment instance.
- Parameters:
render_mode (str, optional) – The render mode you want to use. Defaults to
None
. Not used in this environment.max_cost (float, optional) – The maximum cost allowed before the episode is terminated. Defaults to
100.0
.reference_target_position – The reference target position, by default
8.0
(i.e. the mean of the reference signal).reference_amplitude – The reference amplitude, by default
7.0
.reference_frequency – The reference frequency, by default
0.005
.reference_phase_shift – The reference phase shift, by default
0.0
.clip_action (str, optional) – Whether the actions should be clipped if they are greater than the set action limit. Defaults to
True
.exclude_reference_from_observation (bool, optional) – Whether the reference should be excluded from the observation. Defaults to
False
.exclude_reference_error_from_observation (bool, optional) – Whether the error should be excluded from the observation. Defaults to
False
.action_space_dtype (union[numpy.dtype, str], optional) – The data type of the action space. Defaults to
np.float64
.observation_space_dtype (union[numpy.dtype, str], optional) – The data type of the observation space. Defaults to
np.float64
.
- max_cost
- _action_clip_warning = False
- _clip_action
- _exclude_reference_from_observation
- _exclude_reference_error_from_observation
- _action_space_dtype
- _observation_space_dtype
- _action_dtype_conversion_warning = False
- t = 0.0
- dt = 1.0
- _init_state
- _init_state_range
- K1 = 1.0
- K2 = 1.0
- K3 = 1.0
- a1 = 1.6
- a2 = 1.6
- a3 = 1.6
- gamma1 = 0.16
- gamma2 = 0.16
- gamma3 = 0.16
- beta1 = 0.16
- beta2 = 0.16
- beta3 = 0.16
- c1 = 0.06
- c2 = 0.06
- c3 = 0.06
- b1 = 5.0
- b2 = 5.0
- b3 = 5.0
- delta1 = 0.0
- delta2 = 0.0
- delta3 = 0.0
- delta4 = 0.0
- delta5 = 0.0
- delta6 = 0.0
- obs_low
- obs_high
- action_space
- observation_space
- reward_range
- viewer = None
- state = None
- steps_beyond_done = None
- reference_target_pos
- reference_amplitude
- reference_frequency
- phase_shift
- step(action)[source]
Take step into the environment.
- Parameters:
action (numpy.ndarray) – The action we want to perform in the environment.
- Returns:
tuple containing:
obs (
np.ndarray
): Environment observation.cost (
float
): Cost of the action.terminated (
bool
): Whether the episode is terminated.truncated (
bool
): Whether the episode was truncated. This value is set by wrappers when for example a time limit is reached or the agent goes out of bounds.info (
dict
): Additional information about the environment.
- Return type:
(tuple)
- reset(seed=None, options=None, random=True)[source]
Reset gymnasium environment.
- Parameters:
seed (int, optional) – A random seed for the environment. By default
None
.options (dict, optional) – A dictionary containing additional options for resetting the environment. By default
None
. Not used in this environment.random (bool, optional) – Whether we want to randomly initialise the environment. By default True.
- Returns:
tuple containing:
obs (
numpy.ndarray
): Initial environment observation.info (
dict
): Dictionary containing additional information.
- Return type:
(tuple)
- reference(t)[source]
Returns the current value of the periodic reference signal that is tracked by the Synthetic oscillatory network.
- Parameters:
t (float) – The current time step.
- Returns:
The current reference value.
- Return type:
Note
This uses the general form of a periodic signal:
\[\begin{split}y(t) = A \sin(\omega t + \phi) + C \\ y(t) = A \sin(2 \pi f t + \phi) + C \\ y(t) = A \sin(\frac{2 \pi}{T} t + \phi) + C\end{split}\]Where:
\(t\) is the time.
\(A\) is the amplitude of the signal.
\(\omega\) is the frequency of the signal.
\(f\) is the frequency of the signal.
\(T\) is the period of the signal.
\(\phi\) is the phase of the signal.
\(C\) is the offset of the signal.
- abstract render(mode='human')[source]
Render one frame of the environment.
- Parameters:
mode (str, optional) – Gym rendering mode. The default mode will do something human friendly, such as pop up a window.
- Raises:
NotImplementedError – Will throw a NotImplimented error since the render method has not yet been implemented.
Note
This currently is not yet implemented.
- property tau
- Alias for the environment step size. Done for compatibility with the
- other gymnasium environments.
- property physics_time
- Returns the physics time. Alias for :attr:`.t`.
- class stable_gym.envs.biological.OscillatorComplicated(render_mode=None, max_cost=np.inf, reference_target_position=8.0, reference_amplitude=7.0, reference_frequency=1 / 200, reference_phase_shift=0.0, clip_action=True, exclude_reference_from_observation=False, exclude_reference_error_from_observation=False, action_space_dtype=np.float64, observation_space_dtype=np.float64)[source]
Bases:
gymnasium.Env
Challenging (i.e. complicated) oscillatory network environment. This environment class is based on the
Oscillator
environment class but has an additional protein, mRNA transcription and light input.Note
Can also be used in a vectorized manner. See the gym.vector documentation.
- Description:
The goal of the agent in the oscillator environment is to act in such a way that one of the proteins of the synthetic oscillatory network follows a supplied reference signal.
- Source:
This environment corresponds to the Oscillator environment used in the paper Han et al. 2020. In our implementation several additional features were added to the environment to make it more flexible and easier to use:
Environment arguments now allow for modification of the reference signal parameters.
System parameters can now be individually adjusted for each protein, rather than applying the same parameters across all proteins.
The reference can be omitted from the observation.
Reference error can be included in the info dictionary.
The observation space was expanded to accurately reproduce the plots presented in Han et al. 2020, which was not possible with the original code’s observation space.
Added an adjustable
max_cost
threshold for episode termination, defaulting to \(\infty\) to match the original environment.
- Observation:
Type: Box(9) or Box(10) depending on the
exclude_reference_error_from_observation
argument.Num
Observation
Min
Max
0
Lacl mRNA transcripts concentration
0
\(\infty\)
1
tetR mRNA transcripts concentration
0
\(\infty\)
2
CI mRNA transcripts concentration
0
\(\infty\)
3
Extra protein mRNA transcripts concentration
0
\(\infty\)
4
lacI (repressor) protein concentration(Inhibits transcription of the tetR gene)0
\(\infty\)
5
tetR (repressor) protein concentration(Inhibits transcription of CI gene)0
\(\infty\)
6
CI (repressor) protein concentration(Inhibits transcription of extra protein gene)0
\(\infty\)
7
Extra (repressor) protein concentration(Inhibits transcription of lacI gene)0
\(\infty\)
8
The reference we want to follow
0
\(\infty\)
Optional - The error between the currentvalue of protein 1 and the reference-\(\infty\)
\(\infty\)
- Actions:
Type: Box(3)
Num
Action
Min
Max
0
Relative intensity of light signal that induce theexpression of the Lacl mRNA gene.0
1
1
Relative intensity of light signal that induce theexpression of the tetR mRNA gene.0
1
2
Relative intensity of light signal that induce theexpression of the CI mRNA gene.0
1
3
Relative intensity of light signal that induce theexpression of the extra protein mRNA gene.0
1
- Cost:
A cost, computed as the sum of the squared differences between the estimated and the actual states:
\[C = {p_1 - r_1}^2\]- Starting State:
All observations are assigned a uniform random value in
[0..5]
- Episode Termination:
An episode is terminated when the maximum step limit is reached.
The step exceeds a threshold (default is \(\infty\)). This threshold can be adjusted using the max_cost environment argument.
- Solved Requirements:
Considered solved when the average cost is lower than 300.
- How to use:
import stable_gym import gymnasium as gym env = gym.make("stable_gym:OscillatorComplicated-v1")
On reset, the
options
parameter allows the user to change the bounds used to determine the new random state whenrandom=True
.
- state
The current system state.
- Type:
Initialise a new OscillatorComplicated environment instance.
- Parameters:
render_mode (str, optional) – The render mode you want to use. Defaults to
None
. Not used in this environment.max_cost (float, optional) – The maximum cost allowed before the episode is terminated. Defaults to
np.inf
.reference_target_position – The reference target position, by default
8.0
(i.e. the mean of the reference signal).reference_amplitude – The reference amplitude, by default
7.0
.reference_frequency – The reference frequency, by default
0.005
.reference_phase_shift – The reference phase shift, by default
0.0
.clip_action (str, optional) – Whether the actions should be clipped if they are greater than the set action limit. Defaults to
True
.exclude_reference_from_observation (bool, optional) – Whether the reference should be excluded from the observation. Defaults to
False
.exclude_reference_error_from_observation (bool, optional) – Whether the error should be excluded from the observation. Defaults to
False
.action_space_dtype (union[numpy.dtype, str], optional) – The data type of the action space. Defaults to
np.float64
.observation_space_dtype (union[numpy.dtype, str], optional) – The data type of the observation space. Defaults to
np.float64
.
- max_cost
- _action_clip_warning = False
- _clip_action
- _exclude_reference_from_observation
- _exclude_reference_error_from_observation
- _action_space_dtype
- _observation_space_dtype
- _action_dtype_conversion_warning = False
- t = 0.0
- dt = 1.0
- _init_state
- _init_state_range
- K1 = 1.0
- K2 = 1.0
- K3 = 1.0
- K4 = 1.0
- a1 = 1.6
- a2 = 1.6
- a3 = 1.6
- a4 = 1.6
- gamma1 = 0.16
- gamma2 = 0.16
- gamma3 = 0.16
- gamma4 = 0.16
- beta1 = 0.16
- beta2 = 0.16
- beta3 = 0.16
- beta4 = 0.16
- c1 = 0.06
- c2 = 0.06
- c3 = 0.06
- c4 = 0.06
- b1 = 5.0
- b2 = 5.0
- b3 = 5.0
- b4 = 5.0
- delta1 = 0.0
- delta2 = 0.0
- delta3 = 0.0
- delta4 = 0.0
- delta5 = 0.0
- delta6 = 0.0
- delta7 = 0.0
- delta8 = 0.0
- obs_low
- obs_high
- action_space
- observation_space
- reward_range
- viewer = None
- state = None
- steps_beyond_done = None
- reference_target_pos
- reference_amplitude
- reference_frequency
- phase_shift
- step(action)[source]
Take step into the environment.
- Parameters:
action (numpy.ndarray) – The action we want to perform in the environment.
- Returns:
tuple containing:
obs (
np.ndarray
): Environment observation.cost (
float
): Cost of the action.terminated (
bool
): Whether the episode is terminated.truncated (
bool
): Whether the episode was truncated. This value is set by wrappers when for example a time limit is reached or the agent goes out of bounds.info (
dict
): Additional information about the environment.
- Return type:
(tuple)
- reset(seed=None, options=None, random=True)[source]
Reset gymnasium environment.
- Parameters:
seed (int, optional) – A random seed for the environment. By default
None
.options (dict, optional) – A dictionary containing additional options for resetting the environment. By default
None
. Not used in this environment.random (bool, optional) – Whether we want to randomly initialise the environment. By default True.
- Returns:
tuple containing:
obs (
numpy.ndarray
): Initial environment observation.info (
dict
): Dictionary containing additional information.
- Return type:
(tuple)
- reference(t)[source]
Returns the current value of the periodic reference signal that is tracked by the Synthetic oscillatory network.
- Parameters:
t (float) – The current time step.
- Returns:
The current reference value.
- Return type:
Note
This uses the general form of a periodic signal:
\[\begin{split}y(t) = A \sin(\omega t + \phi) + C \\ y(t) = A \sin(2 \pi f t + \phi) + C \\ y(t) = A \sin(\frac{2 \pi}{T} t + \phi) + C\end{split}\]Where:
\(t\) is the time.
\(A\) is the amplitude of the signal.
\(\omega\) is the frequency of the signal.
\(f\) is the frequency of the signal.
\(T\) is the period of the signal.
\(\phi\) is the phase of the signal.
\(C\) is the offset of the signal.
- abstract render(mode='human')[source]
Render one frame of the environment.
- Parameters:
mode (str, optional) – Gym rendering mode. The default mode will do something human friendly, such as pop up a window.
- Raises:
NotImplementedError – Will throw a NotImplimented error since the render method has not yet been implemented.
Note
This currently is not yet implemented.
- property tau
- Alias for the environment step size. Done for compatibility with the
- other gymnasium environments.
- property physics_time
- Returns the physics time. Alias for :attr:`.t`.