stable_learning_control.utils.test_policy

A set of functions that can be used to see a algorithm perform in the environment it was trained on.

Attributes

parser

Functions

`_retrieve_iter_folder`(fpath, itr)	Retrieves the path of the requested model iteration.
`_retrieve_model_folder`(fpath)	Tries to retrieve the model folder and backend from the given path.
`load_policy_and_env`(fpath[, itr])	Load a policy from save, whether it's TF or PyTorch, along with RL env.
`load_tf_policy`(fpath, env[, itr])	Load a TensorFlow policy saved with Stable learning control Logger.
`load_pytorch_policy`(fpath, env[, itr])	Load a pytorch policy saved with Stable Learning Control Logger.
`run_policy`(env, policy[, max_ep_len, num_episodes, ...])	Evaluates a policy inside a given gymnasium environment.

Module Contents

stable_learning_control.utils.test_policy._retrieve_iter_folder(fpath, itr)[source]

Retrieves the path of the requested model iteration.

Parameters:

fpath (str) – The path where the model is found.
itr (int) – The current policy iteration (checkpoint).

Raises:

IOError – Raised if the model is corrupt.
FileNotFoundError – Raised if the model path did not exist.

Returns:

The model iteration path.

Return type:

str

stable_learning_control.utils.test_policy._retrieve_model_folder(fpath)[source]

Tries to retrieve the model folder and backend from the given path.

Parameters:

fpath (str) – The path where the model is found.

Raises:

IOError – Raised if the model is corrupt.
FileNotFoundError – Raised if the model path did not exist.

Returns:

tuple containing:

model_folder (func): The model folder.

backend (str): The inferred backend. Options are tf2 and
torch.

Return type:

(tuple)

stable_learning_control.utils.test_policy.load_policy_and_env(fpath, itr='last')[source]

Load a policy from save, whether it’s TF or PyTorch, along with RL env.

Parameters:

fpath (str) – The path where the model is found.
itr (str, optional) – The current policy iteration (checkpoint). Defaults to last.
deterministic (bool, optional) – Whether you want the action from the policy to be deterministic. Defaults to False.

Raises:

FileNotFoundError – Thrown when the fpath does not exist.
EnvLoadError – Thrown when something went wrong trying to load the saved environment.
PolicyLoadError – Thrown when something went wrong trying to load the saved policy.

Returns:

tuple containing:

env (gym.env): The gymnasium environment.

get_action (func): The policy get_action function.

Return type:

(tuple)

stable_learning_control.utils.test_policy.load_tf_policy(fpath, env, itr='last')[source]

Load a TensorFlow policy saved with Stable learning control Logger.

Parameters:

fpath (str) – The path where the model is found.
env (gym.env) – The gymnasium environment in which you want to test the policy.
itr (str, optional) – The current policy iteration. Defaults to “last”.

Returns:

The policy.

Return type:

tf.keras.Model

stable_learning_control.utils.test_policy.load_pytorch_policy(fpath, env, itr='last')[source]

Load a pytorch policy saved with Stable Learning Control Logger.

Parameters:

fpath (str) – The path where the model is found.
env (gym.env) – The gymnasium environment in which you want to test the policy.
itr (str, optional) – The current policy iteration. Defaults to “last”.

Returns:

The policy.

Return type:

torch.nn.Module

stable_learning_control.utils.test_policy.run_policy(env, policy, max_ep_len=None, num_episodes=100, render=True, deterministic=True)[source]

Evaluates a policy inside a given gymnasium environment.

Parameters:

env (gym.env) – The gymnasium environment.
policy (Union[tf.keras.Model, torch.nn.Module]) – The policy.
max_ep_len (int, optional) – The maximum episode length. Defaults to None.
num_episodes (int, optional) – Number of episodes you want to perform in the environment. Defaults to 100.
deterministic (bool, optional) – Whether you want the action from the policy to be deterministic. Defaults to True.
render (bool, optional) – Whether you want to render the episode to the screen. Defaults to True.

stable_learning_control.utils.test_policy.parser[source]