Evaluating Robustness

SLC ships with a handy utility for evaluating the policy’s robustness. This is done by assessing the policy performance for several episodes inside a given environment and applying several disturbances. You can run it with:

python -m stable_learning_control.run eval_robustness [path/to/output_directory] [disturber] [-h] [--list_disturbers] [--disturber_config DISTURBER_CONFIG] [--data_dir DATA_DIR] [--itr ITR] [--len LEN] [--episodes EPISODES] [--render] [--deterministic]
    [--disable_baseline] [--observations [OBSERVATIONS [OBSERVATIONS ...]]] [--references [REFERENCES [REFERENCES ...]]]
    [--reference_errors [REFERENCE_ERRORS [REFERENCE_ERRORS ...]]] [--absolute_reference_errors] [--merge_reference_errors] [--use_subplots] [--use_time] [--save_result]
    [--save_plots] [--figs_fmt FIGS_FMT] [--font_scale FONT_SCALE] [--use_wandb] [--wandb_job_type WANDB_JOB_TYPE] [--wandb_project WANDB_PROJECT] [--wandb_group WANDB_GROUP]
    [--wandb_run_name WANDB_RUN_NAME]

The most important input arguments are:

output_dir: str. The path to the output directory where the agent and environment were saved.

disturber: str. The name of the disturber you want to evaluate. Can include an unloaded module in ‘module:disturber_name’ style.

--cfg, --disturber_config DISTURBER_CONFIG, default=None: str. The configuration you want to pass to the disturber. It sets up the range of disturbances you wish to evaluate. Expects a dictionary that depends on the specified disturber (e.g. "{'mean': [0.25, 0.25], 'std': [0.05, 0.05]}" for ObservationRandomNoiseDisturber disturber).

Note

For more information about all the input arguments available for the eval_robustness tool you can use the --help flag or check the robustness evaluation utility documentation or the API reference.

Robustness eval configuration file (yaml)

The SLC CLI comes with a handy configuration file loader that can be used to load YAML configuration files. These configuration files provide a convenient way to store your robustness evaluation parameters such that results can be reproduced. You can supply the CLI with an experiment configuration file using the --eval_cfg flag. The configuration file format equals the format expected by the –exp_cfg flag of the run experiments utility.

--eval_cfg: path str. Sets the path to the yml config file used for loading experiment hyperparameter.

Available disturbers

The disturbers contained in the SLC package can be listed with the --list_disturbers flag. The following disturbers are currently available:

`ActionImpulseDisturber`(env, magnitude, time)	A gymnasium wrapper that can be used to disturb the action of a gymnasium environment with a impulse applied at a certain time step.
`ActionRandomNoiseDisturber`(env, mean, std)	A gymnasium wrapper that can be used to disturb the action of a gymnasium environment with normally distributed random noise.
`EnvAttributesDisturber`(env, attributes, values)	A gymnasium wrapper that can be used to disturb a physics parameter of a gymnasium environment.
`ObservationRandomNoiseDisturber`(env, mean, std)	A gymnasium wrapper that can be used to disturb the observation of a gymnasium environment with normally distributed random noise.

To get more information about the configuration values a given disturber expects, add the --help flag after a given disturber name. For example:

python -m stable_learning_control.run eval_robustness [path/to/output_directory] ObservationRandomNoiseDisturber --help

Results

Saved files

The robustness evaluation tool can save several files to disk that contain information about the robustness evaluation:

Output Directory Structure
`figures/`	A directory containing the robustness evaluation plots when the `--save_plots` flag was used.
`eval_statistics.csv`	File with general performance diagnostics for the episodes and disturbances used during the robustness evaluation.
`eval_results.csv`	Pandas data frame containing all the data that was collected for the episodes and disturbances used during the robustness evaluation. This file is only present when the `--save_results` flag is set and can be used to create custom plots.

These files will be saved inside the eval directory inside the output directory.

Tip

You can also log these results to Weights & Biases by adding the and --use_wandb flag to the CLI command (see Robustness eval utility for more information).

Plots

Default plots

By default, the following plots are displayed when running the robustness evaluation:

../_images/example_lac_robustness_eval_obs_plot.svg — This plot shows how the mean observation (states) paths change under different disturbances. This reference value is also shown if the environment dictionary contains a reference key.

../_images/example_lac_robustness_eval_costs_plot.svg — This plot shows how the mean reward changes under different disturbances.

images/plots/lac/example_lac_robustness_eval_reference_errors_plot.svg — This plot shows how the mean reference error changes under different disturbances. This plot is only shown if the environment dictionary contains a reference key.

Todo

Update reference_errors plot.

Create custom plots

You can also create any plots you like using the eval_results.csv data frame saved during the robustness evaluation. An example of how this is done can be found in stable_learning_control/examples/manual_robustness_eval_plots.py. This example loads the eval_results.csv data frame and uses it to plot how observation one changes under different disturbances. It also shows reference one.

import argparse
from pathlib import Path

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "fpath", type=str, help="The path where the robustness eval results are stored"
    )
    args = parser.parse_args()

    # Retrieve robustness eval dataframe.
    robustness_eval_df = pd.read_csv(Path(args.fpath).absolute())

    # Rename 'disturbance_label' column to 'disturbance'.
    robustness_eval_df.rename(
        columns={"disturbance_label": "disturbance"}, inplace=True
    )

    # Remove non relevant observations and references from the dataframe.
    filtered_eval_df = robustness_eval_df[
        ["step", "disturber", "disturbance", "observation_1", "reference_1"]
    ]

    # Remove number suffix from observation and reference columns.
    filtered_eval_df.rename(
        columns={
            "observation_1": "observation",
            "reference_1": "reference",
        },
        inplace=True,
    )

    # Merge observation and references into one "value" column and add a "signal" column
    # that specifies whether the signal is a reference or observation.
    filtered_eval_df = filtered_eval_df.melt(
        id_vars=["step", "disturbance", "disturber"],
        value_vars=["observation", "reference"],
        var_name="signal",
        value_name="value",
    )

    # Plot the selected observation and reference for each disturbance.
    fig, ax = plt.subplots(figsize=(12, 6), tight_layout=True)
    sns.lineplot(
        data=filtered_eval_df,
        x="step",
        y="value",
        hue="disturbance",
        style="signal",
        palette="tab10",
        legend="full",
        ax=ax,
    )
    ax.set_xlabel("Step")
    ax.set_ylabel("Value")
    ax.set_title("Observation and reference signals for each disturbance")
    plt.show()

Running this code will give you the following figure:

../_images/example_lac_robustness_eval_custom_plot.svg — This plot shows how the mean of observation one changed under different disturbances. The first reference value is also shown.

Todo

Update figure.

Use with custom environments

The robustness evaluation utility can be used with any gymnasium environment. If you want to show the reference and reference error plots, add the reference and reference_error keys to the environment dictionary that is returned by the environments step and reset methods.

How to add new disturber

The disturbers in the SLC package are implemented as gymnasium:gymnasium wrappers <api/wrappers/>. Because of this, they can be used with any gymnasium environment. If you want to add aor new disturber, you only have to ensure that it is a Python class that inherits from the gym.Wrapper class. For more infmation about gymnasium wrappers please checkout the gymnasium documentation. After implementing your disturber, you can create a pull request to add it to the SLC package or use it through the --disturber by specifying the module containing your disturber and the disturber class name. For example:

python -m stable_learning_control.run eval_robustness [path/to/output_directory] --disturber "my_module.MyDisturber"

Special attributes

The SLC package looks for several attributes in the disturber class to get information about the disturber that can be used during the robustness evaluation. These attributes are:

disturbance_label: str. Can be used to set the label of the disturber in the plots. If not present the robustness evaluation utility will generate a label based on the disturber configuration.

Manual robustness evaluation

A script version of the eval robustness tool can be found in the examples folder (i.e. eval_robustness.py). This script can be used when you want to perform some quick tests without implementing a disturber class.