stable_learning_control.algos.tf2.policies.critics.Q_critic

Lyapunov actor critic policy.

This module contains a TensorFlow 2.x implementation of the Q Critic policy of Haarnoja et al. 2019.

Classes

QCritic

Soft Q critic network.

Module Contents

class stable_learning_control.algos.tf2.policies.critics.Q_critic.QCritic(obs_dim, act_dim, hidden_sizes, activation=nn.relu, output_activation=None, name='q_critic', **kwargs)[source]

Bases: tf.keras.Model

Soft Q critic network.

Q[source]

The layers of the network.

Type:: tf.keras.Sequential

Initialise the QCritic object.

Parameters:

obs_dim (int) – Dimension of the observation space.
act_dim (int) – Dimension of the action space.
hidden_sizes (list) – Sizes of the hidden layers.
activation (tf.keras.activations, optional) – The activation function. Defaults to tf.nn.relu.
output_activation (tf.keras.activations, optional) – The activation function used for the output layers. Defaults to None which is equivalent to using the Identity activation function.
name (str, optional) – The Lyapunov critic name. Defaults to q_critic.
**kwargs – All kwargs to pass to the tf.keras.Model. Can be used to add additional inputs or outputs.

Q[source]

call(inputs)[source]

Perform forward pass through the network.

Parameters:

inputs (tuple) –

tuple containing:

obs (tf.Tensor): The tensor of observations.
act (tf.Tensor): The tensor of actions.

Returns:

The tensor containing the Q values of the input observations and actions.

Return type:

tf.Tensor