stable_learning_control.algos.pytorch.policies.critics.Q_critic

Lyapunov actor critic policy.

This module contains a Pytorch implementation of the Q Critic policy of Haarnoja et al. 2019.

Classes

QCritic

Soft Q critic network.

Module Contents

class stable_learning_control.algos.pytorch.policies.critics.Q_critic.QCritic(obs_dim, act_dim, hidden_sizes, activation=nn.ReLU, output_activation=nn.Identity)[source]

Bases: torch.nn.Module

Soft Q critic network.

Q[source]

The layers of the network.

Type:: torch.nn.Sequential

Initialise the QCritic object.

Parameters:

obs_dim (int) – Dimension of the observation space.
act_dim (int) – Dimension of the action space.
hidden_sizes (list) – Sizes of the hidden layers.
activation (torch.nn.modules.activation, optional) – The activation function. Defaults to torch.nn.ReLU.
output_activation (torch.nn.modules.activation, optional) – The activation function used for the output layers. Defaults to torch.nn.Identity.

__device_warning_logged = False[source]

_obs_same_device = False[source]

_act_same_device = False[source]

Q[source]

forward(obs, act)[source]

Perform forward pass through the network.

Parameters:

obs (torch.Tensor) – The tensor of observations.
act (torch.Tensor) – The tensor of actions.

Returns:

The tensor containing the Q values of the input observations and actions.

Return type:

torch.Tensor