stable_learning_control.algos.pytorch.policies.critics
Critic network structures.
Submodules
Classes
Soft Lyapunov critic Network. |
|
Soft Q critic network. |
Package Contents
- class stable_learning_control.algos.pytorch.policies.critics.LCritic(obs_dim, act_dim, hidden_sizes, activation=nn.ReLU)[source]
Bases:
torch.nn.Module
Soft Lyapunov critic Network.
- L
The layers of the network.
- Type:
Initialise the LCritic object.
- Parameters:
obs_dim (int) – Dimension of the observation space.
act_dim (int) – Dimension of the action space.
hidden_sizes (list) – Sizes of the hidden layers.
activation (
torch.nn.modules.activation
, optional) – The activation function. Defaults totorch.nn.ReLU
.
- __device_warning_logged = False
- _obs_same_device = False
- _act_same_device = False
- L
- forward(obs, act)[source]
Perform forward pass through the network.
- Parameters:
obs (torch.Tensor) – The tensor of observations.
act (torch.Tensor) – The tensor of actions.
- Returns:
The tensor containing the lyapunov values of the input observations and actions.
- Return type:
- class stable_learning_control.algos.pytorch.policies.critics.QCritic(obs_dim, act_dim, hidden_sizes, activation=nn.ReLU, output_activation=nn.Identity)[source]
Bases:
torch.nn.Module
Soft Q critic network.
- Q
The layers of the network.
- Type:
Initialise the QCritic object.
- Parameters:
obs_dim (int) – Dimension of the observation space.
act_dim (int) – Dimension of the action space.
hidden_sizes (list) – Sizes of the hidden layers.
activation (
torch.nn.modules.activation
, optional) – The activation function. Defaults totorch.nn.ReLU
.output_activation (
torch.nn.modules.activation
, optional) – The activation function used for the output layers. Defaults totorch.nn.Identity
.
- __device_warning_logged = False
- _obs_same_device = False
- _act_same_device = False
- Q
- forward(obs, act)[source]
Perform forward pass through the network.
- Parameters:
obs (torch.Tensor) – The tensor of observations.
act (torch.Tensor) – The tensor of actions.
- Returns:
The tensor containing the Q values of the input observations and actions.
- Return type: