stable_learning_control.algos.pytorch.latc
A Lyapunov (soft) Actor-Twin Critic Agent.
Submodules
Functions
|
Trains the LATC algorithm in a given environment. |
Package Contents
- stable_learning_control.algos.pytorch.latc.latc(env_fn, actor_critic=None, *args, **kwargs)[source]
Trains the LATC algorithm in a given environment.
- Parameters:
env_fn – A function which creates a copy of the environment. The environment must satisfy the gymnasium API.
actor_critic (torch.nn.Module, optional) –
The constructor method for a Torch Module with an
actmethod, apimodule and severalQorLmodules. Theactmethod andpimodule should accept batches of observations as inputs, and theQ*andLmodules should accept a batch of observations and a batch of actions as inputs. When called, these modules should return:Call
Output Shape
Description
act(batch, act_dim)
Numpy array of actions for eachobservation.Q*/L(batch,)
Tensor containing one current estimateofQ*/Lfor the providedobservations and actions. (Critical:make sure to flatten this!)Calling
pishould return:Symbol
Shape
Description
a(batch, act_dim)
Tensor containing actions from policygiven observations.logp_pi(batch,)
Tensor containing log probabilities ofactions ina. Importantly:gradients should be able to flow backintoa.Defaults to
LyapunovActorTwinCritic*args – The positional arguments to pass to the
lac()method.**kwargs – The keyword arguments to pass to the
lac()method.
Note
Wraps the
lac()function so that theLyapunovActorTwinCriticarchitecture is used as the actor critic.