stable_learning_control.algos.tf2.latc
A Lyapunov (soft) Actor-Twin Critic Agent.
Submodules
Functions
|
Trains the LATC algorithm in a given environment. |
Package Contents
- stable_learning_control.algos.tf2.latc.latc(env_fn, actor_critic=None, *args, **kwargs)[source]
Trains the LATC algorithm in a given environment.
- Parameters:
env_fn – A function which creates a copy of the environment. The environment must satisfy the gymnasium API.
actor_critic (tf.Module, optional) –
The constructor method for a TensorFlow Module with an
act
method, api
module and severalQ
orL
modules. Theact
method andpi
module should accept batches of observations as inputs, and theQ*
andL
modules should accept a batch of observations and a batch of actions as inputs. When called, these modules should return:Call
Output Shape
Description
act
(batch, act_dim)
Numpy array of actions for eachobservation.Q*/L
(batch,)
Tensor containing one current estimateofQ*/L
for the providedobservations and actions. (Critical:make sure to flatten this!)Calling
pi
should return:Symbol
Shape
Description
a
(batch, act_dim)
Tensor containing actions from policygiven observations.logp_pi
(batch,)
Tensor containing log probabilities ofactions ina
. Importantly:gradients should be able to flow backintoa
.Defaults to
LyapunovActorTwinCritic
*args – The positional arguments to pass to the
lac()
method.**kwargs – The keyword arguments to pass to the
lac()
method.
Note
Wraps the
lac()
function so that theLyapunovActorTwinCritic
architecture is used as the actor critic.