stable_learning_control.algos.tf2.latc

A Lyapunov (soft) Actor-Twin Critic Agent.

Submodules

stable_learning_control.algos.tf2.latc.latc

Functions

latc(env_fn[, actor_critic])

Trains the LATC algorithm in a given environment.

Package Contents

stable_learning_control.algos.tf2.latc.latc(env_fn, actor_critic=None, *args, **kwargs)[source]

Trains the LATC algorithm in a given environment.

Parameters:

env_fn – A function which creates a copy of the environment. The environment must satisfy the gymnasium API.

actor_critic (tf.Module, optional) –

The constructor method for a TensorFlow Module with an act method, a pi module and several Q or L modules. The act method and pi module should accept batches of observations as inputs, and the Q* and L modules should accept a batch of observations and a batch of actions as inputs. When called, these modules should return:

Call	Output Shape	Description
`act`	(batch, act_dim)	Numpy array of actions for each observation.
`Q*/L`	(batch,)	Tensor containing one current estimate of `Q*/L` for the provided observations and actions. (Critical: make sure to flatten this!)

Calling pi should return:

Symbol	Shape	Description
`a`	(batch, act_dim)	Tensor containing actions from policy given observations.
`logp_pi`	(batch,)	Tensor containing log probabilities of actions in `a`. Importantly: gradients should be able to flow back into `a`.

Defaults to LyapunovActorTwinCritic

*args – The positional arguments to pass to the lac() method.
**kwargs – The keyword arguments to pass to the lac() method.

Note

Wraps the lac() function so that the LyapunovActorTwinCritic architecture is used as the actor critic.