stable_learning_control.algos.pytorch.latc

A Lyapunov (soft) Actor-Twin Critic Agent.

Submodules

Package Contents

Functions

latc(env_fn[, actor_critic])

Trains the LATC algorithm in a given environment.

stable_learning_control.algos.pytorch.latc.latc(env_fn, actor_critic=None, *args, **kwargs)[source]

Trains the LATC algorithm in a given environment.

Parameters:
  • env_fn – A function which creates a copy of the environment. The environment must satisfy the gymnasium API.

  • actor_critic (torch.nn.Module, optional) –

    The constructor method for a Torch Module with an act method, a pi module and several Q or L modules. The act method and pi module should accept batches of observations as inputs, and the Q* and L modules should accept a batch of observations and a batch of actions as inputs. When called, these modules should return:

    Call

    Output Shape

    Description

    act

    (batch, act_dim)

    Numpy array of actions for each
    observation.

    Q*/L

    (batch,)

    Tensor containing one current estimate
    of Q*/L for the provided
    observations and actions. (Critical:
    make sure to flatten this!)

    Calling pi should return:

    Symbol

    Shape

    Description

    a

    (batch, act_dim)

    Tensor containing actions from policy
    given observations.

    logp_pi

    (batch,)

    Tensor containing log probabilities of
    actions in a. Importantly:
    gradients should be able to flow back
    into a.

    Defaults to LyapunovActorTwinCritic

  • *args – The positional arguments to pass to the lac() method.

  • **kwargs – The keyword arguments to pass to the lac() method.

Note

Wraps the lac() function so that the LyapunovActorTwinCritic architecture is used as the actor critic.