stable_learning_control.algos.pytorch.common.buffers

Contains several replay buffers used in the Pytorch algorithms.

Classes

`ReplayBuffer`	Wrapper around the general FIFO
`FiniteHorizonReplayBuffer`	Wrapper around the general FIFO
`TrajectoryBuffer`	Wrapper around the general

Module Contents

class stable_learning_control.algos.pytorch.common.buffers.ReplayBuffer(device='cpu', *args, **kwargs)[source]

Bases: stable_learning_control.algos.common.buffers.ReplayBuffer

Wrapper around the general FIFO ReplayBuffer which makes sure a torch.tensor is returned when sampling.

device[source]

The device the experiences are placed on (options: cpu, gpu, gpu:0, gpu:1, etc.).

Type:: str

Initialise the ReplayBuffer object.

Parameters:

device (str, optional) – The computational device to put the sampled experiences on (options: cpu, gpu, gpu:0, gpu:1, etc.). Defaults to cpu.
*args – All args to pass to the ReplayBuffer parent class.
**kwargs – All kwargs to pass to the class:ReplayBuffer parent class.

device[source]

sample_batch(*args, **kwargs)[source]

Retrieve a batch of experiences from buffer.

Parameters:

*args – All args to pass to the sample_batch() parent method.
**kwargs – All kwargs to pass to the sample_batch() parent method.

Returns:

A batch of experiences.

Return type:

dict

class stable_learning_control.algos.pytorch.common.buffers.FiniteHorizonReplayBuffer(device='cpu', *args, **kwargs)[source]

Bases: stable_learning_control.algos.common.buffers.FiniteHorizonReplayBuffer

Wrapper around the general FIFO FiniteHorizonReplayBuffer which makes sure a torch.tensor is returned when sampling.

device[source]

The device the experiences are placed on (options: cpu, gpu, gpu:0, gpu:1, etc.).

Type:: str

Initialise the FiniteHorizonReplayBuffer object.

Parameters:

device (str, optional) – The computational device to put the sampled experiences on (options: cpu, gpu, gpu:0, gpu:1, etc.). Defaults to cpu.
*args – All args to pass to the FiniteHorizonReplayBuffer parent class.
**kwargs – All kwargs to pass to the class:FiniteHorizonReplayBuffer parent class.

device[source]

sample_batch(*args, **kwargs)[source]

Retrieve a batch of experiences from buffer.

Parameters:

*args – All args to pass to the sample_batch() parent method.
**kwargs – All kwargs to pass to the sample_batch() parent method.

Returns:

A batch of experiences.

Return type:

dict

class stable_learning_control.algos.pytorch.common.buffers.TrajectoryBuffer(device='cpu', *args, **kwargs)[source]

Bases: stable_learning_control.algos.common.buffers.TrajectoryBuffer

Wrapper around the general TrajectoryBuffer which makes sure a torch.tensor is returned when sampling.

device[source]

The device the experiences are placed on (options: cpu, gpu, gpu:0, gpu:1, etc.).

Type:: str

Initialise the TrajectoryBuffer object.

Parameters:

device (str, optional) – The computational device to put the sampled experiences on (options: cpu, gpu, gpu:0, gpu:1, etc.). Defaults to cpu.
*args – All args to pass to the TrajectoryBuffer parent class.
**kwargs – All kwargs to pass to the TrajectoryBuffer parent class.

device[source]

get(*args, **kwargs)[source]

Retrieve the trajectory buffer.

Call this at the end of an epoch to get all of the data from the buffer. Also, resets some pointers in the buffer.

Parameters:

*args – All args to pass to the get() parent method.
**kwargs – All kwargs to pass to the get() parent method.

Returns:

The trajectory buffer.

Return type:

dict