MPI Tools 

Core MPI Utilities 

Module used for managing MPI processes.

stable_learning_control.utils.mpi_utils.mpi_tools.mpi_fork(n, bind_to_core=False)[source]

Re-launches the current script with workers linked by MPI.

Also, terminates the original process that launched it.

Taken almost without modification from the Baselines function of the same name.

Parameters:

n (int) – Number of process to split into.
bind_to_core (bool, optional) – Bind each MPI process to a core. Defaults to False.

stable_learning_control.utils.mpi_utils.mpi_tools.msg(m, string='')[source]

Send message from one MPI process to the other.

Parameters:

m (str) – Message you want to send.
string (str, optional) – Additional process description. Defaults to "".

stable_learning_control.utils.mpi_utils.mpi_tools.pprint(input_str='', end='\n', comm=<MagicMock name='mock.COMM_WORLD' id='140339105640512'>)[source]

Print for MPI parallel programs: Only rank 0 prints str.

Parameters:

input_str (str) – The string you want to print.
end (str) – The print end character.
comm (mpi4py.MPI.COMM_WORLD) – MPI communicator.

stable_learning_control.utils.mpi_utils.mpi_tools.proc_id()[source]: Get rank of calling process.

stable_learning_control.utils.mpi_utils.mpi_tools.allreduce(*args, **kwargs)[source]

Reduced results of a operation across all processes.

Parameters:

*args – All args to pass to thunk.
**kwargs – All kwargs to pass to thunk.

Returns:

Result object.

Return type:

object

stable_learning_control.utils.mpi_utils.mpi_tools.num_procs()[source]

Count active MPI processes.

Returns:: The number of mpi processes.
Return type:: int

stable_learning_control.utils.mpi_utils.mpi_tools.broadcast(x, root=0)[source]

Broadcast variable to other MPI processes.

Parameters:

x (object) – Variable you want to broadcast.
root (int, optional) – Rank of the root process. Defaults to 0.

stable_learning_control.utils.mpi_utils.mpi_tools.mpi_op(x, op)[source]

Perform a MPI operation.

Parameters:

x (object) – Python variable.
op (mpi4py.MPI.Op) – Operation type

Returns:

Reduced mpi operation result.

Return type:

object

stable_learning_control.utils.mpi_utils.mpi_tools.mpi_sum(x)[source]

Take the sum of a scalar or vector over MPI processes.

Parameters:: x (object) – Python variable.
Returns:: Reduced sum.
Return type:: object

stable_learning_control.utils.mpi_utils.mpi_tools.mpi_avg(x)[source]

Average a scalar or vector over MPI processes.

Parameters:: x (object) – Python variable.
Returns:: Reduced average.
Return type:: object

stable_learning_control.utils.mpi_utils.mpi_tools.mpi_statistics_scalar(x, with_min_and_max=False)[source]

Get mean/std and optional min/max of scalar x across MPI processes.

Parameters:

x – An array containing samples of the scalar to produce statistics for.
with_min_and_max (bool, optional) – If true, return min and max of x in addition to mean and std. Defaults to False.

Returns:

Reduced mean and standard deviation.

Return type:

tuple

stable_learning_control.utils.mpi_utils.mpi_pytorch contains a few tools to make it easy to do data-parallel PyTorch optimization across MPI processes. The two main ingredients are syncing parameters and averaging gradients before the adaptive optimizer uses them. Also, there’s a hacky fix for a problem where the PyTorch instance in each separate process tries to get too many threads, and they start to destroy each other.

The pattern for using these tools looks something like this:

At the beginning of the training script, call setup_pytorch_for_mpi(). (Avoids clobbering problem.)
After you’ve constructed a PyTorch module, call sync_params(module).
Then, during gradient descent, call mpi_avg_grads after the backward pass, like so:

optimizer.zero_grad()
loss = compute_loss(module)
loss.backward()
mpi_avg_grads(module)   # averages gradient buffers across MPI processes!
optimizer.step()

Helper methods for managing Pytorch MPI processes.

Note

This module is not yet used in any of the current algorithms, but is kept here for future reference.

stable_learning_control.utils.mpi_utils.mpi_pytorch.setup_pytorch_for_mpi()[source]: Avoid slowdowns caused by each separate process’s PyTorch using more than its fair share of CPU resources.

stable_learning_control.utils.mpi_utils.mpi_pytorch.mpi_avg_grads(module)[source]

Average contents of gradient buffers across MPI processes.

Parameters:: module (object) – Python object for which you want to average the gradients.

stable_learning_control.utils.mpi_utils.mpi_pytorch.sync_params(module)[source]

Sync all parameters of module across all MPI processes.

Parameters:: module (object) – Python object for which you want to average the gradients.

MPI + TensorFlow Utilities 

Todo

Tools to make it easy to do data-parallel TensorFlow 2.x optimization across MPI processes still need to be implemented.

MPI Tools

Core MPI Utilities

MPI + PyTorch Utilities

MPI + TensorFlow Utilities

MPI Tools 

Core MPI Utilities 

MPI + PyTorch Utilities 

MPI + TensorFlow Utilities 