bagua.torch_api.algorithms¶
Submodules¶
Package Contents¶
- class bagua.torch_api.algorithms.Algorithm¶
This is the base class that all Bagua algorithms inherit.
- reify(self, process_group)¶
Create an algorithm instance.
- Parameters
process_group (bagua.torch_api.communication.BaguaProcessGroup) – The process group to work on.
- class bagua.torch_api.algorithms.AlgorithmImpl(process_group)¶
This is the base class that all Bagua algorithm implementations inherit.
It provides methods that can be override to implement different kinds of distributed algorithms.
- Parameters
process_group (bagua.torch_api.communication.BaguaProcessGroup) – The process group to work on.
- init_backward_hook(self, bagua_module)¶
Given a
BaguaModule
, return a hook function that will be executed on every parameter’s gradient computation completion.- Parameters
bagua_module (bagua.torch_api.distributed.BaguaModule) – A PyTorch module initialized by
with_bagua
method.- Returns
A function that takes the name of a parameter (as in
torch.nn.Module.named_parameters
) and the parameter itself.
- init_forward_pre_hook(self, bagua_module)¶
Given a
BaguaModule
, return a hook function that will be executed before the forward process.- Parameters
bagua_module (bagua.torch_api.distributed.BaguaModule) – A PyTorch module initialized by
with_bagua
method.- Returns
A function that takes the model’s input.
- init_operations(self, bagua_module, bucket)¶
Given a
BaguaModule
, and aBaguaBucket
, register operations to be executed on the bucket.- Parameters
bagua_module (bagua.torch_api.distributed.BaguaModule) – A PyTorch module initialized by
with_bagua
method.bucket (bagua.torch_api.bucket.BaguaBucket) – A single bucket to register operations.
- init_post_backward_hook(self, bagua_module)¶
Given a
BaguaModule
, return a hook function that will be executed when the backward pass is done.- Parameters
bagua_module (bagua.torch_api.distributed.BaguaModule) – A PyTorch module initialized by
with_bagua
method.- Returns
A function that takes no argument.
- init_post_optimizer_step_hook(self, bagua_module)¶
Given a
BaguaModule
, return a hook function that will be executed when theoptimizer.step()
is done.- Parameters
bagua_module (bagua.torch_api.distributed.BaguaModule) – A PyTorch module initialized by
with_bagua
method.- Returns
A function that gets called after an optimizer’s
step()
method is called. The function takes the optimizer as its argument.
- init_tensors(self, bagua_module)¶
Given a
BaguaModule
, return Bagua tensors to be used in Bagua for later operations.- Parameters
bagua_module (bagua.torch_api.distributed.BaguaModule) – A PyTorch module initialized by
with_bagua
method.- Returns
A list of Bagua tensors for communication.
- Return type
- need_reset(self)¶
- Returns
True
if all initialization methods of the current algorithms should be called again. This is useful for algorithms that have multiple stages where each stage needs different initializations.- Return type
bool
- tensors_to_buckets(self, tensors, do_flatten)¶
Given the bucketing suggestion from Bagua, return the actual Bagua buckets. The default implementation follows the suggestion to do the bucketing.
- Parameters
tensors (List[List[bagua.torch_api.tensor.BaguaTensor]]) – Bagua tensors grouped in different lists, representing Bagua’s suggestion on how to bucketing the tensors.
do_flatten (bool) – Whether to flatten the Bagua buckets.
- Returns
A list of Bagua buckets.
- Return type