Shortcuts

betty.optim

Iterative differentiation (ITD) differentiates through the optimization path, requiring tracking of the intermediate states of the parameter during optimization. However, native PyTorch optimizer overrides such intermediate states with the use of in-place operations for the good purpose of saving memory. We here provide the functionality of patching PyTorch’s native optimizers by substituting all involved in-place operations to allow ITD. When users pass their optimizer to IterativeProblem class through the constructor, initialize method of IterativeProblem will automatically call patch_optimizer and patch_scheduler to patch user-proviced PyTorch native optimizer/scheduler along with stateful modules.

betty.optim.patch_optimizer(optimizer, module)[source]

Patch PyTorch’s native optimizer by replacing all in-place operations in its step method

Parameters:
Returns:

Corresponding differentiable optimizer

Return type:

DifferentiableOptimizer

betty.optim.patch_scheduler(old_scheduler, new_optimizer)[source]

Patch the original learning rate scheduler to work with a patched differentiable optimizer.

Parameters:
  • old_scheduler (torch.optim.lr_scheduler) – User-provided PyTorch’s learning rate scheduler

  • new_optimizer (DifferentiableOptimizer) – Patched differentiable optimizer

Returns:

Patched learning rate scheduler

Return type:

torch.optim.lr_scheduler

Supported optimizers

Below is the list of differentiable optimizers supported by Betty.

class betty.optim.sgd.DifferentiableSGD(optimizer, module)[source]

Differentiable version of PyTorch’s SGD optimizer. All in-place operations are replaced.

class betty.optim.adam.DifferentiableAdam(optimizer, module)[source]

Differentiable version of PyTorch’s Adam optimizer. All in-place operations are replaced.

class betty.optim.adamw.DifferentiableAdamW(optimizer, module)[source]

Differentiable version of PyTorch’s AdamW optimizer. All in-place operations are replaced.