PyTorch – Connection Between loss.backward() and optimizer.step()

Understanding Loss.backward() and Optimizer.step() in PyTorch

In PyTorch, training a neural network involves iteratively updating the model’s parameters to minimize the loss function. The key components in this process are loss.backward() and optimizer.step(). This article explores the connection and interplay between these two essential functions.

1. Loss.backward() – The Backpropagation Engine

1.1 Backpropagation: The Core Principle

Backpropagation is a fundamental algorithm in neural networks. It’s the process of calculating the gradients of the loss function with respect to each parameter in the model. These gradients represent how much each parameter contributes to the overall loss.

1.2 Loss.backward(): Triggering Backpropagation

The loss.backward() function initiates the backpropagation process. It computes the gradients of the loss function and stores them in the .grad attribute of each parameter tensor within the model.


import torch

# Example
model = torch.nn.Linear(10, 1)  # A simple linear model
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

input = torch.randn(1, 10)
target = torch.randn(1)

output = model(input)
loss = torch.nn.MSELoss()(output, target)

loss.backward()  # Triggers backpropagation

2. Optimizer.step(): Updating Parameters

2.1 Optimizers: Gradient-Based Updates

Optimizers in PyTorch are responsible for using the calculated gradients to adjust the model’s parameters. They employ various algorithms to update the parameters in a way that minimizes the loss function. Common optimizers include:

  • Stochastic Gradient Descent (SGD)
  • Adam
  • RMSprop

2.2 Optimizer.step(): Applying Parameter Updates

The optimizer.step() function applies the chosen optimization algorithm to update the parameters using the gradients accumulated during backpropagation.


optimizer.step()  # Applies parameter updates based on calculated gradients

3. The Interplay: A Two-Step Dance

The interplay between loss.backward() and optimizer.step() forms the core of the training loop in PyTorch:

  1. Calculate Loss: Compute the loss function based on the model’s predictions and target values.
  2. Backpropagate: Call loss.backward() to calculate gradients for all parameters.
  3. Update Parameters: Execute optimizer.step() to update the parameters based on the computed gradients.
Function Purpose
loss.backward() Calculates gradients of the loss function
optimizer.step() Updates model parameters based on gradients

4. Key Points to Remember

  • loss.backward() must be called before optimizer.step() to ensure gradients are computed.
  • The optimizer uses the gradients calculated by backpropagation to update parameters.
  • The training process involves iterating through this two-step process (backward and step) until the loss converges to a satisfactory level.


Leave a Reply

Your email address will not be published. Required fields are marked *