Cross Entropy in PyTorch

By jacksparrow August 30, 2024

Cross entropy is a fundamental loss function in machine learning, particularly in classification tasks. It measures the difference between two probability distributions, one representing the true labels and the other representing the predicted probabilities. In PyTorch, the nn.CrossEntropyLoss class provides a convenient way to calculate and use cross entropy loss for training models.

Understanding Cross Entropy

Consider a binary classification problem where the target labels are either 0 or 1. Our model predicts a probability score for each class. Cross entropy calculates the average negative log-likelihood of the true labels, given the predicted probabilities.

The formula for cross entropy is:


H(p, q) = - Σ p(x) * log(q(x))

where:

p(x) is the true probability distribution of class x.
q(x) is the predicted probability distribution of class x.

A lower cross entropy value indicates a better model performance, as the predicted distribution gets closer to the true distribution.

Using Cross Entropy Loss in PyTorch

The nn.CrossEntropyLoss class in PyTorch automatically combines the LogSoftmax function with the negative log-likelihood calculation, making it easy to use.

Code Example


import torch
import torch.nn as nn

# Define a simple model
class MyModel(nn.Module):
    def __init__(self, input_size, output_size):
        super(MyModel, self).__init__()
        self.linear = nn.Linear(input_size, output_size)

    def forward(self, x):
        return self.linear(x)

# Initialize model, loss function, and optimizer
model = MyModel(10, 3)
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Sample data and target labels
inputs = torch.randn(10, 10)
targets = torch.randint(0, 3, (10,))

# Calculate predictions and loss
outputs = model(inputs)
loss = loss_fn(outputs, targets)

# Print the loss value
print(loss)

# Backpropagation and optimization
optimizer.zero_grad()
loss.backward()
optimizer.step()

Advantages of Cross Entropy Loss

Simplicity: It’s easy to implement and use.
Efficiency: PyTorch’s built-in implementation is optimized for performance.
Regularization: It inherently encourages the model to output well-calibrated probabilities.

Considerations

Multi-class Classification: For multi-class classification problems, the nn.CrossEntropyLoss function expects the target labels to be integers representing the class indices.
Softmax Output: The nn.CrossEntropyLoss function internally applies the LogSoftmax function, which normalizes the output probabilities to sum up to 1.

Conclusion

Cross entropy is a powerful loss function for training classification models. PyTorch’s nn.CrossEntropyLoss class provides a convenient and efficient way to leverage its advantages. By understanding the principles of cross entropy and its implementation in PyTorch, you can build robust and accurate classification models.

Post Views: 8

Cross Entropy in PyTorch