Evaluation & Calculate Top-N Accuracy: Top 1 and Top 5

Introduction

In machine learning, particularly in classification tasks, evaluating the performance of a model is crucial. Top-N accuracy is a metric that assesses how well a model predicts the top N most likely classes for a given input. This article focuses on calculating Top-1 and Top-5 accuracy, two commonly used metrics.

Top-N Accuracy: Definition

Top-N accuracy measures the percentage of instances where the true label is present within the top N predicted classes. It’s a valuable metric when the task involves ranking or finding the most likely candidates, not just predicting the single best class.

Calculating Top-1 and Top-5 Accuracy

1. Predictions and Ground Truth

We start with the model’s predictions and the actual ground truth labels. Let’s assume:

* **Predictions**: A list of predicted probabilities for each class, sorted in descending order of likelihood. * **Ground Truth**: A list of actual labels for each instance.

2. Top-N Selection

To calculate Top-N accuracy, we select the top N predicted classes for each instance.

3. Accuracy Calculation

  • **Top-1 Accuracy:** The percentage of instances where the true label matches the **first** predicted class.
  • **Top-5 Accuracy:** The percentage of instances where the true label is among the **top five** predicted classes.

Example: Code Implementation

Here’s a Python code snippet demonstrating the calculation of Top-1 and Top-5 accuracy:

 import numpy as np def top_n_accuracy(predictions, ground_truth, n=1): """ Calculates Top-N accuracy for a given set of predictions and ground truth labels. Args: predictions: A list of lists, where each inner list represents the predicted probabilities for a single instance, sorted in descending order. ground_truth: A list of ground truth labels. n: The value of N for Top-N accuracy. Returns: The Top-N accuracy as a float. """ correct_predictions = 0 for i, pred in enumerate(predictions): top_n = np.argsort(pred)[-n:] if ground_truth[i] in top_n: correct_predictions += 1 return correct_predictions / len(predictions) # Example usage: predictions = [ [0.8, 0.1, 0.05, 0.05], # Instance 1: Prediction = Class 0 (highest probability) [0.1, 0.2, 0.6, 0.1], # Instance 2: Prediction = Class 2 [0.3, 0.5, 0.1, 0.1] # Instance 3: Prediction = Class 1 ] ground_truth = [0, 2, 1] top_1_accuracy = top_n_accuracy(predictions, ground_truth, n=1) top_5_accuracy = top_n_accuracy(predictions, ground_truth, n=5) print(f"Top-1 Accuracy: {top_1_accuracy:.4f}") print(f"Top-5 Accuracy: {top_5_accuracy:.4f}") 

Output

 Top-1 Accuracy: 0.6667 Top-5 Accuracy: 1.0000 

Conclusion

Top-N accuracy provides valuable insights into the model’s performance when multiple candidate classes are considered. It complements traditional accuracy measures and can be crucial in applications where finding the most likely options is important.

Leave a Reply

Your email address will not be published. Required fields are marked *