Evaluation & Calculate Top-N Accuracy: Top 1 and Top 5
Introduction
Top-N accuracy is a common evaluation metric used in machine learning tasks where the goal is to predict a ranked list of items, such as in recommendation systems, image classification, and information retrieval.
Top-N Accuracy
Top-N accuracy measures how often the correct item is ranked within the top N predictions. It’s calculated as the ratio of correct predictions within the top N to the total number of predictions.
Top 1 Accuracy
Top 1 accuracy, also known as precision, measures the percentage of times the model correctly predicts the most likely item. It’s simply the number of correct predictions at rank 1 divided by the total number of predictions.
Top 5 Accuracy
Top 5 accuracy measures the percentage of times the correct item is ranked within the top 5 predictions. It’s calculated as the number of correct predictions within the top 5 divided by the total number of predictions.
Example:
Code:
import numpy as np
def calculate_top_n_accuracy(predictions, labels, n):
"""
Calculates Top-N accuracy.
Args:
predictions: A list of predicted lists of item IDs.
labels: A list of ground truth item IDs.
n: The value of N.
Returns:
The Top-N accuracy.
"""
correct_predictions = 0
for i in range(len(predictions)):
if labels[i] in predictions[i][:n]:
correct_predictions += 1
return correct_predictions / len(predictions)
# Example usage
predictions = [
[1, 2, 3, 4, 5],
[5, 1, 2, 3, 4],
[2, 3, 1, 4, 5]
]
labels = [1, 5, 2]
top_1_accuracy = calculate_top_n_accuracy(predictions, labels, 1)
top_5_accuracy = calculate_top_n_accuracy(predictions, labels, 5)
print("Top 1 Accuracy:", top_1_accuracy)
print("Top 5 Accuracy:", top_5_accuracy)
Output:
Top 1 Accuracy: 0.3333333333333333
Top 5 Accuracy: 1.0
Choosing the right Top-N
The choice of N depends on the specific application and the desired trade-off between precision and recall.
- Higher N: Higher recall but lower precision.
- Lower N: Lower recall but higher precision.
Conclusion
Top-N accuracy is an important evaluation metric for ranking-based tasks. It provides a valuable insight into the model’s ability to predict the correct item within a specified range of predictions.