How to compute precision, recall, accuracy and f1-score for the multiclass case with scikit learn?

By jacksparrow August 30, 2024

Computing Precision, Recall, Accuracy, and F1-Score for Multiclass Classification in Scikit-learn

Introduction

Evaluating the performance of a multiclass classification model requires metrics beyond simple accuracy. Metrics like precision, recall, F1-score, and macro/micro averaging provide a more nuanced understanding of how well the model distinguishes between different classes. This article demonstrates how to compute these metrics in Scikit-learn for multiclass classification problems.

Understanding the Metrics

Precision

Precision measures the proportion of correctly predicted positive instances among all instances predicted as positive for a specific class. In multiclass settings, we compute precision for each class individually.

Recall

Recall, also known as sensitivity, measures the proportion of correctly predicted positive instances among all actual positive instances for a specific class. Similar to precision, we calculate recall for each class.

Accuracy

Accuracy measures the overall proportion of correctly classified instances across all classes. It’s a global metric, unlike precision and recall which are class-specific.

F1-Score

The F1-score represents the harmonic mean of precision and recall, providing a balanced measure of model performance. A higher F1-score indicates better balance between precision and recall.

Using Scikit-learn for Multiclass Evaluation

Importing Libraries

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

Generating Sample Data

y_true = [0, 1, 2, 0, 1, 2, 0, 1, 2, 0] # True labels
y_pred = [0, 1, 2, 0, 1, 1, 0, 2, 2, 0] # Predicted labels

Computing Metrics

Accuracy

accuracy = accuracy_score(y_true, y_pred)
print(f"Accuracy: {accuracy:.4f}")

Accuracy: 0.7000

Precision

By default, `precision_score` computes the macro average, which calculates the average precision across all classes. You can specify ‘micro’ or ‘weighted’ averaging as well.

precision_macro = precision_score(y_true, y_pred, average='macro')
print(f"Macro Precision: {precision_macro:.4f}")

Macro Precision: 0.6667

precision_micro = precision_score(y_true, y_pred, average='micro')
print(f"Micro Precision: {precision_micro:.4f}")

Micro Precision: 0.7000

Recall

Similar to precision, you can choose different averaging methods for recall.

recall_macro = recall_score(y_true, y_pred, average='macro')
print(f"Macro Recall: {recall_macro:.4f}")

Macro Recall: 0.6667

recall_micro = recall_score(y_true, y_pred, average='micro')
print(f"Micro Recall: {recall_micro:.4f}")

Micro Recall: 0.7000

F1-Score

f1_macro = f1_score(y_true, y_pred, average='macro')
print(f"Macro F1-Score: {f1_macro:.4f}")

Macro F1-Score: 0.6667

f1_micro = f1_score(y_true, y_pred, average='micro')
print(f"Micro F1-Score: {f1_micro:.4f}")

Micro F1-Score: 0.7000

Understanding Macro vs. Micro Averaging

Macro averaging assigns equal weight to each class, while micro averaging considers the overall number of correctly and incorrectly classified instances across all classes. Macro averaging provides insights into individual class performance, whereas micro averaging focuses on overall model accuracy.

Conclusion

This article provides a comprehensive guide to computing essential metrics like precision, recall, accuracy, and F1-score for multiclass classification problems using Scikit-learn. By understanding and applying these metrics, you can effectively evaluate your model’s performance and make informed decisions about its effectiveness.

Post Views: 14

How to compute precision, recall, accuracy and f1-score for the multiclass case with scikit learn?

Introduction

Understanding the Metrics

Precision

Recall

Accuracy

F1-Score

Using Scikit-learn for Multiclass Evaluation

Importing Libraries

Generating Sample Data

Computing Metrics

Accuracy

Precision

Recall

F1-Score

Understanding Macro vs. Micro Averaging

Conclusion

By jacksparrow

Leave a Reply Cancel reply

You Missed

What is Python? – Definition, Features, Application

KeyAttestation in Android Nougat API 24

UTM tracking codes in Firebase

android.os.BadParcelableException: ClassNotFoundException when unmarshalling: com.facebook.flatbuffers.helpers.FlatBufferModelHelper$LazyHolder

Introduction

Understanding the Metrics

Precision

Recall

Accuracy

F1-Score

Using Scikit-learn for Multiclass Evaluation

Importing Libraries

Generating Sample Data

Computing Metrics

Accuracy

Precision

Recall

F1-Score

Understanding Macro vs. Micro Averaging

Conclusion

By jacksparrow

Related Post

Leave a Reply Cancel reply

You Missed