Log In

Don't have an account? Sign up now

Lost Password?

Sign Up

Prev Next

Model Evaluation & Optimization

1. Regression Metrics

These measure the distance between the predicted value ($\hat{y}$) and the actual value ($y$).

  • MAE (Mean Absolute Error): The average of the absolute differences. It’s easy to understand because it’s in the same units as your data.
    • Example: An MAE of 5 in house prices means your predictions are off by $5,000 on average.
  • MSE (Mean Squared Error): The average of the squared differences. Because it squares the error, it punishes large outliers more heavily than MAE.
  • RMSE (Root Mean Squared Error): The square root of MSE. It brings the unit back to the original scale while still penalizing large errors.
    • Formula: $RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i – \hat{y}_i)^2}$
  • $R^2$ (Coefficient of Determination): Measures how much of the variance in the data is explained by the model.
    • 1.0 = Perfect fit.
    • 0.0 = The model is no better than just guessing the average value.

2. Classification Metrics

For classification, “Accuracy” can be misleading, especially if your data is imbalanced (e.g., 99% of transactions are legitimate and only 1% are fraud).

Confusion Matrix

A table used to describe the performance of a classification model.

  • True Positive (TP): Predicted “Yes”, Actual “Yes”.
  • True Negative (TN): Predicted “No”, Actual “No”.
  • False Positive (FP): Predicted “Yes”, Actual “No” (Type I Error).
  • False Negative (FN): Predicted “No”, Actual “Yes” (Type II Error).

Core Metrics

  • Accuracy: $\frac{TP + TN}{Total}$. Overall correctness.
  • Precision: $\frac{TP}{TP + FP}$. “Of all predicted positives, how many were actually positive?” (Important for Spam filters).
  • Recall (Sensitivity): $\frac{TP}{TP + FN}$. “Of all actual positives, how many did we catch?” (Important for Cancer detection).
  • F1-Score: The harmonic mean of Precision and Recall. Use this when you want a balance between the two.

ROC-AUC

  • ROC Curve: A plot of the True Positive Rate vs. the False Positive Rate at various thresholds.
  • AUC (Area Under the Curve): A single number representing the model’s ability to distinguish between classes.
    • 0.5 = Random guessing.
    • 1.0 = Perfect classifier.

3. Hyperparameter Tuning

Hyperparameters are the “settings” of an algorithm (like the depth of a tree or the $K$ in KNN) that you set before training.

  • Grid Search: You define a list of values for each hyperparameter, and the computer tries every single combination. It is thorough but very slow.
  • Random Search: The computer picks random combinations of hyperparameters from a range. It is often much faster and usually finds a result nearly as good as Grid Search.

4. Model Selection Strategies

  1. Cross-Validation (K-Fold): Split your data into $K$ parts. Train on $K-1$ parts and test on the remaining part. Repeat $K$ times so every piece of data is used for testing once. This ensures your model isn’t just “lucky” on one specific split.
  2. Train/Test Split: The standard practice of keeping 20% of your data hidden from the model during training to see how it handles “real-world” unseen data.
  3. Occam’s Razor: If two models have similar performance, always choose the simpler one. It is less likely to overfit.

Leave a Comment

    🚀 Join Common Jobs Pro — Referrals & Profile Visibility Join Now ×
    🔥