Can You Compare Regressions with AIC?

Comparing regression models using AIC (Akaike Information Criterion) is a common practice in statistical modeling. AIC provides a relative measure of model quality, allowing you to assess which model best fits the data while penalizing complexity. However, simply looking at the percentage difference between AIC values is misleading. This article explains how to correctly compare regressions using AIC and clarifies common misconceptions.

Understanding AIC in Regression Comparison

AIC is calculated as:

AIC = 2k - 2ln(L)

Where:

k represents the number of parameters in the model.
L is the maximum likelihood of the model given the data.

The key to using AIC for model comparison lies in examining the difference between AIC values, not their absolute values or percentage difference.

Comparing AIC Values: The Delta AIC

The crucial metric is the difference between the AIC of a given model (AIC_i) and the minimum AIC among all models considered (AIC_min):

Δi = AICi - AICmin

This Delta AIC (Δ_i) provides a standardized measure of relative model support. A model with Δ_i of 0 represents the best-fitting model among the set.

Formula for calculating Delta AIC

Interpreting Delta AIC Values: Rules of Thumb

Generally accepted guidelines for interpreting Δ_i are:

Δ_i < 2: Substantial support for the model; little evidence against it. The model is highly probable.
2 < Δ_i < 4: Strong support for the model.
4 < Δ_i < 7: Considerably less support for the model.
Δ_i > 10: Essentially no support for the model. It is unlikely to be a good representation of the data.

Why Percentage Difference in AIC is Misleading

Consider these scenarios:

AIC₁ = AIC_min = 100 and AIC₂ = 100.7 (0.7% difference). Here, Δ₂ = 0.7, indicating no substantial difference between the models.
AIC₁ = AIC_min = 100000 and AIC₂ = 100700 (0.7% difference). Now, Δ₂ = 700, strongly favoring the first model.

Illustrative examples of AIC comparison.

As demonstrated, the same percentage difference can lead to drastically different conclusions depending on the magnitude of the AIC values.

Model Complexity and AIC

AIC inherently penalizes model complexity. A more complex model (with more parameters) will have a higher AIC unless it provides a significantly better fit to the data. This penalty helps prevent overfitting, where a model captures noise rather than underlying patterns.

AIC and Model Probability

You can calculate the relative probability of a model given its Δ_i:

pi = exp(-Δi/2)

This represents the probability that the i-th model minimizes the information loss.

Conclusion: Focusing on Delta AIC for Model Selection

When comparing regression models using AIC, focus on the Delta AIC (Δ_i), not the percentage difference or absolute AIC values. Δ_i provides a reliable measure of relative model support, allowing for informed model selection based on both fit and complexity. Remember that AIC guides model selection but does not guarantee the chosen model perfectly represents the underlying reality. It helps you choose the best model among those considered based on the available data.