Model Selection Using the Akaike Information Criterion (AIC)

This web page basically summarizes information from Burnham and Anderson (2002). Go there for more information.

The Akaike Information Criterion (AIC) is a way of selecting a model from a set of models. The chosen model is the one that minimizes the Kullback-Leibler distance between the model and the truth. It’s based on information theory, but a heuristic way to think about it is as a criterion that seeks a model that has a good fit to the truth but few parameters. It is defined as:

AIC = -2 ( ln ( likelihood )) + 2 K

where likelihood is the probability of the data given a model and K is the number of free parameters in the model. AIC scores are often shown as ∆AIC scores, or difference between the best model (smallest AIC) and each model (so the best model has a ∆AIC of zero).

The second order information criterion, often called AICc, takes into account sample size by, essentially, increasing the relative penalty for model complexity with small data sets. It is defined as:

AICc = -2 ( ln ( likelihood )) + 2 K * (n / ( n - K - 1))

where n is the sample size. As n gets larger, AICc converges to AIC ( n - K -1 -> n as n gets much bigger than K, and so (n / ( n - K - 1)) approaches 1), and so there’s really no harm in always using AICc regardless of sample size. In phylogenetics, defining “sample size” isn’t always obvious. In model selection for tree inference, sample size often refers to the number of sites (i.e., Posada and Crandall (2001)). In model selection in comparative methods, sample size often refers to the number of taxa (Butler and King, 2004; O’Meara et al., 2006).

Akaike weights are can be used in model averaging. They represent the relative likelihood of a model. To calculate them, for each model first calculate the relative likelihood of the model, which is just exp( -0.5 * ∆AIC score for that model). The Akaike weight for a model is this value divided by the sum of these values across all models.

Burnham, K. P., and D. R. Anderson. 2002. Model selection and multimodel inference : a practical information-theoretic approach. Springer, New York.

Butler, M. A., and A. A. King. 2004. Phylogenetic comparative analysis: A modeling approach for adaptive evolution. American Naturalist 164:683-695.

O’Meara, B. C., C. Ane, M. J. Sanderson, and P. C. Wainwright. 2006. Testing for different rates of continuous trait evolution using likelihood. Evolution 60:922-933.

Posada, D., and K. A. Crandall. 2001. Selecting the best-fit model of nucleotide substitution. Systematic Biology 50:580-601.