Find Best DNA/Protein Models (ML)

Models | Find Best DNA/Protein Models (ML)

This analysis tests an alignment data file (nucleotide or amino acid) for goodness of fit to some popular models of evolution, by calculating the values of several criteria which can be used to pick the most appropriate evolutionary model for phylogenetic analysis of the data. Specifically, for a given topology, which can be either auto-generated or user-provided, MEGA calculates BIC, AICc, and log likelihood (lnL) applying each model to be tested. The model that produces that lowest BIC score is considered to be the optimal model. For nucleotide data, 24 models can be tested and for amino acid data, 64 models can be tested. The results also show the estimated values of all parameters for each model (frequencies, transition probabilities, rate variation parameters, etc), plus the count of total parameters. In most cases you would pick a model that has a low number of parameters (to keep variance low) yet is accurate enough (as measured by the goodness-of-fit criteria) for your needs.

MEGA provides a Filtered option for the model test that reduces the computational burden required by skipping the evaluation of derivative models that are deemed unlikely to be optimal. MEGA first determines the BIC and AICc values for the primary models (6 models for nucleotide data and 8 for amino acid data). Next, MEGA eliminates all primary models whose BIC and AICc are worse than a specified threshold (default threshold is 5) when compared to the model with the lowest BIC. Derivative models (+G, +I, +F) are then evaluated for only the remaining primary models.