Models for estimating distances

The evolutionary distance between a pair of sequences usually is measured by the number of nucleotide (or amino acid) substitutions occurring between them. Evolutionary distances are fundamental for the study of molecular evolution and are useful for phylogenetic reconstructions and the estimation of divergence times. Most of the widely used methods for distance estimation for nucleotide and amino acid sequences are included in MEGA. In the following three sections, we present a brief discussion of these methods: nucleotide substitutions, synonymous-nonsynonymous substitutions, and amino acid substitutions. Further details of these methods and general guidelines for the use of these methods are given in Nei and Kumar (2000)Nei_and_Kumar_2000.  Note that in addition to the distance estimates, MEGA also computes the standard errors of the estimates using the analytical formulas and the bootstrap methodHC_bootstrap_method_to_compute_standard_error.

Distance methods included in MEGA in divided in three categories (Nucleotide, Syn-nonsynonymous, and Amino acid):

Nucleotide

Sequences are compared nucleotide-by-nucleotide.  These distances can be computed for protein coding and non-coding nucleotide sequences.

      No. of differences

      p-distance

      Jukes-Cantor Model

          with Rate Uniformity Among Sites

          with Rate Variation Among Sites

      Tajima-Nei Model

          with Rate Uniformity and Pattern Homogeneity

          with Rate Variation Among Sites

          with Pattern Heterogeneity Between Lineages

          with Rate Variation and Pattern Heterogeneity

      Kimura 2-Parameter Model

          with Same Rate Among Sites

          with Rate Variation Among Sites

      Tamura 3-Parameter Model

          with Rate Uniformity and Pattern Homogeneity

          with Rate Variation Among Sites

          with Pattern Heterogeneity Between Lineages

          with Rate Variation and Pattern Heterogeneity

      Tamura-Nei Model

          With Rate Uniformity and Pattern Homogeneity

          with Rate Variation Among Sites

          with Pattern Heterogeneity Between Lineages

          with Rate Variation and Pattern Heterogeneity

      Log-Det Method

          with Pattern Heterogeneity Between Lineages

      Maximum Composite Likelihood Model

          with Rate Uniformity and Pattern Homogeneity

          with Rate Variation Among Sites

          with Pattern Heterogeneity Between Lineages

          with Rate Variation and Pattern Heterogeneity