Jukes-Cantor Gamma distance

 

In the Jukes and Cantor (1969) model, the rate of nucleotide substitution is the same for all pairs of the four nucleotides A, T, C, and G. The multiple hit correction equation for this model, which is given below, produces a maximum likelihood estimate of the number of nucleotide substitutions between two sequences, while relaxing the assumption that all sites are evolving at the same rate. However, it assumes equal nucleotide frequencies and does not correct for higher rate of transitional substitutions as compared to transversional substitutions. If the rate variation among sites is modeled using the Gamma distribution, you will need to provide a gamma parameter (a) for computing this distance.

The Jukes-Cantor model

image\ebx_363058755.gif

 

MEGA provides facilities for computing the following p-distances and related quantities:

 

d: Transitions + Transversions : Number of nucleotide substitutions per site.

L: No of valid common sites: Number of sites compared.

 

The formulas for computing these quantities are as follows:

Distance

image\ebx_-223100744.gif

where p is the proportion of sites with different nucleotides and a is the gamma parameter.

Variance

image\ebx_813204396.gif

See also Nei and Kumar (2000), page 36 and estimating gamma parameter.