Tamura-Nei distance (Heterogeneous Patterns)

The Tamura-Nei model (1993Tamura_and_Nei_1993) corrects for multiple hits, taking into account the substitution rate differences between nucleotides and the inequality of nucleotide frequencies.  It distinguishes between transitionalRH_Transition substitution rates between purinesRH_Purines and transversional substitution rates between pyrimidinesRH_Pyrimidines.  It assumes an equality of substitution rates among sites (see related gamma modelHC_Tamura_Nei_distance_with_Gamma_model).When nucleotide frequencies are different between the sequences, the modified formula (Tamura and Kumar 2002Tamura_and_Kumar_2002) relaxes the assumption of substitution pattern homogeneity.

 

The Tamura-Nei model

 

MEGA provides facilities for computing the following quantities for this method:

Quantity

Description

d: Transitions & Transversions

Number of nucleotide substitutions per site.

s: Transitions only

Number of transitional substitutions per site.

v: Transversions only

Number of transversional substitutions per site.

R = s/v

Transition/transversions ratio.

L: No of valid common sites

Number of sites compared.

 

Formulas for computing these quantities are as follows:

Distances

where P1 and P2 are the proportions of transitional differences between nucleotides A and G, and between T and C, respectively, Q is the proportion of transversional differences, gXA, gXC, gXG, gXT, are the respective frequencies of A, C, G and T of sequence X, gXR = gXA + gXG and gXY = gXT + gXC, gA, gC, gG, gT, gR, and gY are the average frequencies of the pair of sequences, and

 

The variances can be estimated by the bootstrap method.