Equal Input Model (Gamma rates and Heterogeneous Patterns)

In real data, amino acid frequencies usually vary among different kind of amino acids. Therefore, the correction based on the equal input model gives a better estimate of the number of amino acid substitutions than the Poisson correction distance. If you are computing the rate variation among sites using the Gamma distribution, you will need to provide a gamma parameter (a). When the amino acid frequencies are different between the sequences, the modified formula (Tamura and Kumar 2002) relaxes the estimation bias.

 

MEGA provides facilities for computing the following quantities:

Quantity

Description

d: distance

Number of amino acid substitutions per site.

L: No of valid common sites

Number of sites compared.

 

Formulas used are:

Distance

image\ebx_806362806.gif

where p is the proportion of different amino acid sites, a is the gamma parameter, gXi is the frequency of amino acid i for sequence X, gi is the average frequency for the pair of the sequences, and

image\ebx_827147449.gif

The variance of d can be estimated by the bootstrap method.