Compute Sequence Diversity

Distances | Compute Sequence Diversity

The Sequence Diversity submenu provides four commands for computing the population and subpopulation diversities that are useful in molecular population genetics studies. First, you define a group, using a population of sequences. Unlike the generic averages of within group, between group, and net between group distances calculated using other commands in the Distances menu, formulas used in the following commands are those used specifically in population genetics analyses.

The commands are:

Mean Diversity within Subpopulations

In a subpopulation, the mean diversity is defined as

$images\compseqdiv_d1.gif$ where $images\compseqdiv_d2.gif$ is the frequency of i-th sequence in the sample from subpopulation i, and q is the number of different sequences in this subpopulation.

Mean Diversity for Entire Population

For the entire population, the mean diversity is defined as

$images\compseqdiv_d3.gif$ , where $images\compseqdiv_d4.gif$ is the estimate of average frequency of the i-th allele in the entire population, and q is the number of different sequences in the entire sample.

Mean Interpopulational Diversity

The estimate of inter-populational diversity is given by

$images\compseqdiv_d5.gif$ .

Coefficient of Differentiation

The estimate of the proportion of interpopulational diversity is given by

$images\compseqdiv_d6.gif$ .