Writing Command Statements for Defining Groups of Taxa and for Annotating Taxa with Meta Data

The MEGA format allows you to assign group definitions and other meta data to the taxa in sequence alignment files as well as to distance data files. Meta data is written in a set of curly brackets following the taxa name. The meta data can be attached to the taxa name using an underscore or it can just be appended to the sequence name. It is important to note that there should be no spaces between the taxa name and meta data command. (Note that groups of taxa can also be defined interactively through a dialog box). MEGA supports the following meta data commands (order does not matter):

    group, species, population, continent, country, city, year, month, day, time

Meta data commands must adhere to the following rules:

 

The following example shows meta data commands for three pathogen sequences:

#pathogen_sample_20200520_Paris_{population=european|group=symptomatic|species=homo_sapiens|continent=Europe|country=France|city=Paris|year=2020|month=5|day=20|time=23:59:59}

TAATTAAAGG GCCGTGGTAT A-CTGACCAT GCGAAGGTAG CATAATCATT AGCCTTTTGA TTTGAGGCTG

#pathogen_sample_20200610_Canberra_{population=european|group=asymptomatic|species=homo_sapiens|continent=Australia|country=Australia|city=Canberra|year=2020|month=6|day=10|time=13:59:59}

GTG..G.... ....C..... TTT.....G. .......... .......... ..T.....A. ..GA.....C

#pathogen_sample_20180402_Sydney_{population=european|group=asymptomatic|species=felis_catus|continent=Australia|country=Australia|city=Sydney|year=2018|month=4|day=2|time=22:58:00}

AT...G.... ....C..... TT......G. .......... .......... ..T.....A. ..G......C

 

In the following, we show an example in which human and mouse are designated as the members of the group Mammal and chicken belongs to group Aves.

!Gene=FirstGene Domain=Exon1 Property=Coding;

#Human_{Mammal} ATGGTTTCTAGTCAGGTCACCATGATAGGTCTCAAT

#Mouse_{Mammal} ATGGTTTCTAGTCAGGTCACCATGATAGGTCCCAAT

#Chicken_{Aves} ATGGTTTCTAGTCAGCTCACCATGATAGGTCTCAAT

 

!Gene=SecondGene Domain=Intron Property=Noncoding;

#Human ATTCCCAGGGAATTCCCGGGGGGTTTAAGGCCCCTTTAAAGAAAGAT

#Mouse GTAGCGCGCGTCGTCAGAGCTCCCAAGGGTAGCAGTCACAGAAAGAT

#Chicken GTAAAAAAAAAAGTCAGAGCTCCCCCCAATATATATCACAGAAAGAT

 

!Gene=ThirdGene Domain=Exon2 Property=Coding;

#Human ATCTGCTCTCGAGTACTGATACAAATGACTTCTGCGTACAACTGA

#Mouse ATCTGATCTCGTGTGCTGGTACGAATGATTTCTGCGTTCAACTGA

#Chicken ATCTGCTCTCGAGTACTGCTACCAATGACTTCTGCGTACAACTGA