MEGA supports conversions from several different file formats into MEGA formats. Each format is indicated by the file extension used. Supported formats include:
Extension |
File type |
. an |
CLUSTAL |
. nexus |
PAUP, MacClade |
. phylip |
PHYLIP Interleaved |
. phylip2 |
PHYLIP Noninterleaved |
. gcg |
GCG format |
. fasta |
FASTA format |
. pir |
PIR format |
. nbrf |
NBRF format |
. msf |
MSF format |
. ig |
IG format |
. xml |
Internet (NCBI) XML format |
The following sections briefly describe each of these formats and how MEGA handles their conversion.
COMMON FILE CONVERSION ATTRIBUTES
The default input formats are determined by a file’s extension (e.g., a file with the extension of “.ig” is initially assumed to be in “IG” input format). However, you have the option to specify any format for any file; the file extension is simply used as an initial guide. Note that the specification of an incorrect file format most often results in an erroneous conversion or other unexpected error.
Input file types can include any of the following characters in their sequence data:
The letters: a-z,A-Z for DNA and protein sequences
Peroid (.)
Hyphen (-)
The space character
Question mark (?).
Depending on their context, all other characters encountered in input files are either ignored or are interpreted as specific non-sequence data, such as comments, headers, etc.
The first line of all converted files is always: #Mega
The second line of all converted file is always: !Title: <filename>
where <filename> is the name of the input file.
The third line of all converted files is blank.
Many formats can specify the length of the sequences contained within them. The MEGA conversion utility ignores these data and does not check to see if the sequences are as long as they are purported to be.