Mega Basics

In this tutorial, we will focus on opening and manipulating data files and saving results. All of the data files used in this tutorial can be found in the MEGA/Examples/ folder (The default location for Windows users is C:\Program Files\MEGA\Examples. The location for Mac users is $HOME/MEGA/Examples, where $HOME is the user’s home directory).

You can directly access the Examples folder by clicking the ‘Examples’ button on the bottom bar of the main window.

Active Files vs. Open Files

In order to perform any kind of calculation/analysis in MEGA you will need to provide a data file. If you are running an analysis on a data file with sequences then you must make sure that the sequences have been aligned prior to analysis (the sequences must be all the same length). In order to get the sequences ready for analysis you may have to align them using the Alignment Editor which provides automated and manual alignment facilities. If you have a file which needs editing in order to conform to one of the file format standards you can open it up in the Text Editor for manual editing.


Viewing a Data File Using the MEGA Text Editor

From the main MEGA menu, you can open any text file for viewing and/or editing. In this example, we will open a native MEGA text file and explore its format. This feature isn’t used too often in MEGA, but is useful when you have a file which is corrupted or needs manual editing. If you want to start using MEGA ASAP, you can skip this example.

Example 1.1:

To open the text editor, click File | Edit a Text File from the main MEGA menu.

In the window that opens, select File | Open and use the file browser to navigate to the MEGA/Examples directory. Select the "Drosophila_Adh.meg" file to open it.

Examine the "Drosophila_Adh.meg" file. Notice the OTU (Operational Taxonomic Unit) names and the interleaved sequence data. This file is in the MEGA format which is one of the two formats which MEGA reads for analysis. The other format MEGA accepts for analysis is FASTA.


From the text editor, you can make changes to the file such as specifying a format. Experiment with the menu options in the editor.

Exit the text editor before proceeding with data analysis. Select the File | Exit Editor main menu option from the text editor window. If the editor asks you if you would like to save the changes that you have made to the file, click ‘No’.


Opening (activating) a Data File for Analysis

You can activate a data file using any of the following methods:

Example 1.2:

Now we will select a file to activate using the first method. From the main MEGA window, select Data | Open a File/Session from the launch bar. Navigate to the Examples directory (Mega5/Examples) and open the "Drosophila_Adh.meg," file.

Below the main MEGA launch bar you will notice that two icons appear in the main MEGA window; a “TA” icon and a “Close Data” icon. Click the “TA” icon and you will be able to view the data you just opened. Click the “Close Data” icon to close the data file currently opened.

Note: You can only one data file may be open at a time. You can open a different data file by going to Data | Open a File/Session, you will see a warning which asks if you want to close the current file to open another, just say “yes”. Each time you select an analysis, MEGA will ask if you would like to use the currently active data. If you click “yes”, then the next analysis will use the data file you already have open, by clicking “no” the current data file will be closed and you will be asked for a new file.
Hint: You can turn this prompt off by selecting the checkbox “Remember to use currently active data file”. MEGA will then assume that you want to keep using that file until you open a different one or close MEGA.


Viewing Sequence Data

The Sequence Data Explorer allows you to visually explore your sequence data as well as perform a wide range of statistical analysis based on data composition. You can activate the Sequence Data Explorer window by using any of these methods:

Note: You must have a data file already opened to explore active data.

Example 1.3:

Re-open the Drosophila file as described in Example 1.2.

Select Data | Explore Active Data from the launch bar of the main MEGA window and the sequence data will be displayed in the Sequence Data Explorer window. Leave this window open for the next example.

Note: If you hover your mouse over the icons on the toolbar, each icon will display text describing the icons function. This window provides several options for saving the displayed data in various formats, translating and highlighting sequences, setting site coverage, as well as various tools for locating information within the data file.


Translating Sequences

Using the Sequence Data Explorer, you can translate protein-coding sequences into amino acid sequences and back using any of the following methods:

Note: The T key is a toggle - it turns the translation on and off. You can tell whether the data is translated or not by clicking on the Sequence Data Explorer main menu option, Data. There will be a check mark next to the Translate Sequences option if the data is translated.

Example 1.4:

With the Drosophila file still open in Sequence Data Explorer (from the previous example), press the T key on the keyboard to translate the nucleotide sequences into amino acid sequences.

Once the sequences are translated, calculate the amino acid composition by selecting the Statistics | Amino Acid Composition main menu command from the Sequence Data Explorer window. If you do not have Microsoft Excel installed, we suggest you select Statistics | Display Results in Comma-delimited (CSV) or Statistics | Display Results in Text Editor to view the results in a CSV or text format, before running the Amio Acid Composition report. If you do have Excel MEGA will open an Excel workbook displaying the calculations for the amino acid composition. Except for Mac, in which case you must save a file.

Exit out of excel.

Note: If Excel is not installed on your computer and you still select save as Excel, You will be prompted to save the results in excel format somewhere on your hard drive.


Exporting Sequence Data

Using the Sequence Data Explorer, you can save data in the following formats: Mega, Nexus (PAUP 4.0), Nexus (PAUP 3.0/MacClade), Phylip 3.0, Excel Workbook, or CSV (Excel Importable).

Example 1.5:

On the Sequence Data Explorer launch bar, click on the Export Data image\ebx_1406274670.gif icon. The window, Exporting Sequence Data will appear.

In this window, you can set the title and a description of the data as well as choose a format.

Note: If you choose any of these export formats, except Excel, the data will open in the Text File Editor and Format Converter window, if you wish to save the exported text go to File | Save As. If you choose the Excel option and Excel is installed on your computer, the data will appear in a new Excel workbook. If you choose Excel and you do not have Excel installed on your computer, an Excel file will be created and you will be prompted for a location in which to save it. Select any of the options except Excel.


Saving Sessions

MEGA includes a feature for saving data sessions that allows you to save translation state, highlighting, font changes, taxa groups, genes and domains, and or other changes associated with your current file into a single session file. If you open the saved session later, the data and all of the associated settings will be restored automatically.

Example 1.6:

From the Sequence Data Explorer main menu, select Data | Save Session. A ‘Save As’ dialog opens that will allow you to save the session in an “.mdsx” file at the location of your choice. Any translation, highlighting, font changes, etc. will be saved in the resulting session. Save the file as “Drosophila_Adh.mdsx”.

Close the Sequence Data Explorer window and the data file by clicking the Close Data icon in the main MEGA window.

Reopen the session by selecting Data | Open a File / Session… from the launch bar of the main MEGA window and selecting the “Drosophila_Adh.mdsx” file. Any changes made to the data are preserved.

Close the Sequence Data Explorer window and the Drosophila file.


Viewing Distance Data

MEGA allows you to save distance data in MEGA’s native “.meg” format and later explore the data using the Distance Data Explorer.

Example 1.7:

From the launch bar on the main MEGA window, select Data | Open a File/Session…

In the Open a File window, find the data file named "Distance Data.meg," then click the Open button to activate the data file. This file sis located in the MEGA/Examples directory.

On the main MEGA window, select Data | Explore Active Data. The contents of the selected data file will be displayed in the Distance Data Explorer window.

In the leftmost column you will see the names of the taxa listed. You can resize this column by dragging the vertical bar at its right edge. The distances are displayed in the columns to the right.

You can change the number of decimal places displayed in the distances by clicking on the toolbar icons labeled 0.0 (Decrease Decimal) and 0.00 (Increase Decimal).


Exporting Distance Data

Throughout MEGA, you will find viewing windows, each with its own set of toolbar icons. Wherever appropriate, you will see a bank of "Export" icons. The set of icons being displayed depends on which viewer you are using and the current analysis. In the Distance Data Explorer, the available export formats and associated icons are: XL, CSV, MEGA and TXT.

Example 1.8:

Click on the icon labeled CSV. The Distance Write-out Options window will appear. Because you clicked on the CSV icon, the Output Format is automatically set to "CSV : Comma-separated file".

Click the Print/ Save Matrix button. A new CSV file will open in the Text File Editor and File Format Converter displaying your data. The new file is automatically given a name and saved to your computers Temporary folder. Use the File | Save As menu option in the text editor to give the file a different name and to specify a destination folder for it.


Calculating Average Distances

On the Distance Data Explorer task bar, you will find the Average menu. From here, you can calculate average pair-wise distances between sequences in several ways: Overall, Within Groups, Between Groups and Net Between Groups. Of course, in order to calculate based on groups, groups must be defined for your data. For more information on defining groups, see the Tutorial labeled “Managing Taxa with Groups”.

Example 1.9:

Select Average | Overall from the Distance Data Explorer main menu. A dialog is displayed that shows the calculated overall average pair-wise distance among all selected sequences.

Close the Distance Data Explorer window by selecting the File | Quit Viewer main menu option.


Note: If you wish to continue with the tutorial, leave MEGA open. If not, close MEGA by selecting the File | Exit MEGA menu command from the main MEGA window.

Note: If you close MEGA and then reopen it, MEGA will remember the settings you used previously for an analysis (bootstrap, model, etc.). If the settings you used last are not applicable to the analysis you are performing currently, MEGA will select the first available applicable options for you. MEGA tries to reuse as many settings as it can in order to save time and effort.