Preface

Genome sequencing is generating vast amounts of DNA sequence data from a wide range of organisms. As a result, gene sequence databases are growing rapidly. In order to conduct efficient analyses of these data, there is a need for easy-to-use computer programs, containing fast computational algorithms and useful statistical methods.

The objective of the MEGA software has been to provide tools for exploring, discovering, and analyzing DNA and protein sequences from an evolutionary perspective. The first version was developed for the limited computational resources that were available on the average personal computer in early 1990s. MEGA1 made many methods of evolutionary analysis easily accessible to the scientific community for research and education. MEGA2 was designed to harness the exponentially greater computing power and a graphical interface of the late 1990’s, fulfilling the fast-growing need for more extensive biological sequence analysis and exploration software. It expanded the scope of its predecessor from single gene to genome wide analyses. Two versions were developed (2.0 and 2.1), each supporting the analyses of molecular sequence (DNA and protein sequences) and pairwise distance data. Both could specify domains and genes for multi-gene comparative sequence analysis and could create groups of sequences that would facilitate the estimation of within- and among- group diversities and infer the higher-level evolutionary relationships of genes and species. MEGA2 implemented many methods for the estimation of evolutionary distances, the calculation of molecular sequence and genetic diversities within and among groups, and the inference of phylogenetic trees under minimum evolution and maximum parsimony criteria. It included the bootstrap and the confidence probability tests of reliability of the inferred phylogenies, and the disparity index test for examining the heterogeneity of substitution pattern between lineages.

MEGA4 continues where MEGA2 left off, emphasizing the integration of sequence acquisition with evolutionary analysis. It contains an array of input data and multiple results explorers for visual representation; the handling and editing of sequence data, sequence alignments, inferred phylogenetic trees; and estimated evolutionary distances. The results explorers allow users to browse, edit, summarize, export, and generate publication-quality captions for their results. MEGA4 also includes distance matrix and phylogeny explorers as well as advanced graphical modules for the visual representation of input data and output results. These features, which we discuss below, set MEGA apart from other comparative sequence analysis programs

As with previous versions, MEGA5 was specifically designed to reduce the time needed for mundane tasks in data analysis and to provide statistical methods of molecular evolutionary genetic analysis in an easy-to-use computing workbench. While MEGA5 was distinct from previous versions, we made a special effort to retain the user-friendly interface that researchers have come to identify with MEGA. We have simplified the file activation process, where you may select an analysis before needing to open a file.

MEGA6 represents a leap forward in terms of performance. The multithreaded ML system has been optimized for maximum efficiency. A new memory manager and updated compiler have made it possible for MEGA to allocate twice as much memory on 64-bit systems as MEGA5 could. The naïve timing methods that were added in MEGA5 have been replaced by a RelTime based system which is as accurate as (or better than) contemporary methodologies but with speeds >1000 times faster.

MEGA7 is a major refactoring of the MEGA source code and achieves another leap forward in terms of performance. MEGA is now optimized for 64-bit processor architectures and can now utilize many GB of memory. In addition, the Tree Explorer window has been re-factored to handle trees with > 100k taxa (previously it could only handle ~4k taxa), depending on the available graphics processing resources. Beyond increased performance, improvements have been made to the user interface. The Timetree system has been completely restyled to use a wizard system for guiding the user through the steps to create a timetree using the Reltime method.