Introduction

Computational diagnosis of amino acid variants in the human exome is the first step in assessing the disruptive impacts of non-synonymous single nucleotide variants (nsSNVs) on human health and disease. MEGA-MD (Molecular Evolutionary Genetics Analysis – Mutation Diagnosis) is a client-server application used to forecast the deleteriousness of nsSNVs using multiple methods and explore them in the context of the variability permitted in the long-term evolution of the affected positions.

MEGA-MD accesses a relational database (MD-DB) resident on our servers that contains pre-computed diagnoses, and associated information, for all possible mutations at all amino acid positions in the human exome. We have included three primary methods (PolyPhen-2, SIFT, and EvoD) of predicting the functional impact of amino acid variants. The first two are the most popular methods and the third significantly improves the performance for nSNVs found at ultra-conserved and at fast-evolving positions (Kumar et al., 2012). The PolyPhen-2 and SIFT diagnoses were obtained from dbNSFP. We have also included results from a multi-method consensus diagnosis, because they have been shown to be more reliable. In this case, we use the evolutionarily-balanced (see Liu and Kumar 2013) versions of PolyPhen-2 and SIFT diagnosis.

In addition to retrieving pre-computed predictions for variants in the human exome, MEGA-MD provides a facility to infer ancestral states for the position where a given amino acid mutation is found. Maximum parsimony and maximum likelihood approaches are supported by this utility which uses the 46 species reference phylogeny along with the 46 species peptide alignment for the relevant gene (obtained from the UCSC resource).

MEGA-MD is developed using the MEGA (Molecular Evolutionary Genetics Analysis) software package.