Research Article
Common Pitfalls and Novel Opportunities for Predicting Variant Pathogenicity
Tom van den Bergh1,2*, Bas Vroling2, Remko KP Kuipers1,3, Henk-Jan Joosten1,2 and Gert Vriend3 | |
1Laboratory of Microbiology, Wageningen University, Wageningen, The Netherlands | |
2Bio-Prodict, Nijmegen, The Netherlands | |
3CMBI, Radboud University Medical Centre, Nijmegen, The Netherlands | |
Corresponding Author : | Tom van den Bergh Laboratory of Microbiology, Wageningen University Wageningen, The Netherlands and Bio-Prodict Nijmegen, The Netherlands Tel: 0031 24 845 7988 E-mail: tvandenbergh@bio-prodict.nl |
Received January 05, 2016; Accepted January 27, 2016; Published February 03, 2016 | |
Citation: van den Bergh T, Vroling B, Kuipers RKP, Joosten HJ, Vriend G (2016) Common Pitfalls and Novel Opportunities for Predicting Variant Pathogenicity. Biochem Physiol 5:197. doi:10.4172/2168-9652.1000197 | |
Copyright: © 2016 van den Bergh T, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
Abstract
The prediction of missense variant pathogenicity is normally performed using analyses of multiple sequence alignments optionally augmented with analyses of the (predicted) protein structure. The most straightforward way, though, is to search the literature to see whether this variant has already been described. Variant data from homologous proteins are also valuable because mutations in a homologous protein often have similar effects as mutations at the equivalent residues of the protein of interest. Transferring variant data seems trivial but is seriously hampered by the fact that homologous residue positions have different numbers in different species. This problem is even bigger when to proteins have such low sequence identities that they can no longer be aligned based on their sequences only and their structures need to be compared to align them accurately. The protein superfamily analysis software suite 3DM solves these problems, because 3DM is a system that combines high quality structure based multiple sequence alignments in which aligned residues have the same number, with all published mutant and variant data for human and all other species. We have used 3DM to analyze nine human proteins for which many disease-related variants are known. This study reveals that mutation data can be transferred even between very distant homologous proteins. Thus, protein superfamily information systems, such as 3DM, offer a wealth of unused information that can be used in the analysis of human variants.