Research Article |
Open Access |
|
|
Computational Annotation for Hypothetical Proteins
of Mycobacterium Tuberculosis |
S.Anandakumar and P. Shanmughavel 1 * |
1Computational Biology and Bioinformatics Laboratory,
Department of Bioinformatics, Bharathiar University,
Coimbatore – 641046, TamilNadu, India |
| *Corresponding author: |
Dr. P. Shanmughavel,
Email : shanvel_99@yahoo.com |
|
| Received August 28, 2008; Accepted November 10, 2008; Published December 26, 2008 |
|
Citation: Anandakumar S, Shanmughavel P (2008) Computational Annotation for Hypothetical Proteins of MycobacteriumTuberculosis. J Comput Sci Syst Biol 1: 050-062. doi:10.4172/jcsb.1000004 |
| |
Copyright: © 2008 Anandakumar S, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and
source are credited. |
| |
|
There is rising death of humans worldwide by reason of tuberculosis. The current sequencing of the Mycobacteriumtuberculosis genome holds assure for the development of new vaccines and the design of new drugs. In
this view, the functions prediction of genomic sequences for hypothetical proteins will invigorate our knowledge
with reference to the identification of new drugs for tuberculosis. There are various function prediction methods
available based on the on the assumption. The process accurate annotation for genes in newly sequenced genomes
currently has been based on sequence similarity. In this work about 250 hypothetical proteins of Mycobacteriumtuberculosis taken functions were predicted using Bioinformatics web tools, BLAST, INTERPROSCAN,
PFAM and COGs. |
Keywords |
| Tuberculosis; Hypothetical proteins; Sequence similarity; Bioinformatics web tools |
Introduction |
| The current research on sequencing of the Mycobacterium
tuberculosis genome holds assure for the development
of new vaccines and the design of new drugs (Prachee
Chakhaiyar and Hasnain, 2004) The functions for genomic
sequences of hypothetical proteins are unknown because
this is a protein whose being has been predicted (Edward
Eisenstein et al, 2000). In depth learn of function prediction
on such proteins will offer opportunity for novel applications
and help the researchers to Identify new drug molecules
for tuberculosis. Mycobacterium tuberculosis organism
has totally 3887 number of proteins. In these proteins
1985 hypothetical proteins were present Out of the
250 hypothetical proteins taken for this work. All hypothetical
proteins were analyzed for function prediction using
Bioinformatics web tools such as BLAST,
INTERPROSCAN, PFAM and COGs. The results indicates
100% confidence for only 86 proteins, with 75% confidence
for 92 proteins and some proteins function could
not be predicted with much confidence (unknown function). |
Methodolgy |
| Complete genome sequence of pathogenic bacteria Mycobacterium
tuberculosis sequences were downloaded
from the PIR Database (http://pir.georgetown.edu/) and
NCBI Database (www.ncbi.nlm.nih.gov/). In complete genome
sequence of Mycobacterium tuberculosis, 1985 hypothetical
proteins were present. Only 250 hypothetical proteins
of genome sequence were analyzed and then downloaded
from the site (http://www.ncbi.nih.gov/genomes/
lproks.cgi). Finally genomics sequences of each protein
were submitted to functions prediction web tools such as
NCBI-BLAST2 (Wendy Baker et al, 2000 ),
INTERPROSCAN (Zdobnov and Rolf Apweiler, 2001),
PFAM (Bateman et al, 2002) and COG (Roman et al, 2000).
The confidence level can be measured on the basis of above
tools. |
Table1: Functional genomics of Mycobacterium tuberculosis. |
|
Table2: Percentage of similarity.
(In 250 proteins, 100% confidence levels present in eighty-four proteins, 75% in Ninety-two proteins, 50% in fifty-six
proteins, 25% in twelve proteins and 0% in six proteins). |
|
| 1 |
If the given four tools indicate the same functions
then the confidence level were to be 100 percent. |
| 2 |
If the given three tools indicate the same functions
other is different functions then the confidence level
were to be 75 percent. |
| 3 |
If the given two tools indicate the same functions
other two given different functions then the confidence
level were to be 50 percent. |
| 4 |
If the given four tools indicate different functions
then the confidence level were to be 25 percent. |
| 5 |
If the given tool doesn’t indicate any functions
then the confidence level were to be 0 percent. |
|
Results and Discussion |
| There is rising death of humans worldwide by reason of
tuberculosis (Smith et al, 2004). Central goal of
Bioinformatics is recognized as the major area of research
to determining protein functions from their genomic sequences
and to develop personalized medicine. Functional
annotations of genomic sequences for hypothetical proteins are of major importance in providing insights into their molecular
functions and will help in the identification of new
drugs for tuberculosis. Table 1 shows the functional
genomics of Mycobacterium tuberculosis by using tools
such as BLAST, INTERPROSCAN, PFAM and COG.
Mycobacterium tuberculosis organism has totally 3887
number of proteins. In this 3887 proteins 1985 were hypothetical
proteins from which 250 hypothetical proteins were
retrieved for this study. Those hypothetical proteins were
submitted to above tools, which help to determine the confidence
level. Among 250 proteins, 244 proteins only were
obtained the function such as DEHYDROGENASES/REDUCTASE,
HYDROLASES, LUCIFERASES & METHYL
TRANSFERASES were in more in number. |
References |
-
Bateman A, et al. (2002) The Pfam protein families database.
Nucleic Acids Res 30: 276-80.
» CrossRef » PubMed » Google Scholar
- Edward E, et al. (2000) Biological function made crystal
clear — annotation of hypothetical proteins via structural
genomics. Current Opinion in Biotechnology 11: 25-
30. » CrossRef
» PubMed » Google Scholar
- Prachee C, Hasnain SE (2004) Defining the Mandate
of Tuberculosis Research in a Postgenomic Era. Medicinal
principles and practice 13: 177-184. » CrossRef » PubMed » Google Scholar
- Roman L, et al. (2000) The COG database: a tool for
genome-scale analysis of protein functions and evolution.
Nucleic Acids Research 28: 33-36. » CrossRef » PubMed » Google Scholar
- Smith, Clare V, et al. (2004) TB drug discovery: addressing
issues of persistence and resistance. Tuberculosis
84: 45-55. » CrossRef » PubMed » Google Scholar
- Wendy B, et al. (2000) The EMBL Nucleotide Sequence
Database. Nucleic Acids Research. 28: 19-23.» CrossRef » Google Scholar
- Zdobnov EM, Rolf A (2001) InterProScan – an integration
platform for the signature-recognition methods in
InterPro. Bioinformatics 17: 847-848. » CrossRef » PubMed » Google Scholar
- Pellegrini M, et al. (1999) Assigning protein functions by
comparative genome analysis: protein phylogenetic profiles.
Proc Natl Acad Sci USA 96: 4285-4288. » CrossRef » PubMed » Google Scholar
|
|
| This Article |
| DOWNLOAD |
|
|
| CONTRIBUTE |
|
| SHARE |
|
| EXPLORE |
|
|
|
|