Wednesday, October 2, 2013

Bioinformatics based databases and tools for whole genome analysis:



a. Comparative genomics and genome structure:


The last few years has seen the completion of high quality draft genome sequences from a wide range of organism. The Genome database at NCBI currently lists total of 3468 viral genomes, 15360 from prokaryotes and 2268 from eukaryotes (as on December 22, 2012; http://www.ncbi.nlm.nih.gov/genome/browse/). The availability of genome sequences from such a variety of organism has allowed researchers to reconstruct the genome structure at the highest resolution possible, i.e. at the level of nucleotides; and also allowed comparison of entire genome rather than just a few kilobases. In other words, sequencing has enabled researchers to not only get a bird’s-eye-view of the genome but also permits them to zoom into virtually any region of the genome. Comparison of the genomes to itself allows us to understand the overall genome organization and composition such duplication and inversions; whereas comparison of two or more genomes permit us to understand the evolution of genomes vis-à-vis each other.


For example, the Pairwise Sequence Comparison (PASC; http://www.ncbi.nlm.nih.gov/sutils/pasc/viridty.cgi?textpage=overview; Bao et al. 2008) at NCBI is a tool for comparison of sequences within viral genomes.


TaxPlot (http://www.ncbi.nlm.nih.gov/sutils/taxik2.cgi;) allows users to compare genomes in a three way plot that identifies protein homologs present across genomes. The tool is available for genome-wide comparisons across microbial and eukaryotic genomes.


Global alignment tools such as AVID (http://pipeline.lbl.gov/cgi-bin/gateway2) that allow users to compare large segments or entire genomes from various organisms have become a powerful tool for comparisons across genome irrespective of their phylogenetic distance or coding potential (as compared to PASC which is limited to viral genomes and TaxPlot that compares protein homologs).






b. Interactome and Biological Network toolInteractome can be defined as a discipline of proteomics / genomics that aims to reveal the direct and indirect interactions within and between proteins and other cellular macromolecules in a cellular environment. Drawing up such an interactome map helps us understand the regulation and functioning of the genome. IntAct(http://www.ebi.ac.uk/intact/) available at EBI-EMBL, together with CytoScape (visualization tool) is one such tool that has nearly 304500 interaction datasets.  Database of Interacting Proteins (http://dip.doe-mbi.ucla.edu/dip/Main.cgi),





Apart from protein interaction data, KEGG (Kyoto Encylopedia of Genes and Genome; http://www.genome.jp/kegg/pathway.html) is a database cum analysis tool for a wide range of metabolic pathways for biosynthesis and degradation of biological compounds.


1 comment:

  1. I really appreciate information shared above. It’s of great help.Thank you for taking the time to publish this information very useful. Very informative and well written post! Quite interesting and nice topic chosen for the post Nice Post keep it up.Excellent post.
    bioinformatics databases

    ReplyDelete