Wednesday, October 2, 2013

Bioinformatics based Database and tools for DNA analysis






a. Annotation is one of the first steps and generally refers to adding identifying features to the sequence. Generally, annotation can be performed by simply comparing an unknown sequence with a DNA sequence, which has already been annotated. This is carried out based on the principles of sequence comparisons with a hypothesis that if two sequences share similarity, then they should also share characteristic features. However, genome annotation or identifying features of genomes are more challenging.  For example, genome annotation can be subdivided into two categories


  • Comparative genomics based (i.e. based on comparison with other genomes)

  • Ab-initio based (i.e. from the beginning)



Depending on the source of the sequence, annotation tools are designed for either prokaryotic genomes or eukaryotic genomes.


GeneMark.hmm (Lomsadze et al. 2005; http://exon.gatech.edu/index.htmlGeneScan (http://genes.mit.edu/GENSCAN.html), andFGENESH (http://linux1.softberry.com/berry.phtml?topic=fgenesh&group=programs&subgroup=gfind) are comparative genomics based eukaryotic gene prediction tool. They take unknown DNA sequence as input sequence, compares with several stored genome features in its database and can predict


  • presence of genes

  • position of genes

  • strand on which genes are present, 

  • exon positions

  • translation product



Some other softwares developed for gene prediction include Artemis by the EBI-EMBL(http://www.sanger.ac.uk/resources/software/artemis/).










b. Similarity searching can be performed using the BLASTN tool at NCBI (http://www.ncbi.nlm.nih.gov) and a detailed description in available in the chapter on NCBI BLAST.


c. Molecular evolution using DNA sequence can be performed using multiple sequence alignment. The most common tool for multiple sequence alignment is Clustal that can either be used as a web-based service or the software can be downloaded from http://www.clustal.org/. It employs progressive alignment as to perform a MSA, Clustal first creates a global pairwise alignment for all sequence pairs with alignment/similarity scores and then starts the MSA with the two sequences with highest score and progressively adds more and more sequences to complete the alignment. The MSA can be further analysed using software such as MEGA to reveal evolutionary relationship. For details on how to construct multiple sequence alignment and phylogenetic tree please refer to the chapter on molecular phylogeny and Multiple Sequence Alignment



No comments:

Post a Comment