Biogenetics (databases)

  Sep 22, 2005 #1
    I think those who are well-immersed in the scientific community are familiar with the biological databases connected through the website www.ncbi.nlm.nih.gov.

    I was just wondering if someone could possibly explain to me a little about the Blast database for me. IN a particular section of a homework assignment, Blast is needed to query for matching protein/gene sequences. The search for an unc. C elegan prtoein/gene sequence resulted in a bunch of data on species that have similar protein/gene sequences and alignments.

    An example for the protein sequence results are:




    My questions are;
    are each of the letters codons of amino acids?
    In terms of its relationship to eachother and to my initial protein query, what is query 1 and subject 1?

    I'll deeply appreciate it if you could provide further insights into this database. (maybe there's a crashcourse or something on it somewhere?) My professor seems to be skimming ont he surface of the importance of these databases (blast/wormbase/flybase), but I want to know more about it since there is a good probability that I might find myself using it in the future.

    Many thanks in advance!
  Sep 22, 2005 #2


    Each Letters represent an amino acids.

    the "+" sign represent a match of amino acid that are in the same functional family but are different. Any space or gap represent a mismatch.

    Query is the protein that you "BLASTED", subject is the protein in the data bank. the number is for the amino acid not for query of subject.

    As far as talking about the database, BLAST is not a database, genbank is the database. BLAST stand for Basic Local Alignment Search tool. So BLAST is program that help you align sequences and find the best match.

    If you have any question, don't be shy ask the questions. I have done a bioinformatics course and I have been using the BLAST and other database for a few year now.
