How Do Biogenetics Databases Like NCBI's Blast Work?

  • Thread starter Thread starter saysua
  • Start date Start date
Click For Summary
SUMMARY

The discussion centers on the functionality of the BLAST (Basic Local Alignment Search Tool) program, which is utilized to query biological databases like GenBank for matching protein and gene sequences. Participants clarify that each letter in the protein sequences represents an amino acid, and the "+" sign indicates a match within the same functional family. The query refers to the protein being searched, while the subject refers to the protein found in the database. Understanding these distinctions is crucial for effectively using BLAST in bioinformatics.

PREREQUISITES
  • Basic understanding of bioinformatics concepts
  • Familiarity with protein and gene sequence analysis
  • Knowledge of amino acid coding and alignment techniques
  • Experience with using online biological databases like GenBank
NEXT STEPS
  • Explore the functionalities of NCBI's BLAST tool in detail
  • Learn about sequence alignment algorithms and their applications
  • Investigate the structure and usage of GenBank as a biological database
  • Study the significance of functional families in protein sequences
USEFUL FOR

Researchers, bioinformaticians, and students in the biological sciences who are looking to deepen their understanding of sequence alignment and the use of biological databases for protein and gene analysis.

saysua
Messages
2
Reaction score
0
I think those who are well-immersed in the scientific community are familiar with the biological databases connected through the website www.ncbi.nlm.nih.gov.

I was just wondering if someone could possibly explain to me a little about the Blast database for me. IN a particular section of a homework assignment, Blast is needed to query for matching protein/gene sequences. The search for an unc. C elegan prtoein/gene sequence resulted in a bunch of data on species that have similar protein/gene sequences and alignments.

An example for the protein sequence results are:

Query 1 MEHEKDPGWQYLRRTREQVLEDQSKPYDSKKNVWIPDPEEGYLAGEITATKGDQVTIVTA 60

MEHEKDPGWQYLRR+REQ+LEDQSKPYDSKKN WIPDPEEGYLAGEITATKGDQVTIVTA

Sbjct 1 MEHEKDPGWQYLRRSREQILEDQSKPYDSKKNCWIPDPEEGYLAGEITATKGDQVTIVTA 60


My questions are;
are each of the letters codons of amino acids?
In terms of its relationship to each other and to my initial protein query, what is query 1 and subject 1?

I'll deeply appreciate it if you could provide further insights into this database. (maybe there's a crashcourse or something on it somewhere?) My professor seems to be skimming ont he surface of the importance of these databases (blast/wormbase/flybase), but I want to know more about it since there is a good probability that I might find myself using it in the future.

Many thanks in advance!
 
Biology news on Phys.org
Each Letters represent an amino acids.
http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/A/AminoAcids.html

the "+" sign represent a match of amino acid that are in the same functional family but are different. Any space or gap represent a mismatch.

Query is the protein that you "BLASTED", subject is the protein in the data bank. the number is for the amino acid not for query of subject.

As far as talking about the database, BLAST is not a database, genbank is the database. BLAST stand for Basic Local Alignment Search tool. So BLAST is program that help you align sequences and find the best match.

If you have any question, don't be shy ask the questions. I have done a bioinformatics course and I have been using the BLAST and other database for a few year now.