|Aug4-07, 03:16 AM||#1|
DNA sequence alignment
I'm interested in learning more about DNA sequence alignment and have been reading up on the topic online.
I'm more interested in the Smith-Waterman algorithm for local alignment, but I'm quite confused about how the algorithm works.
I know the algorithm works on a MxN matrix, where M and N are the lengths of the 2 DNA sequences, but I'm not sure how the entries of the matrix came about. Also, I keep coming across the substitution matrices PAM and BLOSUM, but I thought they're mostly used for amino acid sequences and their matrix entries are predetermined. So how do they fit into the Smith-Waterman algorithm where the DNA sequences are different in different comparisons?
|Aug4-07, 02:49 PM||#2|
Try the NCBI blast help: http://www.ncbi.nlm.nih.gov/BLAST/Bl...TYPE=BlastDocs
Scroll down to PAM and BLOSOM substitution matrices: http://www.ncbi.nlm.nih.gov/BLAST/tu...ltschul-1.html
|Aug8-07, 11:57 PM||#3|
I have another question. Say we compare sequence A with sequence B, C and D using Smith-Waterman algorithm, and the maximum score for each of the 3 comparisons are 1, 2 and 3 respectively. Does that mean sequence A and C are the most similar and therefore the most useful for future research? If not, how do we determine which 2 sequences are the most similar?
|Dec15-09, 02:33 PM||#4|
DNA sequence alignment
Please any body provide me with a program which compute distance matrix from dna or protein sequences
|Dec15-09, 03:18 PM||#6|
Your best bet is to work through this example.
If you're new to local alignment, I suggest you start with Needleman-Wunsch - it's simpler, and a precursor to Smith-Waterman.
If you're still stuck, try asking specific questions again, and I'll try to help you out.
As for substitution matrices - substitutions between A and G (purines) or C and T (pyrimidines) are penalized less than a purine to a pyrimidine (or vice versa) just like how substitutions between phenylalanine and tyrosine are penalized less (similar side chains!)
The reason why you come across PAM/BLOSUM is because Smith-Waterman (and Needleman-Wunsch) can be used not only for nucleotide sequence alignment, but amino acid sequence alignment as well. All that being said, you really ought to ignore substitution matrices for now.
|Similar Threads for: DNA sequence alignment|
|Alignment of Stack||Computing & Technology||4|
|Printer alignment?||Computing & Technology||1|
|Magnetic Field Alignment||Classical Physics||1|
|Matlab help - alignment||Math & Science Software||0|