DNA Sequence Alignment: Understanding Smith-Waterman Algorithm

wu_weidong · Aug 4, 2007

Hi all,
I'm interested in learning more about DNA sequence alignment and have been reading up on the topic online.

I'm more interested in the Smith-Waterman algorithm for local alignment, but I'm quite confused about how the algorithm works.

I know the algorithm works on a MxN matrix, where M and N are the lengths of the 2 DNA sequences, but I'm not sure how the entries of the matrix came about. Also, I keep coming across the substitution matrices PAM and BLOSUM, but I thought they're mostly used for amino acid sequences and their matrix entries are predetermined. So how do they fit into the Smith-Waterman algorithm where the DNA sequences are different in different comparisons?

Thank you.

Regards,
Rayne

Monique · Aug 4, 2007

Try the NCBI blast help: http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs

Scroll down to PAM and BLOSOM substitution matrices: http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html

wu_weidong · Aug 8, 2007

I have another question. Say we compare sequence A with sequence B, C and D using Smith-Waterman algorithm, and the maximum score for each of the 3 comparisons are 1, 2 and 3 respectively. Does that mean sequence A and C are the most similar and therefore the most useful for future research? If not, how do we determine which 2 sequences are the most similar?

Thanks.

sundus · Dec 15, 2009

Hi,
Please anybody provide me with a program which compute distance matrix from dna or protein sequences

farful · Dec 15, 2009

sundus said:

Hi,
Please anybody provide me with a program which compute distance matrix from dna or protein sequences

http://www.megasoftware.net/

farful · Dec 15, 2009

wu_weidong said:

Hi all,
I'm interested in learning more about DNA sequence alignment and have been reading up on the topic online.

I'm more interested in the Smith-Waterman algorithm for local alignment, but I'm quite confused about how the algorithm works.

I know the algorithm works on a MxN matrix, where M and N are the lengths of the 2 DNA sequences, but I'm not sure how the entries of the matrix came about. Also, I keep coming across the substitution matrices PAM and BLOSUM, but I thought they're mostly used for amino acid sequences and their matrix entries are predetermined. So how do they fit into the Smith-Waterman algorithm where the DNA sequences are different in different comparisons?

Thank you.

Regards,
Rayne

http://en.wikipedia.org/wiki/Smith-Waterman_algorithm#Example

Your best bet is to work through this example.

If you're new to local alignment, I suggest you start with Needleman-Wunsch - it's simpler, and a precursor to Smith-Waterman.
http://en.wikipedia.org/wiki/Needleman-Wunsch_algorithm

If you're still stuck, try asking specific questions again, and I'll try to help you out.

As for substitution matrices - substitutions between A and G (purines) or C and T (pyrimidines) are penalized less than a purine to a pyrimidine (or vice versa) just like how substitutions between phenylalanine and tyrosine are penalized less (similar side chains!)

The reason why you come across PAM/BLOSUM is because Smith-Waterman (and Needleman-Wunsch) can be used not only for nucleotide sequence alignment, but amino acid sequence alignment as well. All that being said, you really ought to ignore substitution matrices for now.

DNA Sequence Alignment: Understanding Smith-Waterman Algorithm

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

Similar threads

Can Dogs Talk Using Buttons?

Incredible Difference in Ant Sizes

What causes the asymmetry in a symmetrically developing organism?

A New Niche for Life at Low G

There are people in biology who really do math

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect