DNA sequencing and restoring malformed sequences

  • Thread starter Thread starter Atran
  • Start date Start date
  • Tags Tags
    Dna Sequences
Click For Summary
SUMMARY

The discussion centers on DNA sequencing, specifically modeling DNA as an ordered sequence of nucleobases. The author describes a random sequence and a malformed sequence, highlighting the positions of malformation. They assert that restoring the malformed sequence is computationally straightforward, particularly when visualizing the sequences in parallel. Additionally, they reference the complexity of genome assembly from overlapping fragments as a significant computational challenge in the field.

PREREQUISITES
  • Understanding of DNA structure and nucleobases
  • Familiarity with computational complexity, particularly NP-completeness
  • Knowledge of genome sequencing techniques
  • Basic skills in algorithm design and analysis
NEXT STEPS
  • Research DNA sequence alignment algorithms
  • Explore genome assembly methods, particularly de Bruijn graphs
  • Learn about mutation detection techniques in bioinformatics
  • Investigate the implications of NP-completeness in biological computations
USEFUL FOR

This discussion is beneficial for bioinformaticians, computational biologists, and researchers involved in DNA sequencing and genome assembly challenges.

Atran
Messages
93
Reaction score
1
I was just reading about DNA sequencing. In my view, DNA can be modeled into an ordered sequence of nucleobases, as if the two strands were joined into a single strand (just like in RNA). The first half of the sequence models the first strand. The four nucleobases are numbered from 0 to 3. Hence, a random sequence S equals (0 1 2 0 1 1 0 3 ...). The length of the sequence is on the order of billions.

Assume the same sequence is malformed (0 0 2 0 1 1 3 3 ...). It's malformed at position 1 and 6. Visualizing the two sequences parallel to each other, restoring the malformed sequence would be an easy computational task.

Am I missing something? I looked through this because I heard of sequencing in the context of NP-completeness in my computational complexity class.
 
Biology news on Phys.org
I don't think recognizing mutations is a computationally hard problem (provided the mutation rate is sufficiently low). In the context of genome sequencing, here's a good source that describes one computationally hard problem that had to be addressed in the genome sequencing field (figuring out how to assemble a full DNA sequence from multiple overlapping short fragments of that DNA sequence): http://www.cs.cmu.edu/afs/cs/academic/class/15210-s15/www/lectures/genome-notes.pdf
 

Similar threads

Replies
0
Views
879
  • · Replies 2 ·
Replies
2
Views
5K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 3 ·
Replies
3
Views
67K
  • · Replies 8 ·
Replies
8
Views
5K
  • · Replies 23 ·
Replies
23
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K
Replies
2
Views
1K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 1 ·
Replies
1
Views
6K