Copies of shredded text are needed to get 99.9% reconstruction of mutated text?

  • Thread starter Simfish
  • Start date
  • Tags
    Text
In summary, NASA astrobiologist Chris McKay is interested in reconstructing ancient genomes that have mutated over billions of years due to internal radioactivity. To do this, he proposes shredding a text of length N into pieces of average length X and determining the number of copies C needed to achieve a 99.9% reconstruction rate. However, this may not be a well-defined mathematical problem without information on how to recognize which fragments of text should go together.
  • #1
Simfish
Gold Member
823
2
This is a question posed by NASA astrobiologist Chris McKay (who wants to put life together back from its pieces), who really wants an answer to this question since it could lead us to reconstruct ancient genomes that are billions of years old (but mutated through years of internal radioactivity)

- A text of length N composed of an alphabet of m distinct units (eg Hamlet has 29551 words and 60 distinct units - 52 letters, space, plus 7 punctuation marks)

- Shred the text in pieces of average length X

- How many copies C of the shredded text are needed to give 99.9% reconstruction?
 
Physics news on Phys.org
  • #2
I don't think this is a well defined mathematical problem unless you have information about how we are can recognize that two fragments of text go next to each other. For example does "invited for a" go with "a little drink" or "a necktie party"?
 

1. What is the purpose of needing 99.9% reconstruction of mutated text?

The purpose of needing 99.9% reconstruction of mutated text is to accurately and effectively understand the changes that have occurred in the original text due to mutations. This is important in fields such as genetics, where even small changes in genetic code can have significant impacts on an organism.

2. How is the percentage of 99.9% determined for reconstruction of mutated text?

The percentage of 99.9% is determined by comparing the mutated text to the original text and calculating the percentage of matching characters. This means that out of every 100 characters, 99.9 of them must match in order to achieve the desired level of reconstruction.

3. What methods are used to reconstruct mutated text?

There are various methods that can be used to reconstruct mutated text, including DNA sequencing, gene editing techniques, and computer algorithms. These methods may vary depending on the type of mutation and the complexity of the text being reconstructed.

4. Is 99.9% reconstruction of mutated text considered perfect?

No, 99.9% reconstruction of mutated text is not considered perfect. While it is a high level of reconstruction, it still allows for a small margin of error. However, this level of reconstruction is often considered sufficient for most scientific purposes.

5. What are the potential limitations of achieving 99.9% reconstruction of mutated text?

The potential limitations of achieving 99.9% reconstruction of mutated text can include technical errors, such as sequencing errors or limitations in the technology used, as well as the complexity of the mutated text itself. Additionally, mutations can occur in areas that are difficult to reconstruct, leading to potential gaps in the reconstructed text.

Back
Top