How Many DNA Molecules to Sample for Sufficient Unique Sequences?

1eray · Aug 29, 2011

I have 2x10^12 unique sequences of DNA, and I have an average of 47 copies of each sequence (so 94x10^12 DNA molecules total).
How many molecules do I need to choose at random to be "confident" (defined as you please) that I have at least 10^10 unique molecules? 10^11? 10^12?
I would really like to know how to do this calculation.
Any help would be very appreciated.
Thanks,
Ed

bpet · Aug 31, 2011

If the number of copies of each is not exactly 47 then the answer could vary wildly (consider the case with 1 copy of all but one, and copies of a single one making up the rest).

First step would be to show that the number required is somewhere between the number required when there is 1 copy of all (trivial) and the number required when there are infinitely many copies (which is a version of the coupon collector's problem).

To solve the latter problem you'd use the same techniques as for the CCP but truncate the sums appropriately. In effect you're modelling how the number of distinct copies found increases (randomly) as you add one more to the sample. Then, for example, apply Chebyshev's inequality to the 1st and 2nd moments of the distribution as a function of the sample size which would give you a very conservative estimate of the minimum number required.

How Many DNA Molecules to Sample for Sufficient Unique Sequences?

Thread 'Deductive proof in logic formal systems'

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

I Stochastic calculus: Ito's lemma and differentials

I Help me understand skewness in QQ-plots please

I Intransitive implication

Recent Insights

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem