Any good references on DNA profiling statistic methods?

indio007 · Jan 28, 2016

I'm looking for papers or other references on DNA profiling statistics. Something comprehensive that covers the whole shebang.

I kind of troubled by some of the match probabilities being linked to "race" i.e. Caucasian.
I'm not even sure what Caucasian even means scientifically.
What happens when you combine a subjective measure to what is supposedly a random set?

Is the result still random?

It seems like the entire process of DNA profiling has probabilities involved then those proceeding probabilities (errors etc) are discarded when making the final judgment of say one in a million people will match some reference sample.

This final judgment is based on FBI tables which are subdivided into race.

indio007 · Jan 30, 2016

Ok ... kind of disturbing that no on can answer this.

How bout this.
Does anyone have an opinion on the mathematical rigor of DNA profiling?

WWGD · Jan 30, 2016

Why don't you give people some more time, it is just 2 days old.

WWGD · Jan 30, 2016

indio007 said:

I'm looking for papers or other references on DNA profiling statistics. Something comprehensive that covers the whole shebang.

I kind of troubled by some of the match probabilities being linked to "race" i.e. Caucasian.
I'm not even sure what Caucasian even means scientifically.
What happens when you combine a subjective measure to what is supposedly a random set?

Is the result still random?

It seems like the entire process of DNA profiling has probabilities involved then those proceeding probabilities (errors etc) are discarded when making the final judgment of say one in a million people will match some reference sample.

This final judgment is based on FBI tables which are subdivided into race.

Are you referring or looking for fallacies and biases in profiling? Then there is the fallacy of using DNA matching without considering the context: the match should be considered only from people who are considered suspect.

indio007 · Jan 30, 2016

I don't want to get into a specific instance.
I will spell out the issue.
Here's a statement from a DNA report

The probability of randomly selecting an unrelated individual
that could have contributed to this mixture is 1 person in 1
billion in the Caucasian and Southwestern Hispanic populations
and 1 person in 2 billion in the African American and
Southeastern Hispanic populations
.
These conclusions are based on population statistics derived from
a database of unrelated Caucasian, African American, Southeastern
Hispanic and Southwestern Hispanic populations obtained from the
Federal Bureau of Investigation (FBI)

The problem is Caucasian, African American, Southeastern
Hispanic and Southwestern Hispanic are arbitrary. They are social constructs i.e. there is no "race" test they can give.

What is the effect of an arbitrary choice on the statistics.
Say they wanted to profile pizza lovers. Now there is no correlation between pizza and DNA (i think :D)
However you can still generate a table and probabilities out of this arbitrary choice and start convicting people by saying in a DNA report

The probability of randomly selecting an unrelated individual
that could have contributed to this mixture is 1 person in 1
billion in the pizza lover population.

.
These conclusions are based on food preference derived from
a database of unrelated food eaters from the
Federal Bureau of Investigation (FBI).

A couple of take-out slips in evidence of suspect ordering pizza and your golden.

The bottom line is what effect does arbitrary choice have on the presumption of randomness. It reminds of of post-selection in QP in a way.

Also what does CUMULATIVE error rates in the physical processes i.e. PCR, mixed samples, electrophoresis, analyzer sensitivities to dye fluorescence do to the final probability?

All I can really find are things from law enforcement and gov't.
I just want like maybe a primer that joins the math to the process of actually doing it.

It seems to me there are a lot of assumptions and.

WWGD · Jan 30, 2016

indio007 said:

I don't want to get into a specific instance.
I will spell out the issue.
Here's a statement from a DNA report

The probability of randomly selecting an unrelated individual
that could have contributed to this mixture is 1 person in 1
billion in the Caucasian and Southwestern Hispanic populations
and 1 person in 2 billion in the African American and
Southeastern Hispanic populations
.
These conclusions are based on population statistics derived from
a database of unrelated Caucasian, African American, Southeastern
Hispanic and Southwestern Hispanic populations obtained from the
Federal Bureau of Investigation (FBI)

The problem is Caucasian, African American, Southeastern
Hispanic and Southwestern Hispanic are arbitrary. They are social constructs i.e. there is no "race" test they can give.

What is the effect of an arbitrary choice on the statistics.
Say they wanted to profile pizza lovers. Now there is no correlation between pizza and DNA (i think :D)
However you can still generate a table and probabilities out of this arbitrary choice and start convicting people by saying in a DNA report

The probability of randomly selecting an unrelated individual
that could have contributed to this mixture is 1 person in 1
billion in the pizza lover population.

.
These conclusions are based on food preference derived from
a database of unrelated food eaters from the
Federal Bureau of Investigation (FBI).

A couple of take-out slips in evidence of suspect ordering pizza and your golden.

The bottom line is what effect does arbitrary choice have on the presumption of randomness. It reminds of of post-selection in QP in a way.

Also what does CUMULATIVE error rates in the physical processes i.e. PCR, mixed samples, electrophoresis, analyzer sensitivities to dye fluorescence do to the final probability?

All I can really find are things from law enforcement and gov't.
I just want like maybe a primer that joins the math to the process of actually doing it.

It seems to me there are a lot of assumptions and.

Sorry I can't think of anything other than searching for info in the FBI's database. Maybe they have obtained this data from studying separate 'populations' ( where by 'population' I refer to the construct used by the FBI ). As to accumulated errors, I would hope many samples are taken from the same person and a test is made for each
to avoid mistakes. I don't see how to say something more specific without knowing the details of how the samples are taken and the analysis is done, sorry.

mfb · Jan 31, 2016

indio007 said:

What is the effect of an arbitrary choice on the statistics.

As long as the choice is consistent, this is no problem.
If (!) there is no correlation between DNA and love of pizza, you'll get the same number for both groups (within statistical uncertainties), therefore you cannot make any statement about pizza consumption just based on the DNA.

Any good references on DNA profiling statistic methods?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Graduate Probability puzzle

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect