Any good references on DNA profiling statistic methods?

Click For Summary

Discussion Overview

The discussion revolves around the statistical methods used in DNA profiling, particularly focusing on the implications of race in match probabilities and the potential biases introduced by arbitrary classifications. Participants explore the mathematical rigor and assumptions underlying DNA profiling statistics, as well as the cumulative error rates in the processes involved.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Conceptual clarification

Main Points Raised

  • Some participants express concern about the link between match probabilities and race, questioning the scientific validity of racial classifications like "Caucasian."
  • There is a discussion on whether combining subjective measures with random sets affects the randomness of results in DNA profiling.
  • One participant highlights the arbitrary nature of racial categories used in DNA reports and compares it to hypothetical scenarios, such as profiling based on food preferences.
  • Concerns are raised about the cumulative error rates in the physical processes involved in DNA analysis, including PCR and electrophoresis, and how these might affect final probabilities.
  • Another participant suggests that as long as the arbitrary choice is consistent, it may not pose a problem for statistical analysis, provided there is no correlation between the chosen category and the DNA.

Areas of Agreement / Disagreement

Participants express differing views on the implications of using race in DNA profiling statistics, with some questioning its validity and others suggesting consistency in arbitrary choices may mitigate issues. The discussion remains unresolved regarding the impact of these factors on the reliability of DNA profiling.

Contextual Notes

Participants note the limitations of existing literature, primarily sourced from law enforcement and government databases, and express a desire for more comprehensive resources that connect mathematical principles to the practical aspects of DNA profiling.

indio007
Messages
12
Reaction score
0
I'm looking for papers or other references on DNA profiling statistics. Something comprehensive that covers the whole shebang.

I kind of troubled by some of the match probabilities being linked to "race" i.e. Caucasian.
I'm not even sure what Caucasian even means scientifically.
What happens when you combine a subjective measure to what is supposedly a random set?

Is the result still random?

It seems like the entire process of DNA profiling has probabilities involved then those proceeding probabilities (errors etc) are discarded when making the final judgment of say one in a million people will match some reference sample.

This final judgment is based on FBI tables which are subdivided into race.
 
Physics news on Phys.org
Ok ... kind of disturbing that no on can answer this.

How bout this.
Does anyone have an opinion on the mathematical rigor of DNA profiling?
 
Why don't you give people some more time, it is just 2 days old.
 
indio007 said:
I'm looking for papers or other references on DNA profiling statistics. Something comprehensive that covers the whole shebang.

I kind of troubled by some of the match probabilities being linked to "race" i.e. Caucasian.
I'm not even sure what Caucasian even means scientifically.
What happens when you combine a subjective measure to what is supposedly a random set?

Is the result still random?

It seems like the entire process of DNA profiling has probabilities involved then those proceeding probabilities (errors etc) are discarded when making the final judgment of say one in a million people will match some reference sample.

This final judgment is based on FBI tables which are subdivided into race.

Are you referring or looking for fallacies and biases in profiling? Then there is the fallacy of using DNA matching without considering the context: the match should be considered only from people who are considered suspect.
 
I don't want to get into a specific instance.
I will spell out the issue.
Here's a statement from a DNA report

The probability of randomly selecting an unrelated individual
that could have contributed to this mixture is 1 person in 1
billion in the Caucasian and Southwestern Hispanic populations
and 1 person in 2 billion in the African American and
Southeastern Hispanic populations
.
These conclusions are based on population statistics derived from
a database of unrelated Caucasian, African American, Southeastern
Hispanic and Southwestern Hispanic populations obtained from the
Federal Bureau of Investigation (FBI)

The problem is Caucasian, African American, Southeastern
Hispanic and Southwestern Hispanic are arbitrary. They are social constructs i.e. there is no "race" test they can give.

What is the effect of an arbitrary choice on the statistics.
Say they wanted to profile pizza lovers. Now there is no correlation between pizza and DNA (i think :D)
However you can still generate a table and probabilities out of this arbitrary choice and start convicting people by saying in a DNA report

The probability of randomly selecting an unrelated individual
that could have contributed to this mixture is 1 person in 1
billion in the pizza lover population.

.
These conclusions are based on food preference derived from
a database of unrelated food eaters from the
Federal Bureau of Investigation (FBI).

A couple of take-out slips in evidence of suspect ordering pizza and your golden.

The bottom line is what effect does arbitrary choice have on the presumption of randomness. It reminds of of post-selection in QP in a way.

Also what does CUMULATIVE error rates in the physical processes i.e. PCR, mixed samples, electrophoresis, analyzer sensitivities to dye fluorescence do to the final probability?

All I can really find are things from law enforcement and gov't.
I just want like maybe a primer that joins the math to the process of actually doing it.

It seems to me there are a lot of assumptions and.
 
indio007 said:
I don't want to get into a specific instance.
I will spell out the issue.
Here's a statement from a DNA report

The probability of randomly selecting an unrelated individual
that could have contributed to this mixture is 1 person in 1
billion in the Caucasian and Southwestern Hispanic populations
and 1 person in 2 billion in the African American and
Southeastern Hispanic populations
.
These conclusions are based on population statistics derived from
a database of unrelated Caucasian, African American, Southeastern
Hispanic and Southwestern Hispanic populations obtained from the
Federal Bureau of Investigation (FBI)

The problem is Caucasian, African American, Southeastern
Hispanic and Southwestern Hispanic are arbitrary. They are social constructs i.e. there is no "race" test they can give.

What is the effect of an arbitrary choice on the statistics.
Say they wanted to profile pizza lovers. Now there is no correlation between pizza and DNA (i think :D)
However you can still generate a table and probabilities out of this arbitrary choice and start convicting people by saying in a DNA report

The probability of randomly selecting an unrelated individual
that could have contributed to this mixture is 1 person in 1
billion in the pizza lover population.

.
These conclusions are based on food preference derived from
a database of unrelated food eaters from the
Federal Bureau of Investigation (FBI).

A couple of take-out slips in evidence of suspect ordering pizza and your golden.

The bottom line is what effect does arbitrary choice have on the presumption of randomness. It reminds of of post-selection in QP in a way.

Also what does CUMULATIVE error rates in the physical processes i.e. PCR, mixed samples, electrophoresis, analyzer sensitivities to dye fluorescence do to the final probability?

All I can really find are things from law enforcement and gov't.
I just want like maybe a primer that joins the math to the process of actually doing it.

It seems to me there are a lot of assumptions and.
Sorry I can't think of anything other than searching for info in the FBI's database. Maybe they have obtained this data from studying separate 'populations' ( where by 'population' I refer to the construct used by the FBI ). As to accumulated errors, I would hope many samples are taken from the same person and a test is made for each
to avoid mistakes. I don't see how to say something more specific without knowing the details of how the samples are taken and the analysis is done, sorry.
 
indio007 said:
What is the effect of an arbitrary choice on the statistics.
As long as the choice is consistent, this is no problem.
If (!) there is no correlation between DNA and love of pizza, you'll get the same number for both groups (within statistical uncertainties), therefore you cannot make any statement about pizza consumption just based on the DNA.
 

Similar threads

  • · Replies 20 ·
Replies
20
Views
4K
  • · Replies 28 ·
Replies
28
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 12 ·
Replies
12
Views
4K
  • · Replies 5 ·
Replies
5
Views
6K
  • · Replies 21 ·
Replies
21
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K