What do the different symbols in 23andme raw data files mean?

  • Thread starter Thread starter mycotheology
  • Start date Start date
Click For Summary
SUMMARY

The discussion centers on the interpretation of symbols in 23andMe raw data files, specifically focusing on SNPs (Single Nucleotide Polymorphisms). The symbol "--" indicates that the genotype for that SNP is unknown, while single-letter representations suggest that only one allele's genotype was determined. The accuracy of SNP readings is not guaranteed, as some may be incorrect or unreadable, leading to potential gaps in the data. Repeating measurements is necessary to confirm any discrepancies in SNP readings.

PREREQUISITES
  • Understanding of genetic terminology, specifically SNPs (Single Nucleotide Polymorphisms)
  • Familiarity with genetic testing processes, particularly saliva sample analysis
  • Basic knowledge of genotype representation and allele concepts
  • Awareness of data accuracy issues in genetic sequencing
NEXT STEPS
  • Research the implications of SNPs in genetic traits and diseases
  • Learn about the methodologies used in genetic sequencing by companies like 23andMe
  • Explore the significance of genotype accuracy and methods for validation
  • Investigate the interpretation of genetic data in personal health and ancestry contexts
USEFUL FOR

Geneticists, bioinformaticians, individuals interested in personal genomics, and anyone seeking to understand the nuances of genetic data interpretation.

mycotheology
Messages
86
Reaction score
0
For anyone that doesn't know, 23andme is a company that sequences your genome for you (you send them a saliva sample). It gives you a raw data file ful of SNPs. Heres an example of some SNPs from a raw data file:

rs3128126 1 962210 GG
rs2710875 1 977780 TT
rs2465136 1 990417 CT
rs2488991 1 994391 --
rs7526076 1 998395 AA
rs3934834 1 1005806 CC
rs3766192 1 1017197 CC
rs3766191 1 1017587 CC
rs9442372 1 1018704 AA
rs10907177 1 1021346 AA
rs3737728 1 1021415 AA
rs10907178 1 1021583 AA
rs11260588 1 1021658 GG
rs9442398 1 1021695 --


I have a few questions about this. Firstly, what does -- mean, does that mean the genotype of that SNP is unknown? Secondly, some SNPs only give one letter, what does that mean? Does that mean they could only determine the genotype of one of the alleles?
 
Biology news on Phys.org
The test cannot determine all SNPs with 100% accuracy. At some random positions the reading is incorrect or it is not readable at all. In the latter case the SNP is simply not reported. On the other hand, if a wrong reading has been made, you will never know it, unless you repeat the measurements several times.
 
  • Like
Likes   Reactions: jim mcnamara

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
15K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 8 ·
Replies
8
Views
5K
  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 23 ·
Replies
23
Views
5K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 60 ·
3
Replies
60
Views
11K
  • · Replies 40 ·
2
Replies
40
Views
9K
Replies
4
Views
7K