Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

DNA Data

  1. Jan 29, 2010 #1

    Char. Limit

    User Avatar
    Gold Member

    Is it possible to find out how many bytes of data one chromosome (say, Chromosome 1 for humans) carries?

    Does the question have any meaning?
     
  2. jcsd
  3. Jan 29, 2010 #2
    That would depend on how you would define a 'byte' of data as it pertains to genomic information, but chromosome 1 has about 224 million base pairs.
     
  4. Jan 29, 2010 #3

    Ygggdrasil

    User Avatar
    Science Advisor

    What kind of information are you counting? The DNA sequence is one part of the information carried on the chromosome and it's failry easy to quantify the amount of information present there. However, you could also argue that the way in which the DNA is packaged into chromatin or the patterns of DNA methylation also constitutes information.
     
  5. Jan 29, 2010 #4

    Char. Limit

    User Avatar
    Gold Member

    I'm not sure, I thought information was only in the actual DNA.
     
  6. Jan 30, 2010 #5

    Monique

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Information is also on top of the DNA, like Ygggdrasil mentioned.

    I'm not sure why you want to express it as bytes (I have a worm database, which contains all the genomic sequence and specifies locations of genetic elements, which is 1520 MB in size). It is better to formulate it as functional units, such as genes or CNVs or SNPs etc: the database contains 24114 curated coding sequences.
     
  7. Jan 30, 2010 #6

    Char. Limit

    User Avatar
    Gold Member

    I want to express it in bytes for the same reason I like to express distance of light-years in kilometers: to give me a reference.
     
  8. Jan 30, 2010 #7

    Monique

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    You don't express length or weight in bytes, do you?
     
  9. Jan 30, 2010 #8
    you might want to express it in bytes if you are looking at it from an http://en.wikipedia.org/wiki/Information_theory" [Broken] perspective
     
    Last edited by a moderator: May 4, 2017
  10. Jan 30, 2010 #9

    sylas

    User Avatar
    Science Advisor

    The raw number of base pairs is still not necessarily the best information theory perspective; but that's a quibble.

    The simplest answer for the human chromosome 1 is a bit under 62 Mbytes. There are about 247 million base pairs, and each base pair has four possibilities. So if you just take it as a sequence of units, each unit being two bits, you end up with nearly 62 million bytes.

    Cheers -- sylas
     
    Last edited by a moderator: May 4, 2017
  11. Jan 30, 2010 #10

    Char. Limit

    User Avatar
    Gold Member

    I see. Thank you, Sylas.

    Also, to Monique, I would, of course, not express a length or weight with a unit of information.
     
  12. Jan 31, 2010 #11

    Monique

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    You are being smart, but you've still not explained why you would want to know the number of bytes over simply the number of units (kilometers/basepairs/genes). You yourself stated that you questioned whether your question has any meaning.
     
  13. Jan 31, 2010 #12

    Char. Limit

    User Avatar
    Gold Member

    I did indeed, because I wasn't sure if DNA could be said to carry data in the sense of bytes.

    The bytes are to give me a reference point. Can a chromosome carry more data than, say, an advanced computer?

    At near 62Mb, no way.
     
  14. Jan 31, 2010 #13

    Monique

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Alright, fair enough. Just note that it is not a valid comparison. You are comparing an arbitrary unit (one chromosome) to an advanced computer (carrying how many chips?). People are not turning to DNA as the future for computing for nothing: you can contain a lot of information on a very small scale.

    A typical diploid human cell has "1.5 Gb of data" stored in a nucleus with a diameter of approximately 6 micrometers (only taking into account the basepairing of the DNA).

    According to Leonard Adleman, a pioneer in DNA computing:

    Also note that the biological information in DNA is contained on many layers (for instance epigenetic information): it is not simply a linear sequence of letters.
     
    Last edited: Feb 2, 2010
  15. Jan 31, 2010 #14
    i believe the benefit to DNA computing would be the massive amount of parallelization. this can be good for certain types of problems, like cracking encryption by brute strength. but for general purpose computing, where serial calculations are required, it's probably not very useful.
     
  16. Feb 1, 2010 #15
    Dear Monique, a nucleus has far, far more data stored than 750 MB. The DNA backbone alone (not considering DNA methylation) has about 3.3 GB in its haploid form for human DNA (one byte being 2 bits of information A=T, G=C and T=A, C=G). Most of the information is stored in non-coding stretches (98%) which constitute the so-called regulatory network of information (the higher an organism, the more non-coding DNA stretches it carries in the genome). If you consider RNA and proteins in the 3D space of a cell, which is usually highly structured, the stored information of a single cell (or nucleus) is easily in the lower TB. All life forms are self-encoded, self-assembling, self-propagating nano-structures and what is written in the DNA is only to get from one existing nano-structure (fertilized egg) to a complete mature organism and back to the initial nano-structure. If you mess up a fertilized egg's 3D RNA and protein structure (virtually destroy its information content), the micro-injected DNA cannot perform. The DNA is kind of like the software in a robot (fertilized cell) to give the instruction on how to change itself for duplication. The fertilized egg with all its components is the hardware. So not everything is encoded in the DNA. That's why organisms (species) and its DNA (RNA) co-evolved over millions of generations.
     
  17. Feb 2, 2010 #16

    Monique

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Hi bioinfinity, the human genome carries about 3 billion base pairs, you need two bits to describe each basepair, divide by 8 = 750 Mb. Indeed that is haploid (one copy), a diploid genome (two copies) would be 1.5 Gb.

    I've already mentioned that there is additional information besides the bare genetic code.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: DNA Data
  1. Is DNA information? (Replies: 38)

  2. DNA evolution (Replies: 5)

  3. DNA origami (Replies: 3)

  4. The behaviour of DNA (Replies: 1)

  5. Transcription in DNA (Replies: 5)

Loading...