Human Genome Project: How Scientists Determined 3 Billion Bases

  • Thread starter Thread starter Jack
  • Start date Start date
  • Tags Tags
    Human Project
AI Thread Summary
Scientists determined the 3 billion bases of the human genome primarily through shotgun sequencing, which involves isolating human genomic DNA, fragmenting it with restriction enzymes, and cloning the fragments into plasmids for transformation into E. coli. This method allows for rapid sequencing and assembly of the genome, although it can introduce contamination and errors during replication. The human genome is still considered a draft, with ongoing revisions and improvements, particularly concerning repetitive sequences and potential misassemblies. Recent updates indicate that the genome is nearly complete, with only a small percentage of gaps remaining, but these gaps are deemed too costly to fill. Overall, while significant progress has been made, the final version of the human genome continues to evolve as technology advances.
Jack
Messages
107
Reaction score
0
How exactly did scientists go about determining each of the 3 billion bases that make up the human genome?
 
Biology news on Phys.org
the procedure is quite simple. The most commonly use technique is call shotgun sequecing. It is a fast way of getting draft sequence.

First you have to isolate genomic of the human and then the chromosome you desired.

The next step is to treat the DNA with various restriction enzyme. You obtain fragment and you ligated them into a cloning plasmid then transform E. coli. You select the bacteria (100 of thousand) that have plasmid with an inserts. You then grow each bacterium seperatly then isolated the plasmid. Then the sequencing reaction takes place. Computers analyse the sequencing reaction. Then you get thousand of individual fragments that you assemble using computers. then you get a draft of the chromosome. The problem with this techniques is that it brings contamination into the sequence i.e. E. coli makes mistakes when replecating the plasmid and you get vector contamination also.

The human genome mostly shotgun sequencing and only a few chromosome are double stranted sequence with no contamination.

Here's a link to see the progress and more detail information
http://www.ncbi.nlm.nih.gov:80/mapview/map_search.cgi?taxid=9606
 
Thanks Ian. Does anyone know what the website is with the final version of the human genome (I think it's on the net)?
 
Last edited:
Let me just point out that the sequence of the human genome is not finished jet.. it is still in a draft form, some parts more finished than others.

The main problem are repetitive sequences, I have seen the human genome shrink since it was first published. Right now we are at build 32, they say they are almost finished and we should get the final sequence in the next year I think.

They promised us that Chromosome 19 will be finished in the next coming months.. I hope that they are going to make some revisions although that is not likely. The reason I say that is that my data suggests that there are inversions present in the sequence which I am looking at, the genetic markers are probably not in the right order.
 
Originally posted by Monique
Let me just point out that the sequence of the human genome is not finished jet.. it is still in a draft form, some parts more finished than others.


I thought I heared on the news a few weeks ago that they had completed it as much as they could. The draft was produced last year and completed recently?
 
You are right, I looked it up on the NCBI website and they supposedly 'finished' finished the human genome on April 14th.. I wonder how they define 'finished'.. let me read their press releases.
 
Originally posted by Monique
You are right, I looked it up on the NCBI website and they supposedly 'finished' finished the human genome on April 14th.. I wonder how they define 'finished'.. let me read their press releases.

Apparently it is as finished as it is going to get with current technology and there are only a few small gaps. I think its something like 1% incomplete.
 
OK, 1% incomplete, if that is what they call completing the sequence of the human genome, then they are mistaken in what is the meaning of complete.

Could you tell me what their confidence is that they assembled the sequence correctly?? As I said, the genome has been shrinking due to the elimination of redundant or repetative sequences. And I am pretty sure that there are still misassemblies and inversions that need to be corrected.

So I say: they didn't complete the sequence of the human genome yet, but they have met their goal to get a sequence coverage of greater than 99%. Newer version of the human genome are still to come.

Anyone know the status of the private genome? How finished are they? I would really like to see a publication about the comparison of the two sequences.
 
  • #10
Sorry I think I was wrong about the 1% incomplete bit. Here is an extract from the article on the BBC News website;

'In the draft in June 2000, 97% of the "book of life" had been read'

'The decoding is now close to 100% complete. The remaining tiny gaps are considered too costly to fill and those in charge of turning genomic data into medical and scientific progress have plenty to be getting on with.'

Here is the full article;

http://news.bbc.co.uk/1/hi/sci/tech/2940601.stm
 
Last edited:
  • #11
Originally posted by Jack
Sorry I think I was wrong about the 1% incomplete bit. Here is an extract from the article on the BBC News website;

'In the draft in June 2000, 97% of the "book of life" had been read'

'The decoding is now close to 100% complete. The remaining tiny gaps are considered too costly to fill and those in charge of turning genomic data into medical and scientific progress have plenty to be getting on with.'

Here is the full article;

http://news.bbc.co.uk/1/hi/sci/tech/2940601.stm

The case is worse!
Here NCBI reports the progress: http://www.ncbi.nlm.nih.gov/genome/seq/

I calculated with those figures what precentage of the sequence were determined bases, it is 92.3%. Where telomeres, centromeres, and other heterochromatic regions have been left undetermined, as have a small number of unclonable gaps. That means 8,010,000 bases on Chromosome 19, one of the smaller, and might I say one of the most (if not THE) gene-dense chromosomes which I am interested in are missing.

That means 26,491,000 bases missing on Chromosome 1 (the largest chromosome of all).

I want to ask you about the decoding, what is meant by that? 100% completed??
 
Last edited by a moderator:
  • #12
Btw, how can they know how large their gaps are and what the length of the chromosomes is in bases??
 
  • #13
I'm sorry I can't help you there. I haven't even done my GCSE in biology yet and don't know much about genetics. I took the 'decoding' to simply mean working out the bases that made up the human genome and the 100% bit was slightly exaggerated I think, judging by what you have told me.
 
Back
Top