Open Reading Frame (ORF): Definition & Meaning

  • Thread starter Thread starter Suraj M
  • Start date Start date
  • Tags Tags
    Frame Reading
AI Thread Summary
An open reading frame (ORF) is a DNA sequence that begins with an ATG start codon, continues with a series of amino acid-coding codons, and ends with a stop codon, indicating potential protein-coding regions. While many ORFs are statistically unlikely to occur by chance and are associated with protein-coding genes, not all ORFs encode proteins, and some protein-coding genes may lack traditional ORFs due to non-canonical start sites or the presence of introns. The term "open" in ORF suggests a long stretch of codons without an internal stop codon, making it a candidate for gene identification. Modern techniques for identifying protein-coding regions extend beyond ORFs, incorporating additional genetic information. Despite advancements, the concept of ORFs remains relevant in genomic studies and literature.
Suraj M
Gold Member
Messages
596
Reaction score
39
Could someone tell me what exactly is an open reading frame.(ORF)
What i know about it is that it starts its sequence with ATG, codes for amino acids and ends with a terminator codon, what does that mean? is it like an mRNA(can't be as it has ##T##) or part of the DNA which will form the mRNA(cistron), or direct DNA to protein synthesis(unlikely)?
Thank you
 
Biology news on Phys.org
The concept of an open reading frame comes from the early days of genome sequencing when we had the full DNA sequence of an organism like a bacteria, and we wanted to figure out where the protein-coding genes were. One feature of proteins is that they contain an open reading frame, a DNA sequence beginning with ATG followed by a long stretch of amino acid-coding codons, and ending with a stop codon. Because three of the 64 codons are stop codons, a random sequence of DNA would have open reading frames only ~ 20 codons long. Thus, open reading frames with hundreds of codons are statistically unlikely to occur by chance and are putatively annotated as protein-coding genes.

Does every ORF encode a protein? No. As mentioned above, short ORFs can occur by chance in the genome, so while there are short ORFs encoding things like peptide hormones, many short ORFs do not encode proteins. Do all protein-encoding genes contain ORFs? The answer is again no. For example, in many species, there are non-canonical start sites that do not begin with ATG so these ORFs might not be recognized. Furthermore, the presence of introns in eukaryotic genes can make the identification of ORFs tricky.

Modern means of identifying protein coding regions extend beyond the concept of ORFs and make use of other sources of information (such as identifying transcription and translation start sites, looking for splicing sites in eukarotes, and making use of evolutionary information, for example, whether the ORF is conserved across species).
 
in prokaryotes, how is it different from a transcription unit?
 
The transcription unit, which is the segment of DNA extending from the transcription start site of a gene to the transcription stop site. It contains the ORF as well as other regulatory sequences upstream and downstream of the ORF that are transcribed but not translated into protein. These regulatory regions of the transcription unit encode the 5' untranslated region (UTR) and 3' UTR of the mRNA, and these regions are important for regulating translation and mRNA stability. For example, prokaryotes require a ribosome binding sequence (the Shine-Dalgarno sequence) in the 5'UTR in order to allow the ribosome to bind to the mRNA and initiate transcription. Many eukaryotic mRNAs contain micro-RNA (miRNA) binding sites in their 3'UTRs that regulate the mRNAs, for example, by degrading them.
 
This might be a stupid question after all your
explanation , if its just a reading frame why is it called an OPEN reading frame? Just curious
So ORFs are not of any use now? after identification of transcription units?
 
Suraj M said:
This might be a stupid question after all your
explanation , if its just a reading frame why is it called an OPEN reading frame? Just curious
So ORFs are not of any use now? after identification of transcription units?

I'm not sure why it's called an open reading frame, but if I were to guess, it's because it contains a long expanse of codons lacking a stop codon. The concept of an ORF is different from the concept of a reading frame, because an ORF refers to a specific segment within one reading frame.

The concept of an ORF is still useful now and it is a term that is still used often in the literature. At least in the fields I work in, I hear the term ORF much more than I hear the term transcription unit (possibly because biology tends to be more protein centric and the ORF is the portion of a gene that corresponds to the protein).
 
  • Like
Likes stabu
Ygggdrasil said:
an open reading frame, a DNA sequence beginning with ATG followed by a long stretch of amino acid-coding codons, and ending with a stop codon
Ygggdrasil said:
it's because it contains a long expanse of codons lacking a stop codon.
Contradicting statements, Aren't they?
 
Suraj M said:
Contradicting statements, Aren't they?
Well, on average you would expect to see a stop codon every ~ 20 codons, but in most ORFs you'll have stretches of much greater than 20 codons w/o a stop codon (there is a stop codon at the end but none in the middle).
 
So they are just very long, because of the no. of amino acids in the protein?
 
  • #11
I'd like to venture another possible explanation for the word "open" in the term ORF. Again, it's a bit of a guess, but it's useful because it highlights something not mentioned here. Start codons can also be methionines, so the frames are open at the start end, not the stop-codon end.

There's more certainty attached to the end of the ORF, because stop codons are (at least: tend to be) unambiguous. However, how do we know the codon is methionine, or whether it's a start codon? In that way, you can say the RF is open, and therefore a proper ORF.

Sure, if you have an ORF strongly associated with a certain protein or transcription unit. You might tend to say, "well it's closed now, because we have strong evidence from the protein side, that tells us where the start codon is". However, in those circumstances, you have typically advanced further, and left the ORF behind.

To my detriment, I should also say that trying to attach word-by-word meaning to genetics terms is not often a very good idea.
 
  • #12
I also noticed there is a thread coming up as being associated with this one, called something like "Open Reading Frame?" It cannot be be added to any longer, but in it, is stated that start and end codons are associated with exons. That is incorrect and will lead to severe inference problems in eukaryotes. Maybe not so much in prokaryotes where the exon concept isn't necessary, due to lack of alternative splicing.
 
  • #13
OK, I want to suppress my interpretation of the word "open" here. As you recall, I thought it may refer to the open start-end of a reading frame.

From reading around, it appears that this is unlikely. Doolittle's 1986 book, "On URFs and ORFs" is probably an early enough reference and it correlates with Ygggdrasil's interpretation.

The point is, the word "open" has various meanings. It does not necessarily mean "with an open end" or "open-ended" which is what i was aiming for. Often "open" means "unresolved" as in "open for investigation". So we get a long stretch of codons without a stop codon, and we realize it's statistically unlikely so we classify the region as open, in that, it's a candidate for a possible gene.

I am however, still a bit mystified as to why there is not much talk about deciding when an M is a start codon, but that's may be because I'm not looking in the right places.
 
  • #14
stabu said:
I am however, still a bit mystified as to why there is not much talk about deciding when an M is a start codon, but that's may be because I'm not looking in the right places.

There are some context clues surrounding the ATG to determine whether it is a start codon or not. For example, in bacteria, true start codons are often directly downstream of a Shine-Dalgarno sequence (which directs ribosome binding for translation initiation). Similarly, in eukaryotes, true start codons often fall within a Kozak consensus sequence. Of course, methods of identifying start codons are not always perfect and new experimental techniques are often finding translation on parts of mRNAs that we did not think to be translated previously (http://www.sciencedirect.com/science/article/pii/S2211124714006299).
 
  • #15
Thanks Ygggdrasil. Will read. Cheers.
 
Back
Top