Open Reading Frame (ORF): Definition & Meaning

In summary: I've heard suggested is that the open reading frame might be more similar to the mRNA than the typical DNA sequence. This is because mRNAs typically contain a lot of transcription start sites (TSSs) and transcription stop sites (TSSs), both of which are located within the open reading frame. So, when you sequence an ORF, you're essentially sequencing the mRNA as well.
  • #1
Suraj M
Gold Member
597
39
Could someone tell me what exactly is an open reading frame.(ORF)
What i know about it is that it starts its sequence with ATG, codes for amino acids and ends with a terminator codon, what does that mean? is it like an mRNA(can't be as it has ##T##) or part of the DNA which will form the mRNA(cistron), or direct DNA to protein synthesis(unlikely)?
Thank you
 
Biology news on Phys.org
  • #2
The concept of an open reading frame comes from the early days of genome sequencing when we had the full DNA sequence of an organism like a bacteria, and we wanted to figure out where the protein-coding genes were. One feature of proteins is that they contain an open reading frame, a DNA sequence beginning with ATG followed by a long stretch of amino acid-coding codons, and ending with a stop codon. Because three of the 64 codons are stop codons, a random sequence of DNA would have open reading frames only ~ 20 codons long. Thus, open reading frames with hundreds of codons are statistically unlikely to occur by chance and are putatively annotated as protein-coding genes.

Does every ORF encode a protein? No. As mentioned above, short ORFs can occur by chance in the genome, so while there are short ORFs encoding things like peptide hormones, many short ORFs do not encode proteins. Do all protein-encoding genes contain ORFs? The answer is again no. For example, in many species, there are non-canonical start sites that do not begin with ATG so these ORFs might not be recognized. Furthermore, the presence of introns in eukaryotic genes can make the identification of ORFs tricky.

Modern means of identifying protein coding regions extend beyond the concept of ORFs and make use of other sources of information (such as identifying transcription and translation start sites, looking for splicing sites in eukarotes, and making use of evolutionary information, for example, whether the ORF is conserved across species).
 
  • #3
in prokaryotes, how is it different from a transcription unit?
 
  • #4
The transcription unit, which is the segment of DNA extending from the transcription start site of a gene to the transcription stop site. It contains the ORF as well as other regulatory sequences upstream and downstream of the ORF that are transcribed but not translated into protein. These regulatory regions of the transcription unit encode the 5' untranslated region (UTR) and 3' UTR of the mRNA, and these regions are important for regulating translation and mRNA stability. For example, prokaryotes require a ribosome binding sequence (the Shine-Dalgarno sequence) in the 5'UTR in order to allow the ribosome to bind to the mRNA and initiate transcription. Many eukaryotic mRNAs contain micro-RNA (miRNA) binding sites in their 3'UTRs that regulate the mRNAs, for example, by degrading them.
 
  • #5
This might be a stupid question after all your
explanation , if its just a reading frame why is it called an OPEN reading frame? Just curious
So ORFs are not of any use now? after identification of transcription units?
 
  • #6
Suraj M said:
This might be a stupid question after all your
explanation , if its just a reading frame why is it called an OPEN reading frame? Just curious
So ORFs are not of any use now? after identification of transcription units?

I'm not sure why it's called an open reading frame, but if I were to guess, it's because it contains a long expanse of codons lacking a stop codon. The concept of an ORF is different from the concept of a reading frame, because an ORF refers to a specific segment within one reading frame.

The concept of an ORF is still useful now and it is a term that is still used often in the literature. At least in the fields I work in, I hear the term ORF much more than I hear the term transcription unit (possibly because biology tends to be more protein centric and the ORF is the portion of a gene that corresponds to the protein).
 
  • Like
Likes stabu
  • #7
Ygggdrasil said:
an open reading frame, a DNA sequence beginning with ATG followed by a long stretch of amino acid-coding codons, and ending with a stop codon
Ygggdrasil said:
it's because it contains a long expanse of codons lacking a stop codon.
Contradicting statements, Aren't they?
 
  • #8
Suraj M said:
Contradicting statements, Aren't they?
Well, on average you would expect to see a stop codon every ~ 20 codons, but in most ORFs you'll have stretches of much greater than 20 codons w/o a stop codon (there is a stop codon at the end but none in the middle).
 
  • #9
So they are just very long, because of the no. of amino acids in the protein?
 
  • #11
I'd like to venture another possible explanation for the word "open" in the term ORF. Again, it's a bit of a guess, but it's useful because it highlights something not mentioned here. Start codons can also be methionines, so the frames are open at the start end, not the stop-codon end.

There's more certainty attached to the end of the ORF, because stop codons are (at least: tend to be) unambiguous. However, how do we know the codon is methionine, or whether it's a start codon? In that way, you can say the RF is open, and therefore a proper ORF.

Sure, if you have an ORF strongly associated with a certain protein or transcription unit. You might tend to say, "well it's closed now, because we have strong evidence from the protein side, that tells us where the start codon is". However, in those circumstances, you have typically advanced further, and left the ORF behind.

To my detriment, I should also say that trying to attach word-by-word meaning to genetics terms is not often a very good idea.
 
  • #12
I also noticed there is a thread coming up as being associated with this one, called something like "Open Reading Frame?" It cannot be be added to any longer, but in it, is stated that start and end codons are associated with exons. That is incorrect and will lead to severe inference problems in eukaryotes. Maybe not so much in prokaryotes where the exon concept isn't necessary, due to lack of alternative splicing.
 
  • #13
OK, I want to suppress my interpretation of the word "open" here. As you recall, I thought it may refer to the open start-end of a reading frame.

From reading around, it appears that this is unlikely. Doolittle's 1986 book, "On URFs and ORFs" is probably an early enough reference and it correlates with Ygggdrasil's interpretation.

The point is, the word "open" has various meanings. It does not necessarily mean "with an open end" or "open-ended" which is what i was aiming for. Often "open" means "unresolved" as in "open for investigation". So we get a long stretch of codons without a stop codon, and we realize it's statistically unlikely so we classify the region as open, in that, it's a candidate for a possible gene.

I am however, still a bit mystified as to why there is not much talk about deciding when an M is a start codon, but that's may be because I'm not looking in the right places.
 
  • #14
stabu said:
I am however, still a bit mystified as to why there is not much talk about deciding when an M is a start codon, but that's may be because I'm not looking in the right places.

There are some context clues surrounding the ATG to determine whether it is a start codon or not. For example, in bacteria, true start codons are often directly downstream of a Shine-Dalgarno sequence (which directs ribosome binding for translation initiation). Similarly, in eukaryotes, true start codons often fall within a Kozak consensus sequence. Of course, methods of identifying start codons are not always perfect and new experimental techniques are often finding translation on parts of mRNAs that we did not think to be translated previously (http://www.sciencedirect.com/science/article/pii/S2211124714006299).
 
  • #15
Thanks Ygggdrasil. Will read. Cheers.
 

1. What is an open reading frame (ORF)?

An open reading frame (ORF) is a DNA sequence that begins with a start codon (ATG) and ends with a stop codon (TAA, TAG, or TGA). It is the portion of a gene or mRNA that can be translated into a protein.

2. What is the significance of an ORF?

An ORF is important because it provides the instructions for the production of a protein. It is the functional unit of a gene and is essential for the proper functioning of cells.

3. How is an ORF identified?

An ORF is identified by analyzing the DNA sequence for the presence of a start codon (ATG) followed by a series of codons that code for amino acids and ending with a stop codon (TAA, TAG, or TGA). This region is then translated into a protein sequence to confirm its identity as an ORF.

4. Can an ORF overlap with other ORFs?

Yes, an ORF can overlap with other ORFs. This is because DNA is read in a continuous sequence of three nucleotides (codons) and a single nucleotide can be a part of multiple codons. Therefore, it is possible for multiple ORFs to exist within a single DNA sequence.

5. Are all ORFs functional?

No, not all ORFs are functional. Some ORFs may contain mutations or may not code for a functional protein. Additionally, some ORFs may be transcribed into RNA, but may not be translated into a protein due to regulatory mechanisms.

Similar threads

Replies
4
Views
2K
Replies
8
Views
4K
  • Biology and Medical
Replies
31
Views
5K
  • Biology and Medical
Replies
5
Views
2K
  • Biology and Medical
Replies
1
Views
1K
Replies
2
Views
2K
  • Biology and Chemistry Homework Help
Replies
7
Views
2K
Replies
4
Views
6K
Replies
19
Views
4K
Back
Top