DNA Transcription: Where do genes begin and end?

  • Thread starter Thread starter mtanti
  • Start date Start date
  • Tags Tags
    Dna Genes
Click For Summary

Discussion Overview

The discussion revolves around the mechanisms of DNA transcription, specifically how RNA polymerase identifies the start and end points of genes. Participants explore concepts related to gene structure, including start and stop codons, overlapping genes, and the implications for computational simulations of transcription.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant seeks clarification on how RNA polymerase recognizes the start and stop of transcription, using markers like '[' and ']' to represent gene boundaries.
  • Another participant states that transcription starts at a start codon (ATG) and stops at a stop codon (TGA, TAG, or TAA), but emphasizes that not all start codons are the start of an open reading frame (ORF).
  • It is mentioned that overlapping genes can exist, although some participants contest the extent of overlap that is possible.
  • A participant highlights the complexity of defining genes and suggests that standard textbooks may not provide sufficient detail for gene prediction from genome sequences.
  • Concerns are raised about the identification of "illegal" sequences in random DNA and the implications of encountering multiple start codons before a stop codon.
  • One participant describes a naive approach to gene-finding in bacterial genomes, which involves identifying long sequences without stop codons, while also noting that more sophisticated methods exist.

Areas of Agreement / Disagreement

Participants express differing views on the definitions and characteristics of genes, the nature of overlapping genes, and the clarity of transcription mechanisms. There is no consensus on how to definitively identify gene boundaries or the implications of overlapping start codons.

Contextual Notes

Participants acknowledge that definitions of genes may vary and that the complexity of transcription mechanisms can lead to uncertainty in identifying gene boundaries. There are also references to the limitations of standard textbook knowledge in addressing these issues.

Who May Find This Useful

This discussion may be of interest to those studying molecular biology, bioinformatics, or computational biology, particularly in the context of gene prediction and transcription mechanisms.

mtanti
Messages
172
Reaction score
0
Dear all,

I'm studying DNA-Protein relationships for a computer simulation I have to make about it and I'm finding it difficult to understand how the RNA polymerase knows where to start and stop transcription on a strand of DNA. Could someone explain this for me?

Let's assume that '[' indicates the start of a gene (start codon? tata-box? promoter?) and ']' indicates the end of a gene (stop codon? transcription terminator?). Are all '[' and ']' markers well-formed brackets? Can you have over lapping genes as in [..[..]..]? Can you have missing brackets as in [.. or ..]? If not, what happens if this is the case?

Thanks a lot!
 
Biology news on Phys.org
Simply stated translation starts at a start codon (ATG) and stops at a stop codon (TGA, TAG, or TAA).

The promotor of a gene and regulatory elements determine whether the transcription machinery can bind near the ATG, so not all start codons are starts of an ORF.

You can definitely have overlapping genes.

I guess you are also interested in 5'UTRs and 3'UTRs?

This is all textbook material, so it might be better to pick up one of those and read up on what is known about the transcription machinery.
 
Last edited:
Thanks
 
Simply stated transcription starts at a start codon (ATG) and stops at a stop codon (TGA, TAG, or TAA).

That would be translation, not transcription. Translation is somewhat straightforward, and the above definition should be good enough (but do remember ATG is not the only start codon).

If you're asking about transcription, that's a little less clear. Even less clear is what defines a gene. Everything you learn in a standard textbook is very general and if you're looking into predicting genes from a genome sequence, they won't even look for ribosomal binding sites, because it's quite useless to do so quite frankly.

Also if we're just talking very standard general biology stuff, than you CANNOT have overlapping genes as you indicated [..[..]..]. However, [...[]...] is quite common (where there's very little overlap between the end of one gene and start of another - quite common in operons)
 
My simulation is about the synthesis of mRNA from DNA. I need to work with random DNA sequences so I need to know if there are any "illegal" sequences. What would happen if you have multiple start codons before the next stop codon (overlapping)? You can assume the simplest form of transcription.

How can I know where genes begin and end without knowing where RNA polymerase bind to DNA and start producing mRNA? Are you saying it's still unknown where genes begin and end?
 
What kind of organism(s) are you working with: eukaryote, prokaryote, other?
Within a genome (or large DNA fragment) are you looking for: gene, mRNA, CDS, other?

A start codon encountered after a start codon, is simply the amino acid. So if you encounter ATG somewhere downstream, it simply encodes for a methionine.

Example of how gene-finding in bacterial genome works:
The naive approach to gene-finding (I'm warning you right now, that when I usually say gene, I really mean ORFs - different people use different definitions for "gene") within a genome is to simply find all regions that contain more than x codons (let's say x=30) without a stop codon. After all those are marked for all 6 reading frames, you remove any that are overlapping, choosing the longest regions, and you say those are your genes. Whilst this method is very simple, it does decently. More computationally expensive methods include hidden markov models, tBLASTX, etc. but also realize they can only detect genes that are already known. Prediction of novel genes (without biological experiments) are definitely possible, but that requires an even lengthier explanation.
 

Similar threads

Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
5K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
Replies
4
Views
3K
Replies
0
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K