# The central dogma: proteins first?

1. Dec 31, 2007

### Adeimantus

I recently borrowed the book Information Theory, Evolution, and the Origin of Life by Hubert Yockey. In it he points to the fact that there can be no code from the amino acid alphabet (about 20 symbols) to the RNA/DNA alphabet (64 3-letter codons, a second extension of the 4-symbol alphabet {A,C,G,T}) and says this proves, beyond any doubt, that proteins were not the first step in the origin of life. No information can be transferred from proteins to DNA. However, he also says in the book that the genetic code probably evolved from an earlier code with 2-letter codons. This first extension of {A,C,G,T} has 16 symbols, which is less than 20. So doesn't that leave open the possibility of a code from the amino acid alphabet to the pre-DNA nucleic acid alphabet?

2. Jan 1, 2008

### HubertPYockey

The Central Dogma is why proteins cannot be first

Thank you for reading my book, Information Theory, Evolution and the Origin of Life (Cambridge University Press, 2005). However, no, it doesn’t.

It was the late Thomas Jukes who put forward the speculation that the triplet genetic code, which has three-letter codons, may have evolved from a doublet code of two-letter codons. I cited his work.

For information to go back and forth between two alphabets, the code to translate between them must have a one-to-one mapping between the symbols of each alphabet. That is the only way to know for sure which symbol in alphabet A specifies what symbol in alphabet B.

For example, let us have a pair of dice represent the DNA/RNA alphabet as a doublet code and the number 7, from an alphabet of 1 through 12, represent an amino acid. To specify for the number 7, our metaphorical amino acid, the two dice can come up as 1,6; 6,1; 2,5; 5,2; 3,4; or 4,3. That means any one of six letters from the dice (DNA/RNA) can specify one symbol from the alphabet representing amino acids. This is called a several-to-one mapping.

If all you know is the result of “7,” you can’t possibly know which of the six possible combinations from the doublet dice alphabet specified—or is mapped to—the “7.” That is why information cannot flow from the smaller alphabet to the larger alphabet—that is the central dogma. However, if you know any symbol in the larger alphabet, you know what symbol it will specify in the smaller alphabet. It doesn’t matter that more than one symbol in the larger alphabet may specify a particular symbol in the smaller alphabet. If you know the symbols of the larger alphabet and how they are mapped to the symbols of the smaller alphabet—the code—then when you know a particular symbol from the larger alphabet, you always know what symbol it is mapped to in the smaller alphabet. But just knowing the code doesn’t help you to go with precision from the smaller alphabet to the larger one because you can never know for sure which symbol in the larger alphabet specified the symbol in the smaller one.

This mathematical property of codes alone explains why no “proteins first” scenario for the origin of life is valid.

Our problem is that we know that life exists, so it must have had an origin. However, the origin of life must be an axiom of biology because there is no way to create an algorithm to show how it originated due to the central dogma that information can only flow between alphabets with a one-to-one mapping and only flows from larger alphabets to smaller ones.

It is a pity that biologists and molecular biologists are uncomfortable with things that are unknowable, like the origin of life. Mathematicians and physicists, however, are able to cope with things that are unknowable. I hope that as molecular biologists now must master information theory and coding theory to master their discipline that they will come to accept that there are things that are unknowable. This will clear the decks of incorrect speculations and save young scientists from throwing away their careers pursuing dead ends. That can only result in the good of science.

The ancient Greek mathematicians knew they were unable to solve three problems: (1) the trisection of the angle; (2) construction of a square of exactly the same area of a given circle; and (3), construction of a cube exactly twice the volume of a given cube. We now know that these problems are unknowable because no procedure, or algorithm, exists to solve them. (For #2, it is because pi has no end.)

The question of the origin of life is a traditional proxy battleground for persons who want to force others either to believe in God or to deny the existence of God. In this feud, I have had the role of Mercutio, “a pox on both your houses.” Religionists must stick to questions of faith, while scientists must stick to, as I quote Socrates in my book, what can be “counted and measured.”

Speaking of Socrates, Adeimantus, how is your brother, Plato?

Last edited: Jan 1, 2008
3. Jan 2, 2008

### Adeimantus

Thank you, Dr. Yockey, for the detailed response to my post. It is a welcome surprise to get a response from the author himself.

Your example with the dice helped make it clear why information can flow in only one direction when you have a code between a larger alphabet and a smaller alphabet.

I know nothing about biochemistry and very little about evolution, so I am still not entirely clear on some things. Given that the modern triplet genetic code allows information flow from the DNA/RNA alphabet to the protein alphabet and not vice-versa, can we conclude that information flow was always in this direction, even when there may have been a doublet code?

In other words, is it possible that initially there was a code from the 20-letter amino acid alphabet to the 16-letter doublet alphabet, and that at some later time a triplet code emerged and the direction of information flow switched? Of course I have no idea what selective pressure would make the receiving doublet alphabet change to a triplet code when 20-to-16 is already close to saturation. Again, I have no clue about molecular evolution so I'm probably missing something obvious.

I agree that it's a shame that some folks, even scientific ones, require a belief system that allows (encourages?) them to say "On this rock I stand!" I appreciate your citing Eric Hoffer's little book, The True Believer, in your book. I read his book fairly recently. I wish I had come across it ten years ago when I was in high school.

I got a chuckle out of that. I think it's mildly ironic that some creationists, intelligent design proponents, and even the transitional cdesign proponentsists (sic) enthusiastically cite your work to show how the materialist-reductionist camp oversteps the bounds of critical thinking, but are unable to apply this same critical eye to their own project.

Oh yeah, Plato is doing alright. He gave up on being a philosopher-king and went back to perceiving forms directly in their pristine, unchanging state. Unfortunately, Glaucon had a run-in with the law. Something about "unmusical" behavior. Heh heh heh.

Cheers.

4. Jan 2, 2008

### Hurkyl

Staff Emeritus
Actually, all three problems were solved by the ancient Greeks; they just cannot be solved by a compass & straightedge construction. And plenty of ratios whose decimal representation has "no end" are constructible; $\sqrt{2}$ and 1/3 are simple examples.

5. Jan 2, 2008

### Moridin

Seems like a masked version of ID-proponent William Dembski's "Law" of the conservation of information.

Now, it turns out that the Shannon uncertainty and the physicist's entropy are identical within a trivial constant. Entropy is a measure of "disorder." The Shannon uncertainty is likewise a measure of the disorder in a signal, applied in communication theory. So, in essence, all those types of arguments demand a conservation of entropy, which is factually false and boils down to claims that the 2nd law of thermodynamics is violated in either the origin or evolution of life, which is a favorite creationist argument.

Please let me know if I have misinterpreted your work.

