Is information theory useful in biology?

In summary, information theory is used in various fields, including biology, to understand communication systems and information transmission. It has been applied to the study of the genetic code and the evolution of life. There is ongoing research on using information theory for genomic compression to help with the storage and analysis of large amounts of DNA data. Some of the noise and redundancy in genomes may have important functions, making the use of compression algorithms more complex.
  • #1
BWV
1,465
1,781
  • Like
Likes BillTre
Biology news on Phys.org
  • #3
Here is what is written on Wikipedia:
In addition to mathematics, computer science and telecommunications, the theoretical consideration of communication through information theory is also used to describe communication systems in other areas (e.g. media in journalism, the nervous system in neurology, DNA and protein sequences in molecular biology, knowledge in information science and documentation).

I guess the answer is a clear Yes.
 
  • Like
Likes BillTre
  • #4
Yup, it is a clear yes. Ex: Species diversity and genetic diversity calculations use methods that date back to Claude Shannon.
 
  • Like
Likes BillTre
  • #6
Excerpt from https://en.wikipedia.org/wiki/Genetic_code:
  • Information channels: Information-theoretic approaches model the process of translating the genetic code into corresponding amino acids as an error-prone information channel.[87] The inherent noise (that is, the error) in the channel poses the organism with a fundamental question: how can a genetic code be constructed to withstand noise[88] while accurately and efficiently translating information? These "rate-distortion" models[89] suggest that the genetic code originated as a result of the interplay of the three conflicting evolutionary forces: the needs for diverse amino acids,[90] for error-tolerance[85] and for minimal resource cost. The code emerges at a transition when the mapping of codons to amino acids becomes nonrandom. The code's emergence is governed by the topology defined by the probable errors and is related to the map coloring problem.[91]
 
  • Like
Likes BillTre and fresh_42
  • #7
It is quite easy: All parts of stochastics are useful in any natural science. This includes information theory and biology.
 
  • Like
Likes BillTre and Fervent Freyja
  • #8
jedishrfu said:
Here's a 2015 Quanta magazine article on it:

https://www.quantamagazine.org/the-information-theory-of-life-20151119/

and Adami's related paper:

https://arxiv.org/ftp/arxiv/papers/1112/1112.3867.pdf

and this one from NCBI

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3220916/

I wasn't able to find any other uses other than using information theory to understand how life evolved.
From Adami:

we should think of the human genome — or the genome of any organism — as a repository of information about the world gathered in small bits over time through the process of evolution. The repository includes information on everything we could possibly need to know, such as how to convert sugar into energy, how to evade a predator on the savannah, and, most critically for evolution, how to reproduce or self-replicate.

Curious if, say, a spider that has its behavior 'hard-coded' rather than learned then has more information than a human infant under this definition, or is it more appropriate to think in terms of the species in aggregate?
 
  • #9
sysprog said:
Excerpt from https://en.wikipedia.org/wiki/Genetic_code:
  • Information channels: Information-theoretic approaches model the process of translating the genetic code into corresponding amino acids as an error-prone information channel.[87] The inherent noise (that is, the error) in the channel poses the organism with a fundamental question: how can a genetic code be constructed to withstand noise[88] while accurately and efficiently translating information? These "rate-distortion" models[89] suggest that the genetic code originated as a result of the interplay of the three conflicting evolutionary forces: the needs for diverse amino acids,[90] for error-tolerance[85] and for minimal resource cost. The code emerges at a transition when the mapping of codons to amino acids becomes nonrandom. The code's emergence is governed by the topology defined by the probable errors and is related to the map coloring problem.[91]
Right, so there should be some equivalent of a compression algorithm to eliminate noise and redundant information in a genome?
 
  • #10
BWV said:
Right, so there should be some equivalent of a compression algorithm to eliminate noise and redundant information in a genome?
Some of the noise and redundancy may have important function:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5065233/

There's a lot of work being done regarding genomic compression:
https://spectrum.ieee.org/computing/software/the-desperate-quest-for-genomic-compression-algorithms

Here's an example:

PDF direct download link: https://www.mdpi.com/1999-4893/13/4/99/pdf

A New Lossless DNA Compression Algorithm Based on A Single-Block Encoding Scheme​
Deloula Mansouri , Xiaohui Yuan * and Abdeldjalil Saidani​
School of Computer Science and Technology,​
Wuhan University of Technology, Wuhan 430070, China​
Received: 24 March 2020; Accepted: 17 April 2020; Published: 20 April 2020​
Abstract:​
With the emergent evolution in DNA sequencing technology, a massive amount of genomic data is produced every day, mainly DNA sequences, craving for more storage and bandwidth. Unfortunately, managing, analyzing and specifically storing these large amounts of data become a major scientific challenge for bioinformatics. Therefore, to overcome these challenges, compression has become necessary. In this paper, we describe a new reference-free DNA compressor abbreviated as DNAC-SBE. DNAC-SBE is a lossless hybrid compressor that consists of three phases. First, starting from the largest base (Bi), the positions of each Bi are replaced with ones and the positions of other bases that have smaller frequencies than Bi are replaced with zeros. Second, to encode the generated streams, we propose a new single-block encoding scheme (SEB) based on the exploitation of the position of neighboring bits within the block using two different techniques. Finally, the proposed algorithm dynamically assigns the shorter length code to each block. Results show that DNAC-SBE outperforms state-of-the-art compressors and proves its efficiency in terms of special conditions imposed on compressed data, storage space and data transfer rate regardless of the file format or the size of the data.​
 
  • Like
Likes BWV
  • #11
jedishrfu said:
Here's a 2015 Quanta magazine article on it:

https://www.quantamagazine.org/the-information-theory-of-life-20151119/

and Adami's related paper:

https://arxiv.org/ftp/arxiv/papers/1112/1112.3867.pdf

and this one from NCBI

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3220916/

I wasn't able to find any other uses other than using information theory to understand how life evolved.
I read the first paper. One thing to note is that one of the main examples in the paper, Integrated Information (as a theory for consciousness), has taken some heavy criticism, or some would argue has been debunked, namely by Scott Aaronson who concluded,

But let me end on a positive note. In my opinion, the fact that Integrated Information Theory is wrong—demonstrably wrong, for reasons that go to its core—puts it in something like the top 2% of all mathematical theories of consciousness ever proposed. Almost all competing theories of consciousness, it seems to me, have been so vague, fluffy, and malleable that they can only aspire to wrongness.

https://www.scottaaronson.com/blog/?p=1799

With these kind of very high level theories that aim to find the key to a mystery, you have to go back to Shannon's warning about the bandwagon. Using the word information in science is exciting. But what information is in information theory is usually not what people think information is intuitively. As it is easy to conflate ones idea of information with information from information theory, it makes it fairly easy to write papers that sound both quantitative and deep/promising. Like Aaronson's description of most theories of consciousness, many of them are "so vague, fluffy, and malleable that they can only aspire to wrongness."
 
Last edited:
  • Like
Likes jedishrfu, atyy and BWV
  • #12
base pairs are essentially a base-4 system, how could information theory not be relevant? There's also a lot of overlap with thermodynamics and Gibb's Free Energy
 
  • Like
Likes jim mcnamara

1. How is information theory related to biology?

Information theory is the study of how information is created, transmitted, and processed. In biology, information theory is used to understand how living organisms process and respond to information in their environment. It helps explain how genetic information is stored, transmitted, and expressed in living cells, and how organisms use information to adapt and survive in their environment.

2. What are the applications of information theory in biology?

Information theory has many applications in biology, including understanding the genetic code, gene expression, and protein folding. It is also used in bioinformatics to analyze and interpret large amounts of biological data, such as DNA sequences. Information theory is also useful in studying communication and signaling processes in living organisms, such as how cells communicate with each other and how animals use signals to interact with their environment.

3. How does information theory contribute to our understanding of evolution?

Information theory plays a crucial role in the study of evolution. It helps us understand how genetic information is passed down from one generation to the next, and how mutations and natural selection act on this information to drive evolutionary change. Information theory also helps us understand how organisms adapt and evolve in response to changes in their environment.

4. Can information theory be used to study complex biological systems?

Yes, information theory is a powerful tool for studying complex biological systems. It can be used to analyze and model the flow of information within a system, such as a cell, an organism, or an ecosystem. This allows us to gain insights into how these systems function and how they respond to changes in their environment.

5. What are the limitations of using information theory in biology?

While information theory has many applications in biology, it also has its limitations. One limitation is that it simplifies biological systems by focusing on the flow of information, rather than the complex interactions and processes that occur within living organisms. Additionally, information theory may not fully capture the nuances and complexities of biological systems, and it should be used in conjunction with other approaches for a more comprehensive understanding of biology.

Similar threads

  • Biology and Medical
Replies
3
Views
2K
Replies
2
Views
2K
  • Biology and Medical
Replies
1
Views
1K
  • Biology and Medical
Replies
2
Views
2K
Back
Top