Is information theory useful in biology?

In summary, information theory is used in various fields, including biology, to understand communication systems and information transmission. It has been applied to the study of the genetic code and the evolution of life. There is ongoing research on using information theory for genomic compression to help with the storage and analysis of large amounts of DNA data. Some of the noise and redundancy in genomes may have important functions, making the use of compression algorithms more complex.
  • #1

BWV

1,418
1,699
  • Like
Likes BillTre
Biology news on Phys.org
  • #3
Here is what is written on Wikipedia:
In addition to mathematics, computer science and telecommunications, the theoretical consideration of communication through information theory is also used to describe communication systems in other areas (e.g. media in journalism, the nervous system in neurology, DNA and protein sequences in molecular biology, knowledge in information science and documentation).

I guess the answer is a clear Yes.
 
  • Like
Likes BillTre
  • #4
Yup, it is a clear yes. Ex: Species diversity and genetic diversity calculations use methods that date back to Claude Shannon.
 
  • Like
Likes BillTre
  • #6
Excerpt from https://en.wikipedia.org/wiki/Genetic_code:
  • Information channels: Information-theoretic approaches model the process of translating the genetic code into corresponding amino acids as an error-prone information channel.[87] The inherent noise (that is, the error) in the channel poses the organism with a fundamental question: how can a genetic code be constructed to withstand noise[88] while accurately and efficiently translating information? These "rate-distortion" models[89] suggest that the genetic code originated as a result of the interplay of the three conflicting evolutionary forces: the needs for diverse amino acids,[90] for error-tolerance[85] and for minimal resource cost. The code emerges at a transition when the mapping of codons to amino acids becomes nonrandom. The code's emergence is governed by the topology defined by the probable errors and is related to the map coloring problem.[91]
 
  • Like
Likes BillTre and fresh_42
  • #7
It is quite easy: All parts of stochastics are useful in any natural science. This includes information theory and biology.
 
  • Like
Likes BillTre and Fervent Freyja
  • #8
jedishrfu said:
Here's a 2015 Quanta magazine article on it:

https://www.quantamagazine.org/the-information-theory-of-life-20151119/

and Adami's related paper:

https://arxiv.org/ftp/arxiv/papers/1112/1112.3867.pdf

and this one from NCBI

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3220916/

I wasn't able to find any other uses other than using information theory to understand how life evolved.
From Adami:

we should think of the human genome — or the genome of any organism — as a repository of information about the world gathered in small bits over time through the process of evolution. The repository includes information on everything we could possibly need to know, such as how to convert sugar into energy, how to evade a predator on the savannah, and, most critically for evolution, how to reproduce or self-replicate.

Curious if, say, a spider that has its behavior 'hard-coded' rather than learned then has more information than a human infant under this definition, or is it more appropriate to think in terms of the species in aggregate?
 
  • #9
sysprog said:
Excerpt from https://en.wikipedia.org/wiki/Genetic_code:
  • Information channels: Information-theoretic approaches model the process of translating the genetic code into corresponding amino acids as an error-prone information channel.[87] The inherent noise (that is, the error) in the channel poses the organism with a fundamental question: how can a genetic code be constructed to withstand noise[88] while accurately and efficiently translating information? These "rate-distortion" models[89] suggest that the genetic code originated as a result of the interplay of the three conflicting evolutionary forces: the needs for diverse amino acids,[90] for error-tolerance[85] and for minimal resource cost. The code emerges at a transition when the mapping of codons to amino acids becomes nonrandom. The code's emergence is governed by the topology defined by the probable errors and is related to the map coloring problem.[91]
Right, so there should be some equivalent of a compression algorithm to eliminate noise and redundant information in a genome?
 
  • #10
BWV said:
Right, so there should be some equivalent of a compression algorithm to eliminate noise and redundant information in a genome?
Some of the noise and redundancy may have important function:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5065233/

There's a lot of work being done regarding genomic compression:
https://spectrum.ieee.org/computing/software/the-desperate-quest-for-genomic-compression-algorithms

Here's an example:

PDF direct download link: https://www.mdpi.com/1999-4893/13/4/99/pdf

A New Lossless DNA Compression Algorithm Based on A Single-Block Encoding Scheme​
Deloula Mansouri , Xiaohui Yuan * and Abdeldjalil Saidani​
School of Computer Science and Technology,​
Wuhan University of Technology, Wuhan 430070, China​
Received: 24 March 2020; Accepted: 17 April 2020; Published: 20 April 2020​
Abstract:​
With the emergent evolution in DNA sequencing technology, a massive amount of genomic data is produced every day, mainly DNA sequences, craving for more storage and bandwidth. Unfortunately, managing, analyzing and specifically storing these large amounts of data become a major scientific challenge for bioinformatics. Therefore, to overcome these challenges, compression has become necessary. In this paper, we describe a new reference-free DNA compressor abbreviated as DNAC-SBE. DNAC-SBE is a lossless hybrid compressor that consists of three phases. First, starting from the largest base (Bi), the positions of each Bi are replaced with ones and the positions of other bases that have smaller frequencies than Bi are replaced with zeros. Second, to encode the generated streams, we propose a new single-block encoding scheme (SEB) based on the exploitation of the position of neighboring bits within the block using two different techniques. Finally, the proposed algorithm dynamically assigns the shorter length code to each block. Results show that DNAC-SBE outperforms state-of-the-art compressors and proves its efficiency in terms of special conditions imposed on compressed data, storage space and data transfer rate regardless of the file format or the size of the data.​
 
  • Like
Likes BWV
  • #11
jedishrfu said:
Here's a 2015 Quanta magazine article on it:

https://www.quantamagazine.org/the-information-theory-of-life-20151119/

and Adami's related paper:

https://arxiv.org/ftp/arxiv/papers/1112/1112.3867.pdf

and this one from NCBI

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3220916/

I wasn't able to find any other uses other than using information theory to understand how life evolved.
I read the first paper. One thing to note is that one of the main examples in the paper, Integrated Information (as a theory for consciousness), has taken some heavy criticism, or some would argue has been debunked, namely by Scott Aaronson who concluded,

But let me end on a positive note. In my opinion, the fact that Integrated Information Theory is wrong—demonstrably wrong, for reasons that go to its core—puts it in something like the top 2% of all mathematical theories of consciousness ever proposed. Almost all competing theories of consciousness, it seems to me, have been so vague, fluffy, and malleable that they can only aspire to wrongness.

https://www.scottaaronson.com/blog/?p=1799

With these kind of very high level theories that aim to find the key to a mystery, you have to go back to Shannon's warning about the bandwagon. Using the word information in science is exciting. But what information is in information theory is usually not what people think information is intuitively. As it is easy to conflate ones idea of information with information from information theory, it makes it fairly easy to write papers that sound both quantitative and deep/promising. Like Aaronson's description of most theories of consciousness, many of them are "so vague, fluffy, and malleable that they can only aspire to wrongness."
 
Last edited:
  • Like
Likes jedishrfu, atyy and BWV
  • #12
base pairs are essentially a base-4 system, how could information theory not be relevant? There's also a lot of overlap with thermodynamics and Gibb's Free Energy
 
  • Like
Likes jim mcnamara

1. How is information theory related to biology?

Information theory is a branch of mathematics that deals with the quantification, storage, and communication of information. In biology, information theory has been applied to understand how living organisms gather, process, and transmit information through genetic, neural, and behavioral processes.

2. What are the applications of information theory in biology?

Information theory has numerous applications in biology, including understanding the genetic code, analyzing gene regulatory networks, studying animal communication and behavior, and modeling neural networks. It has also been used to study the evolution of biological systems and to design efficient communication networks in artificial and natural systems.

3. Can information theory help in understanding complex biological systems?

Yes, information theory provides a framework for understanding how complex biological systems function, communicate, and evolve. By quantifying and analyzing the flow of information within and between systems, information theory can reveal patterns and principles that govern the behavior of these systems.

4. What are the limitations of using information theory in biology?

One limitation of using information theory in biology is that it simplifies the complexity of living systems and may not fully capture the nuances of biological processes. Additionally, the assumptions and models used in information theory may not always accurately represent biological systems, leading to potential errors in analysis and interpretation.

5. How can information theory contribute to advancements in biological research?

Information theory can contribute to advancements in biological research by providing a quantitative and mathematical framework for analyzing complex biological systems and processes. It can also aid in the design of experiments and models, leading to a deeper understanding of biological systems and potential applications in fields such as medicine and biotechnology.

Suggested for: Is information theory useful in biology?

Replies
1
Views
1K
Replies
15
Views
2K
Replies
3
Views
765
Replies
6
Views
921
Replies
32
Views
1K
Replies
4
Views
743
Replies
1
Views
893
Replies
48
Views
3K
Replies
1
Views
565
Back
Top