Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Biology based on Principles

  1. Dec 8, 2013 #1
    Hi all,

    My understanding is that biology is mostly an empirical science. This is perhaps due to the fact that biological systems are very complex, heterogeneous, highly modularized. My vague idea is: Is it possible to do for biological systems what physics has done for condensed matter systems? These systems are probably relatively less heterogeneous and complex but physicists have been able to isolate what is essential and develop a very satisfying model that begins with first principles rather than 'curve fitting'.

    Is the goal of systems biology to do something like this for biological signaling?
    Statistical Physics essentially averages over equilibrium systems of many tiny parts to deliver some useful results. What prevents us from doing the same in biological systems?

  2. jcsd
  3. Dec 9, 2013 #2


    User Avatar
    Staff Emeritus
    Science Advisor

    This is far outside by field but I was speaking to a PhD the other day who was working on mathematical modelling of gene regulatory networks. There is a lot of work to mathematically model large biological systems however there's a lot of scepticism about how good it can be for the foreseeable future given the huge amount of variables, most of which we don't know, as well as environmental characteristics. In the course of this discussion we looked at a paper which had modelled a network and as happens in the majority of these cases when they went to test it in real life there were wildly different results due to things they couldn't take into account.

    When you have tens of thousands of genes, even more proteins, sugars, nucleic acids, metabolites, environmental factors etc it's really impractical to make a model that takes into account anything but the most simple of systems for niche applications.
  4. Dec 9, 2013 #3


    User Avatar
    Gold Member

    That's one of the goals in some people's minds. For others it's just as simple as saying "you can't study genes/neurons/proteins as solitary things, you have to study them in the context of the system because their function is a product of the context". So systems biology has a pragmatic side, too.

    It's a multidisciplinary field and each discipline has their own "paradigm" for approaching the subject. I think one fundamental idea, though, is that you essentially have a network of accelerators and breaks so reaction-diffusion systems fit many biological models really well, but I am a dynamical systems theorist, so that's my "paradigm".

    There are probably many simplifying cases where it is valid to do that but... In general, biological systems are degenerate and the heterogeneity in the system may be required for functionality. There are cases in the following paper where an average of a parameter leads to a nonfuncitonal value because the functional regime has a doughnut shaped distribution. When you take the average, you end up in the "hole" of the doughnut, thereby missing the functional regime of the model.


    a more general paper about degeneracy in biological systems:

    I'd also note that this is probably the same with many complex physical systems, too (that you lose a lot of important information taking averages).
  5. Dec 9, 2013 #4


    User Avatar
    Science Advisor

    The answer is sometimes yes (see van Vreeswijk and Sompolinksy or Nicolas Brunel's work for excellent biologically motivated examples). But I think the more important answer is that many things in condensed matter are also phenomenological. For example, Fermi liquid theory or even classical equilibrium statistical mechanics. Just as universality is one lesson of the renormalization group, another lesson is that many interesting regimes are not universal. Also, the renormalization group does not tell you what the order parameter should be.

    A field where complicated models have been usefully fit using lots of data is speech recognition (hidden Markov models, there seems to have been recent progress with multilayer artificial neural networks or "deep learning" too). I think this is closer to where biology is most of the time.

    I would also add as a warning in the same spirit Freeman Dyson's comment about bicycles: "You can't possibly get a good technology going without an enormous number of failures. It's a universal rule. If you look at bicycles, there were thousands of weird models built and tried before they found the one that really worked. You could never design a bicycle theoretically. Even now, after we've been building them for 100 years, it's very difficult to understand just why a bicycle works - it's even difficult to formulate it as a mathematical problem. But just by trial and error, we found out how to do it, and the error was essential. The same is true of airplanes." http://www.wired.com/wired/archive/6.02/dyson_pr.html

    Even the standard model of particle physics needed experimental guidance. And as we now know, because it is an effective theory, it too is curve fitting. So ironically, yes biology can be just like physics, because both are just curve fitting. :tongue:
    Last edited: Dec 9, 2013
  6. Dec 9, 2013 #5

    Throw me in this boat. I'm a bit skeptical of modeling papers such as ones that try to reproduce cell signaling pathways that are reduced to circuits or other simplified models. If you read the fine details for many models, often times it is assumed that certain situations are in steady-state, which in reality, doesn't even exist in many cases that are important. SS is often assumed just to make calculations and modeling easier.

    It is maddeningly difficult to model things for which no template exists. For example, even if you had a perfect model of gene regulatory networks, this in no way exactly translates to accurate predictions of protein expression and morphology controlled by protein expression. Anyone who has ever done a PCR experiment followed by a western blot knows there can be massive chasms between gene expression and fully functional protein levels. Things like post translational modifications are extremely complex and have no template (like the DNA code) from which we can model. If you took cells from two different types of tissues and looked at the same protein, those proteins would behave in different ways because of PTM heterogeneities that exist on those proteins. In otherwords, even if you had one perfect model for one cell type, it could easily fall apart for a different type of cell. How many different types of cells does a living organism have?

    I work in the field of glycobiology. If you think the genome-proteome and mathematical modeling of it is complex, consider this: 3 amino acids can only produce 6 different combinations of a sequence. Since carbohydrates not only depend on sequence but their 3D spatial arrangement, 3 carbohydrates that are typically used in mammalian cells could theoretically produce 25,000 different combinations. If you took only 6 carbohydrates, the amount of combinations possible exponentially increases to 1,000,000,000,000. The amount of information that could be encoded into the glycome could vastly exceed that of the genome. Each of the possible 1,000,000,000,000 glyco structures that could be attached to a protein can modulate everything from its function and cellular trafficking to its surface half life and signaling. The problem is that there's simply no template to go off of that controls the "sugar code" which could be used to model this major PTM. And we're only talking about 1 PTM. Consider also the fact that each one of those 1,000,000,000,000 glyco structures that could be attached to a protein could also be modified even further with phosphorylation, sulfation, acetylation, etc. etc at different sites of the glycan structures and with multiple patterns of acetylation/sulfation/phosphorylation (each modification which could even further modulate cell signaling events) for a properly fined tuned protein capable of proper cellular signaling that is in response to spatial and temporal cues. The complexity one has to deal with to properly model a mammalian cell is truly mind blowing.
    Last edited: Dec 9, 2013
  7. Dec 10, 2013 #6
    That was a great synopsis!
  8. Dec 10, 2013 #7


    User Avatar
    Gold Member

    Just to make sure the narrative isn't getting too skewed, there are places in gene networks where modelling has been successful in predicting reduced system dynamics, such as the Brusselator and possibly the Oregonator. These are common in biochemical engineering.

    Some amount of complexity is manageable with a multi-model approach (as demonstrated in the Eve Marder paper I cited above). But, as Ryan said, tens of thousands of genes is out of control if you try to model it in a reductionist way. A higher-level model that considers the whole system as one "particle" with stochastic properties would probably be more successful than a reductionist model that tries to describe every moving part (because of the degeneracy inherent in such systems).

    The steady-state assumptions are valid in the limit, but you should be able to demonstrate that limit (for instance, that time scales are well enough separated) in experiment. So models=bad shouldn't be what the reader takes away. Models that pay respect to experiment can help focus experimental efforts and make predictions that are falsifiable. Often, though, you do see people get carried away with their abstract models and go off to mathland where experimental evidence starts to take a backstage. And that becomes less science and more math. But many modellers do work closely with experimental collaborators, and it's frustrating and difficult and slow, but it's science.
  9. Dec 10, 2013 #8
    One example I like to think of is which comes first, metabolism to regulate genes, or do genetic networks regulate metabolism? A strong argument could be made for either case (see arguments over the Warburg effect). There certainly are examples of metabolic networks that can behave as exquisite biosensors that can regulate gene transcription through metabolite flux. Everyone knows that cells respond to spatial and temporal cues, certainly one way that is likely is through metabolism that can alter genetic regulation.

    There are models out there of that do model at a higher level, such as treating stem cells and stem cell differentiation as stochastic processes. One could even back it up such a model experimental data. However, how often do people in biology check to see what happens when you change established experimental techniques to do cell work? For example, what happens if you do something as simple as change the concentration of glucose in DMEM? Why is 4.5 g/mL used in high glucose media or why do we use the concentrations of the 50 or so other ingredients in typical media formulations? Cells behave differently if change glucose concentrations because their metabolism changes, which can alter genetic transcription and patterns of glycosylation on proteins or even protein levels/function.

    There are examples of where people have KOed certain glycosylation genes and cells look perfectly normal and happy. However, if you take the same gene, and KO it on the whole organismal level, the defect becomes embryologically lethal. The point here is that even patterns of glycosylation, which is inherently linked to cellular metabolism and environment, is critical for a properly functioning organism. So even though one may be able to perfectly model cell behavior on the whole cellular level and back it up experimentally with cell work done in a restricted controlled environment, the same cell behavior may not even be remotely close to what happens on a whole organismal level since environment and a continuum of metabolism/metabolite/and metabolic flux are virtually impossible to replicate experimentally. It can be extremely difficult to detect that something is wrong with a model of cell behavior if all you are looking at is the cellular level and not observing from a higher plane (tissue/organism).

    I don't think modelling is bad, but that one must realize the limitations of all models. For instance, modeling of a bacteria or yeast that is producing a protein you want could be useful, because no one cares about a whole organism, just the rates or amount of proteins you can make.
    Last edited: Dec 10, 2013
  10. Dec 10, 2013 #9


    User Avatar
    Gold Member

    Yeah, I think that's an important point with all modelling, even outside of biology: the map is not the territory.

    In some cases, when experimenters do provide additional data, you can adjust your model to account for ambient modulation. Often you will replace a parameter with a function. So now instead of just having some kinetic on-rate constant, your on-rates would be a function of the glucose concentration. This is the effect modulation generally has on channel receptors... it changes the shape of the curve so now your parameters are not described as constants, but as functions. And you still have the limitation that if your glucose levels get too high, many other assumptions start getting violated. I guess that's the key: knowing your assumptions well (they are often hidden) and knowing their limitations.
  11. Dec 10, 2013 #10


    User Avatar
    Science Advisor
    2017 Award

    While it remains difficult to trust models to make predictions in biology, what models can do effectively in biology is to help understand a set of data better. A good example here from the field of systems biology and cell signaling is the study of various signaling pathways that lead to all-or-none responses (e.g. bistable systems). Constructing mathematical models from the various protein-protein interactions involved show that many such systems require two essential parts: a positive feedback loop combined with a step involving an ultrasensitive response (i.e. a dose-response curve with a hill coefficient > 1) (see Ferrell JE. 1999. Building a cellular switch: more lessons from a good egg. Bioessays 21: 866 doi:10.1002/(SICI)1521-1878(199910)21:10<866::AID-BIES9>3.0.CO;2-1). Indeed, while this model was originally developed to explain the response of frog eggs to steroid hormone, it has also been shown to apply to other bistable systems, such as lactose metabolism in E. coli. Of course, these models are very simplistic and often ignore important factors in these signaling pathways, but the models are nevertheless useful for abstracting simple concepts that can apply to a diverse set of biological problems.

    One problem with models that begin from first principles is that these models must often work at a scale different from that accessible by experimentalists. For example, in the field of protein science, the most common ab initio method of modeling protein behavior is molecular dynamics simulations. These simulations must step through time femtosecond by femtosecond, and until recently provided data only on the nano- and micro-second scale. Thus, it was difficult to make any predictions from these models as many protein behavior occur on at least the millisecond timescale, and experimentalists had few data on the nano- and micro-second scale behavior of proteins.

    In the field of protein folding, however, fast proteins fold on the microsecond to millisecond timescale, and the most powerful supercomputers can now generate MD simulations on the millisecond timescale (see for example Lindorff-Larsen et al 2011. How Fast-Folding Proteins Fold. Science 334:517. doi:10.1126/science.1208351). Thus, we are getting to the point where the theoreticians and experimentalists can start comparing their results. More powerful or different computational approaches, however, are required to look at other protein behaviors, such as catalysis, that occur on much longer timescales.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook