Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Bioinformatics and observing genetic mutations

  1. Oct 31, 2015 #1

    For starters, I am an Electrical Engineering student and have very little formal training in the field of Biology. Also, I am not sure if this question is more suited for the Bio or CS section of the forum.

    My questions are:

    - How is Bioinformatics used to observe genetic mutations? Specifically, a hereditary disorder such as Cystic Fibrosis.

    - Is this currently the (or one of the) most effective way to examine a disease like CF?

    I would like to learn more about Bioinformatics and the day to day research. If you don't mind telling me about some of your experiences that would be great.

  2. jcsd
  3. Nov 1, 2015 #2
    I too am interested in knowing how gene sequence is decoded. Even for normal genes I find it hard to figure out how each sequence governs a particular part of the bodily functions. We've all learned about tongue bending, opposable thumb, dominant and recessive traits, but how do the scientists decode these things? Sounds like trade secret to me.
  4. Nov 1, 2015 #3


    User Avatar
    Science Advisor

    Doctors will take a sample from their patients (e.g. a cheek swab), extract the DNA, then perform DNA sequencing on the specific sites in your DNA that they want to look at. Cystic fibrosis is caused by specific mutations in the CTFR gene, so sequencing that gene will tell us whether those mutations are present or not and whether the patient has that specific disease.

    Yes, DNA sequencing is the gold standard diagnostic for genetic diseases like CF.

    Figuring out which DNA sequences govern particular functions in the body is the primary goal of the field of genetics, so to get a more complete answer you would need to read a textbook on genetics. Biologists have discovered the functions of genes in many ways. One way is to randomly mutate genes in an experimental organism (typically, bacteria, yeast, fruit flies, plants, or worms), find specific mutants that have alterations in the trait that you're interested in studying, then figure out what genes have been altered in those mutants (this approach is called forward genetics as it starts from a trait of interests, and allows one to identify the genes that control the trait). The opposite approach (reverse genetics) involves mutating the particular gene that you are interested in studying then figuring out what effect mutation of that gene has on the various traits of your experimental organism. These methods of genetic screening have contributed to much of our fundamental knowledge of biology.

    Of course, such studies cannot be done in humans. For human studies, we instead will group together individuals with a common trait (e.g. people with a specific disease that we think may be genetic), sequence or otherwise characterize their DNA, and look for similarities between the individuals with the particular trait versus the general population. These types of studies are called genome-wide association studies.
  5. Nov 1, 2015 #4
    Great reply Ygggdrasil.

    How is the DNA extracted and transferred to a form that can be observed? (From DNA to computer software/program)

    Also, once DNA information is present on a computer screen, how is it observable? In other words, there must be HUGE amounts of data from the DNA - how is it condensed to allow a researcher to view the information?

    Thank you for contributing!
  6. Nov 1, 2015 #5


    User Avatar
    Science Advisor

    Here is a page with some links to sites that explain some of these processes: http://ghr.nlm.nih.gov/handbook/testing/procedure

    For tests that examine specific genetic diseases, the amount of data is not huge because you know where in the genome to look and can analyze the specific sequence of DNA that may or may not carry the particular mutation involved. The DNA sequencing procedure will give you a string of ~ 1,000 letters, and you (or a computer) look at a particular set of letters to see whether there is a typo or not. This is something I do routinely in my own research when introducing mutations into yeast or bacteria.

    Whole genome sequencing is becoming more common as a diagnostic tool, and such procedures, which sequence the full 3 billion base pairs of the human genome (rather than just the 1,000 base pairs around a target site), do generate much more data requiring more sophisticated analysis tools. Most analyses usually identify mutations by comparing the patient's sequence to a reference genome, then compare those mutations to a database of known pathogenic mutations. Many mutations will not be in those databases, and in some cases, it can be difficult to determine whether they will have any effect or not. This is especially a problem in cancer studies as cancer cells are notoriously mutation prone. Thus cancer cells will carry a number of mutations and it's difficult to separate driver mutations (the mutations that are actually causing the cancer) from passenger mutations (mutations that have arisen spontaneously but do not contribute to the malignancy of the cancer).

    If you're interested in hands on learning of bioinformatics, the ROSALIND site is a good site, modeled after code academy, for learning some of the basics: http://rosalind.info/problems/locations/
  7. Nov 1, 2015 #6
    Thank you very much for taking the time to share this information, Ygggdrasil.
  8. Nov 2, 2015 #7
    I can add that Rosalind's Pevzner and Compeau has Coursera MOOC courses that are relatively accessible whether or not you program, and they got me started on such stuff like recognizing replication initiation markers et cetera. Stuff that you can do on your home computer because the strings are 1000 bases long or so, or that you can relatively quickly learn to study with freely available data bases and algorithm running sites.

    Seems to be a new field where everyone is pretty new and helpful - great discussion forums - while you can either work in it or else benefit from its success. Someone said that we shouldn't say "astronomical numbers" anymore, since the data base/crunching tools and needs are expanding much faster than in astronomy, the new term for "amazingly, stupendously large" is "genomic numbers"...
  9. Nov 2, 2015 #8


    User Avatar

    Yeah, I'm not totally in agreement with Ygg on this, but he's far more knowledgeable than I. First, a gene isn't diseased; an organism is (or if you want to split hairs you can reduce it down to an "organ system"). One of the "dreams" of bioinformatics is to have the ability to model organ systems (and eventually the organism). That is a pipe dream today. Very very few diseases are caused by only one mutation or even by mutations on only one "gene". The presentation and (especially) the severity of a disease usually depends on a constellation of genes (and possibly non-gene DNA). To use tongue-rolling as an example is way too ambitious. Its the equivalent of speaking about engineering interstellar engines while we struggle with Earth-Moon propulsion systems. Lets see, I think there are about 16 genes associated with eye color which is far "simpler" than CF or tongue-rolling (it would seem). Picture a system of 20,000 moving parts and predict what effect changes of 16 of them would do... Finally, its NOT just DNA!! You have: the organism, its environment, its epigenetics, its microbiome, the interactions between its various organs (at various stages of development and aging), the DNA, the RNA, the proteins. All of them are worthy of study and to some extent (the above list is from lesser to greater, roughly) should be/could be included in "bioinformatics".
  10. Nov 2, 2015 #9


    User Avatar
    Science Advisor

    I definitely agree. Only for very few traits does a single gene control that trait. Most traits are controled by many genes, and most genes influence multiple traits. Furthermore, almost all traits have some component that is genetic and some component that is environmental, and the environment can greatly affect how your genes affect your traits. The classic example here is phenylketonuria (PKU), a metabolic disease that can cause intellectual disability. However, if those with PKU follow a specific diet restricting consumption of the phenylalanine amino acid which they have problems processing, the disease does not lead to intellectual disability and they can live normal lives.

    By looking at how certain traits vary among fraternal twins (they grow up in very similar environments but ~ half of their genes are different) versus identical twins (similar environments and same genes), one can estimate the heritability of traits. For example, studies have estimated that genetics accounts for ~ 80% of the variation in height among individuals (with environmental factors such as diet and development accounting for the other 20% of variation). However, when researchers have tried to identify the genes that influence height, they've found that hundreds to thousands of loci are involved, and each of these loci have only a tiny effect on an individuals height; the combined effect of ~700 loci account for only ~20% of the variation in human height. It turns out many of the traits that we care about have very complex genetics like this, which suggests that it would be difficult to engineer such traits in the future.

    Again, most of the explanations here will necessarily be incomplete. A full understanding of all the issues involved in deciphering how our genes affect us would require taking a few courses in genetics.
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Similar Discussions: Bioinformatics and observing genetic mutations
  1. BioInformatics-> HMM (Replies: 1)