Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Bioinformatics project to help find cure for ALS

  1. Nov 23, 2014 #1

    we (*) are trying to set up a combined neuroscience/information technology project for the following purpose:

    ALS is a fatal neurological disease that has no known cause or cure and kills its victims within a few years. Our view is that one reason why all drugs have failed in clinical tests in spite of promising preclinical data is assumption that ALS is a single disease. On the contrary, it seems there is lots of variation regarding e.g. place of onset, preference between upper and lower neurons, associated genes, aggregating proteins, anomalous biochemical processes etc.

    What we are trying to do is to assemble a group of neuroscientists and computer scientists to figure out a way to code an intelligent search robot to go through the ever-expanding mass of medical publications (e.g. PubMed) in order to make it possible to distinguish between different forms of ALS. The ultimate goal is to include in this model genetic profiling data, laboratory analysis data etc. as well as information on the effect of different drugs on the biochemical processes, so that one could design a personalized medication cocktail based on the genetic profiling and lab data. The fundamental task is to design and implement an optimal data structure for this purpose.

    We have so far been conducting our search only in Finland, but as it seems quite a challenge to find the right kind of computer scientist for this kind of a project, I thought it makes sense to mention it here as well.

    (*) we = a group of private individuals trying to make things go forward - not affiliated to any research organization. My motivation is based on having ALS since 2010 and not having the possibility to live long enough to see a cure.
  2. jcsd
  3. Nov 28, 2014 #2
    Thanks for the post! This is an automated courtesy bump. Sorry you aren't generating responses at the moment. Do you have any further information, come to any new conclusions or is it possible to reword the post?
  4. Dec 6, 2014 #3


    User Avatar
    Science Advisor

    Hey rmattila.

    The first thing I recommend you do is figure out what data structures you need and how you can get them before you ask someone to help you get a solution.

    If the data is scattered in many formats across many networks who have different ways of getting the data then the developer or analyst will have to take this into account.

    If the sites don't allow easy access for automated data collection (which a robot would do) then it will limit your ability to get this data (some research sites track your access over time and raise red flags or terminate your account for a while (like a suspension)).

    If the data lie in attachments or hyperlinks to specific kinds of files then this is one thing - but if the data is scattered throughout a paper in a non-uniform way then this means a lot more complexity.

    Answering these kinds of questions before getting interest in a developer to do the coding will be crucial because at least they know what they are getting themselves into and the specificity will help them not only evaluate whether they could do it, but also if they know others that could.

    If you don't know the structure or exactly what you are looking for then it gets even more complicated and pattern recognition (aka data mining) will have to be used - and this constrains the number of people who could do something like this very drastically.

    I would first check whether PubMed allow you to have robots accessing their system because if they don't (and there is often a reason for not allowing it) then anything else will be a moot point. You could email them directly to find out or try accessing many papers within a given time-frame.
  5. Dec 6, 2014 #4


    User Avatar
    Science Advisor
    Gold Member
    2017 Award

    Early Bioinformatics programming was dominated by Perl programmers. They are still very active in it. You might check with organizations who run web sites like bioperl.org or the National Center for Biotechnology Information (NCBI). NCBI supports Ebot, which generates Perl scripts for data access (see http://www.ncbi.nlm.nih.gov/Class/PowerTools/eutils/ebot/ebot.cgi). Also the European Bioinformatics Institute has similar stuff. There is a O'Reilly book, Beginning Perl for Bioinformatics, that has more references. There are now similar efforts for tools in Python language. See http://www.open-bio.org/wiki/Main_Page. I don't know much more about it, but they might help you to focus your search for programmers. Maybe others here can give more information. Good luck in your efforts. It's a very worthy cause and I wish I could help more.
    Last edited: Dec 6, 2014
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook