Bioinformatics project to help find cure for ALS

Click For Summary
SUMMARY

The forum discussion centers on a bioinformatics project aimed at finding a cure for ALS by assembling a team of neuroscientists and computer scientists. The project intends to develop an intelligent search robot to analyze medical publications, particularly from PubMed, to differentiate between various forms of ALS. Key challenges include determining optimal data structures, ensuring access to scattered data, and addressing the complexities of data mining. The discussion highlights the importance of understanding data access policies and suggests resources like Perl and Python programming tools for bioinformatics.

PREREQUISITES
  • Understanding of bioinformatics concepts and tools
  • Familiarity with data structures and their implementation
  • Knowledge of data mining techniques
  • Experience with programming languages such as Perl and Python
NEXT STEPS
  • Research PubMed's data access policies for automated data collection
  • Learn about optimal data structures for bioinformatics applications
  • Explore data mining techniques relevant to biomedical research
  • Investigate bioinformatics programming resources, including Perl and Python libraries
USEFUL FOR

Researchers, bioinformaticians, and software developers interested in ALS research and personalized medicine approaches.

rmattila
Messages
244
Reaction score
1
Hello,

we (*) are trying to set up a combined neuroscience/information technology project for the following purpose:

ALS is a fatal neurological disease that has no known cause or cure and kills its victims within a few years. Our view is that one reason why all drugs have failed in clinical tests in spite of promising preclinical data is assumption that ALS is a single disease. On the contrary, it seems there is lots of variation regarding e.g. place of onset, preference between upper and lower neurons, associated genes, aggregating proteins, anomalous biochemical processes etc.

What we are trying to do is to assemble a group of neuroscientists and computer scientists to figure out a way to code an intelligent search robot to go through the ever-expanding mass of medical publications (e.g. PubMed) in order to make it possible to distinguish between different forms of ALS. The ultimate goal is to include in this model genetic profiling data, laboratory analysis data etc. as well as information on the effect of different drugs on the biochemical processes, so that one could design a personalized medication cocktail based on the genetic profiling and lab data. The fundamental task is to design and implement an optimal data structure for this purpose.

We have so far been conducting our search only in Finland, but as it seems quite a challenge to find the right kind of computer scientist for this kind of a project, I thought it makes sense to mention it here as well.(*) we = a group of private individuals trying to make things go forward - not affiliated to any research organization. My motivation is based on having ALS since 2010 and not having the possibility to live long enough to see a cure.
 
Hey rmattila.

The first thing I recommend you do is figure out what data structures you need and how you can get them before you ask someone to help you get a solution.

If the data is scattered in many formats across many networks who have different ways of getting the data then the developer or analyst will have to take this into account.

If the sites don't allow easy access for automated data collection (which a robot would do) then it will limit your ability to get this data (some research sites track your access over time and raise red flags or terminate your account for a while (like a suspension)).

If the data lie in attachments or hyperlinks to specific kinds of files then this is one thing - but if the data is scattered throughout a paper in a non-uniform way then this means a lot more complexity.

Answering these kinds of questions before getting interest in a developer to do the coding will be crucial because at least they know what they are getting themselves into and the specificity will help them not only evaluate whether they could do it, but also if they know others that could.

If you don't know the structure or exactly what you are looking for then it gets even more complicated and pattern recognition (aka data mining) will have to be used - and this constrains the number of people who could do something like this very drastically.

I would first check whether PubMed allow you to have robots accessing their system because if they don't (and there is often a reason for not allowing it) then anything else will be a moot point. You could email them directly to find out or try accessing many papers within a given time-frame.
 
Early Bioinformatics programming was dominated by Perl programmers. They are still very active in it. You might check with organizations who run web sites like bioperl.org or the National Center for Biotechnology Information (NCBI). NCBI supports Ebot, which generates Perl scripts for data access (see http://www.ncbi.nlm.nih.gov/Class/PowerTools/eutils/ebot/ebot.cgi). Also the European Bioinformatics Institute has similar stuff. There is a O'Reilly book, Beginning Perl for Bioinformatics, that has more references. There are now similar efforts for tools in Python language. See http://www.open-bio.org/wiki/Main_Page. I don't know much more about it, but they might help you to focus your search for programmers. Maybe others here can give more information. Good luck in your efforts. It's a very worthy cause and I wish I could help more.
 
Last edited:

Similar threads

  • · Replies 3 ·
Replies
3
Views
3K
Replies
2
Views
3K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 1 ·
Replies
1
Views
405
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 4 ·
Replies
4
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 0 ·
Replies
0
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K