Cheminformatics career what language to learn first?

  1. Apr 25, 2015 #1
    I'm a math and chemistry double major and hope to go to grad school in cheminformatics, my eventual goal would be to work as a data scientist for a drug company. My question is what languages should I learn first? The CS class I need to take for my math major is C++ but I'm not sure if this is the best to learn first. Learning programming online seems easier with lots of resources so I'm curious what I need to learn.

    This can be applied to bioinformatics or other data science related career. I'm sure the languages needed in those fields would be useful for cheminformatics also.

    Thanks for any advice.
  3. Apr 30, 2015 #2
    Thanks for the post! This is an automated courtesy bump. Sorry you aren't generating responses at the moment. Do you have any further information, come to any new conclusions or is it possible to reword the post?
  4. Sep 29, 2015 #3
    Check out David Wild's learncheminformatics.com

    @eshuang Enoch Huang PhD on Twitter works in the area of cheminformatics. He's actually at a pharmaceutical company.

    Check out Sean Ekin's site too. He does discovery work on drugs for rare diseases, amongst other things. He's on Twitter and Linked In. He also edited an academic book on the topic. Your library may have a copy or be able to pull one in if they don't.

    Sean, Enoch and David have been pushing information my way as I've been asking for information. It's a specialized field so connecting with those who currently work in it helps.

    I'm checking the field out as I've been working as a chemist for over 20 years and my company sold my business unit. Due to specialization in flavor design, I have to move to places I don't want to love or change careers. I'm testing career change that uses my breadth of knowledge and experience.

    I'm taking Java this semester while I am still working. What languages do you need? Don't know. David, Sean and Enoch may be able to make suggestions.
    So when it comes to learning languages, it isn't a straight forward question. Just to give you context, I am currently a data scientist but I came from a statistics background. In my group, the most popular languages are Python and R and that seems to be generally true for most data science groups I meet. However, within specializations there are deviations. Groups that tend to do a lot of web scraping may know ruby. My group which works with large scale production models write in Scala. We also know java because we all happen to originally write map reduce when pig and hive were terrible. Speaking of which, pig and hive are also pretty common.

    With all that said though, who knows what will be popular by the time you finish your degree. When I was finishing my degree scala existed but it wasn't a big data language per se, but then comes scalding and scala became big.

    So in short, Python, C, C++, R and hive/pig/sql/nosql are probably the most common. More importantly though is understanding algorithms and the ability to program in parallel regardless of the language you pick.
