Cheminformatics career what language to learn first?

Click For Summary
SUMMARY

The discussion centers on the essential programming languages for a career in cheminformatics, particularly for aspiring data scientists in the pharmaceutical industry. Key languages identified include Python, R, C, C++, and SQL, with an emphasis on understanding algorithms and parallel programming. Participants recommend resources such as learncheminformatics.com and highlight the importance of networking with professionals like Enoch Huang and Sean Ekin for guidance. The conversation underscores that while foundational languages are crucial, adaptability to emerging technologies is equally important.

PREREQUISITES
  • Understanding of cheminformatics principles
  • Familiarity with programming concepts
  • Basic knowledge of data science methodologies
  • Awareness of algorithm design and parallel programming
NEXT STEPS
  • Learn Python for data analysis and cheminformatics applications
  • Explore R for statistical computing and graphics
  • Study SQL for database management and data retrieval
  • Investigate parallel programming techniques and algorithms
USEFUL FOR

Students and professionals in cheminformatics, data scientists transitioning from other fields, and anyone interested in data science applications within the pharmaceutical industry.

Loststudent22
Messages
100
Reaction score
15
I'm a math and chemistry double major and hope to go to grad school in cheminformatics, my eventual goal would be to work as a data scientist for a drug company. My question is what languages should I learn first? The CS class I need to take for my math major is C++ but I'm not sure if this is the best to learn first. Learning programming online seems easier with lots of resources so I'm curious what I need to learn.

This can be applied to bioinformatics or other data science related career. I'm sure the languages needed in those fields would be useful for cheminformatics also.

Thanks for any advice.
 
Loststudent22 said:
I'm a math and chemistry double major and hope to go to grad school in cheminformatics, my eventual goal would be to work as a data scientist for a drug company. My question is what languages should I learn first?
Thanks for any advice.

Check out David Wild's learncheminformatics.com

@eshuang Enoch Huang PhD on Twitter works in the area of cheminformatics. He's actually at a pharmaceutical company.

Check out Sean Ekin's site too. He does discovery work on drugs for rare diseases, amongst other things. He's on Twitter and Linked In. He also edited an academic book on the topic. Your library may have a copy or be able to pull one in if they don't.

Sean, Enoch and David have been pushing information my way as I've been asking for information. It's a specialized field so connecting with those who currently work in it helps.

I'm checking the field out as I've been working as a chemist for over 20 years and my company sold my business unit. Due to specialization in flavor design, I have to move to places I don't want to love or change careers. I'm testing career change that uses my breadth of knowledge and experience.

I'm taking Java this semester while I am still working. What languages do you need? Don't know. David, Sean and Enoch may be able to make suggestions.
 
  • Like
Likes   Reactions: Loststudent22
So when it comes to learning languages, it isn't a straight forward question. Just to give you context, I am currently a data scientist but I came from a statistics background. In my group, the most popular languages are Python and R and that seems to be generally true for most data science groups I meet. However, within specializations there are deviations. Groups that tend to do a lot of web scraping may know ruby. My group which works with large scale production models write in Scala. We also know java because we all happen to originally write map reduce when pig and hive were terrible. Speaking of which, pig and hive are also pretty common.

With all that said though, who knows what will be popular by the time you finish your degree. When I was finishing my degree scala existed but it wasn't a big data language per se, but then comes scalding and scala became big.

So in short, Python, C, C++, R and hive/pig/sql/nosql are probably the most common. More importantly though is understanding algorithms and the ability to program in parallel regardless of the language you pick.
 
  • Like
Likes   Reactions: Loststudent22

Similar threads

  • · Replies 3 ·
Replies
3
Views
5K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 11 ·
Replies
11
Views
7K
  • · Replies 15 ·
Replies
15
Views
4K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 18 ·
Replies
18
Views
7K
  • · Replies 18 ·
Replies
18
Views
7K