1. Limited time only! Sign up for a free 30min personal tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Math Career Advice -- Wanting to become a data scientist

  1. Jul 7, 2016 #1
    Hi, so currently I am stuck in a situation where I currently accepted a job offer and had 5 other interviews which I never heard back from any yet. (I will have to say no if I hear back I guess and there was one I really was hoping to get but its going to be too late now). These were for data analyst positions for companies dealing with healthcare insurance etc. I just graduated with a masters in statistics two weeks ago, and I do know R, Python, MySQL pretty well, and I have done an internship in the past dealing with data analysis. I took the current offer since I do lack work experience and it involves working with the company's proprietary database doing back end work. Plus I was a bit fearful if I did not have any job for more then 6 months related to my field, my degree would just be useless. However after reading everywhere there is a demand for data scientists I did apply for these positions all around as well. For some reason I did not get interviews especially in the SF area. I've seen people with no work experience get these positions to those with several years of experience, and somewhere I fall between no experience and a lot of experience. My question is what should I do from here? I was thinking to work a year or so, and around 9 months in apply for DS jobs again. This time I hope that the 1 year of work experience I have will help me much more. I also hear that its all about networking, but to me is that just adding a bunch of people on LinkedIn and just talking about advice on applying. I guess I should ask what is the proper way to network on LinkedIn or some other websites? I appreciate any advice and thanks for helping out!
  2. jcsd
  3. Jul 7, 2016 #2


    User Avatar
    Education Advisor

    I'm a Data Scientist and a director of many data scientist. Here are my general tips for getting started in Data Science and why your resume may not attract data science recruiters:

    • Buzz words! You need to know HDFS, PIG, HIVE, SPARKS, H20, MapReduce(less now but it's good to know about it), sklearn, and pandas. Without these words in your resume, you won't make it pass a filter.
    • Do stuff! Kaggle.com exist for people like you. Do competitions, put it on your resume (Even if you suck!).
    • Github! Document your work in github, show your thought process, it helps me understand your experience!
    • Statistics is cool (it's my background), but if I asked you to tell me the fundamental differences between a Gradient Boosting Model and a Random Forest could you? Why does GBM perform so well out of the box? Dangers of a using a Lasso? What's better than a Lasso and Ridge? How do you tune a boosting model? If all that sounds foreign, then...learn it!
    • Network! This means go to Data Science meet ups. Go to data science conferences. Data Scientist as a rule go to conferences to discuss new ideas, as opposed to statistician who write papers. It's just a cultural thing. Find someone, pick their brain, tell them your dumb idea, and then watch as they gleefully explain to you why it's dumb but in the process teach you how to improve as a data scientist.
    Those are the basics. There's other stuff in order to get a real data science job (i.e. making models to do cool stuff as opposed to optimizing processes);however, if you're not doing any of the stuff above, then you don't have a chance to even have your resume hit my desk. Good luck!

    **Personal note. I've noticed there exist a bias for hiring new candidates from top tier schools only. 1 year work experience is not enough to overcome this. I've only seen this overcome by doing an internship or excelling at kaggles.
  4. Jul 8, 2016 #3
    I have participated in Kaggle and created a recommender system to predict whether a client would be satisfied or not even though I got around like 60% accuracy in Python using ski-kit learn/matplotlib/etc. I also worked with bagging/boosted trees as well as the standard decision trees to build a model that would predict for a data mining project I did in class which is documented both with the report and code. I also am familiar with many of the questions you mentioned but I probably would have to learn more about lasso/ridge regression as I have not used it much. I have used Lasso for variable selection purposes in a linear regression setting. I have the code up for the kaggle one too. And I have done a previous internship where I was able to model some stuff and worked with the advanced parts of excel to create reports such as power pivots, pivot tables, etc. In terms of knowing about hdfs/mapreduce/spark I have a working knowledge of that and created 2-3 mappers/reducers of my own. Eh I graduated from UC Davis so I don't if that helps in any way. I'm guessing I probably need to do more projects while I work in machine learning. Yeah I wish I could go to those conferences but I don't know if there are any in LA as im much closer there then SF which I know has them.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted