What programming language should I learn?

AxiomOfChoice · May 12, 2013

I'm a mathematical physicist expecting to finish his Ph.D. this summer. I'm not having any luck with academic positions, so I'm getting ready to try industry. Note that I am not at all sure what kind of position I want. Finance? Sure. Consulting? Yeah, why not? R&D somewhere? Yup.

BUT...my computer skills are, I think, pretty underdeveloped. (In fact, one of the reasons I went after academia in the first place is that my computer skills seemed too crappy to land a nice industry position; I think industry is a better fit for me than academia anyway.) I've used Mathematica a lot, but it doesn't seem very many people care about that. So what should I do? I took a class on C in college and did very, very well, and so I suppose I could (and probably should) brush that off a bit...but what else?

Solkar · May 12, 2013

AxiomOfChoice said:

I took a class on C in college and did very, very well,

That's a good starting point.

Consider getting familiar with object-, and more general, class-related concepts and consider learning C++, Java or Python. I would recommend C++, because if you can handle that, learning Java or Python or whatever (except assembly dialects) modern language afterwards is a piece of cake.

As a foundation for OO, get yourself an issue of
Grady Booch
Object Oriented Analysis and Design
and start "playing" with those concepts in a language of your choice; preferably C++.

Another excellent book is
Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides
Design Patterns: Elements of Reusable Object-Oriented Software

Finally, to get a glimpse of more modern, generic concepts, have a look at (C++) std-library containers std::vector<T> and std::valarray<T> and "Boost"(http://www.boost.org/), especially at boost::ublas and boost::graph; i somehow would assume you

AxiomOfChoice said:

I'm a mathematical physicist

could like that.

Regards, S.

daveyrocket · May 13, 2013

Well you should have a look at dice.com or indeed.com at what kinds of programming jobs out there interest you. There's a huge variety, and most types of programming domains have a favorite language or two. So if you can figure out what type of development interests you, then learn the languages that are useful in that type of development.

One big area these days is web programming, which is largely PHP but also Java, Ruby and ASP.NET, with SQL being hugely important for all these (but very easy to learn). Another big area is mobile device programming which is dominated by Objective-C (iPhone) and Java (Android).

C++ is a good language to learn for a wide variety of fundamentals, but the majority of jobs don't use C++, and employers aren't generally willing to wait around for you to learn another language. So depending on how much time you have you may be better served to just learn the relevant language for the part of the industry you want to work in.

trickslapper · May 13, 2013

I would say C#/JAVA and SQL if you want a job

AlephZero · May 13, 2013

Solkar said:

As a foundation for OO, get yourself an issue of
Grady Booch
Object Oriented Analysis and Design
and start "playing" with those concepts in a language of your choice; preferably C++.

Another excellent book is
Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides
Design Patterns: Elements of Reusable Object-Oriented Software

Nothing wrong with those bools (and there's even a lot right about the second one IMO) but I wouldn't call either of them good books to start learning OO.

IMO, Better to start with a language where you can't "ignore OO" - i.e. Java, rather than C++, because if you already know C, you will probably fall into the bad habit of writing C++ code that looks exactly like C.

The time to buy those books is when you know enough to disagree with what's in them, IMO. That's when you will really learn something from them.

Solkar · May 13, 2013

AlephZero said:

because if you already know C, you will probably fall into the bad habit of writing C++ code that looks exactly like C.

Good point.

But I see this

AlephZero said:

The time to buy those books is when you know enough to disagree with what's in them, IMO. That's when you will really learn something from them.

quite contrarily - especially working with those books from the very beginning should point the learner in the right direction.

That's part of why I mentioned them.

Solkar

physwizard · May 29, 2013

I think C++ is a good language to learn. If you are planning to become a hard core programmer then what many of the people above are suggesting should work. Spend some time learning about algorithms and graph theory too, a little bit about operating systems, theoretical computer science and relational algebra and databases. These would be good for the more brainier jobs.

D H · May 29, 2013

The topic of "what programming language should I learn?" comes up repeatedly at this site. In my opinion, that is the wrong question. The right question is "What programming languages should I learn?" The answer to this is "a lot". Good programmers, scientific or otherwise, eventually learn every step from low level assembly to fourth generation or higher language, and learn every major programming paradigm. Learning one language doesn't cut it. Learning at least one new language a year is a reasonable goal.

physwizard · May 29, 2013

D H said:

The topic of "what programming language should I learn?" comes up repeatedly at this site. In my opinion, that is the wrong question. The right question is "What programming languages should I learn?" The answer to this is "a lot". Good programmers, scientific or otherwise, eventually learn every step from low level assembly to fourth generation or higher language, and learn every major programming paradigm. Learning one language doesn't cut it. Learning at least one new language a year is a reasonable goal.

I feel this is an overkill. If you know two or three languages its easier to get a job. But once you're on the job you can learn new languages as and when you require them. You're not going to remember all of them anyway. I work in a scientific type of programming job where math and developing efficient algorithms are equally important as knowledge of the language and I am speaking only from my own experience. I don't know much about the mainstream programming jobs but what I've seen from people I know is that once they start off with a specific language they tend to specialize in it for eg. one guy I know specialized in java and another in oracle databases. I know one guy who worked as a C++ programmer but then lost that job and ended up having to learn java because there were more jobs in that area. So life seems quite chaotic here with people practically living on their wits.

D H · May 29, 2013

physwizard said:

I feel this is an overkill.

Is it? Let's look at what languages / tools a physics grad student needs to learn.

Fortran. Nobody has mentioned this language. Physics departments are chock full of legacy code written mostly in Fortran. The scientific library is very likely to be written in Fortran.
At least one of python or perl. Sometimes one needs to write some experimental code, see if an algorithm has the slightest chance of working. Python and perl perfect for this. Sometimes you need to glue bits and pieces of multiple programs together. Python and perl are the perfect glue code.
At least one of C, C++, or Java. Fortran isn't a marketable skill. Python is agonizingly slow.
Bash or tcsh. This is what one types at the command line. It's a good idea to learn what's under the hood.
Make. At some point a simple script is no longer good enough to build the complex program that models the interior of a failing star. Make is the tool one needs. It's also provides a handy introduction to the declarative programming paradigm.
Matlab or Mathematica.
LaTeX. Even if the PhD thesis can be written in Word, the articles one wishes to publish most likely need to be written in TeX or LaTeX.

That's seven languages / tools, and unless one plans on being a tenured grad student, that's a language a year.

goingmeta · May 29, 2013

... depends on the domain and who you're working with. I wouldn't recommend learning C++ before you know what you're getting yourself into. It might just be a waste of time. If you go into web development, you're going to be using python, ruby, javascript, or something similar. *Nobody* uses C++ for web development. Does C++ help you learn other languages? Sure, but so what? The ideas are present in many other languages.

Find the people you want to work with and ask them what tools they use.

bhobba · May 29, 2013

I worked as a programmer for 30 years and knew a number of languages I learned in my degree before I started - Assembly, Foretran, Cobol, Pascal and Simula.

The fact of the matter is once you know two or three languages learning another one is a snap. As one of my professors said - syntax is boring - that's not what programming is about - its about concepts that are language independent.

Learn some Assembly and two high level languages like C++ and Python and you will be fine.

Thanks
Bill

goingmeta · May 29, 2013

Who writes assembly by hand? Unless this is popular with people in computational physics, I would say that's really poor advice. It's really not a marketable skill.

bhobba · May 29, 2013

goingmeta said:

Who writes assembly by hand?

Occasionally, when you want performance you tune critical parts by writing in assembly language. Its also a good for understanding what's going on inside the computer. And you do not write assembly by hand - you use an assembly language which is compiled exactly like a high level language - its advantage being it allows you access to low level functions normally hidden such as the stack.

I always remember when I learned it I thought - what the fook - why learn this rubbish. The professor said he at one time worked at a place that used it exclusively but they had this huge library of subroutines. Mostly you simply link them together - easy peasy - nothing to it. I sort of giggled - until I actually went out to work and realized how true it was - regardless of what language you use its mostly linking together standard subroutines. To tune a system you have a look at the subroutines chewing the most resources and tune it. Using assembly makes that tuning easier.

Besides high level languages like C have ways to include inline assembly - if you really want to tap the languages full power you need to know it.

Thanks
Bill

cgk · May 30, 2013

I second that. While the time when you would actually write assembly by hand in real-world to-be-delivered programs is long past, being able to understand it is still a valuable skill, especially when debugging or tuning applications[1].

[1] and even today, some kinds of problems are still best handled in terms of inline assembly (say, matrix mutliplication kernels).

goingmeta · May 30, 2013

Yes, many people will tell you that you need to write the critical and most time consuming code in assembly, but how many people *actually* do that? How many people actually do that better than the compiler? How many people actually do it better than the programmers who wrote the matrix multiply library you're using? Also, unless you're programming for specialized hardware, or study computer architecture, you won't 'know assembly'. Assembly for x86 is a mess.

There are special domains where this sort of thing is relevant. But the guy isn't even sure where he wants to go.

It's like somebody asking about what kind of car to get and then getting recommendations for websites which sell car parts.

bhobba · May 30, 2013

goingmeta said:

Yes, many people will tell you that you need to write the critical and most time consuming code in assembly, but how many people *actually* do that?

I have done it - but it wasn't complex and it was more to access some of the low level functions of the database system we were using. Using the higher level language it was written in it was SVC bound - if you are not an IBM geek then don't worry about what that is, simply its short for a supervisor call and doing a lot of them can slow programs down. You would issue a call to read the associator in an inverted list database to get the next entry. Its in memory so it should happen quick but it still required a SVC. There was a low level call available to return the associator to memory in the program bypassing the SVC. When the change was released the report was it was like a turbocharger had been added. That's the main reason - on occasion you want to do things a higher level language won't allow you to do.

All I am saying is once you know 2-3 languages picking up a new one is a snap and if one of those is assembly you have a more rounded background and can pick up other low level languages. You do not need to do it often - but on occasion you do.

Thanks
Bill

DimReg · May 30, 2013

I'm going to come in on the side of the number of languages doesn't matter. If you learn a few sufficiently complex languages, then you'll have most of what you need to learn any other language. Especially for you mathematical physicists, it should be obvious that the syntax is not important, only the structures (an if statement is an if statement in every language it appears in).

I wouldn't focus on assembly language with your background. You shouldn't be marketing yourself as a coding specialist, because your only background in coding is a C course you took in college (that doesn't mean you can't get a coding job). Unfortunately, there are high school students with much better background than that (and will take the job for much less than you). Most employers would cringe at the idea of setting their employees on assembly, especially non specialists. That's why languages like Java are so popular, because they minimize the number of mistakes programmers make by shielding you from the computer. I guess my point is that since you aren't a comp sci major/phd, then most likely someone else will do the assembly language manipulation for you.

bhobba · May 30, 2013

DimReg said:

I guess my point is that since you aren't a comp sci major/phd, then most likely someone else will do the assembly language manipulation for you.

Actually that's true. Its guys like me with a CS background that are called on for stuff like that. So while its desirable and nice to know assembly it's more important to know a higher level language like Python or Java. Turkeys like I was will be called in if tuning is required.

Thanks
Bill

D H · May 30, 2013

I'll give a specific example: spherical harmonics. Newton's law of gravitation only pertains to point masses and objects with a spherical mass distribution. The Earth and Moon are not spherical. Use Newton's law of gravitation and you won't capture key dynamics. Something more complex is needed. One widely used technique is to model gravitation via spherical harmonics. We had a very large simulation that used spherical harmonics to represent gravitational attraction. The simulation spent an inordinate amount of CPU time applying those spherical harmonics. In a boiled down version of that large simulation, almost all of the CPU time was spent in just a handful of lines of code.

A naive implementation of a spherical harmonic expansion involves multiply nested loops with all the calculations in the inner loop. After all, that's what the math says to do: \sum_i \sum_j \text{hairy expressions}. The formulae in that inner loop are quite complex as the code needs to calculate potential (a scalar), the gradient of potential (a vector), and the gradient of that (a second order tensor).

I looked at the assembly generated by the compiler. One problem was that the compiler wasn't finding common expressions that could have been calculated just once, or even better, pushed to the outer loop. While the assembly code showed the problem, hand tuning that assembly code would not have been the right solution. The right solution was to rewrite the C code so that those common expressions were calculated just once.

The code was still too slow even after applying these optimizations. Those hairy expressions involved a number of integer to double conversions. Delving once again into the assembly, I could see that there was an immense amount of low level bit twiddling in performing those conversions. This bit twiddling is slow. Once again, hand tuning the assembly would not have been the right solution. The right solution in this case was to build an integer to double table that spanned the range of integers that needed to be converted to doubles and then perform those integer to double conversions via table lookup.

Bottom line: Even though we didn't write a bit of assembly to speed things up, knowledge of assembly was critical in identifying the problems with this code.

physwizard · May 30, 2013

D H said:

Is it? Let's look at what languages / tools a physics grad student needs to learn.

Fortran. Nobody has mentioned this language. Physics departments are chock full of legacy code written mostly in Fortran. The scientific library is very likely to be written in Fortran.

At least one of python or perl. Sometimes one needs to write some experimental code, see if an algorithm has the slightest chance of working. Python and perl perfect for this. Sometimes you need to glue bits and pieces of multiple programs together. Python and perl are the perfect glue code.

At least one of C, C++, or Java. Fortran isn't a marketable skill. Python is agonizingly slow.

Bash or tcsh. This is what one types at the command line. It's a good idea to learn what's under the hood.

Make. At some point a simple script is no longer good enough to build the complex program that models the interior of a failing star. Make is the tool one needs. It's also provides a handy introduction to the declarative programming paradigm.

Matlab or Mathematica.

LaTeX. Even if the PhD thesis can be written in Word, the articles one wishes to publish most likely need to be written in TeX or LaTeX.

That's seven languages / tools, and unless one plans on being a tenured grad student, that's a language a year.

Hi D H,
Everything you've mentioned is a very good thing to learn and your list would be definitely be a very good starting point for the person who posted this query. But a lot of them don't qualify as full fledged programming languages. bash and tcsh are shell scripting languages. make is a file format. LaTeX is a
formatting language. Matlab/Mathematica have some programming capabilities but they are interpreted languages suitable more for numerical/symbolic computations than for developing stand alone applications. Also, all of them don't have the same learning curve. Some of these can be learned in a few days. So what you are basically saying is that it is a language or tool a year. But if you are looking for a job right now you can't wait seven years so you got to crash most of these in a few weeks or a month.

nsaspook · May 30, 2013

goingmeta said:

Yes, many people will tell you that you need to write the critical and most time consuming code in assembly, but how many people *actually* do that? How many people actually do that better than the compiler? How many people actually do it better than the programmers who wrote the matrix multiply library you're using? Also, unless you're programming for specialized hardware, or study computer architecture, you won't 'know assembly'. Assembly for x86 is a mess.

Actually programming in Assembly is a mess but there are other reasons to know the machine. One of the things you quickly learn with a understanding an actual machine and the code generated by your language instead of just programming the abstract machine created by the computer language are the limits of compiler optimization and why language languages have developed 'hints' to communicate about the actual machines interface to real hardware to make your code run correctly, faster and more efficiently when compiled.

The C 'Volatile' is a good example: http://blog.regehr.org/archives/28

D H · May 30, 2013

DimReg said:

I'm going to come in on the side of the number of languages doesn't matter. If you learn a few sufficiently complex languages, then you'll have most of what you need to learn any other language.

I disagree. You'll at best have what you need to code poorly in languages similar to the ones you already know. Learning to program as opposed to code in that new language takes effort and time. Learning to program well in that new language means learning the language's idiosyncrasies and idioms.

Learning to program in a language that represents a programming paradigm dissimilar to those you already know is tougher yet. A person who knows only imperative, procedural languages can move to yet another imperative, procedural language with some ease but is going to face an uphill battle in moving to another paradigm such as object oriented programming or functional programming. C programmers who have just started learning C++ doesn't really write C++. They write a pidgin dialect better named C plus or minus. It takes a good amount of time to make the transition from pure procedural programming to object oriented programming. The transition to functional programming or data driven programming is harder yet.

an if statement is an if statement in every language it appears in.

No, it's not. An if statement in Lisp looks and behaves like the ternary statement in C/C++ rather than a C/C++ if statement. A Haskell program chock full of if statements is a sign of a noob who doesn't really know how to program in Haskell. There's not much need for if/then/else in Haskell.

physwizard said:

Hi D H,
Everything you've mentioned is a very good thing to learn and your list would be definitely be a very good starting point for the person who posted this query. But a lot of them don't qualify as full fledged programming languages.

Yes, I did abuse the notion of "language" in that list. However,

bash and tcsh are shell scripting languages.

Bash and tcsh are Turing complete. They are programming languages.

Why it's handy to learn (as opposed to hunt and peck your way through the command line): Suppose that after some fits and starts, you successfully execute a sequence of shell commands, including some loops and if statements. Suppose you need to do this all over again sometime later. How to preserve that work so you don't have to go through all those fits and starts? Simple: Pipe your command history to a file, edit that file, perhaps generalizing a bit, and tada! you have a handy shell script that you can run again at that later time.

make is a file format.

Huh? C is a file format. C++ is a different file format. Python, yet another. Make is a language. Some versions of make are even Turing complete. The make language is an example of the logic programming paradigm, somewhat akin to how Prolog works. One key difference: Make rules are not recursive. To get a make that is Turing complete, you'll have to use a version of make such as gnu make that supports lambda expressions (functional programming paradigm). With lambda expressions it's turtles all the way down.

LaTeX is a formatting language.

A rather important one, particularly for a math or physics PhD candidate. Learning to use it properly to write a thesis is a significant undertaking, as hard as learning a new programming language. Not that you'd want to program in it, you can. It is Turing complete. Fun and games with TeX/LaTeX: Here's the (in)famous xii.tex (build as plain TeX rather than LaTeX).

Code:

\let~\catcode~`76~`A13~`F1~`j00~`P2jdefA71F~`7113jdefPALLF
PA''FwPA;;FPAZZFLaLPA//71F71iPAHHFLPAzzFenPASSFthP;A$$FevP
A@@FfPARR717273F737271P;ADDFRgniPAWW71FPATTFvePA**FstRsamP
AGGFRruoPAqq71.72.F717271PAYY7172F727171PA??Fi*LmPA&&71jfi
Fjfi71PAVVFjbigskipRPWGAUU71727374 75,76Fjpar71727375Djifx
:76jelse&U76jfiPLAKK7172F71l7271PAXX71FVLnOSeL71SLRyadR@oL
RrhC?yLRurtKFeLPFovPgaTLtReRomL;PABB71 72,73:Fjif.73.jelse
B73:jfiXF71PU71 72,73:PWs;AMM71F71diPAJJFRdriPAQQFRsreLPAI
I71Fo71dPA!FRgiePBt'el@ lTLqdrYmu.Q.,Ke;vz vzLqpip.Q.,tz;
;Lql.IrsZ.eap,qn.i. i.eLlMaesLdRcna,;!;h htLqm.MRasZ.ilk,%
s$;z zLqs'.ansZ.Ymi,/sx ;LYegseZRyal,@i;@ TLRlogdLrDsW,@;G
LcYlaDLbJsW,SWXJW ree @rzchLhzsW,;WERcesInW qt.'oL.Rtrul;e
doTsW,Wk;Rri@stW aHAHHFndZPpqar.tridgeLinZpe.LtYer.W,:jbye

Matlab/Mathematica have some programming capabilities but they are interpreted languages suitable more for numerical/symbolic computations than for developing stand alone applications.

The distinction between interpreted and compiled is a bit arbitrary. There are C and Fortran interpreters, and there are Matlab and Python compilers. That Matlab isn't suitable for developing standalone applications: Tell that to the developers of multiple spacecraft control systems that each wrote their entire system in Matlab/Simulink.

Gauss M.D. · May 30, 2013

D H said:

Is it? Let's look at what languages / tools a physics grad student needs to learn.

Fortran. Nobody has mentioned this language. Physics departments are chock full of legacy code written mostly in Fortran. The scientific library is very likely to be written in Fortran.

At least one of python or perl. Sometimes one needs to write some experimental code, see if an algorithm has the slightest chance of working. Python and perl perfect for this. Sometimes you need to glue bits and pieces of multiple programs together. Python and perl are the perfect glue code.

At least one of C, C++, or Java. Fortran isn't a marketable skill. Python is agonizingly slow.

Bash or tcsh. This is what one types at the command line. It's a good idea to learn what's under the hood.

Make. At some point a simple script is no longer good enough to build the complex program that models the interior of a failing star. Make is the tool one needs. It's also provides a handy introduction to the declarative programming paradigm.

Matlab or Mathematica.

LaTeX. Even if the PhD thesis can be written in Word, the articles one wishes to publish most likely need to be written in TeX or LaTeX.

That's seven languages / tools, and unless one plans on being a tenured grad student, that's a language a year.

This list is just utterly ridiculous.

jim mcnamara · May 30, 2013

Gauss M.D. said:

This list is just utterly ridiculous.

I see. And how do you back up this assertion?

goingmeta · May 30, 2013

I don't think his list is ridiculous. Competent programmers need to know the whole stack. I don't think you need in-depth knowledge of assembly, make, or bash (this is something you can easily learn on demand), but you should be able to pick out appropriate tools for each job.

Whether or not the OP, or physics students, have to know all that--I have no idea.

Gauss M.D. · May 30, 2013

jim mcnamara said:

I see. And how do you back up this assertion?

57% of the list consists of stuff that's either not a programming language, or stuff that is really easy to pick up if you're already well versed with c++ or whatever. He is implying that you need in-depth knowledge of a bunch of really tough-to-master languages to be considered programming savvy when really, solidity in one good language and fleeting familiarity with the other six points is basically fine.

ParticleGrl · May 31, 2013

I think some of the discussion here is people talking past each other. The original poster asked, essentially, what programming language should I concentrate on learning to maximize my ability to get a job right here, right now.

DH gave a list of useful things to learn as a graduate student, some of which aren't so marketable (fortran, latex,mathematica). And also- there is a difference between being a truly good programmer, and being a hire-able programmer. The first step for a physicist looking for a job is to become merely hire-able.

I think the best thing the original poster can do is figure out what type of work he/she wants to do, and learn the languages being used there first. If you want to do statistical data work, start with R. If you want to program iphone apps start with objective C. Pick one of python or perl and get good with it.

Bottom line: Even though we didn't write a bit of assembly to speed things up, knowledge of assembly was critical in identifying the problems with this code.

I'm almost positive a good profiler/debugger would have found the same problems without a person having to get down into the assembly.

DimReg · May 31, 2013

ParticleGrl said:

I think some of the discussion here is people talking past each other. The original poster asked, essentially, what programming language should I concentrate on learning to maximize my ability to get a job right here, right now.

DH gave a list of useful things to learn as a graduate student, some of which aren't so marketable (fortran, latex,mathematica). And also- there is a difference between being a truly good programmer, and being a hire-able programmer. The first step for a physicist looking for a job is to become merely hire-able.

I think the best thing the original poster can do is figure out what type of work he/she wants to do, and learn the languages being used there first. If you want to do statistical data work, start with R. If you want to program iphone apps start with objective C. Pick one of python or perl and get good with it.

I'm almost positive a good profiler/debugger would have found the same problems without a person having to get down into the assembly.

I think I was looking for the words to say this, but gave up. Particularly that there is a difference between a hire-able programmer and a good programmer.

@OP: For the most part, all your first employer is likely to care about is how good you are at the one language you are expected to use. Search for job openings, and learn the languages that are asked for in jobs you want.

C++ and Java are perfectly good languages for hireability too.

D H · May 31, 2013

ParticleGrl said:

DH gave a list of useful things to learn as a graduate student, some of which aren't so marketable (fortran, latex,mathematica).

A physics graduate student, in particular. Those skills are essential side skills that many physics grad students need to acquire somewhere along the way to getting that PhD. Side skills, mind you. A physics grad student has ten or so hours a week, tops, to learn those side skills. The other 80 or more hours of a typical grad student's work week is split between jumping through the hoops that the student's PhD adviser keeps erecting, jumping through the hoops that the administration keeps erecting. Somewhere along the way the student also needs to do the research needed for the thesis.

Of course some of those skills are easy to obtain. They have to be. There's just not enough time.

I think the best thing the original poster can do is figure out what type of work he/she wants to do, and learn the languages being used there first.

Go back to the original post. The number one item: "Finance? Sure." In other words, a quant. That requires a good deal of numerical analysis and programming dexterity, and not just in one computer language. Good knowledge of multiple languages is a key skill. A quant's day might start with pulling data from a database (SQL), then performing some statistical analysis on it (R, or some Fortran-based statistics package), and then performing a principal component analysis (Fortran, or maybe python/numpy).

If you want to do statistical data work, start with R.

That's a good one to add to the list if the goal is to be a quant.

Bottom line: Even though we didn't write a bit of assembly to speed things up, knowledge of assembly was critical in identifying the problems with this code.

I'm almost positive a good profiler/debugger would have found the same problems without a person having to get down into the assembly.

No, it wouldn't. How's a debugger going to help? Yes, you can make a debugger switch to executing the assembly code, but that's not going to be much help if you don't know assembly. Debuggers and profilers aren't all that good with optimized code. (Some debuggers can't even run on optimized code. They skip right over it.) The problem is that the as-built code and the as-written code no longer map to one another nicely once the code is optimized. A good optimizer will push part of a statement here, another part there. It might isolate expressions common to multiple statements. Which statement gets the blame in profiling?

A profiler did tell us that one particular function was hogging all the CPU time, and when compiled unoptimized, only a handful of lines were responsible. What the profiler didn't tell us was that the compiler wasn't doing a bang-up job when the code was compiled optimized. Looking at the assembly did. Only then was it obvious that common or invariant subexpressions were not recognized as such. The profiler didn't tell us that the standard int to double conversion is best avoided in a tight, multiply nested loop. How could it? It's not obvious to the human reading the as-written code, either. The conversion looked so benign, particularly since it was but one small part of a complex expression and since the conversion was implicit rather than shouted out via static_cast<double> (n).

DimReg · May 31, 2013

D H said:

Go back to the original post. The number one item: "Finance? Sure." In other words, a quant. That requires a good deal of numerical analysis and programming dexterity, and not just in one computer language. Good knowledge of multiple languages is a key skill. A quant's day might start with pulling data from a database (SQL), then performing some statistical analysis on it (R, or some Fortran-based statistics package), and then performing a principal component analysis (Fortran, or maybe python/numpy).

I'm not sure what your point here is. Is it that big of a jump to go from what I said, to the idea that if an employer is asking for knowledge of multiple programming languages, to learn those multiple programming languages? MY point is that employers tell you what they are looking for, because it saves everyone's (especially their) time when they do. So learn what they are looking for. Simple.

Solkar · May 31, 2013

D H said:

Is it? Let's look at what languages / tools a physics grad student needs to learn.

Fortran. Nobody has mentioned this language. Physics departments are chock full of legacy code written mostly in Fortran. The scientific library is very likely to be written in Fortran.

At least one of python or perl. Sometimes one needs to write some experimental code, see if an algorithm has the slightest chance of working. Python and perl perfect for this. Sometimes you need to glue bits and pieces of multiple programs together. Python and perl are the perfect glue code.

At least one of C, C++, or Java. Fortran isn't a marketable skill. Python is agonizingly slow.

Bash or tcsh. This is what one types at the command line. It's a good idea to learn what's under the hood.

Make. At some point a simple script is no longer good enough to build the complex program that models the interior of a failing star. Make is the tool one needs. It's also provides a handy introduction to the declarative programming paradigm.

Matlab or Mathematica.

LaTeX. Even if the PhD thesis can be written in Word, the articles one wishes to publish most likely need to be written in TeX or LaTeX.

That's seven languages / tools, and unless one plans on being a tenured grad student, that's a language a year.

That's an exellent sketch for a medium-distance road map; but I would nevertheless recommend C++ as the first language the OP should study given his C-affinity.

---

Fortran is a miracle in itself.
It's really simple and lightning fast at the same time.

BUT - I would not recommend that someone starts his way to software development with Fortran.
Fortran legacy code is often packed with bad habits that have survived from punch card time, which could be a job killer if picked up in a modern language - cryptic identifier, low comment density etc.

To write industry strength Fortran it needs an experienced developer, which transfers his good coding skills developed writing C,C++ or Java (or whatever modern language) to Fortran.

Also Fortran is not a general purpose language; it's a workhorse for numerics.

But given - likely still the best workhorse for that.

---

goingmeta said:

[...] assembly [...] I would say that's really poor advice. It's really not a marketable skill.

is that so...

In twenty years in the industry I've never met a good programmer who had not also had a liaison with an assembly dialect once in his career.

Let alone that I had ever been tempted to consider someone lacking that experience for a a job.

---

In addition, I'm quite surprised that many folks recommend python for beginners.

That's a very powerful, but also very complicated language; and from what get from a over a decade experience with that, one of the most often badly used as well.

Regards, Solkar

D H · May 31, 2013

Solkar said:

BUT - I would not recommend that someone starts his way to software development with Fortran.

Agreed, but this is not the typical "someone" I targeted with my list. My list is aimed at physics graduate students whose research involves computation and who want to make themselves marketable as a programmer / numerical analyst as a backup plan in case the climb up the academic ladder doesn't pan out. The same applies to meteorologists who have to deal with old Fortran weather simulations, and in a few other areas where Fortran still rules. In most areas Fortran is at best a niche language. Fortran still does rule in a small handful of highly numerical, highly intensive, and computationally large domains. Computational physics oftentimes is one of those domains.

goingmeta · May 31, 2013

I don't think there was any attachment or commitment to finance just because he enumerated it in a list of options. Maybe you should pay attention to what's actually being asked instead of gratuitously telling us about what physics grad students have to go through.

D H · May 31, 2013

goingmeta said:

I don't think there was any attachment or commitment to finance just because he enumerated it in a list of options.

Sure there was.

Perhaps you aren't aware of this, but Wall Street and PhD physicists are: The very same numeric codes that describes what happens inside a star also can be used to describe what happens to a hedge fund. Wall Street has hired lots of physics PhD graduates to work as quants because of this.

Upsides: A six digit starting salary, sometimes with a non-unitary leading digit, and working at the very heart of the engine that drives our economy.

Downsides: Ridiculous work hours (no chance to spend that huge salary), living in New York City with its high cost of living (that huge salary isn't as big as it seems), and working at the very heart of one of the more despised industries in the country.

meanrev · May 31, 2013

I second D H's roadmap: A numerical language such as MATLAB, a set of scripting languages (Python/Perl + bash + make), and an OO language such as C++/C#/Java. LaTeX has been the single most useful skill and productivity enhancer in my life, both in university or out of it. SQL is particularly useful and I wish I had learned more of it. Many seem to have missed out on the importance of the concurrent paradigm and distributed systems (e.g. via Erlang), and I feel this is what will really give you an edge in the coming half a decade, whatever you plan to be doing.

D H said:

Sure there was.

Perhaps you aren't aware of this, but Wall Street and PhD physicists are: The very same numeric codes that describes what happens inside a star also can be used to describe what happens to a hedge fund. Wall Street has hired lots of physics PhD graduates to work as quants because of this.

Upsides: A six digit starting salary, sometimes with a non-unitary leading digit, and working at the very heart of the engine that drives our economy.

Downsides: Ridiculous work hours (no chance to spend that huge salary), living in New York City with its high cost of living (that huge salary isn't as big as it seems), and working at the very heart of one of the more despised industries in the country.

meanrev · May 31, 2013

I forgot to comment on D H's blurb on finance. I preface that I speak with limited authority (I'm not as familiar with the sell-side as the buy-side) in this area, but I've recently had the misfortune to go through the regulatory red tape of forming a hedge fund. D H's post is mostly true, except I don't see a close relation between physical processes in stars and hedge fund liquidity - I'm not ruling out the likelihood that someone has created a model that associates the two, but this is not commonly used in practice. What's useful though, are a combination of quantitative skills and personality traits that an astrophysicist typically has.

Solkar · Jun 1, 2013

D H said:

Agreed, but this is not the typical "someone" I targeted with my list. My list is aimed at physics graduate students whose research involves computation and who want to make themselves marketable as a programmer / numerical analyst as a backup plan in case the climb up the academic ladder doesn't pan out.

The OP did not restrict the set of possible careers to classical numerical programming, let alone to the "academic ladder".

But even for the "academic ladder" - learning Fortran for that purpose means learning Fortran by using netlib libs.
Those libs are highly efficient but provide no role-model for how to code in yrs(2010: ).

But e.g. Bjarne Stroustrup, provides very good advice for coding maintainable numerics in the small chapter of his famous The C++ programming language; advice which will also be valuable if you need C++ just to collect data to push it into a Fortran maths kernel or interface via MPI.

physwizard · Jun 1, 2013

D H said:
I disagree. You'll at best have what you need to code poorly in languages similar to the ones you already know. Learning to program as opposed to code in that new language takes effort and time. Learning to program well in that new language means learning the language's idiosyncrasies and idioms.

Learning to program in a language that represents a programming paradigm dissimilar to those you already know is tougher yet. A person who knows only imperative, procedural languages can move to yet another imperative, procedural language with some ease but is going to face an uphill battle in moving to another paradigm such as object oriented programming or functional programming. C programmers who have just started learning C++ doesn't really write C++. They write a pidgin dialect better named C plus or minus. It takes a good amount of time to make the transition from pure procedural programming to object oriented programming. The transition to functional programming or data driven programming is harder yet.

No, it's not. An if statement in Lisp looks and behaves like the ternary statement in C/C++ rather than a C/C++ if statement. A Haskell program chock full of if statements is a sign of a noob who doesn't really know how to program in Haskell. There's not much need for if/then/else in Haskell.

Yes, I did abuse the notion of "language" in that list. However,

Bash and tcsh are Turing complete. They are programming languages.

Why it's handy to learn (as opposed to hunt and peck your way through the command line): Suppose that after some fits and starts, you successfully execute a sequence of shell commands, including some loops and if statements. Suppose you need to do this all over again sometime later. How to preserve that work so you don't have to go through all those fits and starts? Simple: Pipe your command history to a file, edit that file, perhaps generalizing a bit, and tada! you have a handy shell script that you can run again at that later time.

Huh? C is a file format. C++ is a different file format. Python, yet another. Make is a language. Some versions of make are even Turing complete. The make language is an example of the logic programming paradigm, somewhat akin to how Prolog works. One key difference: Make rules are not recursive. To get a make that is Turing complete, you'll have to use a version of make such as gnu make that supports lambda expressions (functional programming paradigm). With lambda expressions it's turtles all the way down.

A rather important one, particularly for a math or physics PhD candidate. Learning to use it properly to write a thesis is a significant undertaking, as hard as learning a new programming language. Not that you'd want to program in it, you can. It is Turing complete. Fun and games with TeX/LaTeX: Here's the (in)famous xii.tex (build as plain TeX rather than LaTeX).
Code:
\let~\catcode~`76~`A13~`F1~`j00~`P2jdefA71F~`7113jdefPALLF
PA''FwPA;;FPAZZFLaLPA//71F71iPAHHFLPAzzFenPASSFthP;A$$FevP
A@@FfPARR717273F737271P;ADDFRgniPAWW71FPATTFvePA**FstRsamP
AGGFRruoPAqq71.72.F717271PAYY7172F727171PA??Fi*LmPA&&71jfi
Fjfi71PAVVFjbigskipRPWGAUU71727374 75,76Fjpar71727375Djifx
:76jelse&U76jfiPLAKK7172F71l7271PAXX71FVLnOSeL71SLRyadR@oL
RrhC?yLRurtKFeLPFovPgaTLtReRomL;PABB71 72,73:Fjif.73.jelse
B73:jfiXF71PU71 72,73:PWs;AMM71F71diPAJJFRdriPAQQFRsreLPAI
I71Fo71dPA!FRgiePBt'el@ lTLqdrYmu.Q.,Ke;vz vzLqpip.Q.,tz;
;Lql.IrsZ.eap,qn.i. i.eLlMaesLdRcna,;!;h htLqm.MRasZ.ilk,%
s$;z zLqs'.ansZ.Ymi,/sx ;LYegseZRyal,@i;@ TLRlogdLrDsW,@;G
LcYlaDLbJsW,SWXJW ree @rzchLhzsW,;WERcesInW qt.'oL.Rtrul;e
doTsW,Wk;Rri@stW aHAHHFndZPpqar.tridgeLinZpe.LtYer.W,:jbye
The distinction between interpreted and compiled is a bit arbitrary. There are C and Fortran interpreters, and there are Matlab and Python compilers. That Matlab isn't suitable for developing standalone applications: Tell that to the developers of multiple spacecraft control systems that each wrote their entire system in Matlab/Simulink.

This is quite ridiculous. Try convincing a prospective employer that Make and LateX are programming languages. Chances are you won't last 2 minutes into the interview.

D H · Jun 1, 2013

Solkar said:

The OP did not restrict the set of possible careers to classical numerical programming, let alone to the "academic ladder".

Put yourself in the shoes of an employer who has an opening for an entry level, code monkey programming position in which there's not one lick of numerical programming in the target application. A freshly minted PhD mathematical physicist is simultaneously overqualified and underqualified for that position.

Typical mathematical physicists are underqualified because the programming skills are largely self learned and just cover the basics, with little knowledge of structured programming, structures and algorithms, parallelism, and all the other stuff that a computer science major learns as an undergraduate. They are overqualified because there are other jobs out there that represent a much better match between skills and job needs and that pay double the salary of that entry level code monkey position.

Most employers are reluctant to hire someone who is either overqualified or underqualified. Both simultaneously? Not a chance.

Regarding my list: It was aimed not just at the OP, but also at others in a similar boat. That said, starting to work on the "what do I do after I get my PhD" backup plan just a few months before graduating is far too late a start. My list (or something like it) is best suited for incoming candidates who plan on research that involves using numerical programming to solve a problem in physics. The sooner one gets started on that post-graduation backup plan, the better.

Having some kind of backup plan is essential. The math demands it. Academia produces far more graduates in the technical fields than academia itself needs. There's nothing wrong with this. What is wrong is for grad students to expect that they will get one of those rare jobs in academia once they graduate and jump through the post-doc hoops.

But even for the "academic ladder" - learning Fortran for that purpose means learning Fortran by using netlib libs.

I should have qualified my recommendation to learn Fortran with but only if you need it for your thesis work.

Learning Fortran is not something to be done after getting that PhD. It's something to be done along the way to getting that PhD, and as I said above, only if it's needed. Knowledge of Fortran is irrelevant to climbing the academic ladder. The only things that count for that are the number of papers one has published and the number of research grant dollars one has obtained.

D H · Jun 1, 2013

physwizard said:

This is quite ridiculous. Try convincing a prospective employer that Make and LateX are programming languages. Chances are you won't last 2 minutes into the interview.

Oh please.

Have you evaluated resumes? Interviewed? I wouldn't blink at someone who listed on their resume "Languages and tools" as including C++, Fortran, python, bash, make, and LaTeX. I might poke during the interview to see if they really know those, but I wouldn't scoff.

The advice to learn a new language a year isn't mine. It isn't even from Andrew Hunt and David Thomas, although they did write this very advice in their book The Pragmatic Programmer. I had heard this recommendation multiple times, from multiple people, long before that book came out. Does SQL count? Of course it does. It's an important skill, and it requires a different kind of thinking. Make? Of course. Using make right requires a different kind of thinking. LaTeX? Of course. LaTeX is a macro language. Write your own macros and you'll definitely agree that LaTeX is a computer language.

physwizard · Jun 2, 2013

Just a bit of advice. The way I see it is that anybody getting into a physics program (either undergraduate or graduate) should do it because he/she wants to learn physics and not because he/she wants to get a job. Once you get your degree you're satisfied(hopefully, if the education system at the particular university is satisfactory) that you've learned what you wanted to learn. If you don't get an academic job after your degree, a bit of advice - just forget that you are (or were) a physicist and focus on getting skills which will get you any kind of job. You should be able to envisage this situation(not getting an academic job or even a physics related job) even before you get into a graduate or undergraduate program.

D H · Jun 2, 2013

physwizard said:

Just a bit of advice. ... If you don't get an academic job after your degree, a bit of advice - just forget that you are (or were) a physicist and focus on getting skills which will get you any kind of job.

That is terrible advice in my opinion. Broaden one's perspective? Yes. Absolutely. Forget that you are (or were) a physicist? No.

There are plenty of places where PhD physicists can work as physicists outside of academia. Both government and industry need those PhD physicists. There are plenty of places outside of academia that need the strong analytic skills that PhD physicists have learned over the course of their academic careers. There is absolutely no reason the throw all of that away and try to land a job as a web developer, as some have implied in this thread.

Lavabug · Jun 2, 2013

D H said:

That is terrible advice in my opinion. Broaden one's perspective? Yes. Absolutely. Forget that you are (or were) a physicist? No.

There are plenty of places where PhD physicists can work as physicists outside of academia. Both government and industry need those PhD physicists. There are plenty of places outside of academia that need the strong analytic skills that PhD physicists have learned over the course of their academic careers. There is absolutely no reason the throw all of that away and try to land a job as a web developer, as some have implied in this thread.

I think his justification is mainly the scarce amounts of physics-relevant employment available to phd's (although it's greater than it is for any undergrad). I don't disagree with his sentiments about forgetting one is trained as a physicist when it comes to a job hunt if one's training ends at the BS level though, it definitely eases the feelings of failure many of us who don't get to progress onto graduate studies have and helps us stay motivated in the job hunt for unrelated employment.

But I agree with everything else you say. The right question to ask is what languages. I learned the basics of programming in Labview, did lots of numerics in scilab(a freeware MATLAB clone) and Fortran, have played with Python, and completed a senior thesis in Latex. I know very well that I still need way more formal programming experience and more exposure to web-developer languages or C++ to be employable.

In a recent job interview I had (for a temporary summer research position), I got a slight bit of flak for not knowing the precise difference between a function and subroutine in fortran (I was asked to write a function on the spot that did a certain thing, which did its job correctly, but I declared it as a subroutine). I'm afraid I will not get the job.

D H · Jun 2, 2013

Lavabug said:

I know very well that I still need way more formal programming experience and more exposure to web-developer languages or C++ to be employable.

If you want to be employable as a web developer, yes. If you want to be employable as someone doing numerical programming, not necessarily.

There's an optimal point in the concept of broadening one's perspective with regard to future employment. If one thinks the only way to be happy with life is to be a theoretical physicist at MIT, that's too narrow a perspective. The odds are vastly against ever landing that dream job. On the other hand, happily taking any job that comes ones way is too broad a perspective.

Going after specific jobs for which one doesn't have the skills is also perhaps a bit too broad a perspective. Looking for a job as a web developer or as a developer of a suite of multi processor client server applications is an example of this. The majority of the skills you learned in obtaining that degree in physics are orthogonal to the skills needed in these jobs. You are competing with people who have a big head start.

There are a number of jobs out there for which the ability to program is an essential but nonetheless secondary skill. Quantitative analysis has already been mentioned in this thread. Others include data mining, data fusion, remote sensing, navigation, guidance and control, and robotics. Here you have a advantage over a good percentage of computer science majors because many of them took the bare minimum of required math courses in getting their degree.

Lavabug · Jun 2, 2013

D H said:

If you want to be employable as a web developer, yes. If you want to be employable as someone doing numerical programming, not necessarily.

Do jobs like these even exist for physics majors at the undergrad level? My recent job hunt leads me to believe anything that falls under scientific or numerical programming is strictly for stem phd's, while every single entry-level programming job ad typically expects proficiency specifically in java, SQL (what I was referring to as webdev languages) and/or C++, skills which are in general absent from physics bachelors programs but common/mandatory for engineers and CS.

Most of everything else you mention sounds like work at a defense contractor or the military, which makes sense as I know they snatch up physics graduates (unfortunately I do not meet the citizenship requirement at all the posts I've found at the doe, dod, contractors, etc.).

ParticleGrl · Jun 2, 2013

There are plenty of places where PhD physicists can work as physicists outside of academia. Both government and industry need those PhD physicists.

I think this is highly dependent on your subspecialty. Hardly any of the physics phds I know work as physicists (and all of us tried) anywhere, but my subspecialty was HEP theory. I imagine condensed matter experimentalists have a different job market experience.

Order of magnitude, it seems very likely there are more physicists then jobs where physicists can work as physicists, so a lot (perhaps most) of us seem to retrain for very different work.

There is absolutely no reason the throw all of that away and try to land a job as a web developer, as some have implied in this thread.

I know more physics phds developing iphone and android apps then I do working as physicists. This is because people follow the jobs- find something it seems like you would like, and learn to do it.

I feel lucky to have the statistical work that I do, but pretty much nothing specific I learned while getting my phd has been useful, and I was only able to land the job after much self-study, and by replacing most of my physics related job skills on my resume with machine/learning statistics skills. Some of the experience with technical writing/giving presentations has been helpful, but I would have gotten that same experience almost anywhere.

Solkar · Jun 3, 2013

D H said:

Put yourself in the shoes of an employer who has an opening for an entry level, code monkey programming position in which there's not one lick of numerical programming in the target application.

(emphasis mine)
I do not intend to do so; a company where people are considered "code monkeys" (your words) is the wrong company. Regardless of whether the applicant is a PhD or not.

D H said:

They are overqualified because there are other jobs out there that represent a much better match between skills and job needs and that pay double the salary of that entry level code monkey position.

Knowing graduate level physics but no CS is not an "overqualification" for a software development job; it is no qualification at all for that job.

D H said:

I should have qualified my recommendation to learn Fortran with but only if you need it for your thesis work.

Your company will not care about you having finished your thesis and think learning Fortran is obsolete afterwards. When you e.g. have to fit whatever to some data; they will expect from you that you get you hands "dirty" as much as needed.

Where is this

D H said:

Learning Fortran is not something to be done after getting that PhD.

etched in stone?

What programming language should I learn?

Similar threads

Hot Threads

What do computer science engineers do?

Job Skills Degrees with actual guaranteed jobs after graduation?

Other Struggling to Find a Job – Considering My Next Steps

Other Summer Research Position Decision

Engineering Trouble with choosing how to shape my engineering career

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective