# Comparison of high-level computer programming languages

Tags:
1. Apr 22, 2017

### hilbert2

A little question about the appropriateness of a certain research subject...

Would it be useful to make a study of the computational effectiveness of equivalent codes written with Matlab, Mathematica, R Code, Julia, Python, etc. in a set of typical computational engineering problems like Navier-Stokes equation, numerical heat conduction, thermal radiation intensity field calculation, and so on and test how the computation time scales with increased resolution of the discretization? Or would this be redundant as the languages have already been tested for benchmark problems like matrix multiplication and linear system solution?

Just got this idea after reading about how the relatively new Julia language is very effective in the sense of computational speed despite being as simple as Matlab or R to make code with.

2. Apr 22, 2017

### Dr.D

With the exception of (1) real time applications, such as robotics, and (2) very massive computations like FEA, I think computational speed is a matter of little concern. Does anyone really care whether the result appears on your screen in a tenth of a second or requires a whole half second? I certainly don't; I'm just not that quick to react and not in that big a hurry.

3. Apr 22, 2017

### hilbert2

There are plenty of computational tasks that take several days even on supercomputers running a large amount of cores.

Supercomputing facilities always need a queue system to ensure no researcher uses more than their share of the computing resources.

4. Apr 22, 2017

### Dr.D

I'm sure that this is true. I would argue, however, that this is a specialist concern, not a general user concern. How many folks do you suppose are doing those problems that run for days on a supercomputer?

5. Apr 22, 2017

### FactChecker

Comparisons are complicated. If execution speed of MATLAB programs is an issue, it is possible to auto-generate C code and get much faster execution. In that case, the ease with which C code can be auto-generated, efficient libraries can be used, or parallel processing can be applied is important. There are many other complications to consider. I think that the existing benchmarks do a reasonably good job for the initial comparisons of languages and there are other issues to consider.

6. Apr 22, 2017

### hilbert2

Thanks for the answer. I have gotten the impression that the best thing about the Julia language is that it can also be easily parallelized despite being simple to write code with.

Maybe I'll compare some simple diffusion/Schrödinger equation solvers written in R and Julia on the same computer and make some plots of the computation time versus number of grid points, and see if there's anything interesting in there.

It seems to be quite difficult to find any peer reviewed publications about that kind of comparisons, here's one exception: http://economics.sas.upenn.edu/~jesusfv/comparison_languages.pdf (not from natural sciences though).

7. Apr 22, 2017

### FactChecker

I got a good impression of Julia, but I don't have any experience with it. I know that MATLAB has some options for parallel processing and have seen large programs that are run on a network of computers, but I was not involved in those efforts and don't know how hard it was to do.

8. Apr 22, 2017

### Staff: Mentor

Computational speed is an important matter to software engineers in general. By that, I'm not referring to the speed in which computations are performed, but the overall responsiveness of an application. A general rule of thumb is that if an application takes longer than a tenth of a second, users will perceive the application as being "slow." I can't cite any references for this -- this is just something I learned while working in the software industry for fifteen years.

As an example, one of the design goals for Windows XP was to decrease the startup time when the computer was turned on, relative to previous Windows versions. Microsoft invested a lot of resources to achieve this goal.

9. Apr 23, 2017

### mpresic

I am not sure a comparison of processing speeds from high level languages would be all that useful. I would expect different languages would perform differently on all the benchmarks and there would be no strong conclusions that could be drawn. That is, for example, I do not think Julia would outperform C on every benchmark, or C would outperform python on every benchmark, etc. I think it would be a mixed bag.

I think the importance of speed is overstated. I find that it would be more important for code to be well-documented so that it could be understood when inevitably, it is handed off to colleagues, and users than incremental increases in speed. Many times old code written by experts is used by novices on problems where it is misapplied. This ins often not the fault of the "novices", as the "experts" have retired or moved on before documenting the purpose, methods, and limitations of their code. It seems the experts thought they would be around forever.

It is also been my experience that experts were not replaced because the organization/business to reduce the payroll through attrition, but the organization/business was not thinking in terms of legacy.

Back to the main point. Documentation, and understandability should be more of a priority than speed. You have engineers that can make the most high-level user-friendly language inscrutable, and you have engineers that can make (even) structured fortran or assembly language, understandable.

10. Apr 23, 2017

### FactChecker

It's an interesting set of results. When several popular languages take hundreds of times longer to run than C++, there is something to consider. But many interpreted languages can be made to run much faster by using compiled libraries for critical calculations. That makes the comparison more complicated since people are likely to apply techniques that speed the program up when speed becomes a serious issue. In my opinion, no modern languages will run significantly faster than C or FORTRAN, and some will be hundreds of times slower.

11. Apr 23, 2017

### FactChecker

As a person who spend many weekends (and holidays) nursing batch programs through runs that take several days, I tend to disagree. I have also spent a lot of time massaging real-time programs to run in hard real-time of a few milliseconds. There is nothing uglier and harder to document than code that has been squeezed to run in a small time frame.

12. Apr 23, 2017

### mpresic

I see the importance of speeding up code that may take days to run. Commonly system administrators make take the computer or servers, whatever down for maintenance requiring an interruption in service within the several days timeframe. I also agree with your comments regarding FORTRAN and C.

Real-time code can be difficult to document. Nevertheless it is important to see that the code is maintained. In this respect, generations of workers familiar with the methods and techniques should be kept. For example, I am sure to get to the moon, the real-time code for the Apollo computers was hard to understand. I for one would be reassured that this expertise was maintained for if (or when) we try to get back to the Moon. I u
nderstand a good book was written concerning the Apollo guidance computer. Sounds intriguing.

13. Apr 26, 2017

### hilbert2

I did an experiment with calculating a numerical heat transfer (or diffusion) problem in 2D with R, Julia and C++ codes. The problem is like the one in this blog post I have written https://physicscomputingblog.wordpr...solution-of-pdes-part-3-2d-diffusion-problem/ .

I made a square computational domain, which contained NxN discrete cells, where N was given values 30, 37, 45, 52 and 60 on different runs. The method that was used was implicit finite differencing. The number of timesteps was only 10 in all runs.

The C++ code used simple Gauss-Jordan elimination taken from the book "Numerical Recipes in C", and the Ubuntu C++ compiler was run with parameters "g++ -ffast-math -O3". There was no attempt made to use parallel processing, or to account for the sparseness of the linear system. The matrix inverse was computed only on the first time step, and simple matrix-vector multiplication was used in consecutive time steps.

The R code used the in-built "solve(A,b)" function for solving the system of equations.

The Julia code uses the backslash operator "A\b" for solving the system.

The computation times used by the three codes (not including compilation time) are plotted below for the runs done on my own (slow) laptop (AMD E2-3800 APU with Radeon(TM) HD Graphics × 4).

Next the runs were also made with my work computer (Intel Xeon(R) CPU E31230 @ 3.20GHz × 8), and the computation times are shown on the next plot.

First I thought that the Julia code is the fastest because it can somehow notice the sparseness of the matrix and use that to speed up the computation without being explicitly told to do so, but when I tried to invert a matrix filled with random double-precision numbers from interval [-1,1], it worked just as fast as the inversion of the 2D diffusion equation matrix did. So the Julia compiler can probably somehow automatically parallelize the code.

The C++ code would most likely be the fastest if I used some LAPACK functions for solving the linear system, but I haven't done that yet.

Note that if the computational domain has $N^2$ cells, the matrix to be inverted has $N^4$ elements.

14. Apr 26, 2017

### FactChecker

The slow speed of C++ is surprising (although I have more confidence in speed of C than C++). There must be some catch -- some difference in the algorithm. If you want to see if Julia is parallelizing the calculations, you should be able to see something in the performance monitor. If you really want to study it, you can use scripts and DOS commands to collect data. I can not believe that Julia is fundamentally faster than C or even C++ (I can believe a tie, and that would support what others have said about Julia.). C++ slower than R is completely unbelievable to me.

PS. I think you are seeing why people do not use complicated benchmarks for comparisons of greatly dissimilar languages. They involve so much more than the basic calculations and the algorithms are not comparable without a lot of work on specific computers.

15. Apr 27, 2017

### Staff: Mentor

I would also assume that this is due to the the algorithm. Numerical Recipes is notorious for not having efficient implementations (although it is a great book to learn the basics, including the code supplied). You should try GSL.

16. Apr 27, 2017

### hilbert2

I compiled and ran the C++ code

Code (Text):
#include <iostream>
#include <cstdio>
#include <ctime>
#include <stdio.h>

main()
{
std::clock_t start;
double duration;
double x = 1000.0;

start = std::clock();

for(int n=0; n<1e8; n++)
{
x*=0.9999999;
}

duration = ( std::clock() - start ) / (double) CLOCKS_PER_SEC;

std::cout<<"x="<< x <<"\n";
std::cout<<"time (s): "<< duration <<'\n';
}
result: x=0.0453999
time (s): 0.341655

Then an equivalent Julia code:

Code (Text):
t1 = time_ns()

x = 1000.0

for k in 1:100000000
x*=0.9999999
end

t2 = time_ns()

print("x= ",x,"\n")
print("time (s): ",(t2 - t1)/1.0e9,"\n")
result: x=0.04539990730150107
time (s): 8.537868758

So quite a large difference in favor of C++ with this kind of calculation, at least. I'm not sure if telling the Julia to use less significant figures would make it faster.

17. Apr 27, 2017

### Staff: Mentor

Slightly modified from @hilbert2's benchmark.
Code (C):
#include <stdio.h>
#include <time.h>

int main()

{
clock_t start;
double duration;
double x = 1000.0;

start = clock();
for (int n = 0; n<1e8; n++)

{
x *= 0.9999999;
}
duration = (clock() - start) / (double)CLOCKS_PER_SEC;
printf("x: %f\n", x);
printf("time (s): %f\n", duration);
return 0;

}
Compiled as straight C code, release version, VS 2015, on Intel i7-3770 running at 3.40 GHz
Output:
x: 0.045400
time (s): 0.297000
This time is about 10% less than the time hilbert2 posted.

18. Apr 27, 2017

### FactChecker

They have to be compared on the same computer and run at a high, noninterrupted, priority on a dedicated core.

19. Apr 28, 2017

### f95toli

I believe one problem with such a comparison is that in many(most?) well written program in e.g. Matlab or even Python you will find that most of the time is spent calling routines that are already coded in say C; and in some cases they even use the same routines (say LAPACK or FFTW).
Hence, you wouldn't necessarily be comparing the languages as such but the libraries they use to do the "heavy lifting".
Actually solving problems by directly solving e.g. Navier-Stockes in ANY high-level language sounds extremely inefficient; and I don't imagine it is needed very often.

20. Apr 29, 2017

### hilbert2

This kind of calculation is an example of something that can't be parallelized, because the value of x after n:th round of the for loop depends on what it was after (n-1):th round. On the other hand, when doing something like a matrix-vector product $Ax=y$ between a $N\times N$matrix $A$ and an N-vector $x$, you can be computing several sums of form $y_k = \sum_{l=1}^{N} A_{kl}{x_l}$ at the same time with different processors as they are independent.