Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Comparison of high-level computer programming languages

  1. May 18, 2017 #21
    The julia example is missing something important: julia only compiles things inside
    functions! So the julia timing was interpreted rather than compiled code.
    Just wrap a function around it,
    Code (Text):

    function comp()
          x = 1000.0
          for k in 1:100000000     x*=0.9999999   end
          return x
    end
     
    I get these times
    Code (Text):

    julia> t1 = time_ns(); x = comp(); t2 = time_ns();
    julia> println(x);print("time (s): ",(t2 - t1)/1.0e9,"\n")
    0.04539990730150107
    time (s): 0.148448348
     
    versus C++:
    Code (Text):

    ~/tmp> g++ -O foo.cc -o foo
    ~/tmp> foo
    x=0.0453999
    time (s): 0.149457
     
    Another important factor is "type stability" -- the type of a variable should not change within a function
    Have a look here: http://www.stochasticlifestyle.com/7-julia-gotchas-handle/

     
  2. May 20, 2017 #22

    hilbert2

    User Avatar
    Science Advisor
    Gold Member

    Thanks for the helpful info, bw. :)

    I checked the activity monitor when solving the diffusion eq. solution numerically, and it seemed that the Julia program really used multiple processors but the C++ version used only one processor.
     
  3. May 20, 2017 #23

    anorlunda

    Staff: Mentor

    Interest in benchmarks does not fade through the years. But I think that magazine articles and blog posts are the usual venue, not peer reviewed papers. The reason is that to make scientifically significant conclusions, the benchmark suite must include the entire spectrum of applications the language is used for. In other words, mind numbingly huge. A blog post can show benchmarks for a niche.

    Every benchmark comparison attracts both praise and criticism for the benchmarks chosen and the details of the tests. That too I expect to remain unchanged for the foreseeable future.
     
  4. May 20, 2017 #24

    jedishrfu

    Staff: Mentor

    On the julia wesite theres a set benchmarks comparing various numerical languages

    Www.Julialang.org

    One strength of julia is that it can interoperate with languages allowing you to mashup components in different languages to get things done.

    Be aware that benchmarking can favor one language over another by choice of algorithms, their implementations and the environment they are run in.

    Database vendors routinely compete with each running tests that are curated to favor their product.
     
  5. May 22, 2017 #25

    Svein

    User Avatar
    Science Advisor

    The BYTE magazine ran a series of such coding tests in the 1980s. The algorithm chosen was the "Sieve of Eratosthenes" (finding all primes in a range of integers).

    The program is written in several programming languages and documented at http://rosettacode.org/wiki/Sieve_of_Eratosthenes.
     
    Last edited: May 22, 2017
  6. Jul 22, 2017 #26

    SixNein

    User Avatar
    Gold Member

    The simple answer is no.

    The long answer is that languages all get reduced down eventually into machine code. The difference in speed between languages more or less depends upon how many layers these languages have to go through in order to accomplish that task and how often they have to do it. There is also a factor in how good a compiler is at creating optimal machine code, but layering would probably dwarf that for the most part.

    A much more useful study is what algorithm gives you the lowest complexity and what dependencies it has. A good example is sorting algorithms. When would you use a hash sort vs a merge sort vs a selection sort? This will provide you much more useful information than a compassion between languages.
     
  7. Jul 10, 2018 #27
    I had a bit of fun with this, and it illustrates SixNein's points completely. First, I chose--as always when I have a choice--to write in Ada. Why use Ada? Aren't those compilers for embedded systems and very expensive? Some are, but there is an Ada compiler, GNAT, built into gcc. The advantage over using C with gcc is that the GNAT toolset manages the linking so you don't have to create a makefile. The code you get for the same algorithm in Ada and C should run at exactly the same speed. Of course, if you make significant use of Ada generics or tasking, lots of luck generating the same code in C. But that doesn't happen here. I'll put the complete code in the next post, so you can compile and run it on your system if you like.

    I wanted to try writing the key loop two ways in Ada. That became three, and then four, and needed details on the real-time clock Ada was providing, and the resolution of the time types provided with it, to understand the results. I broke the second example so compilations wouldn't last for hours, while I was debugging my code. Why did I have to break the second case? Ada rules say that 0.9999999999**100000000 is a numeric literal, where ** is exponentiation, and it has to be computed exactly. ;-) The problem isn't evaluating the expression, it is keeping all those decimals around while it does so. The compiler (gcc) runs for around four hours, then hits Storage_Error when the number is too big to fit in 2 Gigabytes. (Yes, I have much more memory than that on my system, but the limit is on the bignum type.) Anyway, I broke the numeric expression up differently than in the third case, the whole program compiles in under a minute now.

    The next issue is why I have a lot of large multipliers in the timing code, and why I added the output at the head. Found that the compiler was using 80-bit arithmetic for Long_Long_Float and storing it in three 32-bit words. Interesting. But let me show you the full output:

    Sanity Checks.
    Long_Long_Float'Size is 96 bits.
    Duration'Small is 0.0010 Microseconds.
    Real_Time.Tick is 0.2910 Microseconds.
    Real_Time.Time_Unit is 0.0010 Microseconds.

    Multiplication result was 0.045399907063 and took 147318.304 Microseconds.
    Exponentiation result was 0.045399907063 and took 0.291 Microseconds.
    Exponentiation 2 result was 0.045399907062 and took 0.583 Microseconds.
    Fast exponentiation result was 0.045399907062 and took 0.875 Microseconds.

    I gave up on trying to get that monospaced. \mathtt gets a typewriter font in LaTex, but then I have to redo the spacing and line breaks explicitly. Not worth the trouble. Now to discuss the results. That first time looks huge, but it is in microseconds. It is actually 0.1473... seconds. That may be the fastest time so far reported here, but if I ran the C code I should get close to the same thing. But the thing you should do if your program is too slow is not to look for tweaks here and there, but to use a better algorithm. I understand that this program was intended as a benchmark, but these results one, two, and three clock ticks respectively, and for a fast real-time clock, indicate that there is some magic going on under the hood. When I wrote the second case (exponentiation) I realized that the compiler was going to try to do everything at compile time, and it did. But Ada rules say that numeric literals evaluated at compile time must be evaluated exactly. Breaking the expression up this way (X := 1000.0; Y := 0.999_999_9**10_000; Y := Y**10_000; X := X*Y; ) took maybe thirty seconds of grinding at compile time, then stuffed the first 64-bits plus exponent into Y, and raised that to the 10,000th power at run-time. But how did it do that last step so quickly? Probably by calling the built-in power function in the chip.

    We can guess that exponentiation 2 used the same trick, but calling the power function with an exponent of 100,000,000 instead of 10,000 apparently used another clock tick. (Incidentally, if the clock function is called twice during a tick it adds the smallest increment here, one nanosecond, to the value returned. Twice that for the third call, and so on. This means that you always get a unique value for the clock call. With a six core processor, and this code running on just one core, this can add a few nanoseconds to the value which should be ignored. It can also subtract nanoseconds if the starting call to clock is not the first call in that interval.)

    Finally, the third approach can't be short-circuited by trig or exponential functions. It computes 0.9999999 times itself 100,000,000 times. That code would work even if both values were entered from the keyboard when the program was already running, and it did the calculation 168 thousand times faster.

    So:
    1. Use a language which makes the structure of the problem visible.
    2. Use that to find a better algorithm, if needed.
     
  8. Jul 10, 2018 #28

    FactChecker

    User Avatar
    Science Advisor
    Gold Member
    2017 Award

    @eachus , I wish no offense, but in summary, was there a run-time difference between C and Ada? I like a "you were there" type description, but only after a summary that tells me if it is worth reading the details.
    These type of timing comparisons of a single calculation done many times may not reflect the true difference between language execution speeds.
     
  9. Jul 10, 2018 #29
    -- Save this file as power.adb, open a command window.
    -- invoke gcc with "gnatmake -O3" if you have the gnat tools and libraries installed.
    -- That command will issue "gcc -c -O3 power.adb" then call gnatlink and gnatbind.
    -- type power in a command window to execute the program.
    with Ada.Text_IO; use Ada; use Ada.Text_IO;
    with Ada.Real_Time; use Ada.Real_Time;
    procedure Power is
    Start: Time;
    Elapsed: Time_Span;
    X,Y: Long_Long_Float := 1000.0;
    package Duration_IO is new Fixed_IO (Duration);
    package Long_Float_IO is new Text_IO.Float_IO(Long_Float);
    begin
    Text_IO.Put_Line(" Sanity Checks.");
    Text_IO.Put_Line(" Long_Long_Float'Size is" &
    Integer'Image(Long_Long_Float'Size) & " bits.");
    Text_IO.Put(" Duration'Small is ");
    Duration_IO.Put(Duration'Small * 1000_000,2,4,0);
    Text_IO.Put_Line (" Microseconds.");
    Text_IO.Put(" Real_Time.Tick is ");
    Duration_IO.Put(To_Duration(Tick * 1000_000),2,4,0);
    Text_IO.Put_Line (" Microseconds.");
    Text_IO.Put(" Real_Time.Time_Unit is ");
    Duration_IO.Put(Duration(Time_Unit * 1000_000),2,4,0);
    Text_IO.Put_Line (" Microseconds.");
    -- Print Duration'Small, Real_Time.Tick, and Real_Time.Time_Unit
    -- to understand the issues that can come if they are
    -- inappropriate or useless.

    New_Line;
    X := 1000.0;
    Start := Clock;
    for I in 1..100_000_000 loop
    -- Some differences are just doing things the Ada way.
    -- Like using underscores to make reading numbers easier.
    -- Here I could have written 0..1E8-1. If I were actually
    -- used for something other than a loop count, I might have.
    X := X*0.999_999_9;
    end loop;
    Elapsed := Clock-Start;
    Ada.Text_IO.Put("Multiplication result was ");
    Long_Float_IO.Put(Long_Float(X),4,12,0);
    Ada.Text_IO.Put(" and took ");
    Duration_IO.Put(To_Duration(Elapsed * 1000_000),2,3,0);
    Text_IO.Put_Line (" Microseconds.");

    Start := Clock;
    X := 1000.0;
    Y := 0.999_999_9**10_000; -- Lots of CPU time at compile.
    Y := Y**10_000; -- Broken to avoid a compiler crash
    X := X*Y; -- Ada requires evaluating literal expressions exactly.
    Elapsed := Clock-Start;
    Ada.Text_IO.Put("Exponentiation result was ");
    Long_Float_IO.Put(Long_Float(X),4,12,0);
    Ada.Text_IO.Put(" and took ");
    Duration_IO.Put(To_Duration(Elapsed * 1000_000),2,3,0);
    Text_IO.Put_Line (" Microseconds.");

    Start := Clock;
    X := 1000.0;
    X := X*0.999_999_9**Integer(100.0*X*X); -- Not a numeric literal
    Elapsed := Clock-Start;
    Ada.Text_IO.Put("Exponentiation 2 result was ");
    Long_Float_IO.Put(Long_Float(X),4,12,0);
    Ada.Text_IO.Put(" and took ");
    Duration_IO.Put(To_Duration(Elapsed * 1000_000),2,3,0);
    Text_IO.Put_Line (" Microseconds.");
    -- That may do the same. I msy have to pull over the Ident_Int function from the ACVC tests.

    -- As a compiler writer this is the sort of optimization I would want to happen,
    -- if the value raised to a power, and the power were variables:

    declare
    I: Integer := 1;
    Value : Long_Long_Float := 0.999_999_9;
    Exponent: Integer := 100_000_000;
    Result : Long_Long_Float := 1.0;
    Powers: array (Integer range 1..32) of Long_Long_Float;
    Control: array (Integer range 1..32) of Integer;
    begin
    Start := Clock;
    X := 1000.0;
    Powers(1) := Value;
    Control(1) := 1;
    while Control(I) <= Exponent loop
    Powers(I+1) := Powers(I)*Powers(I);
    Control(I+1) := Control(I)+Control(I);
    I := I+1;
    end loop;
    for J in reverse 1..I loop
    if Control(J) <= Exponent
    then Result := Powers(J)*Result;
    Exponent := Exponent-Control(J);
    end if;
    end loop;
    X := X*Result;
    Elapsed := Clock-Start;
    Ada.Text_IO.Put("Fast exponentiation result was ");
    Long_Float_IO.Put(Long_Float(X),4,12,0);
    Ada.Text_IO.Put(" and took ");
    Duration_IO.Put(To_Duration(Elapsed * 1000_000),2,3,0);
    Text_IO.Put_Line (" Microseconds.");
    end;
    end Power;
     
  10. Jul 10, 2018 #30

    FactChecker

    User Avatar
    Science Advisor
    Gold Member
    2017 Award

    If you have ever been on a project that fell into the Ada "strict typing hell", then you know that the advertised Ada development advantages are not guaranteed. And upper management often prefers strict rules over "best programmer judgement". That preference can lead straight to the Ada "strict typing hell" (among other bad things).
     
  11. Jul 10, 2018 #31

    jedishrfu

    Staff: Mentor

    COBOL had this issue of strictness too. It wasn't so bad though because we'd use an older program as a template for the newer one. I remember the classic error of forgetting a single period in the IDENTIFICATION section and wind up with literally hundreds of errors as the compiler failed to recover from it.
     
  12. Jul 10, 2018 #32
    One part of my job at MITRE, and there were a half a dozen of us who did this, was to get all of the misunderstandings about Ada out of the software design rules well before coding started on Air Force electronics projects. Sometimes though we ran into managers who had added their own rules gotten out of a magazine somewhere. Like: can't use Unchecked_Conversion. All UC means is that it is the programmer's job to wrap it in any necessary checks. Free almost always is UC, because it is your job to be sure there are no other accesses out there. Another is the one you are complaining about. I didn't declare any non-standard types in that fragment. Where you should use your own types are 1: in Generics and 2: when physical units are involved. There were a slew of nice papers on how to have one type for SI units such that corresponded to most units, with the type checking done at compile time. I preferred to stick to things like measuring time in Duration, with constants like milliseconds being defined for use when declaring values. Anyway, define one type per package, and convert the package to a generic if necessary. (Doesn't apply to enumeration types used for convenience: type Color is (Red, ... or type Switch is (Off,On); although you might want to do that one as Off: constant Boolean := False; and so on.)

    The most important rule in Ada programming though, is that if the language seems to be getting in your way, it is trying to tell you something. If you are getting wrapped around the axle, ask what is the simplest way to do what you are trying to do, then figure out why you can't do that. Adding a parameter to a subprogram, or a subprogram to a package may require changing the design documents. Just realize that people make mistakes and hope no one will blow up. (Because their work, of course, was perfect.)
     
  13. Jul 10, 2018 #33

    anorlunda

    Staff: Mentor

    That was a most interesting post. It reminds us that, until the day when we turn over coding to AIs, rules and discipline must give way to humanity. Humans writing code will always remain partially an art.

    I recently watched a very interesting documentary (see below) about MIT's Draper Labs and the navigation computers for the Apollo moon missions. According to this, the software project was nearly a disaster under the loosey-goosey academic culture until NASA sent in a disciplinarian. After a tough time, the software got finished and performed admirably for Apollo 8 and 11.

    My point is that you can err in either direction, too much discipline or too much humanity. Finding the right balance has little to do with programming languages.

     
  14. Jul 10, 2018 #34
    Personally, I did in several different scientific areas.
     
  15. Jul 10, 2018 #35
    I remember that. As a Freshman I got into a class under Doc Draper* at the I-Lab. (Much later Draper Labs) I got assigned to a project to determine whether early ICs (I think they had three transistors and six diodes) were any good or not. In the lab chips which had worked for months would suddenly fail. I had what I thought was a very simple idea that if failure equalled too slow, I should test not static switching voltages but the 10% to 90% (or vice-versa) output voltage swing was taking too long. Turned out you only had to test one transistor all of them on a chip had pretty much identical characteristics. Why did this time domain measurement matter? When the transistor was switching, it was the highest resistance component in the circuit. So the chips that switched too slowly eventually overheated and died. Of course, what killed one chip might not touch the one next to it, because it hadn't had to switch as often.

    I remember Apollo 8 being given go for TLI (trans lunar injection), that was the vote that I had done my job.

    As for the 1202 alarms, I was at my parent's home where we were celebrating one of my sister's 18th birthday. All the family was there, including my father who had designed power supplies for radars at Cape Canaveral (before it became KSC), and my grandfather who had literally learned to fly from the Wright Brothers well before WWI. Of course, every time the TV talking heads said computer problem, my first reaction was, oh no! I goofed. Then I realized it was a software issue. Whew! Not my problem.

    Finally, Apollo 11 landed and while they were depressurizing the LM, I started to explain that the pictures we were going to see, live from the moon were going to be black&white, not color like Apollo 8, and why.

    My mother interrupted, "Live from the moon? Live from the moon! When I was your age we would say about as likely as flying to the moon, as a way to indicate a thing was impossible."
    "Helen," her father said, "Do you remember when I came to your room and said I had to go to New York to see a friend off on a trip? I never thought Lindbergh would make it!" (My grandfather flew in the same unit as Lindbergh in WWI. Swore he would never fly again and didn't. My grandmother, his wife, flew at least a million miles on commercial airlines. She worked for a drug company, and she was one of their go to people for getting convictions against druggists diluting the products, or even selling colored water. (She would pick up on facial features hard to disguise, so that if they shaved a beard, died their hair, etc., she could still make a positive ID, and explain it to the judge.)

    They both lived long enough to see pictures from Voyager II at Uranus, and my mother to see pictures from Neptune.

    * There were very few people who called Doc Draper anything other than Doc Draper. His close friends call him Doc. I have no idea what his mother called him. (Probably Sonny like my father's mother called him.)
     
  16. Jul 10, 2018 #36

    FactChecker

    User Avatar
    Science Advisor
    Gold Member
    2017 Award

    In theory, our rules were guided by some Carnegie Mellon advice. I thought that their advice was very wise, flexible, and appropriate. The part that management disliked and eliminated from our rules was flexible.
    On a large program, it doesn't matter what the code is telling me. We have to follow the programming standards that management presents to the government.
     
  17. Jul 11, 2018 #37
    We granted far more waiver requests than we turned down. The only one I can remember turning down was for 25 KSLOC of C. The project had no idea what the code did, since the author had left over two years earlier. I looked at the code and it was a simulation for a chip that had been developed for the project, to let them test the rest of the code without the chip. Since the chip was now there, I insisted that they replace the emulation with code (about 100 lines) that actually used the chip. Software ran a lot faster then. Another waiver request I remember was to allow for 17 lines of assembler. I showed them how to write a code insert in Ada. Issue closed.

    In general, we found that the most decisive factor in whether a software project succeeded or not, was to divide the number of software engineers into the MIPS of development machines they could use to develop and test code. A number significantly under one was trouble, two or three no problem. Of course today everybody has a faster PC than that, so problems only came when the software was being developed in a classified lab
     
    Last edited: Jul 11, 2018
  18. Jul 11, 2018 #38
    In general, there will be project requirements, and those requirements must be met. It sounds a bit religious to emphasize how one must address one potential requirement over another.

    If I need to present results to a meeting that's two hours away, I will be concentrating of rapid short-term development and execution. If I need to control a mission-critical military platform that will be in service is 8 years, I will be concentrating of traceability, maintainability, ease of testing, version control, auditability, etc.

    To address the OPs question:
    If benchmarks using Navier-Stokes equations will document ground not covered in existing benchmarks, then there is potential use in it. I don't know much about Navier-Stokes equations, but if they are used in simulations that tend to run past several minutes, then there may be consumers of this data.

    As far as using Matlab-generated C code, by all means include that in the survey. You will be documenting hardware, software, everything - version numbers, configuration data, and the specific method(s) used to implement the solution of each platform.

    Since the code you produce will be part of your report, it should be exemplary in style and function.
     
  19. Jul 11, 2018 #39

    anorlunda

    Staff: Mentor

    This thread reminds me of a PF Insights Article. The article and the ensuing discussion parallel this thread in many ways.

    The article: https://www.physicsforums.com/insights/software-never-perfect/

    The discussion: https://www.physicsforums.com/threa...r-perfect-comments.873741/page-2#post-5565499

    I'll quote myself complaining that modern software engineering methods and discipline are not sufficiently down scalable , and that is a serious problem because of the IOT.

     
  20. Jul 11, 2018 #40

    FactChecker

    User Avatar
    Science Advisor
    Gold Member
    2017 Award

    Navier-Stokes equations are at the core of Computational Fluid Dynamics and are, indeed, used in very long series of runs. For instance, aerodynamics calculations that account for every combination of angle of attack, angle of sideslip, mach, altitude, and surface positions would take a very long time to run. Supercomputers are sometimes necessary.
     
    Last edited: Jul 11, 2018
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted