| Thread Closed |
Why is Fortran so Fast? |
Share Thread | Thread Tools |
| May12-07, 07:50 AM | #1 |
|
|
Why is Fortran so Fast?
I have some code in Fortran 77 that executes at pretty good speed. I also have a highly optimized version of the same algorithm in C. When using g77 and gcc, the unoptimized Fortran is just slightly faster that the highly optimized C. When using MSVC++ and g77, than the C is a bit faster. This leads to my question:
If I recode my highly optimized algorithm in Fortran 77 and use a comercial Fortran compiler, will the execution speed be far and above my highly optimized algorithm that was compiled with MSVC++? If speed is an issue, should I try to compile Fortran 95 into desktop applications that are written mostly in C/C++? |
| May12-07, 10:59 AM | #2 |
|
|
The question is, does one language have an advantage in translating to machine instructions?
Fortran 77 (don't know about 90/95) does have one advantage--no pointers per se. This eliminates the possibility of what compiler writers call aliasing, where the same memory location can be modified through two separate variables. If aliasing is a possibility, as in C, then the compiler needs to be more conservative to ensure correct machine code. Put differently, Fortran 77 restricts the programmer to a subset of C that can be more easily optimized. It would seem feasible for a C programmer to restrict oneself to that subset, and have performance that's just as good. There are methods to tell a C compiler that you are not aliasing (the restrict keyword, compiler flags). Would you be interested/able to post your codes? I would find that enlightening, as I have never worked much in Fortran, and I've always been a bit curious about the performance differences that people claim to get. I'm also curious about your optimizations. As for mixed language programming, F77 & C/C++ is no big deal. I've never tried F95 & C/C++, but people I respect strongly dislike it. |
| May12-07, 12:24 PM | #3 |
|
|
Fortran was the first high level language. They had the burden of proof to show that a high-level language could be competitive for preformance with assembly code, and so Fortran was the target of the largest optimization effort ever.
Comparing Fortran 77 against ansi C, I doubt thats the issue. The issue is that Fortran is totally static, the size of all datastructures is known at compile time. Once the program has loaded into memory, the stack size will not change. This allows for absolute addressing (with an offset of course) and that is very fast. On the other hand C is stack dynamic. This allows for more flexible programming, but wasted memory and slower execution. Try declaring your C variables as static, and see if you can close the performance gap. Note: Fortran 95 is no longer static, and so it allows for recursion and other non-Fortran nonsense. |
| May13-07, 12:18 PM | #4 |
Recognitions:
|
Why is Fortran so Fast?
Static vs dynamic storage allocation has nothing to do with the speed difference (and for modern compilers, there IS no speed difference worth botherig about).
Professional Fortran compilers have NEVER implemented just the ANSI language standard, so long as I've been using them (about 40 years). There is a de facto set of extensions which every serious compiler supports. If it didn't support them, it would be unmarketable, because most serious existing Fortran programs already use the extensions. Those extensions have included stack-based and heap-based memory allocation, recursive functions and subroutines, pointers, etc, ever since those things were "popularised" by C and Unix (and Unix is 30 years old now). If you knew anything about compiler writing, you would know that there is no overhead for stack-based storage allocation compared with static, and no wasted memory either. In fact on modern hardware, stack-based allocation is usually more efficient. The main historical reason why C got a reputation of being slow compared with Fortran was simply that the early C compilers did no optimisation. In some senses, C was originally designed as a human-readable machine-independent assembler language, that was trivial to compile onto the sort of hardware that was being built in 1970 (when a typical computer had 0.0001 Gbytes of memory, and 0.0001 GHz CPU speed). For example, on the early C compilers it was quite common for a loop written using pointers to run much faster than the same loop written using array subscripts, because the compiler didn't even try to move the repeated address-calculation code out of the loop. With modern compilers (built after compiler-writing had been transformed from a task requiring genius to something teachable to any average system programmer) that is no longer relevant. Another stimulus to Fortran optimisation was the vector-architecture machines of the 1980s (Cray, CDC, etc). They taught Fortran programmers to build software that could use that architecture efficiently, and the same code can be optimised easily on modern fast scalar machines. |
| May13-07, 01:49 PM | #5 |
|
|
I have quite figured out aliasing and the restrict keyword, but I've seen improvement in both the inefficient Fortran and the efficient C by using the -O3 flag with gcc and g77. The C code is faster, but still not nearly as fast as I would expect. I'll have to do more coding to really do a good comparison.
|
| May26-07, 06:12 PM | #6 |
|
Recognitions:
|
C is closer to machine code than Fortran, so with enough effort, a programmer should be able to generate the same or faster code, but it could end up being an exercise of trial and error to get the compiler to generate the code you want. I'm not aware of any C compilers that have the extensions / optimizations that some Fortran Compilers have to take advantage of hardware features like vector / parallel / pipeline / out of order processing / register scoreboarding oriented operations on super computers, where IBM and Cray still remain relatively popular at the very high end: http://www.top500.org/lists/2006/11 Regarding a PC, I'm not sure how much "hardware specific" optimizations there are in existing Fortran or C/C++ compilers. A good compiler might auto-generate mult-threaded code, take advatange of out of order intruction handling on CPU's with register scoreboarding, or auto-generate multi-threaded code. |
| May26-07, 10:53 PM | #7 |
|
|
Not sure what you mean by "parallel" (in "vector / parallel / pipeline"). Do you mean superscalar issue? That is a hardware feature, not a software feature, beyond not emitting an instruction stream that inhibits multiple issue. Out of order processing is similar: the compiler/assembly programmer does not do out of order processing. The order that's "out of" is the order in which instructions were laid down by the compiler. It's hardware that decides to execute out of order in order to compensate for unpredictable run time latencies (cache misses). The compiler can't do that, for the obvious reason that cache state is unknown at compile time. Not sure what you mean by "on super computers", as most supercomputers nowdays are massively parallel assemblies of Opterons, Itaniums, and similar commodity (more or less) chips. All decent C/C++/Fortran compilers pay attention to register allocation, etc. In fact, I suspect that a lot of compilers first generate an abstract representation of the code, and then use that to generate instructions, schedule them, and allocate registers. This has nothing to do with the initial language. |
| May27-07, 02:05 AM | #8 |
|
Recognitions:
|
Massively parallel systems, based on microprocessor chips, have an issue with microsecond or longer latency on communcations between cpu's, just because of the length of the wires. Some problems are solved more quickly with the high end vector oriented type computers. http://en.wikipedia.org/wiki/Supercomputer From the link below, Fortran "is the primary language for some of the most intensive supercomputing tasks, such as weather and climate modeling, computational fluid dynamics, computational chemistry, quantum chromodynamics, simulations of long-term solar system dynamics, high-fidelity evolution artificial satellite orbits, and simulation of automobile crash dynamics." http://en.wikipedia.org/wiki/Fortran From the same link on Fortran, regarding extensions: "Vendors of high-performance scientific computers (e.g., Burroughs, CDC, Cray, Honeywell, IBM, Texas Instruments, and UNIVAC) added extensions to Fortran to take advantage of special hardware features such as instruction cache, CPU pipelines, and vector arrays." My understanding is that similar extensions are still used on the current high end supercomputers. Again from the same link on Fortran: a reference to the "out of ordering" done by a compiler from the 1970's (date not mentioned in article): "one of IBM's FORTRAN compilers (H Extended IUP) had a level of optimization which reordered the machine language instructions to keep multiple internal arithmetic units busy simultaneously". So the main reason "Fortran is so fast", is that speed is one of the goals of Fortran compilers. To this end, the scientific community willingly accepts machine specific extensions to the language. |
| May27-07, 08:35 PM | #9 |
|
|
Some codes do not scale well to MPP. Various solutions have been pursued, including replacing with algorithms that do scale well, and doing nothing and getting poor performance. |
| May28-07, 10:56 AM | #10 |
|
|
Can all C/C++ programs be coded in such a way that they are just as fast as Fortran?
Are Fortran 2003, 95, 90 just as fast as 77? |
| May28-07, 02:48 PM | #11 |
|
|
Yes, at least in the trivial sense that you can take the assembly output from the Fortran compiler and inline that in C/C++, and it's a valid C/C++ program.
I haven't seen an example of a Fortran code that could not be done as well in C/C++, given mature compilers for both. I would welcome such an example, of anyone has one. I think the problem is that there are a lot more ways to hobble performance in C/C++ than Fortran 77: the space of possible programs is significantly larger given OO, templates, the equivalence of arrays and pointers, the lack of built-in multidimensional arrays, etc. You have to know more about what you're doing to avoid those traps. Really, it's not that hard to have good performance in C++ for numerical codes. Whatever language you're using, it's far more important to understand the target architecture, and how your compiler interacts with your code to produce an instruction stream for that architecture. You have to think about data locality (what fits in cache), getting as many ops per load, branch elimination/prediction, etc. I don't know about later flavors of Fortran. I've heard Fortran programmers complain about performance of F77 codes degrading as they go to newer F95 compilers. But I don't know anything systematic about it. |
| May28-07, 03:59 PM | #12 |
|
Recognitions:
|
|
| Jun14-07, 01:18 AM | #13 |
|
Recognitions:
|
|
| Jun14-07, 07:49 AM | #14 |
|
|
|
| Jun17-07, 07:17 PM | #15 |
|
Recognitions:
|
Is Cray the last "holdout" still making true vector machines? |
| Jun18-07, 08:35 AM | #16 |
|
|
Cray is not the last holdout making true vector machines. No one is, because memory bandwidth is too poor and managing the cache too complex. You're better off spending your transistor budget running more ops on narrower vectors at higher frequency. The newer Crays use Opterons, with 128 bit vector units. Again, your information is out of date. While I'm up, on the subject of intrinsics (compiler extensions that expose hardware functionality not addressable by the language), you make the claim that Fortran is faster because it has these. But, as I pointed out previously and even you have acknowledged, C compilers have these extensions, too. So if C has the same extensions, how can Fortran be faster than C by virtue of these extensions? And as I said before, but you seem to not comprehend, ALL decent C++ compilers offer these extensions on their target platforms. And even if they did not (note that they do!), C++ programmers could always inline some assembler while taking advantage of compiler extensions that defer register allocation and instruction scheduling to the optimizer. Compiler extensions are not a differentiating factor in performance. |
| Jun18-07, 10:56 PM | #17 |
|
Recognitions:
|
http://www.cray.com/products/x1e/index.html I've read that there will be a follow on to this, a Cray X2. The USA govement partially funds this line of supercomputers, so there still must be some need for "traditional vector systems". I'm still under the impression that a significant part of scientific programming is still done in Fortran, probably because of a large amount of code already exists and the cost of conversion would be high. |
| Thread Closed |
| Thread Tools | |
Similar Threads for: Why is Fortran so Fast?
|
||||
| Thread | Forum | Replies | ||
| Lightspeed fast, then slowed, then fast again. | General Physics | 3 | ||
| Fortran v.s. Visual Fortran | Programming & Comp Sci | 0 | ||
| help with fortran plz | Engineering, Comp Sci, & Technology Homework | 1 | ||
| fortran | Computing & Technology | 23 | ||
| Anyone know Fortran? | Introductory Physics Homework | 4 | ||