(Fortran) Does order of accessing a matrix affect the performance?

confi999 · Mar 17, 2010

Hello,

In Fortran 90, will the performance (in terms of computational efficiency) of the following two be the same. If not, which one will be faster. Please advise with the reasoning:

(i)
do i = imin,imax
do j = jmin,jmax
do k = kmin,kmax

Several statements (rigorous arithmetic manipulation) involving a(i,j,k) and b(1,i,j,k)

end do
end do
end do

(ii)
do k = kmin,kmax
do j = jmin,jmax
do i = imin,imax

Several statements (rigorous arithmetic manipulation) involving a(i,j,k) and b(1,i,j,k)

end do
end do
end do

Thank you very much.

Bartleby50 · Mar 17, 2010

The performance should be the same for every case in every programming language.

minger · Mar 17, 2010

Yes it matters a lot! Always access your arrays as in your version (ii). I work in CFD where huge arrays of data are accessed frequently, and that's one of the first things my adviser told me to do to increase computational efficiency.

Bartleby50 · Mar 17, 2010

minger said:

Yes it matters a lot! Always access your arrays as in your version (ii). I work in CFD where huge arrays of data are accessed frequently, and that's one of the first things my adviser told me to do to increase computational efficiency.

Could you explain, why this is so ? Is this a special of fortran ?

minger · Mar 17, 2010

Did some Googling:

http://www.enm.bris.ac.uk/staff/pjn/Programming/C-FORTRAN.html said:

Because of this behaviour you do not want to be continually jumping between, apparently, arbitrary memory locations as the computer will need to keep swapping pages around the system in order to get the one containing the memory address you want into the cache. This is illustrated by the two pairs of nested loops in the example programs stepping through a large two dimensional array.

Although C and FORTRAN allow us to have arrays with multi-dimensional indices, the computer's memory address space is only one dimensional. If we consider a simple 3x3 array in C (A[3][3]) then this is actually laid out in memory as:

A[0][0] A[0][1] A[0][2] A[1][0] A[1][1] A[1][2] A[2][0] A[2][1] A[2][2]

i.e. the rightmost index is the most rapidly varying.

Whereas in FORTRAN the 3x3 array A(3,3) is laid out in memory as:

A(1,1) A(2,1) A(3,1) A(1,2) A(2,2) A(3,2) A(1,3) A(2,3) A(3,3)

i.e. the leftmost index is the most rapidly varying.

If we have an array which is thousands of elements in each dimension then varying the first index by 1 in a C program, or the second index by 1 in a FORTRAN program, will take you into a different page which may not currently be readily accessible, thus causing a delay in processing. Consequently, it is important that loops are nested in the correct order to ensure that, wherever practical, you process all of the elemnents in the current page before going onto the next one, rather than continually processing a single element in a page and then jumping to a completely different one.

So, the answer is in the way that Fortran stores and accesses memory. It puts the left-most column as neighbors, so to avoid skipping around the memory, vary the indices from left to right.

matonski · Mar 17, 2010

To quote Scientific Software Development with Fortran by Drew McCormack,

Why is loop order of any importance? This has to do with the order in which array elements are stored in the memory of a computer. In Fortran — unlike the popular C programming language — elements are stored in column major order. This means that elements along the columns of a two dimensional array (ie, matrix) are consecutive in memory. To be more concrete, in the example above, the element r(5,4) is just before r(6,4), but is not next to element r(5,5). The accompanying figure may help you visualize this.

When you loop over an array, you should try to do it in small steps through memory, because this allows for the most optimal use of the computer's memory architecture. A modern computer brings chunks of memory into its cache for the CPU to operate on; if the array elements in a series of operations are close together in memory, the CPU can perform many operations before it has to write the results back out to main memory.

If, on the other hand, the operations require big strides through memory, the computer will bring in a chunk of memory — technically called a cache line — but will only be able to operate on one or a few elements before having to write the chunk back out and get a new one. Retrieving and storing cache lines is a relatively time consuming operation on a modern computer.

Bartleby50 · Mar 17, 2010

Thanks a lot for the question and the interesting answers. I oversaw the implications for memory handling. I apologize for my first answer, that was obviously totally wrong.

confi999 · Mar 18, 2010

Thank you so much for all the answers and the questions.
All these gave me the answer and the explanation that I was looking for.
Thanks a lot to everyone.

amitavorc · May 15, 2012

Dear Sir,

I am using FORTRAN-90.
I have little experience and i have one problem.

I want to store data (i.e. X and Y values) in an array (A(900),B(900)) in such a way that at first I used only 30 elements of each array for storage 30 initial values of X and Y and then I have some scientific calculations to change the values of X and Y and then again want to store 30 modified values of X and Y in the same array from the 31th array element of both arrays. In this way I want to store my data and finally want to print these arrays.

Could you please help me out how I can write this algorithm I FORTRAN code…..

Please help me.

Please guide me

With best regards

(Fortran) Does order of accessing a matrix affect the performance?

Is A.I. more than the sum of its parts?

AI vs. Humans as Processors in an Environment

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

Newest Blender features

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

(Fortran) Does order of accessing a matrix affect the performance?

Similar threads