# Textbook wrong?

• Fortran

## Main Question or Discussion Point

See attached image.

I have never heard that before. Why wouldn't one be able to implement a data structure that stores a matrix by its columns using C? Is that true?

#### Attachments

• 19.2 KB Views: 336

Related Programming and Computer Science News on Phys.org
Mark44
Mentor
See attached image.

I have never heard that before. Why wouldn't one be able to implement a data structure that stores a matrix by its columns using C? Is that true?
What this is saying is that for the definition below, C stores the elements in what is called row-major order.
Code:
int arr[3][3] = {{1, 1, 1}, {2, 2, 2}, {3, 3, 3}};
In the memory of a C program, the numbers are stored as 1, 1, 1, 2, 2, 2, 3, 3, 3, in consecutive memory locations. Fortran, in contrast, stores array elements in column-major order, so that the same array would appear in memory as 1, 2, 3, 1, 2, 3, 1, 2, 3, in consecutive memory locations.

So, no, the textbook is correct.

Last edited:
I get it now. Thanks.

But there are two problems still:
Mark44: Fixed the first array index, which is now 3.

Second, why wouldn't you use {{1, 2, 3}, {1, 2, 3}, {1, 2, 3}} for this same matrix? If this was properly documented, I don't see a reason why not to (supposing the performance gain would be noticeable).

Last edited by a moderator:
Mark44
Mentor
I get it now. Thanks.

But there are two problems still:

Second, why wouldn't you use {{1, 2, 3}, {1, 2, 3}, {1, 2, 3}} for this same matrix? If this was properly documented, I don't see a reason why not to (supposing the performance gain would be noticeable).
What same matrix? The matrix (the terminology is "array" in programming) I was talking about is the one in my example.
Neither of the points you raised is a problem, except possibly in your understanding. My declaration isn't "wrong." I get to declare my own array any way that I want, as long as I follow the syntactic rules of C. What I posted is a proper declaration for a two-dimensional array in C.

If I had wanted the array to be as you wrote it, with 1, 2, and 3 in each row, I would have declared it that way. As I wrote my declaration, the first row is 1, 1, 1, and the second row is 2, 2, 2, and the third row is 3, 3, 3. As I said before, memory would look like this: 1 1 1 2 2 2 3 3 3, with the numbers appearing in consecutive memory locations.

What I was saying is that if I had declared the array as I did in post #2, memory would look as I described it.

" My declaration isn't "wrong." "
int arr[2][3] = {{1, 1, 1}, {2, 2, 2}, {3, 3, 3}};
Is your compiler okay with this?
Mark44 edit: Changed to int arr[3][3] = ...

"If I had wanted the array to be as you wrote it, with 1, 2, and 3 in each row, I would have declared it that way. As I wrote my declaration, the first row is 1, 1, 1, and the second row is 2, 2, 2, and the third row is 3, 3, 3. As I said before, memory would look like this: 1 1 1 2 2 2 3 3 3, with the numbers appearing in consecutive memory locations."

Yes, I get it. And pretty much every living creature also does. What I was telling you is that it is perfectly possible to get the same disposition in memory you would get in Fortran by just declaring each column in what would be each row. Then operate on that if it offers better performance.

Last edited by a moderator:
And even when the author says that "the most widely used algorithms (...) Fortran.", from what I have seen MatLab is pretty widely used both in industry and research and its implemented mostly on C.

Mark44
Mentor
Yes, that 2 should be a 3 in the declaration. I wish you had been more specific. I thought you meant the numbers I put in my array, not that the first array index was too small.

Mark44
Mentor
And even when the author says that "the most widely used algorithms (...) Fortran.", from what I have seen MatLab is pretty widely used both in industry and research and its implemented mostly on C.
I don't know this for certain, but I would bet that a lot of the back-end code is implemented in Fortran. There are lots of math libraries, such as LINPACK and many others that do the heavy lifting for matrix inversion, eigenvalue/eigenvector calculation, solving differential equations, and so on.

Yes, I've seen some stuff of numerical linear algebra packages after Googling about it and coming here.

But the most important part, which you did not refer to in your last answers is

(...) What I was telling you is that it is perfectly possible to get the same disposition in memory you would get in Fortran by just declaring each column in what would be each row. Then operate on that if it offers better performance.
Am I correct? You just need to write your matrices differently to get what would be the same disposition on memory.

C:
int a[b][c];
It seems counter intuitive that iterating (suppose we have the array above) all values of b for each value of c should be faster, but if it is, then I learned something new.

And even when the author says that "the most widely used algorithms (...) Fortran.", from what I have seen MatLab is pretty widely used both in industry and research and its implemented mostly on C.
Interestingly, some of the scripting functionality compiles to Java (it even allows Swing access). It used to be Fortran, but they switched it to C because at the time Fortran didn't have dynamic memory allocation.

FactChecker
Gold Member
It's not a question of what you can do in C, it is a question of what the compiler automatically does by default for you. M[row][column] is the most natural way to reference elements of a matrix M. In C if you want a matrix, M, to have an element in row=r and column=c referenced by M[r][c], then the compiler forces where those elements are stored consecutively:
M[1][1], M[1][2], M[1][3],... M[2][1], M[2][2], ...

The reason no one makes a structure to change the C order is that matrix calculations are easy to do efficiently the way it is now. It's easy to define loop calculations that use the matrix numbers from memory in sequence.

P.S. The advice in the book regarding the speed of calculations is very debatable. There are many steps and optimizations between your code and the speed of the resulting execution. There are look-ahead data fetches and guesses about which paths your code execution will take that make it virtually impossible for you to anticipate what will be fast.

M[row][column] is the most natural way to reference elements of a matrix M.
Nope. This is matter of convention. A program consistently using the other convention, especially if well document, would read just as natural.

The C language API for LAPACK supports both conventions: http://www.netlib.org/lapack/lapacke.html#_array_arguments.

wle
Yes, I get it. And pretty much every living creature also does. What I was telling you is that it is perfectly possible to get the same disposition in memory you would get in Fortran by just declaring each column in what would be each row. Then operate on that if it offers better performance.
The issue isn't Fortan's way of storing arrays being better than C's or vice versa. The languages simply follow different conventions which you should keep in mind when you write code that loops over arrays.

In C, the array element arr[2][2] is always directly followed by the array element arr[2][3] in memory. Technically, in terms of the C language definition and for the example array defined above, if p is a pointer to the array element arr[2][2], then p + 1 is guaranteed to point to the array element arr[2][3] and p + 3 points to the array element arr[3][2].

Fortran is a bit less straightforward. First of all, Fortran has no concept of pointer arithmetic, so there's no language-level way to define where array elements are going to be placed in memory. As far as I know, that means it's simply a convention that compiler writers follow to store arrays by column first in memory rather than by row. Second, even that convention isn't necessarily always true. I actually experimented several years ago with gfortran and passing arrays between C and Fortran to figure out what gfortran was doing with arrays. Here's what I remember for a two-dimensional array:
• Internally, gfortran creates two "multiplier" integer variables to keep track of where array elements are located. Calling the variables m and n, it would compute something like m * i + n * j to figure out where the array element arr(i,j) is located in memory.
• For a newly created array, m = 1 and n is set to the number of columns, so array elements in the same column are stored sequentially in memory.
• Certain array manipulation functions, like the transpose and array slicing functions, work by changing these values. For example, transpose(arr) works by interchanging the values of m and n rather than moving all the array elements around in memory.

FactChecker
Gold Member
Nope. This is matter of convention. A program consistently using the other convention, especially if well document, would read just as natural.
Perhaps I should have been more specific. In mathematics, matrix indices or subscripts are normally in row,column order. So the C syntax is much more natural for a mathematician.

Perhaps I should have been more specific. In mathematics, matrix indices or subscripts are normally in row,column order. So the C syntax is much more natural for a mathematician.
I think you got it wrong. It is not about C syntax. C does not have any idea of what a matrix is. Yes, if we talk about $A_{m,n}$, I would agree with you that, undoubtedly, $A$ has $m$ rows and $n$ columns.

But what I suggested, and voko also talked about, is using the "reversed" convention when writing it using C in order to get the same memory disposition that Fortran would generate by following the "normal" convention.

I also find array_identifier[row][col] to be simpler, but nonetheless suggested using it the other way around to get better performance. voko not only say that it is acceptable to do it the other way around, but that "A program consistently using the other convention, especially if well document, would read just as natural.".

FactChecker
Gold Member
I think you got it wrong. It is not about C syntax. C does not have any idea of what a matrix is. Yes, if we talk about $A_{m,n}$, I would agree with you that, undoubtedly, $A$ has $m$ rows and $n$ columns.
I stand by my statement. What voko complained about was my statement: "M[row][column] is the most natural way to reference elements of a matrix M.". I then admitted that I should have referred to the math convention.

Mrow,column <=> M[row][column] is a
straightforward mapping to the long-standing convention in math.

In this case, you are right.

But still, "the C syntax is much more natural for a mathematician." does not make much sense to me as C does not even imply what the "outer" array represents. If instead you made a direct reference to a given example of C that used [row][column] you would make sense.

FactChecker
Gold Member
In this case, you are right.

But still, "the C syntax is much more natural for a mathematician." does not make much sense to me as C does not even imply what the "outer" array represents. If instead you made a direct reference to a given example of C that used [row][column] you would make sense.
I think I see your point. In C, M[i1][i2] could be referring to anything from a simple matrix of numbers to a set of complicated structure pointers. So the math matrix interpretation is not as paramount as I was assuming. I will concede the point.

Having been thinking on and off about this for half a day, I now believe that the quoted passage is fundamentally wrong on two accounts.

1. While not explicitly stating that, it suggests very heavily that Fortran's internal representation is superior. It has been pointed out in this thread that modern hardware makes this rather dubious. But more fundamentally, the statement is based on (assumed) performance of matrix multiplication. But that may not be a meaningful metric; a program solving a linear system or finding eigenvalues may not have a single matrix multiplication. Would Fortran's representation be superior in this case?

2. C has no intrinsic knowledge of "rows" or "columns". A multi-dimensional arrays in C is a block of memory with a particular relationship between indices and the address in the block. The programmer is free to interpret any index as a row or as a column. The situation is similar in Fortran, except that the relationship is not part of the language, but is an implementation detail. The programmer can assign "row" or "column" meaning to Fortran's indices regardless. In fact, so can a mathematician, provided that notation is consistent and well-defined.

Having been thinking on and off about this for half a day, I now believe that the quoted passage is fundamentally wrong on two accounts.

1. While not explicitly stating that, it suggests very heavily that Fortran's internal representation is superior. It has been pointed out in this thread that modern hardware makes this rather dubious. But more fundamentally, the statement is based on (assumed) performance of matrix multiplication. But that may not be a meaningful metric; a program solving a linear system or finding eigenvalues may not have a single matrix multiplication. Would Fortran's representation be superior in this case?

2. C has no intrinsic knowledge of "rows" or "columns". A multi-dimensional arrays in C is a block of memory with a particular relationship between indices and the address in the block. The programmer is free to interpret any index as a row or as a column. The situation is similar in Fortran, except that the relationship is not part of the language, but is an implementation detail. The programmer can assign "row" or "column" meaning to Fortran's indices regardless. In fact, so can a mathematician, provided that notation is consistent and well-defined.
As a side note, nothing has been mentioned here to support the claim that "(...) the most widely used professional algorithms for matrix computations are written in fortran (...)".

wle
1. While not explicitly stating that, it suggests very heavily that Fortran's internal representation is superior. It has been pointed out in this thread that modern hardware makes this rather dubious. But more fundamentally, the statement is based on (assumed) performance of matrix multiplication. But that may not be a meaningful metric; a program solving a linear system or finding eigenvalues may not have a single matrix multiplication. Would Fortran's representation be superior in this case?
The point isn't that one convention is intrinsically better than the other. It's that there's often different ways of looping over arrays that will produce the same end result, and where that's the case you should know which convention your favourite programming language/library/environment is using and try to loop in a way that will minimise how much you jump around in memory. As an example, suppose you have a large 5000x5000 element array and you want to do something with each element of the array. In C, you should iterate over the second index in the inner loop:
C:
for (i = 0; i < 5000; ++i) {
for (j = 0; j < 5000; ++j) {
/* Do something with arr[i][j] */
}
}
In Fortran, you should instead iterate over the first index in the inner loop:
Fortran:
do j=1,5000
do i=1,5000
! Do something with arr(i,j)
end do
end do
(or better still in this case, use Fortran's forall loop and let the compiler figure out the best way to do it.)

If you're wondering why this matters, by the way, it's because there's more than one type of memory on a computer. Specifically, modern computers have one or more levels of processor cache memory, main memory itself, and virtual memory (i.e. a page file or swap partition on the hard disk), and there are big differences in how quickly the computer accesses each of these. If you write code that jumps around needlessly over a very large array, your computer could waste a lot of time copying stuff between cache and main memory or (worse) main and virtual memory.

The situation is similar in Fortran, except that the relationship is not part of the language, but is an implementation detail. The programmer can assign "row" or "column" meaning to Fortran's indices regardless.
That's not completely true. Fortran has a few built-in functions for matrix manipulation and these follow the standard mathematical (first index is row, second index is column) convention. So if you're storing matrices in Fortran rank-2 arrays, you should follow that convention if you want code like C = matmul(A, B) to work as expected.

The point isn't that
My impression upon reading and re-reading the passage several times, was that the author was of the opinion that Fortran's representation is superior to C's. Which is false.

I did not think that a trivial statement on the benefits of memory locality was worthy of any additional discussion.

If you're wondering why this matters, the reason is that there are different levels of memory on a computer.
I would like to stay out of analyzing this with respect to modern hardware. Otherwise I would have to point out that in the examples given by you, you would find it very difficult to measure any impact of the loop ordering on performance, if compiled with a modern optimizing compiler and run on a modern CPU.

wle
My impression upon reading and re-reading the passage several times, was that the author was of the opinion that Fortran's representation is superior to C's.
I didn't get that impression. Just that C and Fortran use different conventions and that one should be aware of them.

I would like to stay out of analyzing this with respect to modern hardware. Otherwise I would have to point out that in the examples given by you, you would find it very difficult to measure any impact of the loop ordering on performance, if compiled with a modern optimizing compiler and run on a modern CPU.
Have you actually tried this? I just tested it out with GCC and the difference can be quite measurable.

Specifically, I compiled the code below (saved in a file called "arrloop.c") with gcc -O3 -o arrloop arrloop.c. I increased the array size to 20000x20000 to make the code run longer. Timing the resulting executable produces a result like this:
Code:
$time ./arrloop real 0m0.752s user 0m0.223s sys 0m0.526s Same code except with the loop orders interchanged ("arrloop2.c") and same compilation option: Code: $ time ./arrloop2

real    0m5.853s
user    0m5.248s
sys     0m0.598s
There'll be some variation each time you run the code, but the difference is quite substantial: a few hundred milliseconds of user time vs. over five seconds.

Code for arrloop.c:

C:
#define ASIZE 20000

int arr[ASIZE][ASIZE];

int main(void)
{
int i, j, res = 0;

for (i = 0; i < ASIZE; ++i) {
for (j = 0; j < ASIZE; ++j) {
arr[i][j] = 2 * i + j;
}
}

for (i = 0; i < ASIZE; ++i) {
for (j = 0; j < ASIZE; ++j) {
res += arr[i][j];
}
}

return res;
}
arrloop2.c is identical except that the two inner/outer loops are interchanged.

I increased the array size to 20000x20000 to make the code run longer.
That makes a huge difference. My comment was about the 5K version that you originally posted.

The original version also did not contain a dedicated single "read" loop. That loop is where the majority of time will be spent in the 20K version. Take it out, and you will get very different results.

wle
That makes a huge difference. My comment was about the 5K version that you originally posted.
That still makes a difference. On my machine, 23 milliseconds vs. 189 milliseconds of user time (best times out of two or three tries).

That said, if you're counting on me using a specific array size, I think you're missing the point.

The original version also did not contain a dedicated single "read" loop. That loop is where the majority of time will be spent in the 20K version. Take it out, and you will get very different results.
Commenting out the first ("arr[ i][j] = 2 * i + j;") loop entirely and still with 5000x5000 arrays: 10 ms vs. 101 ms of user time (best time of three tries).