GCC & C++: Allocating Large Arrays in Windows

  • Thread starter CRGreathouse
  • Start date
  • Tags
    Arrays Gcc
In summary: const correctness occasionally helps a lot with runtimes.const correctness occasionally helps a lot with runtimes.
  • #1
CRGreathouse
Science Advisor
Homework Helper
2,844
0
I'm having trouble allocating large arrays in C++ using gcc (using cygwin/MSYS in Windows). If I declare an array at the beginning of main as:

unsigned int arr[1000][1000]

there's no problem, but even something as small (8 MB) as

unsigned int arr[1000][2000]

will fail. This is something of a problem for me, since the program I'm working on requires large precalculations and an array of "only" a million elements just won't do it. Without those precalculations I'd have to do at least a dozen, and often more like 300, modulus operations inside a tight loop executing > 1 trillion times, and that just takes too much time.

Am I doing something dumb? I did create the registry key heap_chunk_in_mb to 0x400 (1 GB) but I still get segmentation faults (or something) when I run the program with larger constants.
 
Technology news on Phys.org
  • #2
Your problem is that it's going on the stack, because it's a local variable. You either need to allocate it dynamically, or change its linkage. (e.g. by declaring it static, or making it global)
 
  • #3
malloc is your friend :-) I've only done it in C (when I had to do S[500][120000]), but I'm assuming C++ has some sort of dynamic memory allocation feature.
 
  • #4
Beeza said:
malloc is your friend :-) I've only done it in C (when I had to do S[500][120000]), but I'm assuming C++ has some sort of dynamic memory allocation feature.
new is the preferred method in C++, if you want to do the allocation by hand. The vector class will do the memory management for you... though I think vector would be awkward for CRGreathouse's purposes, since I don't think you can declare a vector<int[2000]>.
 
Last edited:
  • #5
Hurkyl said:
new is the preferred method in C++, if you want to do the allocation by hand. The vector class will do the memory management for you... though I think vector would be awkward for CRGreathouse's purposes, since I don't think you can declare a vector<int[2000]>.
For my sort test programs, I used vector <__int64> v1(1048576) and it worked fine.
 
  • #6
The problem here is that CRGreathouse wants a two-dimensional array. A vector of int[2000]'s, I believe, doesn't make sense. A vector of vectors is probably inefficient, due to double indirection. A flat vector would be efficient, but he would have to manually convert from a double index into a single index.1 He would then have to spend the extra effort to ensure that his code is just as fast as if he was using ordinary 2D arrays.



1: of course, he could write a class that does this conversion behind the scenes
 
  • #7
Hurkyl said:
[...] A vector of vectors is probably inefficient, due to double indirection.

Very slightly inefficient with recent compilers, and only in special setups. I was recently working on a code where you could switch via single define from bare-metal flat memory allocation and computation of indices by hand, to a vector of vectors storage for matrices. There was no difference in performance between these two with GCC (I think it was 4.0.x), and with Intel compiler (9.1) there was less than 10% loss with vectors and only when specifically compiling for an Intel CPU with aggressive optimization.

So, unless performance is of paramount importance, I'd just go with:
Code:
vector<vector<uint> > arr(1000, vector<uint>(2000));

--
Chusslove Illich (Часлав Илић)
 
  • #8
caslav.ilic said:
So, unless performance is of paramount importance, I'd just go with:
Code:
vector<vector<uint> > arr(1000, vector<uint>(2000));
According to the original post, performance is of paramount importance; CRGreathouse already has a working program (I presume), and is now in the process of making it blindingly fast. Otherwise, I would agree with you: vectors so darned convenient and flexible that you should use them when writing your program, only using something else if profiling says you should. (or you happen to have handy have some special-purpose class that suits your needs)
 
Last edited:
  • #9
Thanks, that solved it. I feel only a little silly; I don't think that was particularly obvious (except in hindsight).

Hurkyl said:
The problem here is that CRGreathouse wants a two-dimensional array. A vector of int[2000]'s, I believe, doesn't make sense. A vector of vectors is probably inefficient, due to double indirection. A flat vector would be efficient, but he would have to manually convert from a double index into a single index.1 He would then have to spend the extra effort to ensure that his code is just as fast as if he was using ordinary 2D arrays.

Bingo, speed is all.
 
  • #10
CRGreathouse said:
Bingo, speed is all.
In that case make it a single array, and size the "second dimension" to be a power of 2. Then to index, it's:

array[(first_index<<X)+(second_index)]

Declaring the array as opposed to dynamically allocating it will save one memory reference (LEA array versus MOV reg,[ptr_to_array]), unless the compiler can leave the address in a register.
 
Last edited:
  • #11
Another little thing... const correctness occasionally helps a lot with runtimes. I've even gone so far as precomputing lookup tables in a second program, and having that program write a header file containing the array declared as a static constant.

(For larger lookup tables, I had to write assembly code to define the array, because the g++ I was using couldn't handle it)
 
  • #12
I hadn't compared a vector-of-vectors matrix to a matrix in a flat vector in several years, so I just wrote an integer matrix-vector multiply both ways. I was surprised by the results: the difference in run time was <~ 1%. Run 1000 times for a 1000x2000 matrix, the flat vector matrix took 3.68 s, the vector of vectors matrix took 3.71 s.

GCC 4.0.1 on a Mac Core Duo. 32b. Nothing special in terms of SSE.
 
  • #13
I wouldn't expect to see much difference on a matrix multiply, because it accesses the entries in a sequential fashion. If I wrote a program like:

Code:
vector< vector<int> > myarray(1000, vector<int>(1000));
vector<int> input(1000);
vector<int> output(1000);

fill(output.begin(), output.end(), 0);
for(int i = 0; i < 1000; ++i) {
  for(int j = 0; j < 1000; ++j) {
    output[i] += input[j] * myarray[i][j];
  }
}

then in the course of optimization, the compiler will identify the constant subexpression myarray. In particular, it will generate essentially the same code as

Code:
vector< vector<int> > myarray(1000, vector<int>(1000));
vector<int> input(1000);
vector<int> output(1000);

fill(output.begin(), output.end(), 0);
for(int i = 0; i < 1000; ++i) {
  vector< vector<int> >::const_reference row = myarray[i];
  for(int j = 0; j < 1000; ++j) {
    output[i] += input[j] * row[j];
  }
}

The point is that the upper1 level of indirection only happens once per row, whereas the lower level happens every access... so you wouldn't expect to see much difference at all. In fact, I would guess that the runtime difference that you saw is entirely due to other effects.

(1: Is there a technical term for this?)


Now, if you accessed the matrix in column major order instead, there's a chance that you might see a bigger difference... I don't know if gcc is smart enough to reorder loops.
 
Last edited:
  • #14
Jeff Reid said:
In that case make it a single array, and size the "second dimension" to be a power of 2. Then to index, it's:

array[(first_index<<X)+(second_index)]

Declaring the array as opposed to dynamically allocating it will save one memory reference (LEA array versus MOV reg,[ptr_to_array]), unless the compiler can leave the address in a register.

I totally agree on that one. When I was beginning to despair of making the 2-D array work properly, I started hand-coding a 1-D solution* where I was doing the shifts myself. I'm pretty sure that gcc is smart enough to do the shift optimization with any kind of -O level.

* Yes, this isn't too hard -- but I was trying to further optimize it while so doing, which probably wasn't too bright -- small savings there, I'd wager.

Hurkyl said:
Another little thing... const correctness occasionally helps a lot with runtimes. I've even gone so far as precomputing lookup tables in a second program, and having that program write a header file containing the array declared as a static constant.

(For larger lookup tables, I had to write assembly code to define the array, because the g++ I was using couldn't handle it)

I'm not sure I can reasonably do that, since the table would actually take a while to read from disk.
 

1. What is GCC?

GCC, or GNU Compiler Collection, is a free and open-source compiler system that is commonly used to compile programming languages such as C, C++, and Fortran. It is maintained by the GNU Project and supports various platforms, including Windows.

2. What is C++?

C++ is a high-level programming language that is an extension of the C programming language. It is commonly used for developing operating systems, games, and other applications that require high performance. C++ is an object-oriented language and allows for low-level memory manipulation, making it a popular choice for systems programming.

3. Why is allocating large arrays in Windows a concern for GCC and C++ programmers?

Allocating large arrays in Windows can be a concern for GCC and C++ programmers because Windows has a limited amount of virtual address space available for a single process. This means that if a program tries to allocate a large array, it may run out of virtual address space, causing unexpected errors or crashes.

4. How can I allocate large arrays in Windows using GCC and C++?

There are a few ways to allocate large arrays in Windows using GCC and C++. One way is to use the "malloc" function, which allocates memory from the heap. Another option is to use the "VirtualAlloc" function, which allows for the allocation of memory pages directly from the operating system. Additionally, using a memory pool or memory-mapped files can also be effective for managing large arrays in Windows.

5. Are there any best practices for allocating large arrays in Windows with GCC and C++?

Yes, there are some best practices that can help with allocating large arrays in Windows using GCC and C++. One important practice is to avoid using the "new" operator, as it can cause issues with memory fragmentation. It is also recommended to use memory management techniques, such as garbage collection or smart pointers, to ensure that allocated memory is properly released when no longer needed. Additionally, it can be helpful to monitor memory usage and optimize code to minimize the need for large array allocations.

Similar threads

  • Programming and Computer Science
Replies
14
Views
1K
  • Programming and Computer Science
Replies
17
Views
1K
  • Programming and Computer Science
Replies
6
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
17
Views
1K
  • Programming and Computer Science
Replies
10
Views
6K
  • Programming and Computer Science
Replies
4
Views
3K
  • Programming and Computer Science
Replies
7
Views
6K
  • Programming and Computer Science
Replies
15
Views
3K
  • Engineering and Comp Sci Homework Help
Replies
5
Views
3K
  • Engineering and Comp Sci Homework Help
Replies
3
Views
2K
Back
Top