Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Big arrays in gcc?

  1. Sep 1, 2007 #1

    CRGreathouse

    User Avatar
    Science Advisor
    Homework Helper

    I'm having trouble allocating large arrays in C++ using gcc (using cygwin/MSYS in Windows). If I declare an array at the beginning of main as:

    unsigned int arr[1000][1000]

    there's no problem, but even something as small (8 MB) as

    unsigned int arr[1000][2000]

    will fail. This is something of a problem for me, since the program I'm working on requires large precalculations and an array of "only" a million elements just won't do it. Without those precalculations I'd have to do at least a dozen, and often more like 300, modulus operations inside a tight loop executing > 1 trillion times, and that just takes too much time.

    Am I doing something dumb? I did create the registry key heap_chunk_in_mb to 0x400 (1 GB) but I still get segmentation faults (or something) when I run the program with larger constants.
     
  2. jcsd
  3. Sep 1, 2007 #2

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Your problem is that it's going on the stack, because it's a local variable. You either need to allocate it dynamically, or change its linkage. (e.g. by declaring it static, or making it global)
     
  4. Sep 1, 2007 #3
    malloc is your friend :-) I've only done it in C (when I had to do S[500][120000]), but I'm assuming C++ has some sort of dynamic memory allocation feature.
     
  5. Sep 1, 2007 #4

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    new is the preferred method in C++, if you want to do the allocation by hand. The vector class will do the memory management for you... though I think vector would be awkward for CRGreathouse's purposes, since I don't think you can declare a vector<int[2000]>.
     
    Last edited: Sep 1, 2007
  6. Sep 1, 2007 #5

    rcgldr

    User Avatar
    Homework Helper

    For my sort test programs, I used vector <__int64> v1(1048576) and it worked fine.
     
  7. Sep 1, 2007 #6

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    The problem here is that CRGreathouse wants a two-dimensional array. A vector of int[2000]'s, I believe, doesn't make sense. A vector of vectors is probably inefficient, due to double indirection. A flat vector would be efficient, but he would have to manually convert from a double index into a single index.1 He would then have to spend the extra effort to ensure that his code is just as fast as if he was using ordinary 2D arrays.



    1: of course, he could write a class that does this conversion behind the scenes
     
  8. Sep 2, 2007 #7
    Very slightly inefficient with recent compilers, and only in special setups. I was recently working on a code where you could switch via single define from bare-metal flat memory allocation and computation of indices by hand, to a vector of vectors storage for matrices. There was no difference in performance between these two with GCC (I think it was 4.0.x), and with Intel compiler (9.1) there was less than 10% loss with vectors and only when specifically compiling for an Intel CPU with aggressive optimization.

    So, unless performance is of paramount importance, I'd just go with:
    Code (Text):
    vector<vector<uint> > arr(1000, vector<uint>(2000));
    --
    Chusslove Illich (Часлав Илић)
     
  9. Sep 2, 2007 #8

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    According to the original post, performance is of paramount importance; CRGreathouse already has a working program (I presume), and is now in the process of making it blindingly fast. Otherwise, I would agree with you: vectors so darned convenient and flexible that you should use them when writing your program, only using something else if profiling says you should. (or you happen to have handy have some special-purpose class that suits your needs)
     
    Last edited: Sep 2, 2007
  10. Sep 2, 2007 #9

    CRGreathouse

    User Avatar
    Science Advisor
    Homework Helper

    Thanks, that solved it. I feel only a little silly; I don't think that was particularly obvious (except in hindsight).

    Bingo, speed is all.
     
  11. Sep 2, 2007 #10

    rcgldr

    User Avatar
    Homework Helper

    In that case make it a single array, and size the "second dimension" to be a power of 2. Then to index, it's:

    array[(first_index<<X)+(second_index)]

    Declaring the array as opposed to dynamically allocating it will save one memory reference (LEA array versus MOV reg,[ptr_to_array]), unless the compiler can leave the address in a register.
     
    Last edited: Sep 2, 2007
  12. Sep 2, 2007 #11

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Another little thing... const correctness occasionally helps a lot with runtimes. I've even gone so far as precomputing lookup tables in a second program, and having that program write a header file containing the array declared as a static constant.

    (For larger lookup tables, I had to write assembly code to define the array, because the g++ I was using couldn't handle it)
     
  13. Sep 2, 2007 #12
    I hadn't compared a vector-of-vectors matrix to a matrix in a flat vector in several years, so I just wrote an integer matrix-vector multiply both ways. I was surprised by the results: the difference in run time was <~ 1%. Run 1000 times for a 1000x2000 matrix, the flat vector matrix took 3.68 s, the vector of vectors matrix took 3.71 s.

    GCC 4.0.1 on a Mac Core Duo. 32b. Nothing special in terms of SSE.
     
  14. Sep 2, 2007 #13

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    I wouldn't expect to see much difference on a matrix multiply, because it accesses the entries in a sequential fashion. If I wrote a program like:

    Code (Text):

    vector< vector<int> > myarray(1000, vector<int>(1000));
    vector<int> input(1000);
    vector<int> output(1000);

    fill(output.begin(), output.end(), 0);
    for(int i = 0; i < 1000; ++i) {
      for(int j = 0; j < 1000; ++j) {
        output[i] += input[j] * myarray[i][j];
      }
    }
     
    then in the course of optimization, the compiler will identify the constant subexpression myarray. In particular, it will generate essentially the same code as

    Code (Text):

    vector< vector<int> > myarray(1000, vector<int>(1000));
    vector<int> input(1000);
    vector<int> output(1000);

    fill(output.begin(), output.end(), 0);
    for(int i = 0; i < 1000; ++i) {
      vector< vector<int> >::const_reference row = myarray[i];
      for(int j = 0; j < 1000; ++j) {
        output[i] += input[j] * row[j];
      }
    }
     
    The point is that the upper1 level of indirection only happens once per row, whereas the lower level happens every access... so you wouldn't expect to see much difference at all. In fact, I would guess that the runtime difference that you saw is entirely due to other effects.

    (1: Is there a technical term for this?)


    Now, if you accessed the matrix in column major order instead, there's a chance that you might see a bigger difference... I don't know if gcc is smart enough to reorder loops.
     
    Last edited: Sep 2, 2007
  15. Sep 4, 2007 #14

    CRGreathouse

    User Avatar
    Science Advisor
    Homework Helper

    I totally agree on that one. When I was beginning to despair of making the 2-D array work properly, I started hand-coding a 1-D solution* where I was doing the shifts myself. I'm pretty sure that gcc is smart enough to do the shift optimization with any kind of -O level.

    * Yes, this isn't too hard -- but I was trying to further optimize it while so doing, which probably wasn't too bright -- small savings there, I'd wager.

    I'm not sure I can reasonably do that, since the table would actually take a while to read from disk.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?