No use of mentioning size in an arrayin c/c++

Tonauac · Feb 4, 2007

int a[10];
for(int i=0;i<55;i++)
cin>>a; ///this will store 55 elements however the size
////of the array is 10
and also prints it out
for(i=0;i<55;i++)
cout<<a;

How it is possilble.....and if it is ok...then what is the use of mentioning the size of an array in c/c++

Hurkyl · Feb 4, 2007

The c++ standard dictates that this program results in undefined behavior. It might do what you want, it might format your hard drive, or it might do anything in between -- it's all allowed by the standard.

However, the typical behavior of a buffer overflow is that it will corrupt other data, and possibly your stack or your heap, and usually leads to your program crashing with the dreaded segmentation fault.

Worse, buffer overflows represent (sometimes very large) security risks.

ChrisLeslie · Feb 4, 2007

There is a purpose in mentioning the size of an array. There will be an area in memory for your array equal to the number of elements given when you declare the array dimensions. If your compiler let's you violate the dimensions of the array by writing out side of it then you will risk overwriting something else in memory and crashing the program. You just have to remember not to exceed the array dimensions when possible. Many other compilers do have array bounds checking and will throw up an error if you exceed array dimensions, but this is not always possible if you have a variable that may exceed the dimensions during run time.

Chris

Tonauac · Feb 5, 2007

thanks...

but ..a c++ compiler...doesnt give a check for it...and neither destroy any other data...coz it just allocate from the free store heap

ChrisLeslie said:

There is a purpose in mentioning the size of an array. There will be an area in memory for your array equal to the number of elements given when you declare the array dimensions. If your compiler let's you violate the dimensions of the array by writing out side of it then you will risk overwriting something else in memory and crashing the program. You just have to remember not to exceed the array dimensions when possible. Many other compilers do have array bounds checking and will throw up an error if you exceed array dimensions, but this is not always possible if you have a variable that may exceed the dimensions during run time.

Chris

verty · Feb 5, 2007

and neither destroy any other data...coz it just allocate from the free store heap

Except when memory after the array is allocated.

Hurkyl · Feb 5, 2007

Tonauac said:

but ..a c++ compiler...doesnt give a check for it...and neither destroy any other data...coz it just allocate from the free store heap

If you overflow an array allocated on the free store heap, you will corrupt other data on the heap. (which generally includes a lot of information used by the memory management library. Heap corruption is a good way to get a seg fault when calling new or delete)

But the array in your example is not allocated on the heap: it's allocated on the stack. So when you overwrite it, you will corrupt your stack.

NoTime · Feb 5, 2007

Tonauac said:

but ..a c++ compiler...doesnt give a check for it...and neither destroy any other data...coz it just allocate from the free store heap

One reason it doesn't check is because determining what value an index variable can have is generally impossible.
Your example is an exception rather than the norm.

Also, you can malloc storage for your array at run time.

nightowl03d · Feb 8, 2007

There is a difference between malloc and new. In C++ new and delete[] is to be preferred over malloc and free, for the simple reason that malloc and free know nothing about constructors and destructors. In addition new and delete can be made to perform faster, and with lower heap fragmentation than malloc and free. So if you REALLY have to make an array in C++, use new and delete[].

In most C++ shops the use of malloc and free is considered a terrorist attack upon the codebase because in effect a time bomb is being planted.

However, you rarely need to use arrays at all in C++, instead you should usually be using std::vector. The STL libraries are quite performant, and you save yourself a whole range of bugs. For example the stack overflow bug in the parent post would be eliminated if it were written as...

std:vector<int> a;
int tmp = -1;
for(int i=1;i<5;i++)
{
std::cin>>tmp;
a.push_back(tmp);
}
std::vector<int>::iterator end = a.end();
for(std::vector<int>::iterator it = a.begin(); it != end;++it)
{
std::cout<<*it;
}
std::cout<<std::endl;

nmtim · Feb 10, 2007

nightowl03d said:

In addition new and delete can be made to perform faster, and with lower heap fragmentation than malloc and free.

Just curious, which std C++ lib implementations don't use malloc and free under the covers? And even if new has its own memory allocator, what makes it inherently superior to malloc?

...you should usually be using std::vector. The STL libraries are quite performant, and you save yourself a whole range of bugs. For example the stack overflow bug in the parent post would be eliminated if it were written as...

std:vector<int> a;
int tmp = -1;
for(int i=1;i<5;i++)
{
std::cin>>tmp;
a.push_back(tmp);
}
std::vector<int>::iterator end = a.end();
for(std::vector<int>::iterator it = a.begin(); it != end;++it)
{
std::cout<<*it;
}
std::cout<<std::endl;

Vectors are great in certain situations like the example shown. But they're more expensive than the humble array, and that means there are situations when arrays are better.

The example has at least one heap allocation hidden in there, probably two or three. Automatic arrays, in the appropriate setting, are great because they skip that system call. In HPC code, this can be crucial. Using a vector for small arrays can be brutal if one is talking millions/billions of them over the course of a run.

If you know your array size (for instance, good ole 3d vectors), you may save a lot of time with an array. Not to mention space--with 64 bit pointers, std::vector burns 24 bytes, the same amount as 3 doubles--100% overhead on your 3d vector. Plus a few more bytes for heap maintainence, though that's hidden.

Another question--how much do you pay for instantiating iterators? On some implementations, it's just a pointer. On others, it's a ctor call.

One more comment, with push_back, you're opening yourself to repeated reallocation and copying, typically every time you cross a power of two. And the library may allocate more than you ask for, to leave you some room to grow. STL is great, but it's not free.

Hurkyl · Feb 10, 2007

nmtim said:

Vectors are great in certain situations like the example shown. But they're more expensive than the humble array, and that means there are situations when arrays are better.

Of course, this is only relevant when you get to the point where it's time to optimize your code. Worrying about this before you're at that point is one of the great programming sins.

Another question--how much do you pay for instantiating iterators? On some implementations, it's just a pointer. On others, it's a ctor call.

If the ctor call isn't optimized away, then you shouldn't be using that C++ compiler for HPC in the first place. :tongue:

nightowl03d · Feb 10, 2007

Disclaimer:
I work on moderately computationally intensive code, (without getting too domain specific, it does real time Bayesian and Markov Analysis on about a 50 MB/s telemetry stream), we have never had an issue with using templates instead of arrays ever pop up in the last 6 years. I suppose there are scenarios where this becomes an issue, but by using the techniques of Alexendrescu and Bulka, I have never seen it become a problem. So my comments are more about general programming than when you need to make the most of your allocated time on a 128 processor NUMA cluster.

nmtim said:

Just curious, which std C++ lib implementations don't use malloc and free under the covers? And even if new has its own memory allocator, what makes it inherently superior to malloc?

It is inherently superior to malloc because malloc/free know nothing about constructors/destructors. The use of malloc and free can lead to memory leaks when you need to switch from POD to real classes. In my experience this happens quite often, particularly in developing commercial software.

Also, by overriding operator new and delete for a class you can optimize object creation and deletion appropriate for a specific class in a way that malloc and free simply can't do. For example, I have done allocators which round up subsets of objects to the same size, created object pools, and the like without the caller ever having to change their semantics.

nmtim said:

Vectors are great in certain situations like the example shown. But they're more expensive than the humble array, and that means there are situations when arrays are better.

I am not aware of a single competent professional developer who uses arrays anymore, UNLESS there is a compelling set of profiling data which indicates that they should do so.

In fact it was my team's observation that getting rid of the array code tended to help speed things up since programs don't run very fast after they seg fault. :-)

Admittedly we didn't get rid of the array code because of its slow performance, we got rid of it because it kept causing crashes because the person who did them kept scribbling on the stack or over-writing heap memory that didn't belong to them. An example of how easy it is to do this is illustrated by the start of this thread.

nmtim said:

The example has at least one heap allocation hidden in there, probably two or three. Automatic arrays, in the appropriate setting, are great because they skip that system call. In HPC code, this can be crucial. Using a vector for small arrays can be brutal if one is talking millions/billions of them over the course of a run.

You should act based on what the profiler says, not what you think it will say. I echo Hurky's comment in this regard.

nmtim said:

If you know your array size (for instance, good ole 3d vectors), you may save a lot of time with an array. Not to mention space--with 64 bit pointers, std::vector burns 24 bytes, the same amount as 3 doubles--100% overhead on your 3d vector. Plus a few more bytes for heap maintainence, though that's hidden.

You would probably have a more reliable code and more memory efficient code if you created a 3d vector class class with a custom operator new and delete. And a template library which optimized their operations than using an array. If you haven't already read Alexendrescu's and Jossutis book, I would think you would find them interesting.

Whether you have the time to engineer this subset of data sets is another question, but once you grok alexendrescu and jossutis, you will realize that arrays are not always the path to optimal performance. (For truly trivial operations they are, but there aren't that many truly trivial operations that need to be developed any more).

nmtim said:

Another question--how much do you pay for instantiating iterators? On some implementations, it's just a pointer. On others, it's a ctor call.

In a release build almost nothing. In a debug build, quite a bit because a decent compiler will put in a bunch of extra run-time checks to help you catch problems while your code is still in development.

nmtim said:

One more comment, with push_back, you're opening yourself to repeated reallocation and copying, typically every time you cross a power of two. And the library may allocate more than you ask for, to leave you some room to grow. STL is great, but it's not free.

push_back doesn't have a cost if you use the reserve method with an appropriate value before you called it. Since my loop was known, I could have called reserve(5), and the cost of reallocs would have been 0. (On my compiler the default is 16 elements, so it was 0 on my machine).

Everything has a cost, but as a rule of thumb, I am a firm believer in the guideline that you should be using the Boost and STL libraries until the profiler tells you otherwise. This isn't a guideline I invented, it was given by Bjarne Stroustroup. It is my opinion that people who ignore him on this point do so at their own peril.

The real point I was trying to make is that arrays are the enemy of stability, flexibility, robustness, and productivity. And if you are programming something which will be used by other people, malloc/free are bad practice in C++ object oriented programming. From a software engineering standpoint, C++ arrays should be used as a last resort based on profiling data that indicates that their use is unavoidable. There are exceptions to every rule, if your particular program is gathering data to feed into some fortran or C program, well then arrays are unavoidable, sometimes the last resort is the only resort.

Unfortunately, in C++ the advanced template techniques require a bit of study, but ultimately, they are often significantly faster than array based programs. (Compilers do a better job of optimization than with a corresponding array based algorithm).

STL/Boost don't require much study, but template meta-programming does.
I am sure the person who I am replying to is familiar with the following books on the subject. But for the others out there, here are some I found helpful.

Efficient C++: Performance Programming Techniques Dov Bulka (Author), David Mayhew (Author)

Modern C++ Design: Generic Programming and Design Patterns Applied Andrei Alexandrescu

C++ Templates - The Complete Guide by David Vandevoorde and Nicolai M. Josuttis

C++ Template Metaprogramming - David Abrahams and Aleksey Gurtovoy

No use of mentioning size in an arrayin c/c++

What is the purpose of mentioning size in an array in c/c++?

Can we declare an array without mentioning its size in c/c++?

How does c/c++ handle arrays with unspecified sizes?

What is the difference between a static and dynamic array in c/c++?

Why is it not recommended to mention size in an array in c/c++?

Similar threads

Hot Threads

Recent Insights