Disclaimer:
I work on moderately computationally intensive code, (without getting too domain specific, it does real time Bayesian and Markov Analysis on about a 50 MB/s telemetry stream), we have never had an issue with using templates instead of arrays ever pop up in the last 6 years. I suppose there are scenarios where this becomes an issue, but by using the techniques of Alexendrescu and Bulka, I have never seen it become a problem. So my comments are more about general programming than when you need to make the most of your allocated time on a 128 processor NUMA cluster.
nmtim said:
Just curious, which std C++ lib implementations don't use malloc and free under the covers? And even if new has its own memory allocator, what makes it inherently superior to malloc?
It is inherently superior to malloc because malloc/free know nothing about constructors/destructors. The use of malloc and free can lead to memory leaks when you need to switch from POD to real classes. In my experience this happens quite often, particularly in developing commercial software.
Also, by overriding operator new and delete for a class you can optimize object creation and deletion appropriate for a specific class in a way that malloc and free simply can't do. For example, I have done allocators which round up subsets of objects to the same size, created object pools, and the like without the caller ever having to change their semantics.
nmtim said:
Vectors are great in certain situations like the example shown. But they're more expensive than the humble array, and that means there are situations when arrays are better.
I am not aware of a single competent professional developer who uses arrays anymore, UNLESS there is a compelling set of profiling data which indicates that they should do so.
In fact it was my team's observation that getting rid of the array code tended to help speed things up since programs don't run very fast after they seg fault. :-)
Admittedly we didn't get rid of the array code because of its slow performance, we got rid of it because it kept causing crashes because the person who did them kept scribbling on the stack or over-writing heap memory that didn't belong to them. An example of how easy it is to do this is illustrated by the start of this thread.
nmtim said:
The example has at least one heap allocation hidden in there, probably two or three. Automatic arrays, in the appropriate setting, are great because they skip that system call. In HPC code, this can be crucial. Using a vector for small arrays can be brutal if one is talking millions/billions of them over the course of a run.
You should act based on what the profiler says, not what you think it will say. I echo Hurky's comment in this regard.
nmtim said:
If you know your array size (for instance, good ole 3d vectors), you may save a lot of time with an array. Not to mention space--with 64 bit pointers, std::vector burns 24 bytes, the same amount as 3 doubles--100% overhead on your 3d vector. Plus a few more bytes for heap maintainence, though that's hidden.
You would probably have a more reliable code and more memory efficient code if you created a 3d vector class class with a custom operator new and delete. And a template library which optimized their operations than using an array. If you haven't already read Alexendrescu's and Jossutis book, I would think you would find them interesting.
Whether you have the time to engineer this subset of data sets is another question, but once you grok alexendrescu and jossutis, you will realize that arrays are not always the path to optimal performance. (For truly trivial operations they are, but there aren't that many truly trivial operations that need to be developed any more).
nmtim said:
Another question--how much do you pay for instantiating iterators? On some implementations, it's just a pointer. On others, it's a ctor call.
In a release build almost nothing. In a debug build, quite a bit because a decent compiler will put in a bunch of extra run-time checks to help you catch problems while your code is still in development.
nmtim said:
One more comment, with push_back, you're opening yourself to repeated reallocation and copying, typically every time you cross a power of two. And the library may allocate more than you ask for, to leave you some room to grow. STL is great, but it's not free.
push_back doesn't have a cost if you use the reserve method with an appropriate value before you called it. Since my loop was known, I could have called reserve(5), and the cost of reallocs would have been 0. (On my compiler the default is 16 elements, so it was 0 on my machine).
Everything has a cost, but as a rule of thumb, I am a firm believer in the guideline that you should be using the Boost and STL libraries until the profiler tells you otherwise. This isn't a guideline I invented, it was given by Bjarne Stroustroup. It is my opinion that people who ignore him on this point do so at their own peril.
The real point I was trying to make is that arrays are the enemy of stability, flexibility, robustness, and productivity. And if you are programming something which will be used by other people, malloc/free are bad practice in C++ object oriented programming. From a software engineering standpoint, C++ arrays should be used as a last resort based on profiling data that indicates that their use is unavoidable. There are exceptions to every rule, if your particular program is gathering data to feed into some fortran or C program, well then arrays are unavoidable, sometimes the last resort is the only resort.
Unfortunately, in C++ the advanced template techniques require a bit of study, but ultimately, they are often significantly faster than array based programs. (Compilers do a better job of optimization than with a corresponding array based algorithm).
STL/Boost don't require much study, but template meta-programming does.
I am sure the person who I am replying to is familiar with the following books on the subject. But for the others out there, here are some I found helpful.
Efficient C++: Performance Programming Techniques Dov Bulka (Author), David Mayhew (Author)
Modern C++ Design: Generic Programming and Design Patterns Applied Andrei Alexandrescu
C++ Templates - The Complete Guide by David Vandevoorde and Nicolai M. Josuttis
C++ Template Metaprogramming - David Abrahams and Aleksey Gurtovoy