Why is making a single iteration through a list/vector/array/etc.

  • Thread starter Thread starter Jamin2112
  • Start date Start date
Click For Summary

Discussion Overview

The discussion revolves around the efficiency and elegance of iterating through data structures, particularly linked lists, in programming. Participants explore the implications of single-pass versus multiple-pass solutions in various contexts, including algorithm design and performance considerations.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Conceptual clarification

Main Points Raised

  • Some participants question the notion that a single iteration is always the "best" solution, suggesting that clarity and simplicity can sometimes be more valuable.
  • Others argue that the definition of "best" can vary, encompassing factors like speed, readability, and code length, which may differ based on the programming language used.
  • A participant proposes that using a method to directly access the middle element of a linked list could be more efficient than a single-pass approach, especially if the list maintains its size as an attribute.
  • Another participant humorously suggests that returning the first element could be a valid answer to the middle element question, highlighting the ambiguity in the problem statement.
  • Concerns are raised about cache performance when using multiple pointers in a single-pass solution, suggesting that in some cases, multiple passes may be more efficient due to memory access patterns.
  • Participants discuss the trade-offs between single and multiple loops in array operations, noting that memory management becomes increasingly important as the size of the data grows.

Areas of Agreement / Disagreement

Participants express differing views on the efficiency of single versus multiple iterations, with no consensus reached on which approach is superior. The discussion remains unresolved, with multiple competing perspectives presented.

Contextual Notes

Participants highlight various assumptions regarding data structure implementations, such as the behavior of STL containers and the implications of memory access patterns, which may affect the performance of different looping strategies.

Jamin2112
Messages
973
Reaction score
12
... always considered the "best" solution, as opposed to a solution that iterates through twice or more times? Isn't it sometimes more efficient and elegant-looking to accomplish a task step-by-step instead of trying to cram everything into a single pass?

I'm wondering because so many little programming challenge problems I see on the internet always want a single pass, e.g. "Find the middle element of a linked list using a single pass."
 
Technology news on Phys.org
Asking "Find the middle element of a linked list using a single pass" does not imply that it is the "best" solution.
And what means "best", is it "fastest" or "simplest" or "most readable" or "shortest code"?
It also depends on the language you are using, in Linq it would make less sense to talk about "iterates".

Were you able to "Find the middle element of a linked list using a single pass."?
After all it is a funny challenge, isn't it?
 
I concur with maajdl. These puzzles oftentimes are not the "best" solution. Think of them as puzzles whose intent is to make you think about data structures and algorithms.

As far as the "best" way to find the middle element of a linked list in the real world, I would view something along the lines of the following as pretty hard to beat. Anything else is micro optimization.
Code:
// Find the middle element of the linked list.
middle =  list.get(list.size()/2);
Compared to a loop that walks over the entire list and keeps a separate middle iterator that is incremented once for every two times the list iterator is incremented, the above code
  • Is much shorter (a *big* plus in the real world),
  • Is much more obvious in what it is doing (another big plus),
  • Uses high level functionality rather than low level functionality of linked list object (yet another big plus), and
  • Might well be faster than the loop over the whole list.
    Suppose the linked list object maintains the size of the list as an easily accessible attribute. This makes list.size() an O(1) algorithm. If list.get() walks over the list to the midpoint, the above code is twice as fast as the puzzle solution. If the underlying implementation of the list is a random access array, then list.get() is also O(1).
 
Last edited:
Jamin2112 said:
"Find the middle element of a linked list using a single pass."

Just return the first element. The question doesn't say what list traversal algorithm you were supposed to use. The element you returned is the middle element, for some traversal algorithm :devil:
 
D H said:
[*]Might well be faster than the loop over the whole list.
Suppose the linked list object maintains the size of the list as an easily accessible attribute. This makes list.size() an O(1) algorithm.

All STL containers, including list, are required to implement ::size() at constant complexity, i.e., as O(1).
 
voko said:
All STL containers, including list, are required to implement ::size() at constant complexity, i.e., as O(1).
Not std::forward_list (new in C++11).

Note that STL is a bit of a misnomer. The containers library in C++ is not the STL. The STL is code developed in the 1980s by Alexander Stepanov that is no longer maintained. While it did serve as a template (pun intended) for significant chunks of the C++ standard library, what is in C++ is not the STL.
 
D H said:
Not std::forward_list (new in C++11).

That was taken from 23.2.1 General Container Requirements. I assumed that applied to all containers, including the forward list. I did overlook that the forward list did not have the size member in the first place.

Note that STL is a bit of a misnomer.

Yeah, I know. It is just some much easier to type STL than "the C++ Standard Library". I think these days "STL" is generally understood as a moniker for "the C++ Standard Library".
 
D H said:
ICompared to a loop that walks over the entire list and keeps a separate middle iterator that is incremented once for every two times the list iterator is incremented ...
That would effectively be making 1 1/2 passes over a linked list. If the list was large enough and/or nodes scattered through the memory space, then the two node pointers could end up competing for common cache lines in the processor, taking more time than making separate passes, one full pass to get a count (assuming count doesn't already exist), one half pass to get to the midpoint.
 
Last edited:
rcgldr said:
That would effectively be making 1 1/2 passes over a linked list. If the list was large enough and/or nodes scattered through the memory space, then the two node pointers could end up competing for common cache lines in the processor, taking more time than making separate passes, one full pass to get a count (assuming count doesn't already exist), one half pass to get to the midpoint.
The exact same is true for a "single pass" solution.
Code:
node* median = head;
if (head != NULL) {
   node* curr_node = head->next;
   node* next_node;
   while ((curr_node != NULL) && ((next_node = curr_node->next) != NULL)) {
      median = median->next;
      curr_node = next_node->next;
   }
}
There are still competing cache lines. In fact, the situation is quite possibly worsened because now we're keeping two pointers going simultaneously that point to disparate areas of memory. If consecutive nodes are more or less organized consecutively in memory, the two loop solution may well be faster than the single pass solution.

There are many cases where two loops are better than one. The above is one such case. Here's another, increment each element of array c by the corresponding element in array a, and increment each element in array d by the corresponding element in array b. All arrays have the same number of elements. This might make one think a single loop would be the better solution.
Code:
// Single loop solution
for (int ii = 0; ii < N; ++ii) {
   c[ii] += a[ii];
   d[ii] += b[ii];
}
This code is fine if N is small. Problems arise as N gets larger. Now there are four separate areas of memory to be managed inside the loop. This makes for lots of cache conflicts.

Here's the two loop solution. Admittedly it's not as pretty.
Code:
// Two loop solution
for (int ii = 0; ii < N; ++ii) {
   c[ii] += a[ii];
}
for (int ii = 0; ii < N; ++ii) {
   d[ii] += b[ii];
}
This code is still fine if N is small. Registers are fast. That this has two loops is not a burden. It's the memory access that's the killer, not the number of loops. In this case, as N grows larger, there are now only two separate areas of memory to be managed inside each loop. This code is significantly faster for moderate values of N. When N gets very, very large, there single loop and two loop solutions are equally bad. There's too much memory thrashing no matter how you cut it.
 

Similar threads

  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
29
Views
6K
Replies
4
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 19 ·
Replies
19
Views
11K
  • · Replies 13 ·
Replies
13
Views
6K
Replies
22
Views
5K