Why is making a single iteration through a list/vector/array/etc.

Jamin2112 · Jan 8, 2014

... always considered the "best" solution, as opposed to a solution that iterates through twice or more times? Isn't it sometimes more efficient and elegant-looking to accomplish a task step-by-step instead of trying to cram everything into a single pass?

I'm wondering because so many little programming challenge problems I see on the internet always want a single pass, e.g. "Find the middle element of a linked list using a single pass."

maajdl · Jan 8, 2014

Asking "Find the middle element of a linked list using a single pass" does not imply that it is the "best" solution.
And what means "best", is it "fastest" or "simplest" or "most readable" or "shortest code"?
It also depends on the language you are using, in Linq it would make less sense to talk about "iterates".

Were you able to "Find the middle element of a linked list using a single pass."?
After all it is a funny challenge, isn't it?

D H · Jan 8, 2014

I concur with maajdl. These puzzles oftentimes are not the "best" solution. Think of them as puzzles whose intent is to make you think about data structures and algorithms.

As far as the "best" way to find the middle element of a linked list in the real world, I would view something along the lines of the following as pretty hard to beat. Anything else is micro optimization.

Code:

// Find the middle element of the linked list.
middle =  list.get(list.size()/2);

Compared to a loop that walks over the entire list and keeps a separate middle iterator that is incremented once for every two times the list iterator is incremented, the above code

Is much shorter (a *big* plus in the real world),
Is much more obvious in what it is doing (another big plus),
Uses high level functionality rather than low level functionality of linked list object (yet another big plus), and
Might well be faster than the loop over the whole list.
Suppose the linked list object maintains the size of the list as an easily accessible attribute. This makes list.size() an O(1) algorithm. If list.get() walks over the list to the midpoint, the above code is twice as fast as the puzzle solution. If the underlying implementation of the list is a random access array, then list.get() is also O(1).

AlephZero · Jan 8, 2014

Jamin2112 said:

"Find the middle element of a linked list using a single pass."

Just return the first element. The question doesn't say what list traversal algorithm you were supposed to use. The element you returned is the middle element, for some traversal algorithm :devil:

voko · Jan 9, 2014

D H said:

[*]Might well be faster than the loop over the whole list.
Suppose the linked list object maintains the size of the list as an easily accessible attribute. This makes list.size() an O(1) algorithm.

All STL containers, including list, are required to implement ::size() at constant complexity, i.e., as O(1).

D H · Jan 9, 2014

voko said:

All STL containers, including list, are required to implement ::size() at constant complexity, i.e., as O(1).

Not std::forward_list (new in C++11).

Note that STL is a bit of a misnomer. The containers library in C++ is not the STL. The STL is code developed in the 1980s by Alexander Stepanov that is no longer maintained. While it did serve as a template (pun intended) for significant chunks of the C++ standard library, what is in C++ is not the STL.

voko · Jan 9, 2014

D H said:

Not std::forward_list (new in C++11).

That was taken from 23.2.1 General Container Requirements. I assumed that applied to all containers, including the forward list. I did overlook that the forward list did not have the size member in the first place.

Note that STL is a bit of a misnomer.

Yeah, I know. It is just some much easier to type STL than "the C++ Standard Library". I think these days "STL" is generally understood as a moniker for "the C++ Standard Library".

rcgldr · Jan 9, 2014

D H said:

ICompared to a loop that walks over the entire list and keeps a separate middle iterator that is incremented once for every two times the list iterator is incremented ...

That would effectively be making 1 1/2 passes over a linked list. If the list was large enough and/or nodes scattered through the memory space, then the two node pointers could end up competing for common cache lines in the processor, taking more time than making separate passes, one full pass to get a count (assuming count doesn't already exist), one half pass to get to the midpoint.

D H · Jan 9, 2014

rcgldr said:

That would effectively be making 1 1/2 passes over a linked list. If the list was large enough and/or nodes scattered through the memory space, then the two node pointers could end up competing for common cache lines in the processor, taking more time than making separate passes, one full pass to get a count (assuming count doesn't already exist), one half pass to get to the midpoint.

The exact same is true for a "single pass" solution.

Code:

node* median = head;
if (head != NULL) {
   node* curr_node = head->next;
   node* next_node;
   while ((curr_node != NULL) && ((next_node = curr_node->next) != NULL)) {
      median = median->next;
      curr_node = next_node->next;
   }
}

There are still competing cache lines. In fact, the situation is quite possibly worsened because now we're keeping two pointers going simultaneously that point to disparate areas of memory. If consecutive nodes are more or less organized consecutively in memory, the two loop solution may well be faster than the single pass solution.

There are many cases where two loops are better than one. The above is one such case. Here's another, increment each element of array c by the corresponding element in array a, and increment each element in array d by the corresponding element in array b. All arrays have the same number of elements. This might make one think a single loop would be the better solution.

Code:

// Single loop solution
for (int ii = 0; ii < N; ++ii) {
   c[ii] += a[ii];
   d[ii] += b[ii];
}

This code is fine if N is small. Problems arise as N gets larger. Now there are four separate areas of memory to be managed inside the loop. This makes for lots of cache conflicts.

Here's the two loop solution. Admittedly it's not as pretty.

Code:

// Two loop solution
for (int ii = 0; ii < N; ++ii) {
   c[ii] += a[ii];
}
for (int ii = 0; ii < N; ++ii) {
   d[ii] += b[ii];
}

This code is still fine if N is small. Registers are fast. That this has two loops is not a burden. It's the memory access that's the killer, not the number of loops. In this case, as N grows larger, there are now only two separate areas of memory to be managed inside each loop. This code is significantly faster for moderate values of N. When N gets very, very large, there single loop and two loop solutions are equally bad. There's too much memory thrashing no matter how you cut it.

Why is making a single iteration through a list/vector/array/etc.

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

HTML/CSS Problems with DNS records

PHP My website presents the visitor with the choice of opting out of using cookies....

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect