Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Why is making a single iteration through a list/vector/array/etc.

  1. Jan 8, 2014 #1
    ... always considered the "best" solution, as opposed to a solution that iterates through twice or more times? Isn't it sometimes more efficient and elegant-looking to accomplish a task step-by-step instead of trying to cram everything into a single pass?

    I'm wondering because so many little programming challenge problems I see on the internet always want a single pass, e.g. "Find the middle element of a linked list using a single pass."
     
  2. jcsd
  3. Jan 8, 2014 #2

    maajdl

    User Avatar
    Gold Member

    Asking "Find the middle element of a linked list using a single pass" does not imply that it is the "best" solution.
    And what means "best", is it "fastest" or "simplest" or "most readable" or "shortest code"?
    It also depends on the language you are using, in Linq it would make less sense to talk about "iterates".

    Were you able to "Find the middle element of a linked list using a single pass."?
    After all it is a funny challenge, isn't it?
     
  4. Jan 8, 2014 #3

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    I concur with maajdl. These puzzles oftentimes are not the "best" solution. Think of them as puzzles whose intent is to make you think about data structures and algorithms.

    As far as the "best" way to find the middle element of a linked list in the real world, I would view something along the lines of the following as pretty hard to beat. Anything else is micro optimization.
    Code (Text):

    // Find the middle element of the linked list.
    middle =  list.get(list.size()/2);
     
    Compared to a loop that walks over the entire list and keeps a separate middle iterator that is incremented once for every two times the list iterator is incremented, the above code
    • Is much shorter (a *big* plus in the real world),
    • Is much more obvious in what it is doing (another big plus),
    • Uses high level functionality rather than low level functionality of linked list object (yet another big plus), and
    • Might well be faster than the loop over the whole list.
      Suppose the linked list object maintains the size of the list as an easily accessible attribute. This makes list.size() an O(1) algorithm. If list.get() walks over the list to the midpoint, the above code is twice as fast as the puzzle solution. If the underlying implementation of the list is a random access array, then list.get() is also O(1).
     
    Last edited: Jan 8, 2014
  5. Jan 8, 2014 #4

    AlephZero

    User Avatar
    Science Advisor
    Homework Helper

    Just return the first element. The question doesn't say what list traversal algorithm you were supposed to use. The element you returned is the middle element, for some traversal algorithm :devil:
     
  6. Jan 9, 2014 #5
    All STL containers, including list, are required to implement ::size() at constant complexity, i.e., as O(1).
     
  7. Jan 9, 2014 #6

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    Not std::forward_list (new in C++11).

    Note that STL is a bit of a misnomer. The containers library in C++ is not the STL. The STL is code developed in the 1980s by Alexander Stepanov that is no longer maintained. While it did serve as a template (pun intended) for significant chunks of the C++ standard library, what is in C++ is not the STL.
     
  8. Jan 9, 2014 #7
    That was taken from 23.2.1 General Container Requirements. I assumed that applied to all containers, including the forward list. I did overlook that the forward list did not have the size member in the first place.

    Yeah, I know. It is just some much easier to type STL than "the C++ Standard Library". I think these days "STL" is generally understood as a moniker for "the C++ Standard Library".
     
  9. Jan 9, 2014 #8

    rcgldr

    User Avatar
    Homework Helper

    That would effectively be making 1 1/2 passes over a linked list. If the list was large enough and/or nodes scattered through the memory space, then the two node pointers could end up competing for common cache lines in the processor, taking more time than making separate passes, one full pass to get a count (assuming count doesn't already exist), one half pass to get to the midpoint.
     
    Last edited: Jan 9, 2014
  10. Jan 9, 2014 #9

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    The exact same is true for a "single pass" solution.
    Code (Text):

    node* median = head;
    if (head != NULL) {
       node* curr_node = head->next;
       node* next_node;
       while ((curr_node != NULL) && ((next_node = curr_node->next) != NULL)) {
          median = median->next;
          curr_node = next_node->next;
       }
    }
     
    There are still competing cache lines. In fact, the situation is quite possibly worsened because now we're keeping two pointers going simultaneously that point to disparate areas of memory. If consecutive nodes are more or less organized consecutively in memory, the two loop solution may well be faster than the single pass solution.

    There are many cases where two loops are better than one. The above is one such case. Here's another, increment each element of array c by the corresponding element in array a, and increment each element in array d by the corresponding element in array b. All arrays have the same number of elements. This might make one think a single loop would be the better solution.
    Code (Text):

    // Single loop solution
    for (int ii = 0; ii < N; ++ii) {
       c[ii] += a[ii];
       d[ii] += b[ii];
    }
     
    This code is fine if N is small. Problems arise as N gets larger. Now there are four separate areas of memory to be managed inside the loop. This makes for lots of cache conflicts.

    Here's the two loop solution. Admittedly it's not as pretty.
    Code (Text):

    // Two loop solution
    for (int ii = 0; ii < N; ++ii) {
       c[ii] += a[ii];
    }
    for (int ii = 0; ii < N; ++ii) {
       d[ii] += b[ii];
    }
     
    This code is still fine if N is small. Registers are fast. That this has two loops is not a burden. It's the memory access that's the killer, not the number of loops. In this case, as N grows larger, there are now only two separate areas of memory to be managed inside each loop. This code is significantly faster for moderate values of N. When N gets very, very large, there single loop and two loop solutions are equally bad. There's too much memory thrashing no matter how you cut it.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Why is making a single iteration through a list/vector/array/etc.
  1. Singly Linked Lists (Replies: 1)

Loading...