Jamin2112 attempts C++ solutions

Jamin2112 · Sep 22, 2013

Okay, in this thread I'm going to attempt some problems and I would appreciate it if the C++ purists could offer feedback. I Googled to get some ideas for problems. Here's a preview:

Eliminate whitespace from a string
Remove duplicate characters from a string (e.g. "AAA BBB" -> "A B")
Return the sum of the two largest integers in an integer array
Sum the n largest integers in an array of integers where every integer is between 0 and 9

I will add more problems as I go along in this thread.

Jamin2112 · Sep 22, 2013

Eliminate whitespace from a string:

Code:

void eliminate_whitespace(std::string &str) { 
   std::string::iterator it = str.begin(); 
   while (it != str.end()) { 
      if (*it == ' ') { 
         str.erase(it);
      } else { 
         it++;
      }
   }   
}

Jamin2112 · Sep 22, 2013

Remove duplicate characters from a string (e.g. "AAA BBB" -> "A B"):

Code:

void remove_duplicates(std::string &str) { 
  std::string::iterator it = str.begin(); 
   while (it != str.end()) { 
      if (*it == *(it + 1)) { 
         str.erase(it);
      } else { 
         it++;
      }
   }   
}

Jamin2112 · Sep 22, 2013

Good so far? Anything stylistically poor?

Borek · Sep 22, 2013

You may want to check what "whitespace" is.

Jamin2112 · Sep 22, 2013

Borek said:

You may want to check what "whitespace" is.

Ok, so it's spaces, tabs and newlines?

Borek · Sep 22, 2013

Yes. Note that newline can be \n, \r or both, depending on the OS conventions.

D H · Sep 22, 2013

Jamin2112 said:
Remove duplicate characters from a string (e.g. "AAA BBB" -> "A B"):
Code:
...
if (*it == *(it + 1))
...

The highlighted code invokes undefined behavior when your iterator points to the last character in the string. A better approach: Look at the preceding character rather than the following one. The first character is not a duplicate (there's nothing before it), so start at str.begin()+1. There's a subtlety here: What if the string is empty? (Hint: Use a different comparison operator in your loop.)

Jamin2112 · Sep 22, 2013

D H said:

The highlighted code invokes undefined behavior when your iterator points to the last character in the string. A better approach: Look at the preceding character rather than the following one. The first character is not a duplicate (there's nothing before it), so start at str.begin()+1. There's a subtlety here: What if the string is empty? (Hint: Use a different comparison operator in your loop.)

Revised:

Code:

void remove_duplicates(std::string &str) { 
  std::string::iterator it = str.begin() + 1; 
   while (it != str.end() + 1) { // meaning *it != '\0'
      if (*it == *(it - 1)) { 
         str.erase(it);
      } else { 
         it++;
      }
   }   
}

D H · Sep 22, 2013

That's still undefined behavior (str.end()+1). Don't use == or !=. Using something else. String iterators are random access iterators. The full panoply of comparison operators is available for your use.

Jamin2112 · Sep 22, 2013

D H said:

That's still undefined behavior (str.end()+1). Don't use == or !=. Using something else.

Are you talking about string::compare?

Jamin2112 · Sep 22, 2013

Return the sum of the two largest integers in an integer array:

I think the fastest way, on average, is brute force: Iterate through the array to find the largest, iterate through again to find the second largest, then sum the largest and second largest.

Another way is to quicksort the array and sum the last two elements. Not as fast, on average.

Yet another way would be to add the elements of the array to a binary tree and keeps track of the largest and second largest added so far. I think that will be still be slower than brute force, on average.

D H · Sep 22, 2013

Jamin2112 said:

Are you talking about string::compare?

No. I'm talking about (it != str.end()), or even worse (it != str.end()+1) (which you really do not want to do).

D H · Sep 22, 2013

Jamin2112 said:

Return the sum of the two largest integers in an integer array:[/U][/B]

I think the fastest way, on average, is brute force: Iterate through the array to find the largest, iterate through again to find the second largest, then sum the largest and second largest.

One pass through the array will suffice. Keep track of the largest and second largest values. If array is less than or equal to the second largest you don't have to update either. If it's greater than the largest, the (former) largest is the second largest and this new value is now your largest value. Otherwise, it's in between your largest and second largest, so this new value is now the second largest. Sum the two at the end.

Special cases: What if the array is empty? What if it has only one element? *Always* think of those special cases.

Grep · Sep 22, 2013

Jamin2112 said:

Are you talking about string::compare?

Take a look at how other comparison operators work with iterators.

One way or another, do not count on any value or specific behaviour past .end(). It is undefined. Keep in mind that .end() is not at the last element of the sequence, but past-the-end.

In fact, if it's an empty string, even (str.begin() + 1) is technically undefined since str.begin() == str.end(). Adding one to str.begin() puts the iterator past the .end().

So how would you prevent that from happening? Like D H said, always think of these corner cases. Make tests for them, even. When is there effectively no work needing to be done on the input string? How do you make sure (str.begin() + 1) is always defined?

Jamin2112 · Sep 22, 2013

D H said:

One pass through the array will suffice. Keep track of the largest and second largest values. If array is less than or equal to the second largest you don't have to update either. If it's greater than the largest, the (former) largest is the second largest and this new value is now your largest value. Otherwise, it's in between your largest and second largest, so this new value is now the second largest. Sum the two at the end.

Special cases: What if the array is empty? What if it has only one element? *Always* think of those special cases.

The special cases would have to be runtime errors since the function isn't defined for arrays of length 0 or 1.

Code:

int sum_largest_two(int* arr, int n) { // some sort of error handling here ... int largest(arr[0]), secondLargest(arr[0]); for (int i = 1; i < n; i++) { if (arr[i] > largest) { secondLargest = largest; largest = arr[i]; } else if (arr[i] < largest && secondLargest < arr[i]) { secondLargest = arr[i]; } } return (largest + secondLargest); }

EDIT ... I probably should use vectors since we're doing C++.

Code:

int sum_largest_two(const std::vector<int> &vec) { // some sort of error handling here ... int largest(vec[0]), secondLargest(vec[0]); for (std::vector::iterator it = vec.begin() + 1; it != vec.end(); ++it) { if (vec[i] > largest) { secondLargest = largest; largest = vec[i]; } else if (vec[i] < largest && secondLargest < vec[i]) { secondLargest = vec[i]; } } return (largest + secondLargest); }

Grep · Sep 22, 2013

For your whitespace removal, you might want to take a look at this pattern:

http://en.wikipedia.org/wiki/Erase-remove_idiom

Jamin2112 · Sep 22, 2013

D H said:

How do you make sure (str.begin() + 1) is always defined?

I actually don't know the answer to this. If you create a string of length 0, then I think that *str.begin(), as well as *str.end(), is the sentinel character. So the only thing you can do is add a statement that forces the string to length 1 if it's length 0.

Grep · Sep 22, 2013

Jamin2112 said:

I actually don't know the answer to this. If you create a string of length 0, then I think that *str.begin(), as well as *str.end(), is the sentinel character. So the only thing you can do is add a statement that forces the string to length 1 if it's length 0.

That quote is from my post, not D H's, but no matter.

You're going to facepalm when you realize the answer. Ask yourself what needs to be done so the output is correct (no repeated sequential characters) when the input size is 0 or 1.

AlephZero · Sep 22, 2013

Borek said:

You may want to check what "whitespace" is.

Jamin2112 said:

Ok, so it's spaces, tabs and newlines?

... and some other characters. The definition of "whitespace" depends what locale you are using (e.g. what language your users are working in).

If you are using unicode characters, there are a whole raft of different "space" characters with different properties to trip up your code.

Look at the functions in <cctype>, and use them. As well as giving the right results, they are likely to run faster than "if this && that && something_else && yet_another_test" logic.

Grep · Sep 22, 2013

AlephZero said:

Look at the functions in <cctype>, and use them. As well as giving the right results, they are likely to run faster than "if this && that && something_else && yet_another_test" logic.

There's also a locale aware one in <locale>: http://www.cplusplus.com/reference/locale/isspace/. For removing whitespace, it works nicely with bind and the erase-remove idiom I already mentioned (using C++11).

AlephZero · Sep 22, 2013

D H said:

Special cases: What if the array is empty? What if it has only one element? *Always* think of those special cases.

Jamin2112 said:

The special cases would have to be runtime errors since the function isn't defined for arrays of length 0 or 1.

That's usually a poor way to handle the situation. If your function throws an exception, what is the caller supposed to do with it? Unless he/she bothered to read the function documentation (cue hollow laughter, in real life!) he/she probably won't even expect the exception.

Usually it's better to make a sensible decision about what the return value "means", in the limiting cases. For example returning 0 for an array of length 0, or the single array value for an array of length 1, might be sensible and useful.

Similarly for your "duplicate characters" function: there's an obvious definition for what you get after removing duplicate characters from a zero-length string, so that isn't an error situation.

D H · Sep 22, 2013

Grep said:

(using C++11).

Which is a whole nother ball of wax. A whole lot of other balls of wax, in fact.

Speaking of balls of wax, Jamin2112, do you know how to write a proper assignment operator? Do you know when you need to write one? Have you heard of the "rule of three" (which sometimes becomes the "rule of five" in C++11)? By focusing on algorithms, you are missing out on huge chunks of what C++ is all about. Object oriented programming is obviously a big part, but there's also templates and template metaprogramming, and functional programming.

Back to the original ball of wax, C++11 adds a whole lot of machinery for template metaprogramming and functional programming.

Jamin2112 · Sep 22, 2013

D H said:

Speaking of balls of wax, Jamin2112, do you know how to write a proper assignment operator? Do you know when you need to write one? Have you heard of the "rule of three" (which sometimes becomes the "rule of five" in C++11)? By focusing on algorithms, you are missing out on huge chunks of what C++ is all about. Object oriented programming is obviously a big part, but there's also templates and template metaprogramming, and functional programming.

Well, I've read C++ Primer. I know what templates are. Not sure about template metaprogramming.

Jamin2112 · Sep 22, 2013

Grep said:

For your whitespace removal, you might want to take a look at this pattern:

http://en.wikipedia.org/wiki/Erase-remove_idiom

Ok. Here's the recipe for removing white-space, using the erase-remove idiom and the isspace function (http://www.cplusplus.com/reference/cctype/isspace/). You need <ctype.h> and <algorithm>. The entire thing is 1 line.

For a string named str, do:

str.erase(std::remove_if(str.begin(), str.end(), isspace), str.end());

Grep · Sep 22, 2013

Jamin2112 said:

You need <ctype.h> and <algorithm>.

That should be <cctype> and not <ctype.h>, if you'll recall from previous commentaries I've made on your code and using headers that come from C in C++. But you've got the idea. Just keep in mind, as previously mentioned, it is not locale aware, so make sure its assumptions are not violated.

That sort of functional programming can be very useful in the right place. As D H says, though, all that and C++/C++11 in general, is a big ball of wax. Planet sized ball of wax. Maybe a solar system's worth. Like music, it's a lifelong journey of learning. May as well just enjoy the voyage, one step at a time.

Adding to what D H said, here's a page on template metaprogramming: http://en.wikipedia.org/wiki/Template_metaprogramming. And on the "rule of three": http://en.wikipedia.org/wiki/Rule_of_three_(C++_programming). Wikipedia is really useful. Make full use of it. :)

You know, at the risk of sounding like a fanboy, if D H is right any more often, I may need to get some "Listen to D H" T-shirts made. lol

Jamin2112 · Sep 22, 2013

Sum the n largest integers in an array of integers where every integer is between 0 and 9

I'm going to approach this in a think-out-loud style so you guys can critique my thinking as well as my final solution.

Whenever a problem says "where every integer is between a and b ...", that's a dead giveaway of how you should best approach the problem. You're going to create some sort of container of length a - b + 1, which will record the frequency of each number a through b. The recipe I think will work efficiently is below.

Code:

int sum_largest_n_09(int* arr, int size, int n) { 
    vector<int> cntVec(10, 0); 
    for (int k = 0; k < size; k++) cntVec[arr[k]]++;
    // now an algorithm to calculate and return the sum ...
}

Note that the above assume n ≤ size and that the numbers in the array are indeed between 0 and 9. The algorithm to calculate the sum is a bit messy to write. I'm working on it.

Another recipe would be simpler to write but more costly in terms of time, especially when the size of the array is large. Sort the array and sum the last n elements.

Code:

int sum_largest_n_09(int* arr, int size, int n) { 
   sort (arr, arr + n);
   int sum = 0;
   for (int i = 1; i <= 10; i++) sum += arr[size - i];
   return sum;
}

Same assumptions as before.

Is my thinking right so far?

D H · Sep 22, 2013

What is it with this excessive focus on algorithms, Jamin2112? I understand that some interviewers nowadays have an excessive focus on obscure algorithms, as if that will help them weed the wheat from the chaff. It won't, and that they do should be a clue that you do not want to work at those places.

That's my take, anyhow. My take may not be the best advice. I've been working for 35 years, solid, and I've looked for and found a job *once* (and that was over 30 years ago). Every other time I've switched employers the job has found me.

My insight is from the other side of the fence: I occasionally interview candidates. For freshouts with only a bachelors degree, I want to see if they have the knowledge they claim they have, and whether they can work as part of a team. I do not understand this weird concept of asking candidates to write a program for some *extremely* obscure problem whose solution is nonetheless easily found with a google search. It makes absolutely no sense to me. That is not checking any of those boxes I am supposed to check as an interviewer. (Do you know *anything*? Do you know anything relevant to my employer? Can you function as a part of a team? Would I want *you* to be a part of my team?)Back on topic, here are some of the mistakes I see you making:

You aren't taking full advantage of the standard library. That's quite forgivable; the standard library is huge. Add in the stuff from Boost and C++11 and it's ridiculously huge.
You are making fencepost errors. That is not forgivable. Fencepost errors are, at least to me, one of the key things that separate the wheat from the chaff.
You are not dealing with corner cases. That, too, is not forgivable. "Check early, check often."

Jamin2112 · Sep 23, 2013

D H said:

Back on topic, here are some of the mistakes I see you making:

You aren't taking full advantage of the standard library. That's quite forgivable; the standard library is huge. Add in the stuff from Boost and C++11 and it's ridiculously huge.

You are making fencepost errors. That is not forgivable. Fencepost errors are, at least to me, one of the key things that separate the wheat from the chaff.

You are not dealing with corner cases. That, too, is not forgivable. "Check early, check often."

What sort of fencepost error did I make?

Grep · Sep 23, 2013

Jamin2112 said:

Sum the n largest integers in an array of integers where every integer is between 0 and 9

Whenever a problem says "where every integer is between a and b ...", that's a dead giveaway of how you should best approach the problem. You're going to create some sort of container of length a - b + 1, which will record the frequency of each number a through b. The recipe I think will work efficiently is below.

Be careful of letting the requirements frame your approach to the implementation. In this case, a vector (*cough*stopusingarrays*cough*) is very much a good idea. But if that range was very large and would result in a sparse array of counts, then you might consider something else like a map.

Jamin2112 said:

Code:

int sum_largest_n_09(int* arr, int size, int n) { 
    vector<int> cntVec(10, 0); 
    for (int k = 0; k < size; k++) cntVec[arr[k]]++;
    // now an algorithm to calculate and return the sum ...
}

First, no reason to pass in an array rather than a vector. If you're just missing being able to initialize a vector from a list of integers, then check out C++11 and use initializer lists. But don't let that affect the function signature, either way. I think for implementing that in C++, you should probably assume that 'array' means you should use a vector, and not forcing you into making a subpar choice. Perhaps the requirements would better be stated as '...integers in a list of integers'.

I'm also going to suggest you go back to advice I gave you about a month ago on your other code, because it still applies and I don't want to repeat myself. In this case, I suggest not leaving out the curly braces for the for loop body. As D H said recently, it invites Murphy's Law. And of course, put the for on its own line, and the body between those curly braces.

I'll just add that, when choosing different approaches, if it doesn't need to be super high performance and won't have a large impact on the overall program's performance, then I tend to prefer the simplest and most easily maintained solution (but not grossly inefficient and wasteful, of course!). The profiler can tell me where I need to optimize later during development.

Your sort seems wrong. Is this right: sort(arr, arr + n);? If you were using std::vector, you could use the std::sort algorithm: http://www.cplusplus.com/reference/algorithm/sort/ and set a comparator that will make it come out largest first. That makes it even more obvious, simpler and less error prone.

Jamin2112 attempts C++ solutions

Similar threads

Is A.I. more than the sum of its parts?

AI vs. Humans as Processors in an Environment

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect