# Jamin2112 attempts C++ solutions

• C/++/#
Okay, in this thread I'm going to attempt some problems and I would appreciate it if the C++ purists could offer feedback. I Googled to get some ideas for problems. Here's a preview:

• Eliminate whitespace from a string
• Remove duplicate characters from a string (e.g. "AAA BBB" -> "A B")
• Return the sum of the two largest integers in an integer array
• Sum the n largest integers in an array of integers where every integer is between 0 and 9

I will add more problems as I go along in this thread.

Eliminate whitespace from a string:

Code:
void eliminate_whitespace(std::string &str) {
std::string::iterator it = str.begin();
while (it != str.end()) {
if (*it == ' ') {
str.erase(it);
} else {
it++;
}
}
}

Remove duplicate characters from a string (e.g. "AAA BBB" -> "A B"):

Code:
void remove_duplicates(std::string &str) {
std::string::iterator it = str.begin();
while (it != str.end()) {
if (*it == *(it + 1)) {
str.erase(it);
} else {
it++;
}
}
}

Good so far? Anything stylistically poor?

Borek
Mentor
You may want to check what "whitespace" is.

You may want to check what "whitespace" is.

Ok, so it's spaces, tabs and newlines?

Borek
Mentor
Yes. Note that newline can be \n, \r or both, depending on the OS conventions.

D H
Staff Emeritus
Remove duplicate characters from a string (e.g. "AAA BBB" -> "A B"):

Code:
...
if (*it == *(it + 1))
...

The highlighted code invokes undefined behavior when your iterator points to the last character in the string. A better approach: Look at the preceding character rather than the following one. The first character is not a duplicate (there's nothing before it), so start at str.begin()+1. There's a subtlety here: What if the string is empty? (Hint: Use a different comparison operator in your loop.)

The highlighted code invokes undefined behavior when your iterator points to the last character in the string. A better approach: Look at the preceding character rather than the following one. The first character is not a duplicate (there's nothing before it), so start at str.begin()+1. There's a subtlety here: What if the string is empty? (Hint: Use a different comparison operator in your loop.)

Revised:

Code:
void remove_duplicates(std::string &str) {
std::string::iterator it = str.begin() + 1;
while (it != str.end() + 1) { // meaning *it != '\0'
if (*it == *(it - 1)) {
str.erase(it);
} else {
it++;
}
}
}

D H
Staff Emeritus
That's still undefined behavior (str.end()+1). Don't use == or !=. Using something else. String iterators are random access iterators. The full panoply of comparison operators is available for your use.

That's still undefined behavior (str.end()+1). Don't use == or !=. Using something else.

Return the sum of the two largest integers in an integer array:

I think the fastest way, on average, is brute force: Iterate through the array to find the largest, iterate through again to find the second largest, then sum the largest and second largest.

Another way is to quicksort the array and sum the last two elements. Not as fast, on average.

Yet another way would be to add the elements of the array to a binary tree and keeps track of the largest and second largest added so far. I think that will be still be slower than brute force, on average.

Last edited:
D H
Staff Emeritus
No. I'm talking about (it != str.end()), or even worse (it != str.end()+1) (which you really do not want to do).

D H
Staff Emeritus
Return the sum of the two largest integers in an integer array:[/SIZE][/U][/B]

I think the fastest way, on average, is brute force: Iterate through the array to find the largest, iterate through again to find the second largest, then sum the largest and second largest.
One pass through the array will suffice. Keep track of the largest and second largest values. If array is less than or equal to the second largest you don't have to update either. If it's greater than the largest, the (former) largest is the second largest and this new value is now your largest value. Otherwise, it's in between your largest and second largest, so this new value is now the second largest. Sum the two at the end.

Special cases: What if the array is empty? What if it has only one element? *Always* think of those special cases.

Take a look at how other comparison operators work with iterators.

One way or another, do not count on any value or specific behaviour past .end(). It is undefined. Keep in mind that .end() is not at the last element of the sequence, but past-the-end.

In fact, if it's an empty string, even (str.begin() + 1) is technically undefined since str.begin() == str.end(). Adding one to str.begin() puts the iterator past the .end().

So how would you prevent that from happening? Like D H said, always think of these corner cases. Make tests for them, even. When is there effectively no work needing to be done on the input string? How do you make sure (str.begin() + 1) is always defined?

One pass through the array will suffice. Keep track of the largest and second largest values. If array is less than or equal to the second largest you don't have to update either. If it's greater than the largest, the (former) largest is the second largest and this new value is now your largest value. Otherwise, it's in between your largest and second largest, so this new value is now the second largest. Sum the two at the end.

Special cases: What if the array is empty? What if it has only one element? *Always* think of those special cases.

The special cases would have to be runtime errors since the function isn't defined for arrays of length 0 or 1.

Code:
int sum_largest_two(int* arr, int n) {
// some sort of error handling here ...
int largest(arr[0]), secondLargest(arr[0]);
for (int i = 1; i < n; i++) {
if (arr[i] > largest) {
secondLargest = largest;
largest = arr[i];
} else if (arr[i] < largest && secondLargest < arr[i]) {
secondLargest = arr[i];
}
}
return (largest + secondLargest);
}

EDIT .... I probably should use vectors since we're doing C++.

Code:
int sum_largest_two(const std::vector<int> &vec) {
// some sort of error handling here ...
int largest(vec[0]), secondLargest(vec[0]);
for (std::vector::iterator it = vec.begin() + 1; it != vec.end(); ++it) {
if (vec[i] > largest) {
secondLargest = largest;
largest = vec[i];
} else if (vec[i] < largest && secondLargest < vec[i]) {
secondLargest = vec[i];
}
}
return (largest + secondLargest);
}

How do you make sure (str.begin() + 1) is always defined?

I actually don't know the answer to this. If you create a string of length 0, then I think that *str.begin(), as well as *str.end(), is the sentinel character. So the only thing you can do is add a statement that forces the string to length 1 if it's length 0.

I actually don't know the answer to this. If you create a string of length 0, then I think that *str.begin(), as well as *str.end(), is the sentinel character. So the only thing you can do is add a statement that forces the string to length 1 if it's length 0.

That quote is from my post, not D H's, but no matter.

You're going to facepalm when you realize the answer. Ask yourself what needs to be done so the output is correct (no repeated sequential characters) when the input size is 0 or 1.

AlephZero
Homework Helper
You may want to check what "whitespace" is.

Ok, so it's spaces, tabs and newlines?

... and some other characters. The definition of "whitespace" depends what locale you are using (e.g. what language your users are working in).

If you are using unicode characters, there are a whole raft of different "space" characters with different properties to trip up your code.

Look at the functions in <cctype>, and use them. As well as giving the right results, they are likely to run faster than "if this && that && something_else && yet_another_test" logic.

Look at the functions in <cctype>, and use them. As well as giving the right results, they are likely to run faster than "if this && that && something_else && yet_another_test" logic.

There's also a locale aware one in <locale>: http://www.cplusplus.com/reference/locale/isspace/. For removing whitespace, it works nicely with bind and the erase-remove idiom I already mentioned (using C++11).

AlephZero
Homework Helper
Special cases: What if the array is empty? What if it has only one element? *Always* think of those special cases.

The special cases would have to be runtime errors since the function isn't defined for arrays of length 0 or 1.

That's usually a poor way to handle the situation. If your function throws an exception, what is the caller supposed to do with it? Unless he/she bothered to read the function documentation (cue hollow laughter, in real life!) he/she probably won't even expect the exception.

Usually it's better to make a sensible decision about what the return value "means", in the limiting cases. For example returning 0 for an array of length 0, or the single array value for an array of length 1, might be sensible and useful.

Similarly for your "duplicate characters" function: there's an obvious definition for what you get after removing duplicate characters from a zero-length string, so that isn't an error situation.

D H
Staff Emeritus
(using C++11).
Which is a whole nother ball of wax. A whole lot of other balls of wax, in fact.

Speaking of balls of wax, Jamin2112, do you know how to write a proper assignment operator? Do you know when you need to write one? Have you heard of the "rule of three" (which sometimes becomes the "rule of five" in C++11)? By focusing on algorithms, you are missing out on huge chunks of what C++ is all about. Object oriented programming is obviously a big part, but there's also templates and template metaprogramming, and functional programming.

Back to the original ball of wax, C++11 adds a whole lot of machinery for template metaprogramming and functional programming.

Speaking of balls of wax, Jamin2112, do you know how to write a proper assignment operator? Do you know when you need to write one? Have you heard of the "rule of three" (which sometimes becomes the "rule of five" in C++11)? By focusing on algorithms, you are missing out on huge chunks of what C++ is all about. Object oriented programming is obviously a big part, but there's also templates and template metaprogramming, and functional programming.

Well, I've read C++ Primer. I know what templates are. Not sure about template metaprogramming.

You need <ctype.h> and <algorithm>.

That should be <cctype> and not <ctype.h>, if you'll recall from previous commentaries I've made on your code and using headers that come from C in C++. But you've got the idea. Just keep in mind, as previously mentioned, it is not locale aware, so make sure its assumptions are not violated.

That sort of functional programming can be very useful in the right place. As D H says, though, all that and C++/C++11 in general, is a big ball of wax. Planet sized ball of wax. Maybe a solar system's worth. Like music, it's a lifelong journey of learning. May as well just enjoy the voyage, one step at a time.

Adding to what D H said, here's a page on template metaprogramming: http://en.wikipedia.org/wiki/Template_metaprogramming. And on the "rule of three": http://en.wikipedia.org/wiki/Rule_of_three_(C++_programming). Wikipedia is really useful. Make full use of it. :)

You know, at the risk of sounding like a fanboy, if D H is right any more often, I may need to get some "Listen to D H" T-shirts made. lol

Sum the n largest integers in an array of integers where every integer is between 0 and 9

I'm going to approach this in a think-out-loud style so you guys can critique my thinking as well as my final solution.

Whenever a problem says "where every integer is between a and b ...", that's a dead giveaway of how you should best approach the problem. You're going to create some sort of container of length a - b + 1, which will record the frequency of each number a through b. The recipe I think will work efficiently is below.

Code:
int sum_largest_n_09(int* arr, int size, int n) {
vector<int> cntVec(10, 0);
for (int k = 0; k < size; k++) cntVec[arr[k]]++;
// now an algorithm to calculate and return the sum ...
}

Note that the above assume n ≤ size and that the numbers in the array are indeed between 0 and 9. The algorithm to calculate the sum is a bit messy to write. I'm working on it.

Another recipe would be simpler to write but more costly in terms of time, especially when the size of the array is large. Sort the array and sum the last n elements.

Code:
int sum_largest_n_09(int* arr, int size, int n) {
sort (arr, arr + n);
int sum = 0;
for (int i = 1; i <= 10; i++) sum += arr[size - i];
return sum;
}

Same assumptions as before.

Is my thinking right so far?

Last edited:
D H
Staff Emeritus
What is it with this excessive focus on algorithms, Jamin2112? I understand that some interviewers nowadays have an excessive focus on obscure algorithms, as if that will help them weed the wheat from the chaff. It won't, and that they do should be a clue that you do not want to work at those places.

That's my take, anyhow. My take may not be the best advice. I've been working for 35 years, solid, and I've looked for and found a job *once* (and that was over 30 years ago). Every other time I've switched employers the job has found me.

My insight is from the other side of the fence: I occasionally interview candidates. For freshouts with only a bachelors degree, I want to see if they have the knowledge they claim they have, and whether they can work as part of a team. I do not understand this weird concept of asking candidates to write a program for some *extremely* obscure problem whose solution is nonetheless easily found with a google search. It makes absolutely no sense to me. That is not checking any of those boxes I am supposed to check as an interviewer. (Do you know *anything*? Do you know anything relevant to my employer? Can you function as a part of a team? Would I want *you* to be a part of my team?)

Back on topic, here are some of the mistakes I see you making:
• You aren't taking full advantage of the standard library. That's quite forgivable; the standard library is huge. Add in the stuff from Boost and C++11 and it's ridiculously huge.
• You are making fencepost errors. That is not forgivable. Fencepost errors are, at least to me, one of the key things that separate the wheat from the chaff.
• You are not dealing with corner cases. That, too, is not forgivable. "Check early, check often."

1 person
Back on topic, here are some of the mistakes I see you making:
• You aren't taking full advantage of the standard library. That's quite forgivable; the standard library is huge. Add in the stuff from Boost and C++11 and it's ridiculously huge.
• You are making fencepost errors. That is not forgivable. Fencepost errors are, at least to me, one of the key things that separate the wheat from the chaff.
• You are not dealing with corner cases. That, too, is not forgivable. "Check early, check often."

What sort of fencepost error did I make?

Sum the n largest integers in an array of integers where every integer is between 0 and 9

Whenever a problem says "where every integer is between a and b ...", that's a dead giveaway of how you should best approach the problem. You're going to create some sort of container of length a - b + 1, which will record the frequency of each number a through b. The recipe I think will work efficiently is below.

Be careful of letting the requirements frame your approach to the implementation. In this case, a vector (*cough*stopusingarrays*cough*) is very much a good idea. But if that range was very large and would result in a sparse array of counts, then you might consider something else like a map.

Code:
int sum_largest_n_09(int* arr, int size, int n) {
vector<int> cntVec(10, 0);
for (int k = 0; k < size; k++) cntVec[arr[k]]++;
// now an algorithm to calculate and return the sum ...
}

First, no reason to pass in an array rather than a vector. If you're just missing being able to initialize a vector from a list of integers, then check out C++11 and use initializer lists. But don't let that affect the function signature, either way. I think for implementing that in C++, you should probably assume that 'array' means you should use a vector, and not forcing you into making a subpar choice. Perhaps the requirements would better be stated as '...integers in a list of integers'.

I'm also going to suggest you go back to advice I gave you about a month ago on your other code, because it still applies and I don't want to repeat myself. In this case, I suggest not leaving out the curly braces for the for loop body. As D H said recently, it invites Murphy's Law. And of course, put the for on its own line, and the body between those curly braces.

I'll just add that, when choosing different approaches, if it doesn't need to be super high performance and won't have a large impact on the overall program's performance, then I tend to prefer the simplest and most easily maintained solution (but not grossly inefficient and wasteful, of course!). The profiler can tell me where I need to optimize later during development.

Your sort seems wrong. Is this right: sort(arr, arr + n);? If you were using std::vector, you could use the std::sort algorithm: http://www.cplusplus.com/reference/algorithm/sort/ and set a comparator that will make it come out largest first. That makes it even more obvious, simpler and less error prone.

Be careful of letting the requirements frame your approach to the implementation. In this case, a vector (*cough*stopusingarrays*cough*) is very much a good idea. But if that range was very large and would result in a sparse array of counts, then you might consider something else like a map.

First, no reason to pass in an array rather than a vector. If you're just missing being able to initialize a vector from a list of integers, then check out C++11 and use initializer lists. But don't let that affect the function signature, either way. I think for implementing that in C++, you should probably assume that 'array' means you should use a vector, and not forcing you into making a subpar choice. Perhaps the requirements would better be stated as '...integers in a list of integers'.

I'm also going to suggest you go back to advice I gave you about a month ago on your other code, because it still applies and I don't want to repeat myself. In this case, I suggest not leaving out the curly braces for the for loop body. As D H said recently, it invites Murphy's Law. And of course, put the for on its own line, and the body between those curly braces.

I'll just add that, when choosing different approaches, if it doesn't need to be super high performance and won't have a large impact on the overall program's performance, then I tend to prefer the simplest and most easily maintained solution (but not grossly inefficient and wasteful, of course!). The profiler can tell me where I need to optimize later during development.

Your sort seems wrong. Is this right: sort(arr, arr + n);? If you were using std::vector, you could use the std::sort algorithm: http://www.cplusplus.com/reference/algorithm/sort/ and set a comparator that will make it come out largest first. That makes it even more obvious, simpler and less error prone.

Thanks for the advice! I'm taking it to heart.

I'm not sure how to set a comparator. Can you show me?

Thanks for the advice! I'm taking it to heart.

I'm not sure how to set a comparator. Can you show me?

Sure, you'll note that URL I gave you to the STL sort from <algorithm> contains an example bit of code that should make how to do it fairly clear, I hope. It's a really useful site that I use all the time. A very small and hopefully obvious change to their example comparator function should be all you need.

Sure, you'll note that URL I gave you to the STL sort from <algorithm> contains an example bit of code that should make how to do it fairly clear, I hope. It's a really useful site that I use all the time. A very small and hopefully obvious change to their example comparator function should be all you need.

Ah, I see. The solution should look something like this.

Code:
bool greater (int i, int j) {
return (i > j);
}

int sum_largest_n(std::vector<int> vec, int n) {
std::vector<int>::iterator itbeg = vec.begin();
std::sort(itbeg, vec.end(), greater);
int sum = 0;
for (std::vector<int>::iterator it = itbeg ; it != itbeg  + 9; ++it)
sum += *it;
return sum;
}

But of course this doesn't use the fact that every number in vec is between 0 and 9.

D H
Staff Emeritus