Is there anything wrong with this C code?

Jamin2112 · Aug 19, 2013

I had an interview the other day and the lady was asking me about different ways to find whether there are any repeated elements in an array. At the end of the interview she told me to send her a code sample showing how to implement two of the ways we discussed. I thought that what I wrote was pretty straightforward, but I ended up not getting another interview (possibly for another reason). I'm wondering whether my code shows any indications that I'm a stylistic amateur or whether anything is downright wrong.

Code:

#include <algorithm>
    #include <string>
    #include <iostream>
    using namespace std;

    // This function returns true if the input array arr of length n, with values known to be in the
    // range between floor and ceiling inclusive, contains any repeated elements.
    bool contains_repeats(int* arr, int n, int floor, int ceiling) { 
	    bool binaryMask[ceiling - floor + 1];
	    for (int i = 0; i < n; i++) {
		    if (binaryMask[ceiling - arr[i]]) { 
			    return true;
		    } else {
			    binaryMask[ceiling - arr[i]] = 1;
		    }
	    }
        return false;
    }

    // An overloading of the previous function uses quicksort and then checks whether any    adjacent
    // elements are equal.
    bool contains_repeats(int* arr, int n) { 
    	sort(arr, arr + n);
    	for (int i = 0; i < n - 1; i++) if (arr[i] == arr[i+1]) return true;
	return false;
    }


    int main(int argc, char* argv[]) { 

	// Example of implemenation for the binary mask:
	int numbers[] = {1, 3, 5, 1, 30, 4, 38, 94, 14, 41, 40, 69, 15, 11, 33, 3};
	int low = 0;
	int high = 100; 
	cout << "Are there any repeated elements?\n"; 
	string answer = contains_repeats(numbers, sizeof(numbers)/sizeof(int), low, high) ? "Yes." : "No.";
	cout << answer << endl;
	// Example of implemenation using quicksort:
	cout <<"Let's see if we get the same answer from the other procedure ...\n";
	answer = contains_repeats(numbers, sizeof(numbers)/sizeof(int)) ? "Yes." : "No.";
	cout << answer;
	}

Grep · Aug 19, 2013

I assume you meant C++ and not C, because it's not C. If she asked for C, that might be an issue. Guessing not.

One thing that jumps out at me is your indentation troubles. You seem to have mixed tabs and spaces. I like spaces and not tabs for a few reasons, but whichever you choose, stick to one and don't mix them. And things don't line up properly everywhere. I know I'd count that against you. If you're careless with indentation for a small bit of code for an interview, I would expect no better on the job.

I'd not have counted it against you, but I also would avoid omitting curly braces or putting things on the same line that maybe shouldn't be. Clarity is important, and that line starting with the for statement in the function using the sort would be more obvious and clear if you'd not tried to put it all on one line. In fact, I've found that using the curly braces even when not strictly necessary avoids a few potential errors, particularly when modifying the code later, or quickly scanning the code. I suppose that last bit about the braces is somewhat debatable, though. But I strive for maximum obviousness and clarity, and I think that helps. The reason I'd not count it against you too much is that if I wanted it done that way, I'd put it in a coding standard you would have to follow, anyways.

Since it's C++, perhaps a C++ container would have been better than an array, as well. For example, vector<bool> or bitset. Read the description of them, and keep in mind they pack the bits, which you may not want. For an unspecialized vector that doesn't pack bits, you would need to use a non-bool data type. Certainly, Stroustroup would tell you not to use arrays in C++. Also possibly debatable, but the container would be safer and clearer. No need to use an array for speed, either, unless you find that area needs optimization later. Don't optimize prematurely. The speed difference between a vector and an array is far less than one would think. Just try and avoid resizing it by allocating sufficient space at the start. But if she said "array", then that's what you gave her, so hard to complain.

Lastly, a lot of people would say that you shouldn't include all of std at the start of a file (the using namespace std; line). I'll say that I'm guilty of this sometimes, but here's a good explanation why a lot of people consider it bad practice: http://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-bad-practice.

That's about all I can see at first glance that she might have dinged you on (looks technically correct, though I haven't gone through it with a fine tooth comb). Anyways, the indentation is the more serious problem, in my opinion, and the one that I'd personally count against you were I the interviewer. Other than that, it's not so bad. Perhaps other people have comments on something I've missed (or wish to debate the debatable parts ;) ).

Good luck with your job search!

EDIT: Ouch, you didn't initialize binaryMask to all falses. That's a serious issue. It's certainly not guaranteed to contain all 0s/falses. Also, did you really mean to index your binaryMask with (ceiling - arr)? (arr - floor) makes a lot more logical sense to me. If the lowest is 3, for example, then it would store 5 at index 5 - 3 = 2. IMO, you need to test your code more thoroughly. A good test would have weeded that out.

Jamin2112 · Aug 19, 2013

Grep said:

Since it's C++, perhaps a C++ container would have been better than an array, as well. For example, vector<bool> or bitset.

When I was talking with her on the phone, I mentioned that I could use bitset for the binary mask. But then, when I tried to use it in my code, I realized it wouldn't work because the parameter in the constructor of bitset needs to be a constant set before run time. I can't make a function that has a parameter for the size of the bitset created within unless that parameter is a constant set before run time.

Grep · Aug 19, 2013

Jamin2112 said:

When I was talking with her on the phone, I mentioned that I could use bitset for the binary mask. But then, when I tried to use it in my code, I realized it wouldn't work because the parameter in the constructor of bitset needs to be a constant set before run time. I can't make a function that has a parameter for the size of the bitset created within unless that parameter is a constant set before run time.

Quite right, it's a template parameter. My bad. In that case, vector<bool> or vector<char> (if you don't want the bits packed) or something like that would be some other options. In fact, just dropping in 'vector<bool> binaryMask(ceiling - floor + 1);' for your initialization works and solves your initialization bug.

Also note my edit about the bug with not initializing your binaryMask I just added. P.S. Also, it's not really a mask, strictly speaking.

Jamin2112 · Aug 20, 2013

Grep said:

EDIT: Ouch, you didn't initialize binaryMask to all falses. That's a serious issue.
It's certainly not guaranteed to contain all 0s/falses.

Why not?

IMO, you need to test your code more thoroughly. A good test would have weeded that out.

Yea, I thought of making a little test module to go along with the script I submitted. I didn't really know what she wanted since she wasn't every specific.

Grep · Aug 20, 2013

Jamin2112 said:

Why not?

Just the way the specifications are. A holdover from C. From http://www.cplusplus.com/doc/tutorial/arrays/ :

"When declaring a regular array of local scope (within a function, for example), if we do not specify otherwise, its elements will not be initialized to any value by default, so their content will be undetermined until we store some value in them. The elements of global and static arrays, on the other hand, are automatically initialized with their default values, which for all fundamental types this means they are filled with zeros."

jtbell · Aug 20, 2013

Grep said:

EDIT: Ouch, you didn't initialize binaryMask to all falses. That's a serious issue. It's certainly not guaranteed to contain all 0s/falses.

Jamin2112 said:

Why not?

In C++ (and I assume in C also) array elements are not initialized automatically. They preserve whatever was previously in those memory locations when the array is allocated.

When you construct a C++ vector, you can specify a value to which all elements are to be initialized:

Code:

vector<bool> binaryMask (ceiling - floor + 1, false);

[added] Grep beat me to it, with the more precise statement that local arrays are not initialized automatically.

Jamin2112 · Aug 20, 2013

Grep said:

Just the way the specifications are. A holdover from C. From http://www.cplusplus.com/doc/tutorial/arrays/ :

"When declaring a regular array of local scope (within a function, for example), if we do not specify otherwise, its elements will not be initialized to any value by default, so their content will be undetermined until we store some value in them. The elements of global and static arrays, on the other hand, are automatically initialized with their default values, which for all fundamental types this means they are filled with zeros."

Then you're right that I made an "ouch" mistake.

I'll remember that for the future.

I was wondering, Could you maybe list a few programming questions and then review my code? Basic stuff you'd find on a computer science test in college, like the mode of an array, median of two sorted arrays, etc. I'm going to try to do them like a C++ purist would, then you can critique my code. That would be much appreciated.

Grep · Aug 20, 2013

Jamin2112 said:

Then you're right that I made an "ouch" mistake.

I'll remember that for the future.

I was wondering, Could you maybe list a few programming questions and then review my code? Basic stuff you'd find on a computer science test in college, like the mode of an array, median of two sorted arrays, etc. I'm going to try to do them like a C++ purist would, then you can critique my code. That would be much appreciated.

Past my bedtime already, and tomorrow (well, today, now) I DM a game of Pathfinder all night, so I won't be able to get to it right away. But I'll think it over and see if I can come up with some ideas. Not that I consider myself a C++ master, but I'd find someone claiming such status to be suspect, to be honest. Anyways, if someone disagrees with my critique, I'll likely be corrected. That's one thing you can usually count on in an internet forum. lol

Perhaps someone else will pitch in some good ideas (and hopefully help with the critique, which could be somewhat time consuming).

In the meanwhile, for general practice, I enjoy Project Euler: http://projecteuler.net/. They get pretty challenging. :) Not sure it's best in this case, since you want to focus on the language and not the CS aspect, I guess, but a cool site I thought I'd mention.

Borek · Aug 20, 2013

Some compilers do initialize arrays with zeros in debug mode, but not in the final code. I think Visual C++ does that. That means your code can run OK in debug mode and stop working once compiled for release.

Grep · Aug 20, 2013

I suppose you could go for one of your own suggestions: Write a function that computes the mode of a series of integers. I'll add some specifications:

Numbers may have a very wide range, and the list could be fairly large.
Try to not traverse the list of numbers more than once.
If multi-modal, just return one of them arbitrarily.
Also, perhaps use C++11 features. In gcc, just add the -std=c++11 option.

inotyce · Aug 20, 2013

If your array is small in size, you can also do this (without sorting):
Create all of its permutated partners (complicated unless you know how to create them)
Make a sum: a+b_k[j] (i=[0,n-1] original array index) (j=[0,n-1], permutated array index) (b_k is k^th permutator, k=[0,n-1])
If the sum=2*a then there are duplicated members

Jamin2112 · Aug 20, 2013

inotyce said:

If your array is small in size, you can also do this (without sorting):
Create all of its permutated partners (complicated unless you know how to create them)
Make a sum: a+b_k[j] (i=[0,n-1] original array index) (j=[0,n-1], permutated array index) (b_k is k^th permutator, k=[0,n-1])
If the sum=2*a then there are duplicated members

Can show a simple example to demonstrate this procedure?

inotyce · Aug 20, 2013

Yes, sure

a={1,5,0,2,5};
b_0={5,0,2,5,1};
b_1={0,2,5,1,5};
b_2={2,5,1,5,0};
b_3={5,1,5,0,2};
----------------
a+b_1={6,5,2,7,6};
...
a+b_2={3,10,1,7,5};
...
---------------------

In case a+b_2, the resulted second element is doubled, compared to array a's second element. Then you can conclude there is a duplicated element in the original array.

By the way, I used a wrong word "permutation", too general, since we only need left/right shifted cases and thus we have N-1 cases, and it might cost O(N*N) to get the work done.

rcgldr · Aug 21, 2013

Borek said:

Some compilers do initialize arrays with zeros in debug mode, but not in the final code. I think Visual C++ does that. That means your code can run OK in debug mode and stop working once compiled for release.

Visual C++ usally sets unitialized variables to all hex cc...cc . This ends up being helpful when debugging code.

jtbell · Aug 21, 2013

jtbell said:

In C++ (and I assume in C also) array elements are not initialized automatically. They preserve whatever was previously in those memory locations when the array is allocated.

[...]

[added] Grep beat me to it, with the more precise statement that local arrays are not initialized automatically.

Now I suppose I'd better try to be even more precise: The C++ standard does not require that elements of local arrays of native data types be automatically initialized to any particular value when the arrays are allocated. Therefore, a program should not depend on such automatic initialization.

With an array of a non-native data type, e.g. a class or struct defined as part of the program, I think the default constructor of that class is invoked. I've always used vectors instead of arrays for that sort of thing, and in that case the default constructor is used if you don't provide a value to initialize the elements with, when declaring the vector.

D H · Aug 21, 2013

In addition to the problem that the contents of the array binaryMask are not initialized, there are several other problems with bool binaryMask[ceiling - floor + 1];.

One big problem is that this array is allocated on the stack rather than the heap. Use the stack for small things. A lot of operating systems limit the size of the stack. Exceed this size and you get stack overflow.

Another problem is that it is illegal. The size of an array in C++ must be a compile-time constant expression. You can get away with run-time bounds on many compilers because many compilers use a common core engine for C and C++, and C does allow dynamically-sized arrays. Allowing dynamically-sized arrays in C++ is a common extension that may make it's way into the next version of the standard.

Yet another problem: What if those limits are huge, or what if the array elements aren't some integral type? You certainly don't want to allocate an array of 2⁶⁴ elements, or even larger.

There are also a couple of problems with the sorted array approach. One is that you are sorting the array. That's not a very nice thing to do. What if the initial order somehow matters to the caller of your function contains_repeats? The sort-based version of your function is unusable that caller. Ideally, the input array should be const. You probably should be sorting a copy of the input array.

Another problem is that it's an O(N log N) algorithm. It would be nice to do this as an O(N) algorithm. Your bit flag implementation is an O(N) algorithm.

Another is to create a set from your input array. A set, by definition, has no duplicates. One way to do this is to use std::set. However, creating a std::set is an O(N log N) algorithm. If you are using C++11, you can create a std::unordered_set. This is O(N) on average, but O(N²) with some pathological cases. So how to tell if you have duplicates? Simple: The size of the set is equal to the size of the input array if there are no duplicates. The set will be smaller in size if the input array contains duplicates.

There's nothing wrong per se with the dumb but simple O(N²) algorithm of checking whether arr[1] is the same as arr[0], arr[2] is the same as arr[0] or arr[1], and so on. The algorithm may be inefficient, but the code is short and sweet, blatantly obvious, and stands by itself.

One last point: Do not use using namespace std. It's key indicator of someone best not hired.

Jamin2112 · Aug 21, 2013

D H said:

In addition to the problem that the contents of the array binaryMask are not initialized, there are several other problems with bool binaryMask[ceiling - floor + 1];.

One big problem is that this array is allocated on the stack rather than the heap. Use the stack for small things. A lot of operating systems limit the size of the stack. Exceed this size and you get stack overflow.

Another problem is that it is illegal. The size of an array in C++ must be a compile-time constant expression. You can get away with run-time bounds on many compilers because many compilers use a common core engine for C and C++, and C does allow dynamically-sized arrays. Allowing dynamically-sized arrays in C++ is a common extension that may make it's way into the next version of the standard.

Yet another problem: What if those limits are huge, or what if the array elements aren't some integral type? You certainly don't want to allocate an array of 2⁶⁴ elements, or even larger.

There are also a couple of problems with the sorted array approach. One is that you are sorting the array. That's not a very nice thing to do. What if the initial order somehow matters to the caller of your function contains_repeats? The sort-based version of your function is unusable that caller. Ideally, the input array should be const. You probably should be sorting a copy of the input array.

Another problem is that it's an O(N log N) algorithm. It would be nice to do this as an O(N) algorithm. Your bit flag implementation is an O(N) algorithm.

Another is to create a set from your input array. A set, by definition, has no duplicates. One way to do this is to use std::set. However, creating a std::set is an O(N log N) algorithm. If you are using C++11, you can create a std::unordered_set. This is O(N) on average, but O(N²) with some pathological cases. So how to tell if you have duplicates? Simple: The size of the set is equal to the size of the input array if there are no duplicates. The set will be smaller in size if the input array contains duplicates.

There's nothing wrong per se with the dumb but simple O(N²) algorithm of checking whether arr[1] is the same as arr[0], arr[2] is the same as arr[0] or arr[1], and so on. The algorithm may be inefficient, but the code is short and sweet, blatantly obvious, and stands by itself.

One last point: Do not use using namespace std. It's key indicator of someone best not hired.

Thanks. This is all very good advice.

Is there anything wrong with this C code?

Is A.I. more than the sum of its parts?

AI vs. Humans as Processors in an Environment

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Is there anything wrong with this C code?

Similar threads