# Compt Sci; drawing ball program

1. Mar 6, 2013

### sandy.bridge

1. The problem statement, all variables and given/known data
The idea is that the program asks the console user to provide them with a number that tells the program how many different colors of balls are to be used. Then, a ball is randomly drawn from a jar. That ball is put back in the jar, and balls are drawn until the same ball as the first is drawn again. This act is averaged over N times to get a good average (value of N is also provided by the user).

Now, it is suggested that we are to be within 1/100th of the following results:
How many different ball colors are there? 50
How many trials would you like to run? 100000
Average number of balls drawn to get a duplicate over 100000 trials: 9.52394

3. The attempt at a solution

I believe my issue lies in my use of the random function. Here is the code:

PHP:
#include <iostream>
#include <cstdlib>
#include <ctime>

using namespace std;

int random(int k)
{
int y = rand() % k, i;
for (i = 1;; i++)
{
int x = rand() % k;
if (y == x)
{
break;
}
}
return i;
}

int main() {
int i = 0, j, k, sum = 0;

srand(time(0));
cout << "Please enter the number of colours: ";
cin >> k;
cout << "Please enter the number of trials: ";
cin >> j;

{while (i < j)
{
sum = sum + random(k);
i++;
} cout << sum/k;}

return 0;
}

What happens is averaging it over N trials results in the average approaching the value of how ever many balls the user chose to use.

2. Mar 6, 2013

### Staff: Mentor

Those numbers do not match the problem statement. Assuming all colors (let's call this k) have the same probability to be drawn (this is missing), the average should approach k+1. Or k, if you do not count the first ball.

If you stop as soon as any color appeared more than once (not necessarily the color of the first ball!), the average might be close to 9.52.
Edit: I can confirm this, the average approaches 9.5431...
And this is correct.
I don't understand why you calculate sum/k, however.

Last edited: Mar 6, 2013
3. Mar 6, 2013

### sandy.bridge

I must be reading this wrong:
"Write a program that will input the number of ball colors, and a value for N. Estimate the approximate number of balls that will have to be drawn from the bucket before you see the same ball color a second time. Once it has this value, print it out to the user.

So I have to change the code such that it stops when any number is first repeated, rather than the first number that is drawn?

Did you have to end up changing the code significantly in order to determine which number was first repeated? For example, should I be utilizing an array?

Last edited: Mar 6, 2013
4. Mar 6, 2013

### Staff: Mentor

Right (both)
I did not use simulations at all, there is a direct way to calculate the expected value.
An array is possible, but there are easier ways to do this. You don't have to keep track of specific colors drawn, it is sufficient to know how many colors were drawn before.

5. Mar 6, 2013

### sandy.bridge

So if I do N trials, what I want to do is keep track of how many of each number was drawn throughout all the N trials? Would these not be different depending on the number, or should they all be the same after a lot of trials?

Edit: Actually, if I merely went with one of the random numbers generated, then counted how many of that one number was generated in total for all N trials, that should yield results close to that value, no?

Last edited: Mar 6, 2013
6. Mar 7, 2013

### Staff: Mentor

What do you mean with "how many of each number"?

Here is an example:
Draw a ball: Color 24
Draw a ball: Color 4. No color appeared twice, continue.
Draw a ball: Color 13. No color appeared twice, continue.
Draw a ball: Color 4. Color 4 appeared twice, stop.
Total balls drawn in this simulation: 4.

Now do this 100000 times and calculate the average of the balls drawn.

Instead of keeping track of all drawn ball numbers, you can calculate the probability P that draw X gives a color which did not appear before. If P (here you need randomness), continue, otherwise stop. This simplifies the algorithm.

7. Mar 7, 2013

### sandy.bridge

Should I be trying to scan for doubles while the loop is randomly generating, or do I generate a bunch of randoms before I start scanning? The difficult part I am having is scanning through to check for doubles. I can do it easily if I know the specific one in mine, but thats it.

8. Mar 7, 2013

### sandy.bridge

I finally got it!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Displays:
PHP:
Enter a number of balls: 50
Enter a number of trials: 1000000
9.54841

Last edited: Mar 7, 2013
9. Mar 8, 2013

### Staff: Mentor

Good :).

Here is what I would have done:
Code (Text):
int simulate(int k) {
for(int i=2;i<=k+1; i++) //the earliest possible stop is at the 2nd ball and the last one is k+1
if(rand()%k < i-1) //we had i-1 colors drawn before the i'th ball is drawn => (i-1)/k stopping probability
return i;
return 0; // should never happen
}

10. Mar 10, 2013

### sandy.bridge

That is far more simple than the one I have. Regardless, thanks for the help!