Compt Sci; drawing ball program

AI Thread Summary
The discussion revolves around a programming assignment that requires creating a simulation to estimate how many balls need to be drawn from a jar before a duplicate color is drawn. The initial code provided does not correctly implement the logic to stop drawing when any color appears twice, leading to incorrect average results. Participants suggest using an array to track drawn colors or calculating probabilities to simplify the process. The correct approach involves averaging the number of draws across multiple trials until a duplicate is found, with the expected average for 50 colors being approximately 9.52394. The conversation concludes with a more efficient simulation method being proposed, emphasizing the importance of accurately tracking drawn colors to achieve the desired outcome.
sandy.bridge
Messages
797
Reaction score
1

Homework Statement


The idea is that the program asks the console user to provide them with a number that tells the program how many different colors of balls are to be used. Then, a ball is randomly drawn from a jar. That ball is put back in the jar, and balls are drawn until the same ball as the first is drawn again. This act is averaged over N times to get a good average (value of N is also provided by the user).

Now, it is suggested that we are to be within 1/100th of the following results:
How many different ball colors are there? 50
How many trials would you like to run? 100000
Average number of balls drawn to get a duplicate over 100000 trials: 9.52394


The Attempt at a Solution



I believe my issue lies in my use of the random function. Here is the code:

PHP:
#include <iostream>
#include <cstdlib>
#include <ctime>

using namespace std;

int random(int k)
{
    int y = rand() % k, i;
    for (i = 1;; i++)
    {
        int x = rand() % k;
            if (y == x)
        {
            break;
        }
    }
    return i;
}

int main() {
    int i = 0, j, k, sum = 0;

    srand(time(0));
    cout << "Please enter the number of colours: ";
    cin >> k;
    cout << "Please enter the number of trials: ";
    cin >> j;


    {while (i < j)
    {
    	sum = sum + random(k);
    	i++;
    } cout << sum/k;}



    return 0;
    }

What happens is averaging it over N trials results in the average approaching the value of how ever many balls the user chose to use.
 
Physics news on Phys.org
sandy.bridge said:
Now, it is suggested that we are to be within 1/100th of the following results:
How many different ball colors are there? 50
How many trials would you like to run? 100000
Average number of balls drawn to get a duplicate over 100000 trials: 9.52394
Those numbers do not match the problem statement. Assuming all colors (let's call this k) have the same probability to be drawn (this is missing), the average should approach k+1. Or k, if you do not count the first ball.

If you stop as soon as any color appeared more than once (not necessarily the color of the first ball!), the average might be close to 9.52.
Edit: I can confirm this, the average approaches 9.5431...
What happens is averaging it over N trials results in the average approaching the value of how ever many balls the user chose to use.
And this is correct.
I don't understand why you calculate sum/k, however.
 
Last edited:
I must be reading this wrong:
"Write a program that will input the number of ball colors, and a value for N. Estimate the approximate number of balls that will have to be drawn from the bucket before you see the same ball color a second time. Once it has this value, print it out to the user.

So I have to change the code such that it stops when any number is first repeated, rather than the first number that is drawn?

Did you have to end up changing the code significantly in order to determine which number was first repeated? For example, should I be utilizing an array?
 
Last edited:
sandy.bridge said:
I must be reading this wrong:
[...]
So I have to change the code such that it stops when any number is first repeated, rather than the first number that is drawn?
Right (both)
Did you have to end up changing the code significantly in order to determine which number was first repeated? For example, should I be utilizing an array?
I did not use simulations at all, there is a direct way to calculate the expected value.
An array is possible, but there are easier ways to do this. You don't have to keep track of specific colors drawn, it is sufficient to know how many colors were drawn before.
 
So if I do N trials, what I want to do is keep track of how many of each number was drawn throughout all the N trials? Would these not be different depending on the number, or should they all be the same after a lot of trials?

Edit: Actually, if I merely went with one of the random numbers generated, then counted how many of that one number was generated in total for all N trials, that should yield results close to that value, no?
 
Last edited:
So if I do N trials, what I want to do is keep track of how many of each number was drawn throughout all the N trials?
What do you mean with "how many of each number"?

Here is an example:
Draw a ball: Color 24
Draw a ball: Color 4. No color appeared twice, continue.
Draw a ball: Color 13. No color appeared twice, continue.
Draw a ball: Color 4. Color 4 appeared twice, stop.
Total balls drawn in this simulation: 4.

Now do this 100000 times and calculate the average of the balls drawn.

Instead of keeping track of all drawn ball numbers, you can calculate the probability P that draw X gives a color which did not appear before. If P (here you need randomness), continue, otherwise stop. This simplifies the algorithm.
 
Should I be trying to scan for doubles while the loop is randomly generating, or do I generate a bunch of randoms before I start scanning? The difficult part I am having is scanning through to check for doubles. I can do it easily if I know the specific one in mine, but that's it.
 
I finally got it!

Displays:
PHP:
Enter a number of balls: 50
Enter a number of trials: 1000000
9.54841
 
Last edited:
Good :).

Here is what I would have done:
Code:
int simulate(int k) {
  for(int i=2;i<=k+1; i++) //the earliest possible stop is at the 2nd ball and the last one is k+1
   if(rand()%k < i-1) //we had i-1 colors drawn before the i'th ball is drawn => (i-1)/k stopping probability
     return i;
  return 0; // should never happen
}
 
  • #10
That is far more simple than the one I have. Regardless, thanks for the help!
 

Similar threads

Replies
2
Views
3K
Replies
3
Views
1K
Replies
3
Views
1K
Replies
14
Views
4K
Replies
7
Views
2K
Replies
7
Views
2K
Back
Top