Random generated doubles: can I get a number more than once?

RaamGeneral · May 3, 2016

Hello. I recall that the probability of generating a particular real number (in a bounded or unbounded interval) is exactly zero.
If I keep generating reals for a limited amount of time, can I get a number more than once?

About the computer, I know that the amount of different values the type double (for instance) can store is not infinite. This should mean I could get duplicates, but very unlikely, right?.

I'm working on an assignment where there are a bunch of random (in an interval) doubles inside a matrix and I have to pick the position of the least one. The assignment says: "if there are more, choose randomly among them".

If the answer to my first question is "no", then theoretically I will never be in that situation. But because of the machine limitations actually I could (especially if the matrix is read from a file which contains the numbers with a small number of digits).

I'm using C and in my particular assignment I find difficult to use rand(), because I'm writing this code inside a function, where I can't call srand(time(0)) [because the function will be executed in a loop (the matrix is modified each time in a way that I can't consider that position any more)]. srand is not called in the main() and I don't write the main().I decided to assume that the numbers are indeed all different. Even if there were some equal and are the smallest I will pick the first when iterating the matrix, and I can say something like this: I should take one of them randomly, but because they were created and put randomly inside the matrix it seems like the same to take the first.
Another explanation is this: because the program simulates a physical phenomenon (that I know nothing about, it's called invasion percolation), IF the phenomenon is deterministic, considering the same matrix, the evolution of the system should always be the same and so the evolution of the matrix calling my function in the loop.Sorry for writing so much and the terrible english. I'd like to have some discussion about this problem.

Ibix · May 3, 2016

You will eventually get repeated numbers because, as you note, you are actually drawing from a finite list of possible values. How long it is before you expect to get a repeat, I don't know. However, IIRC the C rand() function returns a number in the range 0-65535, so if you are actually using that you must get a repeat in 65536 iterations or less. Probably much less. This is the repeated birthday problem - how many people do you need before it is 50:50 that two of them have the same birthday? 23. Apply the same methodology to your random number generator and see how likely you are to have a repeat given your matrix size.

BvU · May 3, 2016

IEEE-754 double data type in C is 8 bytes. So there are ##2^{64}## (##\approx 1.8e19##) possible values (minus a few that get branded 'not a number' or ##\pm##inf). as a rule I remember you can have 15 decimal digits of precision that way, and I 'never'

ran over the upper limit (1.7e308).
So you can generate quite a few doubles before you generate the same one twice, even when observing Ibix' caveat.

RaamGeneral · May 3, 2016

BvU.
I used the formula: a+(b-a)*rand()/RAND_MAX;
if RAND_MAX is 65535 then I really get 65536 different values, independently of how many different double I can represent in the machine.

newjerseyrunner · May 3, 2016

RaamGeneral said:

BvU.
I used the formula: a+(b-a)*rand()/RAND_MAX;
if RAND_MAX is 65535 then I really get 65536 different values, independently of how many different double I can represent in the machine.

I'm not 100% that you'll get all numbers, it would depend on the "random" algorithm, there may be discreet values that will never be presented by that function.

I think your formula is for creating a value within a certain range? If you, your algorithm is not correct in using division, you'll end up with an integer division, which may not do what you expect it to do, you want to use modulus.

Code:

int random_between(int minimum, int maximum){
    return (rand() % (maximum + 1 - minimum)) + minimum;
}

glappkaeft · May 3, 2016

BvU said:

IEEE-754 double data type in C is 8 bytes. So there are ##2^{64}## (##\approx 1.8e19##) possible values (minus a few that get branded 'not a number' or ##\pm##inf). as a rule I remember you can have 15 decimal digits of precision that way, and I 'never' ran over the upper limit (1.7e308).
So you can generate quite a few doubles before you generate the same one twice, even when observing Ibix' caveat.

Like others have mention this depends entirely on how good the algorithm you are using is. You calculation is only true for a trully cryptagraphic grade random number generator. If you use the ones commonly built in you may have little or no true randomness. For instance, if you use the c rand() function naively it is completely deterministic and might repeat pretty quickly. The only way to know for sure is to look up the characteristics of the algoritm you are using or if designing it yourself, do the math.

BvU · May 3, 2016

RaamGeneral said:

BvU.
I used the formula: a+(b-a)*rand()/RAND_MAX;
if RAND_MAX is 65535 then I really get 65536 different values, independently of how many different double I can represent in the machine.

I don't see that. As long as a and b are double you should get a heck of a lot more. Or are you generating integers ?

The expression is weird anyway. What is the intention of having a, b AND rand_max ?

BvU · May 3, 2016

glappkaeft said:

If you use the ones commonly built in you may have little or no true randomness. For instance, if you use the c rand() function naively it is completely deterministic and might repeat pretty quickly. The only way to know for sure is to look up the characteristics of the algoritm you are using or if designing it yourself, do the math.

I agree you are dependent on the compiler and whatever library it uses (and options settings).

Ibix · May 3, 2016

BvU said:

I don't see that. As long as a and b are double you should get a heck of a lot more. Or are you generating integers ?

The expression is weird anyway. What is the intention of having a, b AND rand_max ?

The basic C standard library rand() function returns an integer in the range 0-RAND_MAX, and RAND_MAX=65535. His formula converts that into a double in the range a to b. But since all the variation comes from one call to rand(), he can only get one of 65536 doubles.

BvU · May 3, 2016

I'll accept your authority in this. No wonder I never used C but stuck to Fortran

. Not just an integer function, but a 2byte function to boot, and with a name starting with the letter r . Yuck !

Ibix · May 3, 2016

By way of not pretending to be an authority, here's a reference: http://www.tutorialspoint.com/c_standard_library/c_function_rand.htm

Apparently RAND_MAX is only guaranteed to be at least 32767. The return type is int, so it may be a four-byte integer depending on implementation. The implementation in Kernighan and Ritchie uses RAND_MAX=65535; I haven't seen any other but I haven't exactly made a study of it. OP can just printf("%d",RAND_MAX); if he wants to know what his actually is.

RaamGeneral · May 3, 2016

The program has these functions (that I have to write) among others:

/* allocate memory for the matrix; given, I can't modify this */
double** new_matrix(unsigned n, unsigned m);
/* set the elements of the matrix randomly */
int init_percolation (double ** mat, unsigned n, unsigned m, double a, double b);
/* load the matrix from file */
int load_matrix (FILE * f, double ** a, unsigned n, unsigned m);
/* "removes" the element in the center of the matrix */
int first_step (double ** mat, unsigned n, unsigned m);
/* of all the elements near the removed ones, remove the least */
int step (double ** mat, unsigned n, unsigned m, double a, double b);

I must use the function rand() when needed, I don't have to design something like that. This is only a simple assignment for the university, but if I'm not missing something, there is something wrong here.

The main() is given:
step() is inside a loop, and calling srand() inside a loop doesn't really work from what I know.
Now I see even init_matrix() is called in a loop so the same argument applies: the matrices will be all the same (I didn't test it yet, actually).

I really don't know how and where call the srand(), or maybe I'm not knowing the usage of this function very well.EDIT. Now I see (then I will verify with printf) that RAND_MAX "in the GNU C Library, it is 2147483647".

NascentOxygen · May 4, 2016

RaamGeneral said:

The assignment says: "if there are more, choose randomly among them".

It seems to me that, regardless of how small the likelihood of encountering identical reals, because it cannot be guaranteed to never occur your code should be written to gracefully handle that event: it's in the specifications.

The fact that you may find you can execute your program a few hundred times without encountering the special case of a repeated element means nothing. ⚄

RaamGeneral · May 4, 2016

I agree.

What about the srand() problem?

Here it is my function and a portion of the main().

Code:

int init_percolation(double ** mat, unsigned n, unsigned m, double a, double b) 
{ 
   
    size_t i,j;
   
    srand(time(NULL));

    for(i=0;i<n;i++)
        for(j=0;j<m;j++)
            mat[i][j]=a+(b-a)*rand()/RAND_MAX;
   
   
    return 0;

}

int main (void){

for (k=0; k<NI; k++) { 
    if ( init_percolation(a,N,M, low, up) != 0 ) return EXIT_FAILURE;
    print_matrix(stdout,a,N,M) ;
    ...
}

...
}

It works, but shouldn't it be wrong to call srand() inside a function that is called inside a loop?

rcgldr · May 4, 2016

In most implementations of rand(), the internal value will go through all 2^32 values only once before cycling. However the modulo operation performed afterwards will bias the results. Wiki article includes a table of some common implementations:

http://en.wikipedia.org/wiki/Linear_congruential_generator#Parameters_in_common_use

NascentOxygen · May 4, 2016

It works, but shouldn't it be wrong to call srand() inside a function that is called inside a loop?

You're talking about seeding the PRNG miltiple times? I imagine there could be consequences. So why don't you seed it just once and inside main()?

RaamGeneral · May 4, 2016

Because I can't; the main() is given and I can't modify it.
Never mind, the program is almost ready.

Thank you folks.

JorisL · May 4, 2016

The part of the code you gave seems to sample NI random matrices.
It does this by defining a matrix in the main function of the program.
Then it loops NI times, in this loop it tests if the initialization of the matrix succeeds (it initializes by filling the matrix with random numbers).
Then it does some work with the matrix to get a result (I suppose) after which the cycle starts over.

The idea of the srand in the init_percolation function is to make sure that the sampled matrices are as independent as possible.
In this case it's alright to use it I would think.

NascentOxygen · May 4, 2016

RaamGeneral said:

srand(time(NULL));

if (k==0) { srand(time(NULL)); }

EDITED

RaamGeneral · May 4, 2016

NascentOxygen, in a "stroke of genius" (and with your help) I now understand. If the professor didn't make mistakes arranging the assignment, every problem is to test the student and has a solution.
I cannot use your advice because I can't modify the main, and in init_percolation I can't check if I already called srand()... unless I use a static variable (maybe you meant this but didn't want to tell directly).

But in this program I have actually two functions that make use of srand(). Remember step() in which I have to choose randomly between possible duplicates?

The professor said in a note: "don't use global variables unless you really need it" and, because I can't use static variables "among" more functions I really need it this time (I can check if init_percolation() already called srand() but I won't know if step() did or vice-versa).EDIT: I can actually use a static variable.

Random generated doubles: can I get a number more than once?

1. Can a random generated double be the same number as a previously generated one?

2. How can I ensure that I do not get a repeated number when generating random doubles?

3. Is there a specific method for generating random doubles that guarantees no repeats?

4. Can I manipulate the range and precision of the random number generator to avoid repeated numbers?

5. What is the impact of getting repeated numbers when generating random doubles?

Similar threads

Hot Threads

Recent Insights