Random generated doubles: can I get a number more than once?

  • #1

Main Question or Discussion Point

Hello.


I recall that the probability of generating a particular real number (in a bounded or unbounded interval) is exactly zero.
If I keep generating reals for a limited amount of time, can I get a number more than once?

About the computer, I know that the amount of different values the type double (for instance) can store is not infinite. This should mean I could get duplicates, but very unlikely, right?.

I'm working on an assignment where there are a bunch of random (in an interval) doubles inside a matrix and I have to pick the position of the least one. The assignment says: "if there are more, choose randomly among them".

If the answer to my first question is "no", then theoretically I will never be in that situation. But because of the machine limitations actually I could (especially if the matrix is read from a file which contains the numbers with a small number of digits).

I'm using C and in my particular assignment I find difficult to use rand(), because I'm writing this code inside a function, where I can't call srand(time(0)) [because the function will be executed in a loop (the matrix is modified each time in a way that I can't consider that position any more)]. srand is not called in the main() and I don't write the main().


I decided to assume that the numbers are indeed all different. Even if there were some equal and are the smallest I will pick the first when iterating the matrix, and I can say something like this: I should take one of them randomly, but because they were created and put randomly inside the matrix it seems like the same to take the first.
Another explaination is this: because the program simulates a physical phenomenon (that I know nothing about, it's called invasion percolation), IF the phenomenon is deterministic, considering the same matrix, the evolution of the system should always be the same and so the evolution of the matrix calling my function in the loop.


Sorry for writing so much and the terrible english. I'd like to have some discussion about this problem.
 

Answers and Replies

  • #2
Ibix
Science Advisor
Insights Author
5,985
4,578
You will eventually get repeated numbers because, as you note, you are actually drawing from a finite list of possible values. How long it is before you expect to get a repeat, I don't know. However, IIRC the C rand() function returns a number in the range 0-65535, so if you are actually using that you must get a repeat in 65536 iterations or less. Probably much less. This is the repeated birthday problem - how many people do you need before it is 50:50 that two of them have the same birthday? 23. Apply the same methodology to your random number generator and see how likely you are to have a repeat given your matrix size.
 
  • #3
BvU
Science Advisor
Homework Helper
2019 Award
13,059
3,029
IEEE-754 double data type in C is 8 bytes. So there are ##2^{64}## (##\approx 1.8e19##) possible values (minus a few that get branded 'not a number' or ##\pm##inf). as a rule I remember you can have 15 decimal digits of precision that way, and I 'never' :smile: ran over the upper limit (1.7e308).
So you can generate quite a few doubles before you generate the same one twice, even when observing Ibix' caveat.
 
  • #4
BvU.
I used the formula: a+(b-a)*rand()/RAND_MAX;
if RAND_MAX is 65535 then I really get 65536 different values, independently of how many different double I can represent in the machine.
 
  • #5
1,516
617
BvU.
I used the formula: a+(b-a)*rand()/RAND_MAX;
if RAND_MAX is 65535 then I really get 65536 different values, independently of how many different double I can represent in the machine.
I'm not 100% that you'll get all numbers, it would depend on the "random" algorithm, there may be discreet values that will never be presented by that function.

I think your formula is for creating a value within a certain range? If you, your algorithm is not correct in using division, you'll end up with an integer division, which may not do what you expect it to do, you want to use modulus.
Code:
int random_between(int minimum, int maximum){
    return (rand() % (maximum + 1 - minimum)) + minimum;
}
 
  • #6
233
97
IEEE-754 double data type in C is 8 bytes. So there are ##2^{64}## (##\approx 1.8e19##) possible values (minus a few that get branded 'not a number' or ##\pm##inf). as a rule I remember you can have 15 decimal digits of precision that way, and I 'never' :smile: ran over the upper limit (1.7e308).
So you can generate quite a few doubles before you generate the same one twice, even when observing Ibix' caveat.
Like others have mention this depends entirely on how good the algorithm you are using is. You calculation is only true for a trully cryptagraphic grade random number generator. If you use the ones commonly built in you may have little or no true randomness. For instance, if you use the c rand() function naively it is completely deterministic and might repeat pretty quickly. The only way to know for sure is to look up the characteristics of the algoritm you are using or if designing it yourself, do the math.
 
  • #7
BvU
Science Advisor
Homework Helper
2019 Award
13,059
3,029
BvU.
I used the formula: a+(b-a)*rand()/RAND_MAX;
if RAND_MAX is 65535 then I really get 65536 different values, independently of how many different double I can represent in the machine.
I don't see that. As long as a and b are double you should get a heck of a lot more. Or are you generating integers ?

The expression is weird anyway. What is the intention of having a, b AND rand_max ?
 
  • #8
BvU
Science Advisor
Homework Helper
2019 Award
13,059
3,029
If you use the ones commonly built in you may have little or no true randomness. For instance, if you use the c rand() function naively it is completely deterministic and might repeat pretty quickly. The only way to know for sure is to look up the characteristics of the algoritm you are using or if designing it yourself, do the math.
I agree you are dependent on the compiler and whatever library it uses (and options settings).
 
  • #9
Ibix
Science Advisor
Insights Author
5,985
4,578
I don't see that. As long as a and b are double you should get a heck of a lot more. Or are you generating integers ?

The expression is weird anyway. What is the intention of having a, b AND rand_max ?
The basic C standard library rand() function returns an integer in the range 0-RAND_MAX, and RAND_MAX=65535. His formula converts that into a double in the range a to b. But since all the variation comes from one call to rand(), he can only get one of 65536 doubles.
 
Last edited:
  • #10
BvU
Science Advisor
Homework Helper
2019 Award
13,059
3,029
I'll accept your authority in this. No wonder I never used C but stuck to Fortran :smile: . Not just an integer function, but a 2byte function to boot, and with a name starting with the letter r . Yuck !
 
  • #11
Ibix
Science Advisor
Insights Author
5,985
4,578
By way of not pretending to be an authority, here's a reference: http://www.tutorialspoint.com/c_standard_library/c_function_rand.htm

Apparently RAND_MAX is only guaranteed to be at least 32767. The return type is int, so it may be a four-byte integer depending on implementation. The implementation in Kernighan and Ritchie uses RAND_MAX=65535; I haven't seen any other but I haven't exactly made a study of it. OP can just printf("%d",RAND_MAX); if he wants to know what his actually is.
 
  • #12
The program has these functions (that I have to write) among others:

/* allocate memory for the matrix; given, I can't modify this */
double** new_matrix(unsigned n, unsigned m);
/* set the elements of the matrix randomly */
int init_percolation (double ** mat, unsigned n, unsigned m, double a, double b);
/* load the matrix from file */
int load_matrix (FILE * f, double ** a, unsigned n, unsigned m);
/* "removes" the element in the center of the matrix */
int first_step (double ** mat, unsigned n, unsigned m);
/* of all the elements near the removed ones, remove the least */
int step (double ** mat, unsigned n, unsigned m, double a, double b);

I must use the function rand() when needed, I don't have to design something like that. This is only a simple assignment for the university, but if I'm not missing something, there is something wrong here.

The main() is given:
step() is inside a loop, and calling srand() inside a loop doesn't really work from what I know.
Now I see even init_matrix() is called in a loop so the same argument applies: the matrices will be all the same (I didn't test it yet, actually).

I really don't know how and where call the srand(), or maybe I'm not knowing the usage of this function very well.


EDIT. Now I see (then I will verify with printf) that RAND_MAX "in the GNU C Library, it is 2147483647".
 
Last edited:
  • #13
NascentOxygen
Staff Emeritus
Science Advisor
9,244
1,071
The assignment says: "if there are more, choose randomly among them".
It seems to me that, regardless of how small the likelihood of encountering identical reals, because it cannot be guaranteed to never occur your code should be written to gracefully handle that event: it's in the specifications.

The fact that you may find you can execute your program a few hundred times without encountering the special case of a repeated element means nothing. ⚄
 
  • Like
Likes BvU
  • #14
I agree.

What about the srand() problem?

Here it is my function and a portion of the main().

Code:
int init_percolation(double ** mat, unsigned n, unsigned m, double a, double b) 
{ 
   
    size_t i,j;
   
    srand(time(NULL));

    for(i=0;i<n;i++)
        for(j=0;j<m;j++)
            mat[i][j]=a+(b-a)*rand()/RAND_MAX;
   
   
    return 0;

}

int main (void){

for (k=0; k<NI; k++) { 
    if ( init_percolation(a,N,M, low, up) != 0 ) return EXIT_FAILURE;
    print_matrix(stdout,a,N,M) ;
    ....
}

....
}

It works, but shouldn't it be wrong to call srand() inside a function that is called inside a loop?
 
  • #15
rcgldr
Homework Helper
8,681
515
  • #16
NascentOxygen
Staff Emeritus
Science Advisor
9,244
1,071
It works, but shouldn't it be wrong to call srand() inside a function that is called inside a loop?
You're talking about seeding the PRNG miltiple times? I imagine there could be consequences. So why don't you seed it just once and inside main()?
 
  • #17
Because I can't; the main() is given and I can't modify it.
Never mind, the program is almost ready.

Thank you folks.
 
  • #18
489
188
The part of the code you gave seems to sample NI random matrices.
It does this by defining a matrix in the main function of the program.
Then it loops NI times, in this loop it tests if the initialization of the matrix succeeds (it initializes by filling the matrix with random numbers).
Then it does some work with the matrix to get a result (I suppose) after which the cycle starts over.

The idea of the srand in the init_percolation function is to make sure that the sampled matrices are as independent as possible.
In this case it's alright to use it I would think.
 
  • #19
NascentOxygen
Staff Emeritus
Science Advisor
9,244
1,071
  • #20
NascentOxygen, in a "stroke of genius" (and with your help) I now understand. If the professor didn't make mistakes arranging the assignment, every problem is to test the student and has a solution.
I cannot use your advice because I can't modify the main, and in init_percolation I can't check if I already called srand().... unless I use a static variable (maybe you meant this but didn't want to tell directly).

But in this program I have actually two functions that make use of srand(). Remember step() in which I have to choose randomly between possible duplicates?

The professor said in a note: "don't use global variables unless you really need it" and, because I can't use static variables "among" more functions I really need it this time (I can check if init_percolation() already called srand() but I won't know if step() did or vice-versa).


EDIT: I can actually use a static variable.
 
Last edited:

Related Threads on Random generated doubles: can I get a number more than once?

  • Last Post
Replies
4
Views
2K
  • Last Post
Replies
7
Views
3K
Replies
9
Views
1K
  • Last Post
Replies
1
Views
7K
  • Last Post
Replies
3
Views
4K
Replies
2
Views
3K
Replies
1
Views
3K
  • Last Post
Replies
2
Views
1K
Replies
4
Views
3K
  • Last Post
Replies
13
Views
3K
Top