Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Finding randomization algorithm

  1. Jan 26, 2010 #1
    Hello all,

    I am currently attempting to take a set of data I have acquired and trace it back to the initial algorithm that produced it. My problem is my math level is only at the level of differential equations and I do not have any knowledge in advanced statistical analysis. Anyone have any advice on where I can look to find some quality information on the subject or whether or not what I'm trying to do is even possible.

    Here is some background information:

    Basically, I am playing a video game show that asks a series of multiple choice questions with answers choices A, B, and C. I have not yet recorded a sample of data, but once I do I would like to try to figure out the algorithm that produced the sample and try to use it to predict future answer choices.

    Any advice and help would be much appreciated.
  2. jcsd
  3. Jan 26, 2010 #2
    Wait you want an algorithm that produces a random number between 1 and 3? at which point that number will be converted to its alphabetical equivalent?
  4. Jan 26, 2010 #3
    can i not assign the value of 1 to A, 2 to B, and 3 to C
  5. Jan 26, 2010 #4
    thats what i mean by saying the number is converted to its alphabetical equivalent.

    but most if not all program have a random number generator between 0 and 1. you could just run a loop like 20 times and say:

    if randomnumber <=0.33
    answer = A
    else if 0.33<randomnumber<=0.66
    answer = B
    answer = C

    I mean, what kind of algorithm are you trying to decipher here?
  6. Jan 26, 2010 #5
    What I am trying to decipher is an algorithm being used by a game and use its output (i.e. number of A's, B's, and C's) to trace back and find the initial algorithm, thus allowing predictions to be made about future answers.
  7. Jan 26, 2010 #6
  8. Jan 27, 2010 #7
    That helps a little. Thanks zli034. But I am looking more for actual equations I can follow to fit my own data into. Anyone have any ideas?
  9. Jan 27, 2010 #8
    and you are under the assumption that this game didnt use a random number generator. You think that they produced a function of how the answer changes with time?
  10. Jan 27, 2010 #9
    No. I do think they are using a random number generator. Seldom to rarely, there will be an instance where the same answer will appear 3 or 4 times in a row and other times where it will go 20 some questions without being a certain answer.
    Last edited: Jan 27, 2010
  11. Jan 27, 2010 #10
    you do think or dont think they are using a random number generator. and the information you've given me is not a pattern, therefore it does not oppose the idea of a random number generator.
  12. Jan 27, 2010 #11
    Okay. I agree that it is not a pattern. So it looks like it is using a random number generator. Now what?
  13. Jan 27, 2010 #12
    now you find the program they use for the random number generator or quit
  14. Jan 27, 2010 #13
    Well, that's not what I was looking for, but thanks I guess. I was initially asking for some assistance on what types of equations or mathematics I could use to try to uncover the program they used.
  15. Jan 27, 2010 #14
    dacruick essentially gave you the "equation" that you would need to figure out what program they used. All random number generators create a deterministic sequence. Once you know what RNG was used and the seed that was used to start the sequence, you would have exact knowledge of the rest of the sequence.

    It seems that you are looking for some sort of method to uncover the RNG given a certain set of data. That is, you are essentially trying to decode a sequence. There is no surefire way to do this, otherwise internet poker would not have survived. If I was going to take a crack at it I would:
    (1) Collect a number of popular uniform random number generators
    (2) Simulate a large number of seeds for each of the random number generators
    (3) See which model performs the best in terms of matching the sequence (there are a number of different ways you could define "best" here).

    Depending on what RNG you used, you may also be able to calibrate the parameters to better match the sequence. Keep in mind, however, that it will probably be nothing more than dumb luck if you stumble onto a seed-RNG combination that works well. Statisticians put a lot of work into making these RNGs generate sequences that are as close to a truly random process as possible.
  16. Jan 27, 2010 #15
    Okay, thanks tjm7582
  17. Jan 27, 2010 #16
    Another complication is that even if you know the RNG algorithm the seed is probably itself randomly generated, e.g. based on system clock, to ensure that the same sequence isn't produced each time you switch the machine on and off.

    Although the reverse-engineering problem is a little ambitious, there are still lots of interesting statistical questions you can ask about the "black box" RNG, e.g.
    - is any answer A/B/C more likely?
    - is any answer A/B/C more likely to follow a given answer?
    - what sample size is needed to be confident of the above?

    - how many questions are in the database, given how many we've observed and how many have been asked twice?

    Have fun in your investigations!
  18. Jan 27, 2010 #17
    Thanks bpet. I've noticed myself that those questions you posed are realistically all I am going to get at best.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook