Challenge Micromass' big statistics challenge

  • #51
fresh_42 said:
Another try on 9)...
Correction: I've made a mistake and had used a wrong denominator in one line. (#words instead of #letters) So the picture is a different one concerning the occurrences of letters. However, my next step would be attempts to decode the text, now that I have some clues for setting up a bijection of the alphabet. Since this is forbidden, my last hypothesis is
9) Is an english text.
and now I'll try to decode it.
 
Physics news on Phys.org
  • #52
PeroK said:
Yes, in the first sequence there seems to be a correlation between the outcome of each toss and the previous one (it's more likely to be the same). For once I was more interested in the psychology than the maths. The normal tendency, I believe, is to do the opposite: try to even things up by favouring a Head afer a Tail and vice versa. If sequence 1 is human generated, then whoever did it appears to have known this and over compensated by favouring another Head after a Head and another Tail after a Tail. Or, perhaps, it was @micromass doing this deliberately!
Hopefully he gives me credit for it already, but if not, a similar test could be done observing the two coin flips after a head first appears. In that case, a simple observation is that there are often two heads following it. I didn't compute it numerically yet, but it looks as if that might give about 4 ## \sigma ## or 5 ## \sigma ## from the mean for the top sequence, and even 3 ## \sigma ## is very unlikely to happen by chance...editing...this second test of the two subsequent coin tosses (if I computed it correctly) gave just over ## 2 \sigma ## from the mean for the first set (and not ## 4 \sigma ## or ## 5 \sigma ##), but it still is an indicator that it was likely man-made.
 
Last edited:
  • #53
Charles Link said:
Hopefully he gives me credit for it already, but if not, a similar test could be done observing the two coin flips after a head first appears. In that case, a simple observation is that there are often two heads following it. I didn't compute it numerically yet, but it looks as if that might give about 4 ## \sigma ## or 5 ## \sigma ## from the mean for the top sequence, and even 3 ## \sigma ## is very unlikely to happen by chance.

This is also the result I seemed to get when observing Shannon-entropy which is a (fancy but strongly physics-related) measure for randomness and thus expected to be higher for the real combination. I seem to have made a mistake last time, because now I also observe the THT-sequence to have higher entropy for virtually any combination

I don't know how to make a table here in physics forums, but the entropies for the lists {(n,n+1)} of two subsequent flips are 1.365 /1. 382; for the lists {(n, n+2)} 1.376/1.383, for the lists {(n,n+30)} I get 1.368/1.382, using {(n,n+100)} gives 1.364/1.383.

And for the lists {(n,n+1,n+2)} I get 2.032/ 2.072.

In conclusion, the THT-sequence is real and the THH-sequence is fake. A disadvantage of my approach is that, unlike the approaches from other people that use the more canonical statistics methods, I cannot give an uncertainty/probability the random sequence accidently seemed more man-made.
 
  • Like
Likes Charles Link
  • #54
I'll let you guys continue to think of this a bit more. I'll give the answer in a few days time.
 
  • #55
@micromass Thank you. I now have a nice little mini Enigma for Caesars, random Letter encryption, word and letter counting, decoding attempts ... guess I'll have to add the RSA feature to finish it.
 
  • #56
fresh_42 said:
@micromass Thank you. I now have a nice little mini Enigma for Caesars, random Letter encryption, word and letter counting, decoding attempts ... guess I'll have to add the RSA feature to finish it.

I might make a cryptography challenge later :smile:
 
  • #57
micromass said:
I might make a cryptography challenge later :smile:
I don't know whether this is good or bad news for my little playground. As far as I know you the chances are high you will put some noise on the channel and I'm not sure whether I want to deal with error correcting mechanisms. :nb)
 
  • #58
fresh_42 said:
I'm not sure whether I want to deal with error correcting mechanisms. :nb)

Of course you do!
 
  • #59
micromass said:
I'll let you guys continue to think of this a bit more. I'll give the answer in a few days time.
I have a more definitive statistical test for #2. The item that separates the first sequence from the second is the number of changes of state that take place (from the previous flip). In 100 coin tosses, there are ## N=99 ## possible changes of state. It's a simple binomial test with ## \mu=Np ## and ## \sigma^2=Npq ##. (with ##p=q=1/2 ##). This gives ## \mu=49.5 ## and ##\sigma=5.0 ##. For the first sequence I counted k=33 changes of state. This gives a ## z=(49.5-33)/5.0 ## that is slightly greater than 3. (result was more than ## 3 \sigma ## from the mean.) For the second sequence I still need to tally it, but I want to post this before someone else beats me to it. ...editing... for the second sequence I counted k=50 (approximately) transitions. (my eyes aren't real steady so the count may be off by one or two.) This second case is well within ## 1 \sigma ## of the mean and is the kind of result to be expected from a random coin toss.
 
Last edited:
  • #60
Problem 1: I don't get it. In what sense "optimal"?
 
  • #61
Erland said:
Problem 1: I don't get it. In what sense "optimal"?

That's for you to decide on. I left the question very open for this purpose. All I'm giving is the number of people on the train on several days, the question is how you would decide how many seats the train should have. Clearly we don't want it be too long since that's a waste of money. But we don't want it to be too short either, otherwise we get complaints about people not finding seats.
 
  • #62
micromass said:
I left the question very open for this purpose.
Darn it micromass...
 
  • #63
micromass said:
That's for you to decide on. I left the question very open for this purpose. All I'm giving is the number of people on the train on several days, the question is how you would decide how many seats the train should have. Clearly we don't want it be too long since that's a waste of money. But we don't want it to be too short either, otherwise we get complaints about people not finding seats.
Ok, then I say that the optimal number of seats is 254. Obviously, more than 254 seats would just be a waste of money. On the other hand, if you have fewer than 254 seats, then the orchs who miss the train on Day 3 because of lack of seats would get very angry and kill you, and your life is worth more than the cost of a few extra seats.
 
  • #64
ProfuselyQuarky said:
Darn it micromass...
Guess there is a completely different solution for Japanese subways ...
 
  • Like
Likes ProfuselyQuarky
  • #65
Trains:
Assuming the number of passengers is a Gaussian distribution, we can estimate its mean and standard deviation as 226 +- 20. If we plan for 266 seats, we have people standing with a probability of about 2% - and even then, just a small number has to stand. We probably cannot choose exactly 266 seats and it is not necessary either, something around that value should be fine.

Real passenger numbers do not follow Gaussian distributions, however - they have much longer tails. A good railroad company would check if a larger capacity might be needed (e. g. some large celebrations at Mordor or Rohan), and take a longer train (or even a second train) in that case. Assuming middle Earth follows a weekly cycle like Earth, we might also consider Fridays and the weekends separately.Coin tosses: I'll check the numbers tomorrow, but just by looking at the sequences, the second one is completely missing longer runs of the same side, it is generated by a human.
 
  • #66
Another way of doing #2

If the sequence was generated by a coin toss, then it is ##Binomial(199,0.5)##. Since each flip is independent, the probability that flip ##n+1## is identical to flip ##n## is ##0.5##, so the number of repetitions is ##Binomial(198,0.5)##.

Let ##k## be the number of repetitions. We want to estimate ##p = 0.5##.
The first sequence contains 117 repetitions, which gives (under a uniform prior) a posterior distribution of ##Beta(1 + 117, 198 - 117 + 1)##, which is greater than ##.5## with probability ##>0.99##.

The second contains 96 repetitions, which gives (under a uniform prior) a posterior distribution of ##Beta(1 + 96, 198 - 96 + 1)##, which is compatible with ##p = 0.5##.
 
  • #67
For number 6:

This really depends on what you mean by "can tell the difference". If by by "guessing" we mean flipping a fair coin, then the number of successes by guessing is ##Binomial(10,p=0.5)##. Under a flat prior, 8 successes gives a posterior of ##Beta(9,3)##, which suggests that ##p > 0.5## with probability ##0.98##, which is probably enough for me to accept a low risk claim like "can distinguish between coke and pepsi". You could do some kind of formal model comparison, but I hate that kind of stuff.

For number 1

0 seats. Nobody in Rohan is going to Mordor consensually, and Rohan doesn't want Orc refugees.
 
  • #68
Number Nine said:
.
The first sequence contains 117 repetitions, which gives (under a uniform prior) a posterior distribution of ##Beta(1 + 117, 198 - 117 + 1)##, which is greater than ##.5## with probability ##>0.99##.
One concern that is difficult to accurately quantify is the fact that we are all looking for statistics which will reflect some unlikely condition that is manifested by one of the two strings of coin flips. The more tests we run looking for unlikely coincidences, the more likely it is that we will find an unlikely coincidence. If, for instance, one runs 50 [independent] statistical tests on each of two random data sets then it is about 63% likely that one of those will claim "non-random" at the 99% confidence level.
 
  • #69
7.

We have 7 days thus 7 bins and 12 tickets. Without regard to distribution we can say that there are C(7-1+12,12) ways to get 12 tickets. Furthermore, we have C(2-1+12/12) ways to get 12 tickets on just Tuesday and Thursday (assuming we care to distinguished between tickets). Thus the professor has a 13/18564 probability of getting a ticket on Tuesday and Thursday. As for if it's worth getting a garage on those days or not depends on the valuation of the professor and time lose (or gained) on getting a garage versus the cost of a ticket.

For example if the garage isa 30 minute walk through a rough neighborhood and the parking ticket is 10 buckets, then the person may choose the ticket. If the garage is 1 minute further and cost 10 dollars but the ticket is 100, then the garage is probably a good idea.
 
  • #70
jbriggs444 said:
One concern that is difficult to accurately quantify is the fact that we are all looking for statistics which will reflect some unlikely condition that is manifested by one of the two strings of coin flips. The more tests we run looking for unlikely coincidences, the more likely it is that we will find an unlikely coincidence. If, for instance, one runs 50 [independent] statistical tests on each of two random data sets then it is about 63% likely that one of those will claim "non-random" at the 99% confidence level.

Yes, there are any number of statistics one could look at to quantify "non-randomness", so this will probably be a problem no matter which approach we use. The number of repetitions is just the simplest, and probably one of the most plausible places in which to find a difference, since most people tend to misjudge the expected amount of repetition when constructing "random" sequences.

The best approach might be to test a variety of features and look for consensus.
 
  • #71
Number Nine said:
Another way of doing #2

If the sequence was generated by a coin toss, then it is ##Binomial(199,0.5)##. Since each flip is independent, the probability that flip ##n+1## is identical to flip ##n## is ##0.5##, so the number of repetitions is ##Binomial(198,0.5)##.

Let ##k## be the number of repetitions. We want to estimate ##p = 0.5##.
The first sequence contains 117 repetitions, which gives (under a uniform prior) a posterior distribution of ##Beta(1 + 117, 198 - 117 + 1)##, which is greater than ##.5## with probability ##>0.99##.

The second contains 96 repetitions, which gives (under a uniform prior) a posterior distribution of ##Beta(1 + 96, 198 - 96 + 1)##, which is compatible with ##p = 0.5##.
This is compatible with my ##χ^2## Test (posts #2 and #9). However, it seems to lead to the wrong answer.
 
  • #72
I posted about #2 in posts 49, 52, and 59, but I didn't realize each sequence is 200 characters. I only analyzed the first 100 because the second 100 in each sequence requires moving the bottom across. I counted "changes of state" in post 59 which is essentially ## k=( 199- ## finding it the same) that @Number Nine did in post 66, (and is also binomial). In any case, using Number Nine's counting, there are 82 "changes of state' and ## \sigma^2=Npq=199(1/2)(1/2)=50 ##. This gives ## \sigma=7.1 ## and ## z=(100-82)/7.1=2.6 ##. (The binomial statistics approach the gaussian for the z values for large N. The gaussian tells us a z of 2.6 is rather unlikely.)
 
Last edited:
  • #73
Here is my statistical analysis for the coin tosses (problem 2). Both sequences have a length of 199. Real coin tosses are independent, we expect:

- about the same number of H and T. The first has 91 T, the second has 94, we expect 99.5, both are reasonable.
- about 50% probability that two subsequent tosses have the same result. The first sequence has this 117 out of 198 times, the second one 95 times out of 198. That 117 is a bit high, let's check further.
- sequences TTT or HHH should occur on average 197/4=49.25 times. The first sequence has this 63 times, the second one 45 times. First is a bit high here as well, but the numbers are strongly correlated.
- sequences TTTT or HHHH? We expect 24.5, we get 35 and 12 respectively. Both are off.
- T^5 or H^5? We expect 12.2, we get 20 and 1. The probability to have zero or one sequence of that length is 9.8%. The probability to have 20 or more such sequences is 16.5%.

Okay, weird. My initial impression that the second sequence is lacking runs was right, but the other one has too many, both with still somewhat reasonable probabilities.

TxT or HxH? No anomaly. TxxT/HxxH? Also no.
THT or HTH? We expect 49.25, we have 27 and 53.
THTH or HTHT? We expect 24.5, the first sequence has 8 while the second has 27.
THTHT/HTHTH? We expect 12.2, we have 4 and 13, respectively.
6 alternating? We expect 6.1, we get 1 and 6.
HTTH or THHT? We expect 24.5, we get 26 and 17.

In principle, both sequences can be generated by a human, and both can be generated randomly, so we can never be sure. Looking at the various tests described here, sequence 1 looks more odd than sequence 2. Humans normally tend to include fewer longer runs ("TTTTT" and so on) if they try to be random, but micromass (or whoever made that problem) knows about this and can manipulate the sequence.

I would expect sequence 1 to be made by a human knowing about typical biases humans have when trying to generate random sequences. The main point is the high probability to get two identical tosses in a row, and the low number of sequences of alternating coin tosses (again, correlated of course, didn't run toy studies to make a probability out of that).
 
  • Like
Likes PeroK and Charles Link
  • #74
micromass said:
Take the following two sequences of coin tosses:

One of these sequences is from an actual coin toss experiment. The other is invented by a human. Find out which of these is which.
Well, after close examination, I have determined that the real coin toss was the second one. The probability string of 6 or more of the same result is less than or equal to 1/64. Still possible. Having a string of 6 or more of the same result 4 times has the probability of 1/1024-- the first one had more than 4 strings of 6 or more, the second had none of them. I thus say that the first is fake and the second is real.
 
  • #75
Oh, mfb beat me to it. I didn't even read any of the other responses before I posted.
 
  • #76
micromass said:
Given the following encoded text, find out whether this is a real text or randomly generated using some scheme. Attempting to decode the text doesn't count.
Man-made (real text). Definitely. A few of the most common letters are y, u, i, h and j, letters that are all next to each other. Of course, everyone thinks to put in a lot of spaces and this text has plenty on them. What letter doesn't appear at all? The letter nobody thinks about and that is at the bottom left corner of the keyboard. z. The probability of z not appearing at all is infinitesimally small.
 
  • #77
Isaac0427 said:
Man-made (real text). Definitely. A few of the most common letters are y, u, i, h and j, letters that are all next to each other. Of course, everyone thinks to put in a lot of spaces and this text has plenty on them. What letter doesn't appear at all? The letter nobody thinks about and that is at the bottom left corner of the keyboard. z. The probability of z not appearing at all is infinitesimally small.

The text doesn't have to be generated from a uniform distribution, it could have been generated some other way. That said, I agree that it was probably man-made. The character distribution roughly matches the letter frequencies of the English language, so I assume that it was generated using some kind of substitution cipher (although, amusingly, he seems to have replaced the e's with spaces).

Rplot.png
 
  • #78
Isaac0427 said:
Well, after close examination, I have determined that the real coin toss was the second one. The probability string of 6 or more of the same result is less than or equal to 1/64. Still possible. Having a string of 6 or more of the same result 4 times has the probability of 1/1024-- the first one had more than 4 strings of 6 or more, the second had none of them. I thus say that the first is fake and the second is real.
I don't know where the 1/1024 comes from, but it is not right. The 1/64 applies to a specific position only.
 
  • #79
micromass said:
That's for you to decide on. I left the question very open for this purpose. All I'm giving is the number of people on the train on several days, the question is how you would decide how many seats the train should have. Clearly we don't want it be too long since that's a waste of money. But we don't want it to be too short either, otherwise we get complaints about people not finding seats.

I know some train companies in England who would take all the seats out (if they were allowed to), have everyone stand throughout the journey in order to mininise the number of carriages and call that optimal. Then double the season-ticket prices (if they were allowed to)!
 
  • Like
Likes micromass
  • #80
A professor got a ticket twelve times for illegal overnight parking. All twelve tickets were given either Tuesdays or Thursdays. Is it justified for him to rent a garage on these days?
Insufficient information. How much does renting a garage cost, how much do the tickets cost, averaged over all the times he parked there on Tuesdays and Thursdays?
Does he park there on other days as well? If he parks there on all work days with equal frequency and gets tickets with the same probability, the probability that all parking tickets are limited to two weekdays is just 0.00017. While there is no mathematical proof possible I would expect that those days are indeed more "dangerous" than the others. It is not unreasonable to have more checks on two specific days.
 
  • Like
Likes QuantumQuest
  • #81
Here are my thoughts on number 10:

How did the psychic get his infallibility rating? It seems fairly obvious to take both boxes. If the psychic always predicts that the player will take both boxes, then in 99.9% of cases this is what happens. Only 1 in a thousand decides simply to take box A.

I reckon box B has $1M in it.
 
  • #82
I knew combined box problems like 10 already*, so here is an unconventional approach as a psychic is involved:

I would try to find someone convinced in the psychic's ability for a $1000 to $10000 bet that the psychic is wrong, and take box B only. It is a win/win situation: Psychic put money in? I gain 990,000. Psychic didn't put money in? I gain $1000 (same as I would with taking two boxes), and collect additional evidence that psychics do not exist.

* ;)
 
  • #83
#5
micromass said:
I have a big box filled with balls. All balls have a number. I draw 55 balls at random and record their number. They are: 1010, 5050, 104104, 130130, 213213. How many balls do you expect to be in the box?

At the beginning of the experiment, there were at least five balls. Any larger estimates require assumptions about the manner in which the balls were numbered.

source: https://www.reddit.com/r/statistics/comments/1du3r0/favorite_statistics_joke/
3 Americans are on a train through Scotland: a statastician, a physicist and a mathematician. They all see a brown cow out the window.

The statastician says "Oh, cows in Scotland must be brown!"

The physicist says "Well, we know there's a brown cow in Scotland."

The mathematician says "Not quite! We know there is at least one cow in Scotland, and at least half of it is brown!"
 
  • #84
  • Like
Likes mfb and Ygggdrasil
  • #87
I guess mine was too bad, or too hidden :(.

The numbers remind me of a story I saw a while ago (not sure if it really happened): Some students released three pigs at a university, and had them labeled as "1", "2" and "4". The search for pig "3" took quite some time!Problem 4 needs some love. I'll assume that a detection has the same probability for all x between 1 and 20, otherwise we have insufficient information to start working on it. As we do not know the total number of particles, the distribution in the interval is the only thing we can use. We expect an exponential distribution, this is invariant under shifts, so for simplicity subtract 1 from all measured values and the range, we measure from 0 to 19. Unfortunately, within the small experimental sample, the best fit is a flat distribution. Looking at the data, this is not surprising as we have 4 out of 6 decays in the second half. We cannot set an upper limit on λ. For each event, we can calculate a likelihood:
$$L(x)=\frac {e^{-x/\lambda}} { \lambda \left(1-e^{-19/\lambda}\right) }$$
Calculate the product of all 6 events for a total likelihood, and take the negative logarithm of it for a nice scaling.

In images, first the distribution for this dataset, then the distribution for a "more normal" dataset, with more events for small x:

Given data, red line at the limit for λ->infinity.
loglikelihood_orig.png


Example data with a more usual distribution, red line at the limit for λ->infinity again.
loglikelihood_normal.png


A particle physicist would now probably look for the range where the negative log likelihood is not larger than ##\chi_2^{-1}(0.05)/2=1.92## above the minimum, which leads to ##\lambda > 4.0## at 95% confidence level.
 
  • #88
Problem 1) The answer is obviously 0...

- Anyone traveling from Rohan to Mordor will be wanting to take their horse as well. Therefore much more room is needed on each carriage, and horses will make quite a mess on such a journey.

- Anyone/anything traveling from Mordor to Rohan is either an orc or a troll, or something even nastier, so they can bloody-well stand. (There won't be any Rohirrim returning from Mordor to Rohan, since they and their horses will have been eaten by the trolls. That's why the trolls want to go to Rohan -- for 2nd helpings.)

But on 2nd thoughts, the answer is: Go out and shoot whoever had the bright idea of building a Rohan--Mordor railway in the first place! :doh:

:oldbiggrin:
 
  • Like
Likes micromass
  • #89
micromass said:
That's for you to decide on. I left the question very open for this purpose. All I'm giving is the number of people on the train on several days, the question is how you would decide how many seats the train should have. Clearly we don't want it be too long since that's a waste of money. But we don't want it to be too short either, otherwise we get complaints about people not finding seats.

But depending on how we value "cost incurred by passenger in not finding a seat" vs. "cost incurred by company in making a seat available that is unused" the answers would differ. If you do not make those values clear upfront, how can an answer be found?
 
  • #90
DocZaius said:
But depending on how we value "cost incurred by passenger in not finding a seat" vs. "cost incurred by company in making a seat available that is unused" the answers would differ. If you do not make those values clear upfront, how can an answer be found?

You assume the answer is unique. It is not. You can make all assumptions you want, just clearly state them and try to be somewhat realistic.
 
  • #91
The answer to 2: the first one is the real random sequence.
The answer to 9: the text is a simple Caesar code of an English text. Up to you to decode it to see what it says. I think fresh_42 already did this.
 
  • #92
micromass said:
The answer to 2: the first one is the real random sequence.
The answer to 9: the text is a simple Caesar code of an English text. Up to you to decode it to see what it says. I think fresh_42 already did this.


And now I'm struggling with an overflow in RSA ...
 
  • Like
Likes ProfuselyQuarky
  • #93
micromass said:
The answer to 2: the first one is the real random sequence.
An odd one, however. So my initial impression was right, and then I overthought the problem.
 
  • #94
mfb said:
An odd one, however. So my initial impression was right, and then I overthought the problem.
Yep. One more brick in my wall of discomfort when I hear someone, esp. politicians, argue with statistics.
 
  • Like
Likes micromass
  • #95
This one (#2) surprised me too, but the probability of either sequence coming up (again) exactly the same in a random flipping of 200 coins is ## p=(1/2)^{200} ##.
 
  • #96
Charles Link said:
This one (#2) surprised me too, but the probability of either sequence coming up (again) exactly the same in a random flipping of 200 coins is ## p=(1/2)^{200} ##.
Well.. yes, but that's not what the question was about.
 
  • #97
mfb said:
Well.. yes, but that's not what the question was about.
It would appear when micromass selected a "random" coin flip sequence, the unlikely occurred. A couple of simple tests showed it wound up ## 2 \sigma ## or more from the mean. Not too unlikely, but not the most common case.
 
  • #98
I agree that the answer to #2 is a bit odd. The standard way to test for randomness in a dataset (or at least to test whether events are independent) is the Wald-Wolfowitz run test. Dataset 1 contains 108 heads, & 91 tails, so one would expect to see 99.8 runs with a standard deviation of 6.98. The actual number of runs observed was 82, which is 2.5σ below expected, yielding a p-value ~ 0.013

Dataset 2 contains 105 heads and 94 tails, so one would expect to see 100.2 runs with a standard devation of 7.01. The actual number of observed runs is 104, just 0.5σ away from expected, yielding a p-value ~ 0.64.

(Calculations done with the runstest function in MATLAB)
 
  • Like
Likes Charles Link
  • #99
micromass said:
  1. Take the following two sequences of coin tosses:

    Code:
    THHHHTTTTHHHHTHHHHHHHHTTTHHTTHHHHHTTTTTTHHTHHTHHHTTTHTTHHHHTHTTHTTTHHTTTTHHHHHHTTTHHTTHHHTHHHHHTTTTTHTTTHHTTHTTHHTTTHHTTTHHTHHTHHTTTTTHHTHHHHHHTHTHTTHTHTTHHHTTHHTHTHHHHHHHHTTHTTHHHTHHTTHTTTTTTHHHTHHH

    Code:
    THTHTTTHTTTTHTHTTTHTTHHHTHHTHTHTHTTTTHHTTHHTTHHHTHHHTTHHHTTTHHHTHHHHTTTHTHTHHHHTHTTTHHHTHHTHTTTHTHHHTHHHHTTHTHHTHHHTTTHTHHHTHHTTTHHHTTTTHHHTHTHHHHTHTTHHTTTTHTHTHTTHTHHTTHTTTHTTTTHHHHTHTHHHTTHHHHHTHHH

    One of these sequences is from an actual coin toss experiment. The other is invented by a human. Find out which of these is which.

What about the following solution:
We can assume that if a human invents a binary string, then one or both of the following things may happen:

1) the human does not (or can not) keep track of how many H's and T's he has previously generated, so he/she ends up creating a string where the observations are highly biased towards H (or T).

2) the human does not (or can not) remember the whole sequence of H's and T's he/she has generated so far, so he/she tends to generate new observations based on the (few) previous observation(s).

Based on the above assumptions I would propose a decision rule based on the result of the two following tests:

Test #1: Let's call α the probability of getting H in a coin toss. Estimate α by calculating the arithmetic average of H's in the string (this is a maximum likelihood estimator) and call this number \hat{\alpha}. Perform a likelihood-ratio test between the probability of obtaining the given string under the two hypotheses that \alpha=\hat{\alpha} and \alpha=\frac{1}{2}. If the likelihood of the former hypothesis is higher, then conclude that the string was generated by a human. This test can be expressed by checking whether \hat{\alpha} exceeds a certain threshold: \hat{\alpha} < \frac{\ln \left( 2(1-\hat{\alpha}) \right)}{\ln(\hat{\alpha}-1)}

Test #2: Consider two consecutive observations X,Y. Estimate the joint probability distribution PX,Y by considering all the consecutive pairs of observations in the given string and by filling a 2x2 contingency table of the occurences of the four sequences HH, TH, HT, TT in the given string. Perform a \chi^2-test of independence. If the hypothesis of independence is rejected, then we can deduce that there was probably a correlation between consecutive observations. In such case, we conclude that the string was generated by a human.The two strings provided in the original post both pass Test #1 but the first string does not pass Test #2: the corresponding p-values were 0.0164 for the first string, and 0.4944 for the second string and the threshold was set to 0.05, thus we conclude that the first string was generated by a human.

As an additional remark I can see a possible connection between this problem and the theory of data compression. Data that is intelligible (or generated) by a human often contains redundancies which are typically due to the correlation of different portions of data. Such correlations allow compression algorithms to predict the successive portions of a stream of data, given the knowledge of the previous data. This is typically not true for random noise, which is why a realization of true random noise is difficult to compress.
 
Last edited:
  • #100
mnb96 said:
) the human does not (or can not) keep track of how many H's and T's he has previously generated, so he/she ends up creating a string where the observations are highly biased towards H (or T).

2) the human does not (or can not) remember the whole sequence of H's and T's he/she has generated so far, so he/she tends to generate new observations based on the (few) previous observation(s).
The random generation doesn't keep track of that by definition, a human must reproduce that in order to generate a plausible random distribution.
mnb96 said:
f the likelihood of the former hypothesis is higher, then conclude that the string was generated by a human.
It will always be higher, unless we happen to have exactly as many T as H, which is unlikely for a randomly generated string.
##\hat \alpha \leq 1##, ##\ln(\hat \alpha -1)## doesn't work.
mnb96 said:
If the hypothesis of independence is rejected, then we can deduce that there was probably a correlation between consecutive observations. In such case, we conclude that the string was generated by a human.
A randomly generated string will have correlations, because HH can follow after TH or HH, but not after HT or TT.
 
Back
Top