Confused about using Monte Carlo for error estimation

In summary: No. At least nothing there seems wrong to me. Why do you think there is something wrong?I think you may be confusing the standard deviation with the standard error of the mean. The uncertainty of ##A## is ##\sigma_A##, the standard deviation of ##A##. It seems like you are concerned that your ##\bar A## is much closer to ##\mu_A## (0.5) than ##\sigma_A##. But that is not a problem. ##\bar A## has a different distribution than ##A##. The uncertainty in ##\bar A## is usually called the standard error and will be about ##\sigma_{\bar A}=\
  • #1
kelly0303
563
33
Hello! I have 2 probabilities ##p_1## and ##p_2## which governs the probability of measuring some events. I measure event 1 N times and get ##N_1## counts and event 2 N times and get ##N_2## counts. Then I need to build the function ##A = \frac{N_1-N_2}{N_1+N_2}##. I am trying to estimate the uncertainty on ##A## using Monte Carlo but I can't seem to get it right. What I am doing now is to generate N events 1 and N events 2 sampled from a binomial distribution with probability ##p_1## and ##p_2##, respectively (I am using ##p_1 = 0.06## and ##p_2 = 0.02##). I sum the events for each case and obtain ##N_1## and ##N_2## then I compute ##A##. I am doing this 10000 times and I am using the mean and standard deviation of the 10000 A values as the estimate and uncertainty for A, which should be 0.5. However, when I tried N = 100, I obtained ##0.496 \pm 0.328##, for N = 1000 I obtained ##0.499 \pm 0.097## and for N = 10000 I obtained ##0.500 \pm 0.030## (or in general very closed value to these numbers for that specific N). It seems like the actual estimate for A is much closer to the real value (0.5) than the obtained uncertainty. Am I doing something wrong? Do I need to divide by ##\sqrt{N}## at a point (the uncertainty seems to already scale as ~##\sqrt{N}## so I assumed this is not needed)?
 
Physics news on Phys.org
  • #2
kelly0303 said:
It seems like the actual estimate for A is much closer to the real value (0.5) than the obtained uncertainty. Am I doing something wrong?
I agree that it seems strange. You might want to make a scatter plot of the results, say for N=100, (maybe for less than 10000 runs) to see if your calculated standard deviation seems reasonable.
 
  • #3
kelly0303 said:
Hello! I have 2 probabilities ##p_1## and ##p_2## which governs the probability of measuring some events. I measure event 1 N times and get ##N_1## counts and event 2 N times and get ##N_2## counts.
Can you explain -- for the rest of us :wink: -- what it means if you 'measure an event' ?

I have measured a lot of things, but never events.

##\ ##
 
  • #4
BvU said:
Can you explain -- for the rest of us :wink: -- what it means if you 'measure an event' ?

I have measured a lot of things, but never events.

##\ ##
In my case I have a 2 level system, which is prepared in a given state, and it then evolves to the other state with a given probability. In this case an event is a measurement in the second state that produces a count.
 
  • Like
Likes BvU
  • #5
kelly0303 said:
In my case I have a 2 level system, which is prepared in a given state, and it then evolves to the other state with a given probability. In this case an event is a measurement in the second state that produces a count.
One of my concerns was whether the two events you are measuring are independent. I don't know if a dependence between them might explain the phenomenon that you have observed.
 
  • #6
FactChecker said:
One of my concerns was whether the two events you are measuring are independent. I don't know if a dependence between them might explain the phenomenon that you have observed.
Sorry for the confusion. My question is purely theoretical now (I want to understand how to estimate an uncertainty using Monte Carlo). For the purpose of the question we can assume they are independent (the are also independent in the actual experiment, but the situation there is much more complicated that this example). Basically what I did was to generate these numbers in Python, compute A, and then get the mean and standard deviation. For reference this is all the code I used:

Code:
import numpy as np

N = 1000
M = 10000

counts_up = np.random.binomial(N,0.06,M)
counts_down = np.random.binomial(N,0.02,M)
A = (counts_up-counts_down)/(counts_up+counts_down)

print(np.nanmean(A),np.nanstd(A))
 
Last edited:
  • Like
Likes Dale
  • #7
I don't know numpi, but I am very leary of reusing the exact same name of a variable and an array as you have for counts_up in line 8 (same for counts_down). Why not use a different name? You are not charged for them. ;-)
 
  • #8
kelly0303 said:
It seems like the actual estimate for A is much closer to the real value (0.5) than the obtained uncertainty. Am I doing something wrong?
No. At least nothing there seems wrong to me. Why do you think there is something wrong?

I think you may be confusing the standard deviation with the standard error of the mean. The uncertainty of ##A## is ##\sigma_A##, the standard deviation of ##A##. It seems like you are concerned that your ##\bar A## is much closer to ##\mu_A## (0.5) than ##\sigma_A##. But that is not a problem. ##\bar A## has a different distribution than ##A##. The uncertainty in ##\bar A## is usually called the standard error and will be about ##\sigma_{\bar A}=\sigma_A/\sqrt{M}##. This seems reasonable based on the numbers you reported.
 
Last edited:
  • Like
Likes FactChecker
  • #9
Dale said:
No. At least nothing there seems wrong to me. Why do you think there is something wrong?
I am really new to MC, but I am not sure what I should get out of it. If I need to use A in other equations, where I also need to apply MC, will I need to sample A from the mean and standard deviation obtained above? Or should I use the mean and standard deviation obtained above directly?
 
  • #10
Dale said:
No. At least nothing there seems wrong to me. Why do you think there is something wrong?
It does seem strange to me that his sample means in three cases are so close to the theoretical value that the sample standard deviations are orders of magnitude greater than the errors. I would like to see the results of a few more runs to see if that was just luck.
 
  • #11
kelly0303 said:
I am really new to MC, but I am not sure what I should get out of it. If I need to use A in other equations, where I also need to apply MC, will I need to sample A from the mean and standard deviation obtained above? Or should I use the mean and standard deviation obtained above directly?
What you have is a large number of samples of ##A##. Let’s denote that as ##\{A_i\}=\{A_1,…,A_M\}##. If you need future draws of ##A## then there are three approaches:

1) you can generate a new draw directly using the generating process of ##A## that you described in the OP and in your python code.

2) you can use the sample mean, standard deviation, etc. to approximate ##A## as being from some other distribution, like a normal. Then draw from that distribution.

3) you can draw a sample of ##A## from ##\{A_i\}## with uniform probability. This is called the empirical distribution.
 
  • #12
FactChecker said:
It does seem strange to me that his sample means in three cases are so close to the theoretical value that the sample standard deviations are orders of magnitude greater than the errors. I would like to see the results of a few more runs to see if that was just luck.
Why is that strange? That is what we expect.

Since ##M=10000## we expect ##\bar A## to be closer to ##\mu_A## than ##\sigma_A## by a factor of ##\sqrt{M}=100##. The data appear consistent with that
 
Last edited:
  • Like
Likes FactChecker
  • #13
Dale said:
Why is that strange? That is what we expect.

Since ##M=10000## we expect ##\bar A## to be closer to ##\mu_A## than ##\sigma_A## by a factor of ##\sqrt{M}=100##. The data appear consistent with that
I didn't catch that. I stand corrected. The printed values are the sample mean of 10,000 values of A and the standard deviation of A, not the standard deviation of the sample mean of A. So yes, we should expect the sample mean of 10,000 to have a standard deviation smaller by a factor of ##\sqrt {10,000}##.
 
  • Like
Likes Dale

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
22
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
842
  • Set Theory, Logic, Probability, Statistics
Replies
18
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
Replies
67
Views
5K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
  • Programming and Computer Science
Replies
1
Views
817
  • Atomic and Condensed Matter
Replies
3
Views
998
  • Set Theory, Logic, Probability, Statistics
Replies
16
Views
1K
Back
Top