Confused about using Monte Carlo for error estimation

kelly0303 · Apr 18, 2023

Hello! I have 2 probabilities ##p_1## and ##p_2## which governs the probability of measuring some events. I measure event 1 N times and get ##N_1## counts and event 2 N times and get ##N_2## counts. Then I need to build the function ##A = \frac{N_1-N_2}{N_1+N_2}##. I am trying to estimate the uncertainty on ##A## using Monte Carlo but I can't seem to get it right. What I am doing now is to generate N events 1 and N events 2 sampled from a binomial distribution with probability ##p_1## and ##p_2##, respectively (I am using ##p_1 = 0.06## and ##p_2 = 0.02##). I sum the events for each case and obtain ##N_1## and ##N_2## then I compute ##A##. I am doing this 10000 times and I am using the mean and standard deviation of the 10000 A values as the estimate and uncertainty for A, which should be 0.5. However, when I tried N = 100, I obtained ##0.496 \pm 0.328##, for N = 1000 I obtained ##0.499 \pm 0.097## and for N = 10000 I obtained ##0.500 \pm 0.030## (or in general very closed value to these numbers for that specific N). It seems like the actual estimate for A is much closer to the real value (0.5) than the obtained uncertainty. Am I doing something wrong? Do I need to divide by ##\sqrt{N}## at a point (the uncertainty seems to already scale as ~##\sqrt{N}## so I assumed this is not needed)?

FactChecker · Apr 18, 2023

kelly0303 said:

It seems like the actual estimate for A is much closer to the real value (0.5) than the obtained uncertainty. Am I doing something wrong?

I agree that it seems strange. You might want to make a scatter plot of the results, say for N=100, (maybe for less than 10000 runs) to see if your calculated standard deviation seems reasonable.

BvU · Apr 18, 2023

kelly0303 said:

Hello! I have 2 probabilities ##p_1## and ##p_2## which governs the probability of measuring some events. I measure event 1 N times and get ##N_1## counts and event 2 N times and get ##N_2## counts.

Can you explain -- for the rest of us

-- what it means if you 'measure an event' ?

I have measured a lot of things, but never events.

##\ ##

kelly0303 · Apr 18, 2023

BvU said:

Can you explain -- for the rest of us -- what it means if you 'measure an event' ?

I have measured a lot of things, but never events.

##\ ##

In my case I have a 2 level system, which is prepared in a given state, and it then evolves to the other state with a given probability. In this case an event is a measurement in the second state that produces a count.

FactChecker · Apr 18, 2023

kelly0303 said:

In my case I have a 2 level system, which is prepared in a given state, and it then evolves to the other state with a given probability. In this case an event is a measurement in the second state that produces a count.

One of my concerns was whether the two events you are measuring are independent. I don't know if a dependence between them might explain the phenomenon that you have observed.

kelly0303 · Apr 18, 2023

FactChecker said:

One of my concerns was whether the two events you are measuring are independent. I don't know if a dependence between them might explain the phenomenon that you have observed.

Sorry for the confusion. My question is purely theoretical now (I want to understand how to estimate an uncertainty using Monte Carlo). For the purpose of the question we can assume they are independent (the are also independent in the actual experiment, but the situation there is much more complicated that this example). Basically what I did was to generate these numbers in Python, compute A, and then get the mean and standard deviation. For reference this is all the code I used:

Code:

import numpy as np

N = 1000
M = 10000

counts_up = np.random.binomial(N,0.06,M)
counts_down = np.random.binomial(N,0.02,M)
A = (counts_up-counts_down)/(counts_up+counts_down)

print(np.nanmean(A),np.nanstd(A))

FactChecker · Apr 18, 2023

I don't know numpi, but I am very leary of reusing the exact same name of a variable and an array as you have for counts_up in line 8 (same for counts_down). Why not use a different name? You are not charged for them. ;-)

Dale · Apr 18, 2023

kelly0303 said:

It seems like the actual estimate for A is much closer to the real value (0.5) than the obtained uncertainty. Am I doing something wrong?

No. At least nothing there seems wrong to me. Why do you think there is something wrong?

I think you may be confusing the standard deviation with the standard error of the mean. The uncertainty of ##A## is ##\sigma_A##, the standard deviation of ##A##. It seems like you are concerned that your ##\bar A## is much closer to ##\mu_A## (0.5) than ##\sigma_A##. But that is not a problem. ##\bar A## has a different distribution than ##A##. The uncertainty in ##\bar A## is usually called the standard error and will be about ##\sigma_{\bar A}=\sigma_A/\sqrt{M}##. This seems reasonable based on the numbers you reported.

kelly0303 · Apr 18, 2023

Dale said:

No. At least nothing there seems wrong to me. Why do you think there is something wrong?

I am really new to MC, but I am not sure what I should get out of it. If I need to use A in other equations, where I also need to apply MC, will I need to sample A from the mean and standard deviation obtained above? Or should I use the mean and standard deviation obtained above directly?

FactChecker · Apr 19, 2023

Dale said:

No. At least nothing there seems wrong to me. Why do you think there is something wrong?

It does seem strange to me that his sample means in three cases are so close to the theoretical value that the sample standard deviations are orders of magnitude greater than the errors. I would like to see the results of a few more runs to see if that was just luck.

Dale · Apr 19, 2023

kelly0303 said:

I am really new to MC, but I am not sure what I should get out of it. If I need to use A in other equations, where I also need to apply MC, will I need to sample A from the mean and standard deviation obtained above? Or should I use the mean and standard deviation obtained above directly?

What you have is a large number of samples of ##A##. Let’s denote that as ##\{A_i\}=\{A_1,…,A_M\}##. If you need future draws of ##A## then there are three approaches:

1) you can generate a new draw directly using the generating process of ##A## that you described in the OP and in your python code.

2) you can use the sample mean, standard deviation, etc. to approximate ##A## as being from some other distribution, like a normal. Then draw from that distribution.

3) you can draw a sample of ##A## from ##\{A_i\}## with uniform probability. This is called the empirical distribution.

Dale · Apr 19, 2023

FactChecker said:

It does seem strange to me that his sample means in three cases are so close to the theoretical value that the sample standard deviations are orders of magnitude greater than the errors. I would like to see the results of a few more runs to see if that was just luck.

Why is that strange? That is what we expect.

Since ##M=10000## we expect ##\bar A## to be closer to ##\mu_A## than ##\sigma_A## by a factor of ##\sqrt{M}=100##. The data appear consistent with that

FactChecker · Apr 19, 2023

Dale said:

Why is that strange? That is what we expect.

Since ##M=10000## we expect ##\bar A## to be closer to ##\mu_A## than ##\sigma_A## by a factor of ##\sqrt{M}=100##. The data appear consistent with that

I didn't catch that. I stand corrected. The printed values are the sample mean of 10,000 values of A and the standard deviation of A, not the standard deviation of the sample mean of A. So yes, we should expect the sample mean of 10,000 to have a standard deviation smaller by a factor of ##\sqrt {10,000}##.

Confused about using Monte Carlo for error estimation

What is Monte Carlo simulation?

How does Monte Carlo simulation help with error estimation?

What are the limitations of using Monte Carlo simulation for error estimation?

How do you determine the number of simulations needed for accurate error estimation using Monte Carlo?

What are some common applications of Monte Carlo simulation in scientific research?

Similar threads

Hot Threads

Recent Insights