Puzzled about the non-teaching of Monte Carlo method(s) for error analysis

AI Thread Summary
The discussion centers on the application of the Monte Carlo method for uncertainty analysis, which is viewed as more intuitive and accurate compared to traditional analytical methods. Participants highlight the limitations of the standard deviation formula, particularly its assumptions about variable independence and linearity. While Monte Carlo techniques are praised for their ease of understanding and implementation, concerns are raised about their reliance on proper input distributions and the potential lack of insight into error sources. The conversation also touches on the educational curriculum, questioning why Monte Carlo methods are not more prominently featured in science education, despite their practical applications. Ultimately, the consensus emphasizes the importance of understanding both Monte Carlo and traditional analytical methods for effective error analysis.
  • #51
FactChecker said:
I would have to question whether the mean of a ratio is simply the ratio of the means.

Also, the ratio of two independent normally distributed random variables with zero means has a Cauchy distribution, whose mean does not exist. (I gather the idea is to estimate the true resistance as the mean of the V/I values.) The ratio of independent normals with non-zero means may be better behaved.

A paper by Marsaglia about dealing with the ratio of normal random variables is:

https://www.google.com/url?sa=t&rct...sg=AFQjCNEgO1dvktreWiL-rt-ZPcS3K1FmYQ&cad=rja

As to the merits of pencil-and-paper methods of "error propagation", my impression (based only on questions that appear in the forum about error propagation) is that (statistically!) instruction in the pencil and paper methods seems to teach a set of procedures, but does not connect those procedures with the theory of probability models. Many questioners ask about how to compute "uncertainties" in things and are confident that "uncertainty" has an obvious meaning. It may indeed have a well defined meaning in their particular technical discipline, but I don't detect this meaning is identical to a particular concept in probability and statistics. Of course, there is a potential bias in my sample since people with less understanding of the subject would be more inclined to ask questions about it.
 
Science news on Phys.org
  • #52
FactChecker said:
Thanks. That answers my question. So you are measuring a fixed, but unknown resistance using meters that have the errors you describe. The voltage and current errors are simply measurement errors that are independent of each other. That makes sense to me now. So your analysis and Monti Carlo simulation are about the distribution of the calculated resistance using Ohm's law with the measurement errors. I guess that the means of the MC voltage and current simulations are set for the specific true resistance. I would have to question whether the mean of a ratio is simply the ratio of the means.
I think the median nails it pretty accurately. The mean is going to be biased towards greater values. Part of the reason why I asked why the GUM states to report the mean and SD rather than the median and confidence intervals.
 
  • #53
fluidistic said:
Part of the reason why I asked why the GUM states to report the mean and SD rather than the median and confidence intervals.

A simple argument for reporting the mean and SD:
Suppose we do N independent replications of a simulation and each replication produces one value of Y. Then, by the central limit theorem, the mean value of all the Y's is approximately normally distributed for large N , provided the distribution of Y (as an individual random variable) has finite variance. (The distribution of Y need not be a normal distribution for this to hold.) So if the purpose of doing the simulation is to estimate the "true" value Y by using the mean of the Y data, the estimated standard deviation of Y is relevant to computing confidence intervals.

In the above approach, the size of confidence intervals depends on the number of samples (replications) since that number enters into the calculation of the sample standard deviation of the Y data. By contrast, you suggest looking at a histogram and its median to compute a confidence interval. How does the size of this confidence interval depend on the number of replications you did to create the histogram? If the number you report is not a function of the number of replications of the simulation, then can it be interpreted as a "confidence interval" or should we use some other terminology for it?
 
  • #54
Stephen Tashi said:
Also, the ratio of two independent normally distributed random variables with zero means has a Cauchy distribution, whose mean does not exist.
These variables do not have zero means
 
  • #55
The distribution of the ratio of uncorrelated normal variables with non-zero means is discussed here. They give the equation for the variance. It uses the individual means and variances. I do not see anything about the mean. It makes me tired. :cool:
This is something where I would feel more comfortable with the Monte Carlo simulation than with my analysis, and it really is a very basic thing.
 
  • #56
Zero over zero is never fun.
 
  • #57
FactChecker said:
This is something where I would feel more comfortable with the Monte Carlo simulation than with my analysis, and it really is a very basic thing.

And the most basic thing, is: What is the problem being analyzed? The question highlights the distinction between the usual type of "confidence interval" analysis versus "error propagation" analysis.

In confidence interval analysis, we usually have data in the form of independent samples. The confidence interval analysis depends on the data and the amount of data we have.

By contrast, the example being simulated assumes some given distributions. There is no mention of how the parameters of these distributions relate to any data. It is as if the parameters of the distributions are handed to us and assumed to be exact.

The general form of confidence interval analysis is that we have a formula ##\hat{c}## for estimating some parameter ##c## of a distribution as a function of data. The usual form of a confidence interval is ##( \hat{c} - L/2, \hat{c} + L/2)## where ##L## is positive constant. A confidence interval depends on data that has randomness, so it is a randomly generated interval. The probability associated with confidence intervals of length ##L## is the probability that the true value of the parameter lies within at least one of these randomly generated intervals. (This probability is not associated with single confidence intervals. For example, a "95 % confidence" is not a probability that can be associated with a single interval like ##(25.3 - L/2, 25.3 + L/2)##).

So how do the simulation results relate to a confidence interval analysis? I think the histogram of ##V/I## is relevant to analyzing whether a single measurement (a sample of size 1) of ##(V,I)## is within a certain distance of the true value of ##R##. If multiple measurements are taken, further analysis is needed. However, the parameters of the distributions for ##V,I## used in the simulation results are not presented as estimates from data. It appears logically contradictory to analyze the situation of multiple ##V/I## measurements using a simulation that ignores them.
 
  • #58
It's a very simple Monte Carlo simulation where the V and I measurements are modeled as Gaussian. I think that the sample results can be used for probability distributions and confidence intervals as with any other random sample. The validity of the model to represent the particular resistance problem might be questionable.
 
Last edited:
  • #59
FactChecker said:
It's a very simple Monte Carlo simulation where the V and I measurements are modeled as Gaussian. I think that the sample results can be used for probability distributions and confidence intervals as with any other random sample.

I agree that the simulation is simple. But what is the analysis? For example, what is the length of a 95% confidence interval centered on the mean ( or median if we prefer) of 10 pairs of ##V/I## measurements?
 
  • #60
I guess I have to take back any strong statement about the confidence intervals. I would treat the Monte Carlo sample the same as any other statistical sample.
I am also troubled by an analytical approach that allows it to be a ratio of normal random variables. Suppose we ignore the problem of division by zero. What do we say if the voltage measurement is positive and the current measurement is negative? Or vice versa. Do we just throw out that possibility? And what if both happen to be negative? Since they are assumed to be uncorrelated, that would be possible. Would we treat that as more valid just because they are not mixed positive/negative and they give a positive resistance? This resistor problem is a hypothetical problem and I am not sure that we are modeling it properly.
The problem hurts my head. I think I will leave this problem to others.
 
Last edited:
  • #61
Stephen Tashi said:
has a Cauchy distribution

You mean a Breit-Wigner, right? Or maybe a Lorentzian? :wink:
 
  • #62
FactChecker said:
The problem hurts my head. I think I will leave this problem to others.
I am certainly glad I was never "educated" in this stuff. And I used it a lot with great practical success, largely oblivious I guess.
All I require is that the deviations (rms errors) be small compared means, and they be "uncorrelated". Then the leading order terms for the rms deviations are as described independent of the form of the distributions. Of course if the Taylor expansion for the functional dependence blows up there is a problem, but in the real world this seldom happens.
These techniques are extraordinarily useful and robust. In my experience the only places requiring some care are low probability events (the wings of the distribution) where wrong assumptions will bite you. Do not be afraid. Say the magic words: "central limit theorem".
 
  • Like
Likes FactChecker
  • #63
First, the thing we really want is the probability that the true value is inside the error bars to be 68%. That's not well-defined. (And to also require that the probability that the true value is also inside twice the error bars to be 90% is even less well-defined). So our whole statistical technique, analytic or Monte Carlo, isn't built on a mathematically rigorous foundation. But it's the best we have. And "not mathematically rigorous" is not the same as useless - there's real meaning in comparing error bars of 1%, 5% and 10%, even if none of these are exactly what we hoped to know.

FactChecker said:
What do we say if the voltage measurement is positive and the current measurement is negative?

Here's where you have to think like a physicist, not a mathematician. If my knowledge of the current is so poor I can't tell which way it is flowing, or even if it is flowing at all, I shouldn't be using it to calculate the resistance.
 
  • Like
Likes hutchphd
  • #64
hutchphd said:
In my experience the only places requiring some care are low probability events (the wings of the distribution) where wrong assumptions will bite you. Do not be afraid. Say the magic words: "central limit theorem".
You are correct, of course. I was thinking of the 10% voltage error as being a huge error without realizing that a negative voltage would be 10 standard deviations below the mean. Stranger things have happened, but not since Moses.
 
  • Like
Likes hutchphd
  • #65
If the goal of the simulation example was to compare the simulation technique with a theoretical pencil-and-paper technique then we'd need to see the pencil-and-paper technique worked out to make the comparison. But to see the pencil-and-paper technique worked out, we'd need to define what problem is being analyzed in the first place!
 
  • Like
Likes FactChecker
  • #66
hutchphd said:
The bottom line is you need to know what the hell you are doing...The Monte Carlo methods give you numbers and very little insight. If you don't understand the theory, you will not know what you are doing.
My PhD advisor said exactly the same thing to me 25 years ago.
 
  • Like
Likes hutchphd
  • #67
Insight and theory are great when they are correct. But things often get very complicated and confusing. If nothing else, Monte Carlo simulations can often be used to verify theoretical results or to point out errors.

Here is an example showing the difficulty of analyzing a fairly simple queueing problem: https://www.physicsforums.com/threads/waiting-time-in-a-queue-using-poisson-arrival.902175
And the only way I would feel confident of the analytical results is if there was a simulation that supported them. The only way to do real work in queueing theory is with MC simulation. IMHO, the only queueing problems that can be solved analytically are trivial ones.

Here is a problem involving a dice game where the analytical solution is messy and the MC simulation is simple. https://www.physicsforums.com/threads/probability-in-a-dice-game.989492/
Again, I would not be confident of the analytical solution at all if there was not an MC simulation result to support it.
 
Last edited:
  • #68
I had a professor who taught Classical Electrodynamicsi at the graduate level, not from Jackson,but with his own notes involving exclusively, differential forms. I knew a professor that was proposing to teach graduate Classical Mechanics, not from Goldstein, but using category theory, alone. Both suggested these areas were sadly lacking in graduate physics education. After attending the Electrodynamics course, I had (two) graduate courses in Jackson. (I went to graduate school twice).

It may seem I am favor of conventional treatment of the physics curriculum. I think there is danger in uniformity, and it is good for some students to have different tools in their toolbox. However, it is hard to come up with areas in the tight physics curriculum that could be left out. Certainly, including MC methods at the expense of other important topics is going to be objected too by others.

It seems like, when the poster becomes the instructor head of the course, he or she can then teach whatever he or she wants. There is quite a bit academic freedom in the USA anyway. The professor who taught differential forms did not get much pushback. The professor who proposed category theory (as far as I know) did not get his course, because no student was interested in taking the course.

Also, the training of a physicist contains more than just physics courses. Physicists can run into MC techniques in computer science courses, or statistics courses. A good argument could be made that statistics and probability should be required. Maybe some would say, substitute probability, for Complex Analysis. However, a look at most graduate physics program, seems to regard complex analysis over probability. As I wrote, you can always find somebody to find something they feel should be part of the education, that is overlooked.
 
  • Like
Likes fluidistic
Back
Top