| New Reply |
Pathological PDFs. eg: ratio of normals... including Cauchy. |
Share Thread | Thread Tools |
| Aug3-12, 01:36 PM | #18 |
|
|
Pathological PDFs. eg: ratio of normals... including Cauchy.
Chiro,
I made an out of context remark regarding Marsaglia. He is quite justified in going to the trouble of using a 50 digit number -- My statement that his approximations were "crude" was based on the general notion of approximation vs. exact integrals...-- I didn't mean to give an impression of his work being garbage, and after re-reading my quick remark; I just realized you might have taken my comment that way. ![]() Until I actually give my own approximation, I have no idea whether or not I can trump his. Cheers. --Andrew. |
| Aug3-12, 03:15 PM | #19 |
|
|
If you go back to the earlier thread from which this branched out, and you look for Chiro' giving me a suggestion on computing the integral -- That's where Fourier came into the discussion... All distributions using Gaussian randoms implicitly come from a limiting approach based on samples. The Gaussian was discovered by asking the question, if a measurement (discrete) is repeated what is the probability of the error. To answer the question; Gauss took a limit for infinite repetitions, and the bell curve was arrived at. There is, then, in each distribution (& Cauchy) -- an implicit limit going from the discrete/finite to the continuous. If the continuous integrals for the mean had always diverged toward positive infinity or negative infinity, there would be no question of what theory said/predicted... But the moment theory can arrive at infinity minus infinity -- that isn't a result of any kind. If that is true, then an estimate of 0 is as good as any other estimate -- they are all *equally* bad. BUT: In a real experiment, there is going to be a sampling granularity -- and an actual finite value for the mean. When I hear the words that the integral does not converge, then I understand The QUESTION was somehow asked wrong -- and the boundary conditions need to be looked at to decide why. Simulations are a way to look for problems, and inconsistencies -- because they indicate the results a real measurement would report. The problem may be resolved in many different ways -- without changing the theory itself. But, I need *quantitative* models of how simulation of Cauchy distribution using discrete samples is going to be different from the continuous case. I think (intuition) what we're missing is the idea of a confidence interval. How likely an experiment will "hit" the div/zero discontinuity close enough to disturb an otherwise tranquil mean... (Don't answer that! it's in English -- and I need to quantify it) I am looking at your equations, and I don't find them objectionable -- but I'm not sure they give the same result that I have... (I'll comment later). The question, to me, is not about whether or not we condition the distributions -- the question is; how, and on what basis, and why. As I said in the OP -- I'll have to explain some of this later in the thread -- (I'm working on it, hard...) When I wrote the OP -- I was expecting corrections inside the derivation, your approach is unexpected, and that's difficult for me to adjust to. I'm trying. |
| Aug3-12, 04:22 PM | #20 |
|
|
In contrast, if I now take this new view you are giving: I implicitly gave the information about how the Gaussian's were conditioned in my derivation, even if I didn't know I was doing it. At most it's a notational issue... and I asked people for help about formalizing the derivation; intentionally attempting to uncover issues like this one, directly. Hindsight, of course... Hopefully, tomorrow will be better. |
| Aug3-12, 09:38 PM | #21 |
|
|
Stephen, I think this will be a simple idea...
The Cauchy has a CDF of [itex] {1 \over 2} + {1 \over \pi} tan ^{ -1 } ( x )[/itex]; When estimating a mean, I might think like this: (assume purely positive numbers). The mean must be between the lowest value in the sample and the highest; If I assume the highest value takes the whole weight of the average (n samples times highest value / n), Then the mean must be lower or equal to that value. Hence, I can ask -- what is the greatest distance from 0 that the mean could be? 90% of the time, the largest sample will be no higher than: 6.31376; hence the weight of that is 5.1... ( 0.9 * 6.313.. ) 99% of the time, the largest sample will be no more than 63.65675; hence the weight of that is 5.72... ( 63.65675 * 0.09 ) 99.9% of the time -- 636.61925 and weight = 5.72... (636 * 0.009) So, I can expect with some confidence -- that a mean of no more than 18 will be computed at least a certain definite percentage of the time; depending on the number of samples taken. Hence, it doesn't seem right to say the mean is totally unbounded -- but there must be some kind of relationship between sample size and a typical mean. If there were no rhyme nor reason to the mean, all values would be equally likely. But that's not the case... |
| Aug3-12, 09:47 PM | #22 |
|
|
Basically, it looks what you are doing is talking about a different distribution with each level of 'confidence'.
One suggestion I have is that for the X/Y problem, you should modify Y so that you exclude a region of a neighbourhood around 0 (i.e. you censor this region where P(-e < X < e) = 0 for some epsilon e) and then recompute the density function. The idea of using a pure Gaussian for the denominator, even for something like NIST is absolutely stupid and if they want to use X/Y without any modification, then they are going to deal with the case of no moments existing. Since all you are doing is effectively changing the distribution for each level of confidence, you are probably IMO better off in just creating the distribution you intended and then calculating the mean in the way that it is calculated rather than trying to fudge the calculation of the mean for a distribution where it does not exist. This way, you'll keep to the definitions (which are there for a reason because they work both theoretically and practically) and you will be able to clarify your assumptions by the nature of the definition of the actual distribution (for example censor the region around 0 is due to getting rid of dividing by numbers close to zero). (Also remember if you do censoring, you have to normalize the distribution to make sure it integrates to 1). |
| Aug4-12, 01:30 AM | #23 |
|
|
![]() I'm not a glutton for punishment... But consider a realistic case; There is a wall 2000(1) mm away. Divide that distance by a *SLOPPY* 1 meter scale 1000(1) mm long. What are the results? Well, the result is obviously about 2 (on mean). However, this still factors into a form which Marsaglia treats as having a Cauchy. ( I forget exactly what he said about that -- my eyes glazed over...) N(2000,1) / N(1000,1) = (2000 + N(0,1)) / (1000 + N(0,1)) = =2000/(1000+N(0,1)) + N(0,1)/(1000+N(0,1)) My solution so far, is just for the first part -- and even that is supposedly invalid because I used a cauchy principle value... Yet the second part is the only thing really Cauchy distribution like in the problem, so I assume that's a Cauchy with the mode offset in one direction or the other.... But in any event, it's not possible to avoid having that second part in the equation -- and if it's mean can be anything -- then the sum of the two means, can be anything -- and well, the *very* practical problem has just become theoretically impossible when ACTUALLY using the theory and not faking it. I find that quite perplexing. Gosh! the odds of hitting the zero point are 1 in 1000 sigmas. It *aint* gonna happen... BUT -- it's still a Cauchy?!!!!! Now, let's talk about the choice of "theory". There is no reason both of these lengths couldn't be repetitively measured over and over -- so a Gaussian is the most appropriate distribution. But once we do a ratio, we are going to have a Cauchy and a 1/Gaussian. WOW. But, Yes, that sounds possible -- although I need to start with the confidence interval: eg: I need someone to be able to tell me they want my result 99.9% certain, and then I need to compute how much of the zero divide to censor... THEN, I can do it. I'm going to sleep on it tonight... Dunno.... |
| Aug4-12, 02:06 AM | #24 |
|
|
For the practical part, the first thing to focus on is getting the distribution for 1/X where X is censored and then look at Y/X after you get the censored distribution for 1/X.
It's best if you leave the 1/X distribution in terms of the e mentioned above so that later you can see how this e effects the calculation of the mean of Y/X: this solves your problem of analyticity and you can use this to compare how many standard deviations you need to get a mean of a particular value, but looking at how the epsilon affects the final calculation of E[Y/X] where Y is Gaussian and 1/X is the transformation of the inverse of your censored distribution. You can simulate this extremely easily by using a method to simulate from a censored distribution (an MCMC approach will do this) and then simply simulating from the Gaussian giving a simulation for Y/X. The assumption of censoring is one that can quantified in the context of more general assumptions in the domain (i.e. engineering) by considering the nature of what is being calculated (i.e. scales of things, what these things are) in relation to the epsilon used in the censoring process. I think that the above suggestion will help you not only derive a distribution and ultimately a mean using censorship around 0 for the denominator RV, but also to actually quantify the characteristics and how the epsilon changes the value of not only the mean, but also the other moments as well. |
| Aug4-12, 05:25 PM | #25 |
|
Recognitions:
|
As a very general observation (general enough to apply to the whole of mathematical society, not particularly to yourself), there is always a mental contest between formal mathematics vs the philosophy of mathematics that I would called Mathematical Platonism. The Wikipedia deistinguishes quite a number of species of Mathematical Platonism, but to me, the common element in this philosophy is the belief that things with mathematicl definitions have a reality that exists apart from the definition. In your particular case, you believe that the concept of "mean value of a distribution" has a reality beyond the formal definition, so you allow yourself to reason about this reality and reach conclusions based on your private vision of it. I think almost everybody does this to some degree. Sometimes Mathematical Platonism leads nowhere. For example, if you look at threads on the forum that are inviitations to Mathematical Platonists, such as "Is multiplication repeated addition?", "Is dy/dx a ratio?", you find that many of the posts with a Platonic slant are opinionated and unimaginative. But sometimes you do find Platonic outlooks that are very helpful intuitive ways to think about mathematical ideas. Physicists and engineers often take the Platonic view of mathematics and I suspsect the reason that physics and engineering are able to cruise along with the Platonists on board is that most concepts they deal with don't depend on a legalistic and precise application of logic. On the other hand, mathematics gets into a mess if it tries to develop results based on Platonic arguments. There are simply too many different contradictory private concepts of things like "limits", "infinity", "probability" etc. among human beings. The only way to arrive at definite results is to have formal definitions and develop arguments based on those definitions, not people's private visions of what things are. I don't want to discourage you from Platonic reasoning. I just want to make the point that whatever conclusions you reach by that reasoning have to reconciled with the formal mathematical definitions and presented in those terms in order for them to be accepted as mathematics. ---- As to the observations about the sample mean: The sample mean of the Cauchy distribution is a statistic that does have a distribution. For a sample size of 1, it is obviously just the Cauchy distribution. It's an interesting question what the distribution is for larger size samples. The Central Limit Theorem (that the mean of an independent samples of size n is approximately normally distributed for large n) doesn't apply to the Cauchy distribution since that theorem requires that the distribution being sampled have a (finite) variance. (This sets me wondering about such things as: Are their distributions whose k-th moment doesn't exist, but the k-th moment of the sample distribution (for sample size > 1) exists? Are their distributions whose k-th moment doesn't exist, but such that the kth-moment of the limiting distribution of their sampling distribution (as sample size approches infinity) exists. If no kind soul happens to tell me, I may start a thread with such questions someday.) The non-existence of the mean of Cauchy distribution involves (according the formal definition) the non-existence of a integral that is done using the distribution. Thus the entire distribution is considered when doing the integration. The fact that a particular large value of a Cauchy random variable is unlikely in a sample doesn't mean that you can leave that value out when you do the integral. The problem with the (formal) existence of the integral depends on how you define integrals (Riemann vs Lebesgue - either way, therei's a problem). Again, it's the people who frequent the Calculus & Analysis section can probably give an authoritative answer about that. I'm sure you've studied integrals invovling infinite limits and various theorems about when they exist or not. Some function die-off quick enough so that the integral from 0 to infinity exists, other's die-off but not quickly enough. That's the type of thing involved in the integral for the mean of the Cauchy. In the integral for the mean of the ratio of Gaussians, the problem is that the integrand is unbounded. I don't think you should give-up on using the Cauchy principal part in your calculations. I merely suggest that you rephrase the claim about what you are calculating. My (Platonic) view is that your are calculating a limit of the means of distributions that are "conditioned" by setting them equal to zero on parts of the real line. (A density f(x) defined on a the real line can be used to define another density g(x) that leaves out intervals of the real line. On the part that is not left out, define the modified density to be g(x) = f(x)(/ 1 - P) where P is the probability of the left-out part. On the part that is left out, you define g(x) = 0.) |
| Aug6-12, 12:48 PM | #26 |
|
|
Introducing a censoring window causes the angle of integration to change from 0 to 2π, to a set of bounds that depend on r. I am unable to solve the new integral itself. Not only that, but in trying to solve for the Cauchy distribution itself (often done implicitly when solving for the first moment); I noticed that the Cauchy distribution requires the same integral I did before divided by r; which means that it, too, has a point which is infinite. Hence, the Cauchy itself requires the use of a Cauchy principle value to derive.... If changing the limits with which one integrates any such integral is capable of changing the results, I would have to assume the Cauchy distribution is itself invalid unless there is a way to justify the use of a particular Cauchy principle value... I need to find a derivation of the Cauchy distribution and find out on what grounds they justify the existence of the integral in the first place !!! or if there is a way to work around the issue. The censoring window is roughly equivalent to truncating the tails of the Cauchy at *very* large values. The non-existence of the integral of the mean, for example, is based on a-symmetrical areas for the left and right hand portions of the integral; but notice -- truncating the zero denominator point symmetrically -- has the effect of forcing the limit of the left and right hand values to have a symmetric end point. Therefore your idea (nearly) reduces to mine as the window shrinks to zero. ![]() The symmetry of the censoring window is equivalent to my idea -- but the flaw in my idea is that the limit of the integral used to compute the mean can have be chosen to be asymmetrical as one approaches infinity. Just so, your censoring window is symmetrical -- but we could choose an asymmetrical window... I see no way, based on engineering considerations, how to justify this window's symmetry or a-symmetry. In the ideal case, it wouldn't exist -- and is just a fix-up to work around a problem not defined by engineering at all...
|
| Aug6-12, 02:31 PM | #27 |
|
|
see my comments to chiro regarding the windowing issue in general, and not being able to solve integrals with a cut-out window. Really, it's not my habit to walk into a buz-saw, even if someone else asks me to. I'm sure he'd agree with you on these particular questions.
|
| Aug6-12, 02:59 PM | #28 |
|
|
In the second question, I think the issue (likely) resides around whether we define the distribution to be the limit of the ratio set of finite binomials, where we take the mean of the finite ratio -- and then let the number of elements in the binomials approach infinity -- or whether we take the limit before computing the mean, and end up computing the mean of the continuous distribution. |
| Aug16-12, 03:22 AM | #29 |
|
|
Stephen Tashi!
Now, I'm making progress! You said... The graph is for the distribution of 100,000 means of 200 point samples from a true Cauchy. The resulting shape -- is another Cauchy. I tried many more points, many less points, many more and less repetitions -- the shape is exactly the same... ! So what you are saying is correct; but the reason is simply that a Cauchy added to a Cauchy is .... a Cauchy; and that has implications.... Analogically: When something is Gaussian, say -- with a mean of mu, and a sigma of s; Then If we add two independent samples of the distribution together -- The result's distribution is again a Gaussian: Gaussian + Gaussian = Gaussian. So: N(mu,s) + N(mu,s) = N( 2mu, 1.414213...s ) When doing an average, we simply add (as above) and then divide by 2: The result is NOW: N(mu, 0.7071...s). The significant detail is that the sigma has gotten *smaller* after the average. Hence: The new result is "closer" on average to mu. As long as the measure of the width of the distribution (which does not have to be sigma, but any "x" scaling one can invent) shrinks with each averaging; the result converges toward an "average". It really doesn't matter what the shape is -- Gaussian, Cauchy, etc. In the OP, I originally said that I thought the Cauchy had a mean in the "limit"; but I think I edited that out... It's important to recognize, however, that a ratio of Gaussians is *not* a Cauchy in the strict sense. We've been mixing ideas carelessly... Only when a=0 and b=0 is it truly a Cauchy. In all other cases, I'm pretty certain, the "scale" of the distribution is not preserved on addition. So, convergence may not happen with the Cauchy -- but could with the others. (even if very slowly...) I have a comment to make to Chiro, but I know enough that I probably ought to re-start the thread with more consistent and accurate labeling; At this point, I think certain things being confounded early on in the thread are preventing a wider participation; and clarifying, and summing up the useful things said in a concentrated new OP would be best. |
| Aug16-12, 04:17 AM | #30 |
|
|
If you want to check theoretically if it is preserved and you have an exact analytic representation then you should firstly look at the nature of the MGF. If you can combine the MGF's to get something that has the same form then that's it.
Also with regards to the sigma, when you are estimating the mean you want the sigma to get smaller and this is what you should get since the average of the sample is an unbiased estimator for the mean. I know in this example you have two distinct random variables so you are not actually doing it in the above context, but the average of distributions will always make the variance smaller than the sum by using standard variance operator laws. I'd be interested though if you derived an analytic form of a distribution by using the censoring technique we discussed earlier in this thread, because that would really nail the behaviour of what is going on when you start to allow values close to 0. Did you look into this if not analytically, via simulation of sorts? |
| New Reply |
| Thread Tools | |
Similar Threads for: Pathological PDFs. eg: ratio of normals... including Cauchy.
|
||||
| Thread | Forum | Replies | ||
| pathological function problem | Calculus & Beyond Homework | 1 | ||
| Normals to Curves (Differentiation) | Calculus & Beyond Homework | 1 | ||
| favourite pathological examples | General Math | 5 | ||
| Pathological Function | General Math | 12 | ||
| Obsession: productive or pathological | General Discussion | 21 | ||