Probability Density of a Constrained Chi-Square

In summary, the conversation discusses the process of determining a probability distribution from a function built through measurements with normally distributed errors. The speaker wants to reduce the dimensionality to two variables by imposing a constraint and variable change. They question whether the resulting function is still a chi-square with two degrees of freedom and seek sources for theoretical grounding. The expert recommends using simulation and understanding the various transforms and joint distributions for non-independent variables. They also mention the importance of estimating the covariance matrix and understanding entangled limits in this context.
  • #1
Soveraign
55
0
Hello PF! It's been a while. How are things?

In my research I'm faced with determining a probability distribution from a function built as follows:

Perform three measurements X, Y, Z that have normally distributed errors.

Impose a constraint and variable change that allows me to reduce the dimensionality to 2.

My question is: Can I assume the resulting function is a chi-square with 2 dof and therefore write my pdf as

[tex]exp(- \chi^2 / 2)[/tex]

The long version with specifics:

I am measuring the energies and opening angle of two photons with a common point of origin and I wish to determine the probability density of true energies and angles from this single measurement. For simplicity I assuming Gaussian errors on the measurements. The opening angle is transformed a bit to make the calculations easier and I start with an initial chi-square of (subscript "m" is my measured value and "z" is my transformed angle measurement):

[tex] \chi^2 = \frac {(E_1 - E_{1m})^2} {\sigma_{E1}^2} + \frac {(E_2 - E_{2m})^2} {\sigma_{E2}^2} + \frac {(z - z_{m})^2} {\sigma_{z}^2} [/tex]

The photons are produced by a common particle and therefore I can impose the constraint that the invariant mass of these photons is a specific value "M" (p is four-momentum).

[tex] C = (\mathbf p_{\gamma 1} + \mathbf p_{\gamma 2})^2 - M^2 = 0 [/tex]

This allows me to reduce the variables from 3 to 2, but in a fairly non-linear way. My final chi-square is a function of energy of the original common particle and the cosine of the center of momentum decay angle of the photons:

[tex] \chi^2 = f(E, \cos{\theta^*}) [/tex]

As one would expect the transformations are quite non-linear, but in practice frequently are "close" to linear for the actual values being considered. I don't want to further burden this post with the ugly transformation details but would happily provide if it is needed.

So the long version of the question is: Can I assume the above expression is still a chi-square? Is the dof 2? Does the fact that E and cos(th*) are not independent play a role in determining the proper dof?

Many, many thanks to anyone that can help. I am especially interested in sources I can reference so I know I'm standing on strong theoretical grounds.
 
Physics news on Phys.org
  • #2
Hey Soveraign.

One recommendation I have (and this applies to any situation similar to yours) is to use simulation as a verification tool whenever you need to check a distribution against gut instinct or theoretical verification.

If you have dependencies then you will either need to find a joint distribution and the entangled limits (since they are dependent) or you will need to express one variable as a function of another.

The theory for things like this include transformation theorems of functions of random variables, probability transforms like the characteristic transformation and a variety of other results.

Given that you have complex constraints, my first suggestion is to resort to simulation. You can use a package like R, or if you have massive complex conditional distributions, you would want to use something like WinBUGS. Both are free and R is open source.

After you do a simulation with enough data points (say 10,000 - 100,000) you can then plot the distribution, calculate its moments, and even do a goodness of fit test against the chi-square distribution with two degrees of freedom.
 
  • #3
Thanks for the reply. I do check assumptions about distributions with high-stat simulations to try and catch mistakes. In the situation I'm describing above, it is the high-stat sims that show I must be doing something wrong... but not horribly so. I have both a bias in my final results as well as a suspect chi-square/dof when trying to combine multiple events from the simulations.

While giving an exam I was thinking this over and began to wonder if I should be approaching my constraint as more of a "condition" like so:

[tex] P(E1=a, E2=b, Z=c | M=m) = P(M=m | E1=a, E2=b, Z=c) P(E1=a, E2=b, Z=c) / P(M=m)[/tex]

The P(E1=a, E2=b, Z=c) would be trivariate normal and the P(M=m | E1=a, E2=b, Z=c) might end up as a chi-square with dof 1. But then I realized the constraint imposes zero probability on a large range of E1, E2, Z (thus why I am able to reduce to the two variables mentioned in the first post) and might be the wrong approach.

Do you recommend any links/books that would go into some detail about joint distributions for non-independent variables (and what you mean by entangled limits)?

Am meeting with some stats people tomorrow about it, hoping for some more insight.
 
  • #4
I'd recommend getting a basic book on probability that covers the various kinds of transforms. These include

finding the distribution of a function of a random variable, characteristic and probability generating transforms, moment generating functions, and results to do with sums, products, and quotients of random variables.

I'd also suggest understanding how to take conditional and marginal distributions to get the joint distribution which is what you were getting at in the post above.

If you are dealing with normal random variables then you should be looking at how to estimate the covariance matrix. If you can derive the covariance matrix from your constraints (which you should be able to) then you can get the final joint distribution. Also any linear combination of normals is normal and there are theorems that allow you to get the conditional means and variances of any compound normal against another compound normal.

Entangled limits are just limits that depend on other variables and not constants. An example is say 0 < x < y as opposed to 0 < x < 1. If you have any kind of entanglement then you have dependencies between the random variables that have the entangled limits.

If you can find the entangled limits, you can use that to specify the distribution but I don't know of a lot of solid theories or results to do it in general.
 
  • #5
Thought I would follow up here and try to more succinctly describe the problem. At its core, I have three random variables that are normally distributed:

[tex] x_1 \sim N(\mu_1, \sigma_1) \\
x_2 \sim N(\mu_2, \sigma_2) \\
x_3 \sim N(\mu_3, \sigma_3) [/tex]

I also know there is a relationship among the distributions:

[tex] \mu_1 \mu_2 \mu_3 = C [/tex]

where "C" is a constant. Given exactly one sample from the distributions, I want to determine the probability densities for hypothesis \mu's. Specifically I need:

[tex] p(\bar \mu \vert \bar x) [/tex]

My latest attempt to work this out has taken me to a Bayes style of looking at it:

[tex] p(\bar \mu \vert \bar x) = \frac {p(\bar x \vert \bar \mu) p(\bar \mu)} {p(\bar x)} [/tex]

where:

[tex] p(\bar x \vert \bar \mu) [/tex]

is simply the joint probability of the three normals for specific x's. The relationship among the means seems to certainly be prior information and my instinct is to model it as:

[tex] p(\bar \mu) =

\begin{array}{l l}
A, & \mu_1\mu_2\mu_3 = C \\

0, & otherwise
\end{array}

[/tex]

where "A" is just a constant. This would enforce a zero probability when the constraint is not met. Then finally p(x) as just normalization. The final posterior distributions look "ok" but I'm unsure if I'm doing this right.

Thoughts anyone? Thanks!
 

1. What is the probability density of a constrained chi-square distribution?

The probability density function of a constrained chi-square distribution is a mathematical function that describes the likelihood of a specific value occurring in the distribution. It is used to calculate the probability of obtaining a particular value or range of values from a constrained chi-square distribution.

2. How is the probability density of a constrained chi-square distribution calculated?

The probability density of a constrained chi-square distribution is calculated using the formula f(x) = (1/2k) * (x^(k/2-1)) * e^(-x/2), where k is the degrees of freedom and x is the chi-square variable. This formula is derived from the gamma distribution, which is the underlying distribution of the chi-square distribution.

3. What is the relationship between the probability density and cumulative distribution function of a constrained chi-square?

The probability density function and cumulative distribution function of a constrained chi-square are related in that the probability density function calculates the probability of obtaining a specific value, while the cumulative distribution function calculates the probability of obtaining a value less than or equal to a specific value. The two functions are related by integration, with the cumulative distribution function being the integral of the probability density function.

4. How does the probability density of a constrained chi-square distribution change with different degrees of freedom?

The probability density of a constrained chi-square distribution is dependent on the degrees of freedom, with higher degrees of freedom resulting in a more peaked distribution. This means that as the degrees of freedom increase, the probability of obtaining extreme values decreases, and the distribution becomes more concentrated around the mean.

5. What are some practical applications of the probability density of a constrained chi-square distribution?

The probability density of a constrained chi-square distribution is commonly used in statistical analyses, such as hypothesis testing and confidence interval calculations. It is also used in fields such as finance, physics, and engineering to model various phenomena and make predictions based on data. Additionally, it is used in quality control processes to assess the variability of a process or product.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
801
  • Calculus and Beyond Homework Help
Replies
3
Views
705
  • Set Theory, Logic, Probability, Statistics
Replies
14
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
698
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
7K
  • Set Theory, Logic, Probability, Statistics
Replies
20
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
761
Back
Top