- #1

- 4

- 0

- Thread starter tmj143
- Start date

- #1

- 4

- 0

- #2

- 51

- 0

That is the mean for x and y.

- #3

- 4

- 0

? I don't think you understand. There's some probability distribution which says that each data point can lie somewhere in between some x1 and x2 and some y1 and y2; these uncertainties are of different magnitudes for each data point, and the fact that they are different means that the data points need to be weighted differently in the regression calculation. But, because the error bars are asymmetric, I can't just do a straight weighted fit...That is the mean for x and y.

- #4

Stephen Tashi

Science Advisor

- 7,545

- 1,456

- #5

- 4

- 0

l

l

l

------x-------------------l

Or something like that where x is the data point and the l's/-'s represent the error (but of different magnitudes for each point, like I said). I don't know how I can better describe this...

- #6

Stephen Tashi

Science Advisor

- 7,545

- 1,456

If you find a way, please post it. I'm too busy to conduct a detailed interrogation. If you really know what you're doing, your question will have an answer. If you don't know what you're doing (for example, if you just think regression and correlation are the "right" thing to do, but you don't understand what your trying to optimize by using them) then you are beyond help.I don't know how I can better describe this...

- #7

- 4

- 0

Look, this is purely a statistical question; if you want me to go into science details, I could, but they're absolutely irrelevant. I have data. I need to figure out if there is a correlation between the x and y variables/the slope of said line in the case of a linear regression. I'm sure that I could do some sort of complicated simulation to randomly sample imaginary data points from within my error bars and calculate fits for all of them to see if I get anything significant, but that is far more complicated than something I want to deal with.If you find a way, please post it. I'm too busy to conduct a detailed interrogation. If you really know what you're doing, your question will have an answer. If you don't know what you're doing (for example, if you just think regression and correlation are the "right" thing to do, but you don't understand what your trying to optimize by using them) then you are beyond help.

I know that if the error bars were the same, I could do a weighted least squares fit. But they're not. So all I'm asking for is if anyone knows how to deal with the asymmetric error bars in such instances... I'm sure people do such fits all the time, but my ability to google any sort of explanation hasn't been successful.

- #8

Stephen Tashi

Science Advisor

- 7,545

- 1,456

- #9

Stephen Tashi

Science Advisor

- 7,545

- 1,456

- #10

- 2,967

- 5

Perhaps you can use a model of a piecewise Gaussian variable. Suppose the variable has a mean [itex]a[/itex] and different standard deviation for [itex]x > a[/itex] and [itex]x < a[/itex], i.e. its distribution is:

[tex]

\varphi(x) = \left\{\begin{array}{ll}

A_{1} \exp\left(-\frac{(x - a)^{2}}{2 \sigma^{2}_{1}}\right)&, x > a \\

A_{2} \exp\left(-\frac{(x - a)^{2}}{2 \sigma^{2}_{2}}\right)&, x < a

\end{array}\right.

[/tex]

You have to adjust [itex]A_{1}[/itex] and [itex]A_{2}[/itex] so that:

[tex]

E(X) - a = \int_{-\infty}^{\infty}{(x - a) \varphi(x) \, dx} = 0 \Rightarrow A_{1} \int_{0}^{\infty}{t e^{-\frac{t^{2}}{2 \sigma^{2}_{1}}} \, dt} = A_{2} \int_{0}^{\infty}{t e^{-\frac{t^{2}}{2 \sigma^{2}_{2}}} \, dt} \Rightarrow A_{1} \, \sigma^{2}_{1} = A_{2} \, \sigma^{2}_{2}

[/tex]

Of course, the probability density must be normalized:

[tex]

\int_{-\infty}^{\infty}{\varphi(x) \, dx} = 1 \Rightarrow A_{1} \, \int^{\infty}_{0}{e^{-\frac{t^{2}}{2\sigma^{2}_{1}} \, dt} + A_{2} \, \int^{\infty}_{0}{e^{-\frac{t^{2}}{2\sigma^{2}_{2}} \, dt} = 1 \Rightarrow \sqrt{\frac{\pi}{2}} \left(A_{1} \, \sigma_{1} + A_{2} \, \sigma_{2} \right) = 1

[/tex]

These two equations allow you to express [itex]A_{1/2}[/itex] in terms of [itex]\sigma_{1/2}[/itex]. Try to find the variance of the variable.

Next, consider the variable:

[tex]

\varepsilon_{i} = a \, X_{i} + b \, Y_{i} + c, \; a^{2} + b^{2} = 1, \; i = 1, \ldots, N

[/tex]

If [itex]X_{i}[/itex] and [itex]Y_{i}[/itex] have the above distribution, what is the expectation value and variance for [itex]\varepsilon_{i}[/itex]?

Approximate these variables as having an approximately Normal distribution with the above expectaion values and variances and use the maximum likelihood method, which would reduce to a least-squares method to estimate the parameters of the general linear dependence:

[tex]

a \, x + b \, y + c = 0, \; a^{2} + b^{2} = 1

[/tex]

[tex]

\varphi(x) = \left\{\begin{array}{ll}

A_{1} \exp\left(-\frac{(x - a)^{2}}{2 \sigma^{2}_{1}}\right)&, x > a \\

A_{2} \exp\left(-\frac{(x - a)^{2}}{2 \sigma^{2}_{2}}\right)&, x < a

\end{array}\right.

[/tex]

You have to adjust [itex]A_{1}[/itex] and [itex]A_{2}[/itex] so that:

[tex]

E(X) - a = \int_{-\infty}^{\infty}{(x - a) \varphi(x) \, dx} = 0 \Rightarrow A_{1} \int_{0}^{\infty}{t e^{-\frac{t^{2}}{2 \sigma^{2}_{1}}} \, dt} = A_{2} \int_{0}^{\infty}{t e^{-\frac{t^{2}}{2 \sigma^{2}_{2}}} \, dt} \Rightarrow A_{1} \, \sigma^{2}_{1} = A_{2} \, \sigma^{2}_{2}

[/tex]

Of course, the probability density must be normalized:

[tex]

\int_{-\infty}^{\infty}{\varphi(x) \, dx} = 1 \Rightarrow A_{1} \, \int^{\infty}_{0}{e^{-\frac{t^{2}}{2\sigma^{2}_{1}} \, dt} + A_{2} \, \int^{\infty}_{0}{e^{-\frac{t^{2}}{2\sigma^{2}_{2}} \, dt} = 1 \Rightarrow \sqrt{\frac{\pi}{2}} \left(A_{1} \, \sigma_{1} + A_{2} \, \sigma_{2} \right) = 1

[/tex]

These two equations allow you to express [itex]A_{1/2}[/itex] in terms of [itex]\sigma_{1/2}[/itex]. Try to find the variance of the variable.

Next, consider the variable:

[tex]

\varepsilon_{i} = a \, X_{i} + b \, Y_{i} + c, \; a^{2} + b^{2} = 1, \; i = 1, \ldots, N

[/tex]

If [itex]X_{i}[/itex] and [itex]Y_{i}[/itex] have the above distribution, what is the expectation value and variance for [itex]\varepsilon_{i}[/itex]?

Approximate these variables as having an approximately Normal distribution with the above expectaion values and variances and use the maximum likelihood method, which would reduce to a least-squares method to estimate the parameters of the general linear dependence:

[tex]

a \, x + b \, y + c = 0, \; a^{2} + b^{2} = 1

[/tex]

Last edited:

- #11

- 1

- 0

This method only works when your error is symmetric in log space, but this is the main kind of asymmetric error I usually run across. In fact, a lot of the error that we call symmetrical is really actually symmetrical only in log space, which makes it close to symmetrical for small errors, but quite asymmetrical for larger ones. Very often when we say +/-25%, we really mean */÷1.25, which actually works out to +25%/-20%, of course.

- #12

- 459

- 0

Ha. That's amusing.

The problem is that there's no right way to do this without knowing where those error bars come from. Error bars by themselves have no definite meaning. But when people use symmetric error bars, we know by convention that they probably represent something like root-mean-squared error, or some multiple of it. There's no such universal meaning of asymmetric error bars, so without more information about the error distribution they're intended to summarize, it's hard to say how to handle them correctly.

- Last Post

- Replies
- 8

- Views
- 322

- Last Post

- Replies
- 1

- Views
- 753

- Replies
- 5

- Views
- 705

- Replies
- 4

- Views
- 11K

- Replies
- 2

- Views
- 6K

- Replies
- 5

- Views
- 12K

- Replies
- 4

- Views
- 773

- Last Post

- Replies
- 3

- Views
- 99

- Last Post

- Replies
- 7

- Views
- 2K

- Replies
- 2

- Views
- 10K