I Advantage of having more measurements

  • I
  • Thread starter Thread starter kelly0303
  • Start date Start date
  • Tags Tags
    Measurements
kelly0303
Messages
573
Reaction score
33
Hello! I have some points in the plane, with errors on both x and y coordinates. The goal of the experiment is to check if the points are consistent with a straight line or not i.e. if they can be described by a function of the form ##y = f(x)=a+bx## or if there is some nonlinearity involved (e.g. ##y = f(x)=a+bx+cx^2##). Assume first we have only 3 points measured. In this case, the approach is to calculate the area of the triangle formed and the associated error, so we get something of the form ##A\pm dA##. If ##dA>A##, then we are consistent with non-linearity and we can set a constraint (to some given confidence level) on the magnitude of a possible non-linearity (e.g. ##c<c_0##). If we have 4 points, we can do something similar and we can for example calculate the area of the triangle formed by the first 3 points (in order of the x coordinate), ##A_1\pm dA_1## and the area of the last 3 points ##A_2\pm dA_2## and then sum them add and do error propagation to get ##A\pm dA## then proceed as above (in the case of this experiment we expect to not see a non-linearity so we just aim for upper bounds). My question is, what is the advantage of having more points? Intuitively, I expect that the more points you have, the more information you gain and hence the better you can constrain the non-linearity. But it seems like the error gets bigger and bigger, simply because we have more points and error propagation (you can assume that the errors on x and y are the same, or at least very similar for different measurements). So, assuming the points are actually on the line, for 3 points we get ##0\pm dA_3## and for, say 10 points we get ##0\pm dA_{10}## with ##dA_{10}>dA_3##, so the upper bounds we can set on the non-linearity are better (smaller) in the case of 3 points. But intuitively that doesn't make sense. Can someone help me understand what I am doing wrong. Why is it better to have more points? Thank you!
 
Physics news on Phys.org
kelly0303 said:
My question is, what is the advantage of having more points?
I believe the more observations or trials we do, the more information we get to know the physical system including its proper disturbance, noise or probabilistic behaviors. 
 
Last edited:
anuttarasammyak said:
I believe the more observations or trials we do, the more information we get to know the physical system including its proper disturbance, noise or probabilistic behaviors. 
Well yeah, this is what I believe intuitively, but I am not sure how to show it mathematically.
 
Law of large numbers and central limit theorem would be of your interest.
 
I don’t understand the point of the areas. Why not just estimate c directly using least squares. Or even a Bayesian estimation
 
  • Like
Likes jedishrfu
Dale said:
I don’t understand the point of the areas. Why not just estimate c directly using least squares. Or even a Bayesian estimation
But in order to estimate c, I would need to know the functional form of the non-linearity. However the actual form is very model dependent so in our case we don't want to set constraints on a given model we just want to set a constraint on any deviation from linearity, regardless of its actual form. Am I miss understanding your point?

Basically I want to quantify how far the points are from being on a straight line. I decided to use this area as a quantifier, but I am totally open to suggestions for better ways to do it.
 
anuttarasammyak said:
Law of large numbers and central limit theorem would be of your interest.
I know about these in general, I am just not sure how they apply to my particular case. For example, in general the error would go as ##1/\sqrt{N}##, where N is the number of measurements, but I don't see that in my expressions above explicitly, so I am probably doing something wrong.
 
anuttarasammyak said:
I do not know what exactly is your system but I expect many (x,y) data plots show some dense and sparse pattern and it becomes clearer for larger N as the below linked experiment video shows as an example. link https://www.hitachi.com/rd/research/materials/quantum/doubleslit/index.html
I don't have many points, tho. Here is a paper that might explain it better (the physics of it is involved, but the details are not important for my question), in figure S2. In the experiments so far, people used to measure 3 points and get something like in figure S2. What it is usually done in literature is to calculate the area created by these 3 points and the error associated to it (by propagating the error from each of the 3 points), and from there set a constraint on the non-linearity (so far all the areas are smaller than the uncertainties, so we were able to just set upper limits). My question is simply, if I am able to measure a 4th point on that plot, how would that help me (I am sure it would, as I would gain more data, but I am not sure mathematically how is the error on the area reduced by adding one more point)?
 
  • #10
What you’re trying to do is what a linear regression does. It finds the best line through a set of points. If it looks to be a poor line after a lot of points then you must consider that there’s a different relationship.

Sometimes folks will apply a linear regression to the log values of x or y or both. This scheme can discover polynomial functions like ##y = x^2 ## because a log plot would show a straight line for ##log(y) = 2 log(x)##

Here’s more on linear regression:

https://en.wikipedia.org/wiki/Linear_regression

and this video

 
  • #11
jedishrfu said:
What you’re trying to do is what a linear regression does. It finds the best line through a set of points. If it looks to be a poor line after a lot of points then you must consider that there’s a different relationship.

Sometimes folks will apply a linear regression to the log values of x or y or both. This scheme can discover polynomial functions like ##y = x^2 ## because a log plot would show a straight line for ##log(y) = 2 log(x)##

Here’s more on linear regression:

https://en.wikipedia.org/wiki/Linear_regression

and this video


I know what linear regression is, that is not what I am trying to do... as I said in the previous reply, the paper I linked to might explain better what I want to do, especially figure S2. There they measure 3 points, calculate the area of the triangle created by them and quantify the deviation from linearity based on the value of that area. I don't see how doing a linear regression to these 3 points would help me quantify that non-linearity.
 
  • Sad
Likes jedishrfu
  • #12
kelly0303 said:
There they measure 3 points, calculate the area of created by them and quantify the deviation from linearity based on the value of that area.
I observe in S2 they set a half of volume of hexagonal with axis of three momentum vectors as NL, right ? Do these three vectors come from one time experiment data ? I would like to understand how you want to add data or vectors to it in your question.
 
  • #13
anuttarasammyak said:
I observe in S2 they set a half of volume of hexagonal with axis of three momentum vectors as NL, right ? Do these three vectors come from one time experiment data ? I would like to understand how you want to add data or vectors to it in your question.
I am not sure what you mean. What hexagonal volume are you referring to?
 
  • #14
Equation (6) and its explanation by S2.
 
  • #15
anuttarasammyak said:
Equation (6) and its explanation by S2.
Equation (6) is just the area of that triangle in figure S2. In the experiment they measure the 6 points ##m\nu_i^{AA_j}## from the x and y-axis in figure S2, and from there they calculate the area created.
 
  • #16
Equation (6) seems to have dimension of volume p^3 in momentum space
|(A \times B)\cdot C|
not area for me.
 
  • #17
anuttarasammyak said:
But equation (6) seems to have dimension of volume p^3 in momentum space
(A \times B)\cdot C
If you look just before equation (5), ##m_\mu## is just a constant, without units.
 
  • #18
I see. And the paper saying "Equivalently, in our geometrical picture it is the volume of the parallelepiped defined by −→mν1,2 and −→mµ." assures my view.

Going back to your point what would you like to do more than this triplet vectors ? Making a quartet by incorporating another vector ? Getting a set of the triplet by many experiments?
 
Last edited:
  • #19
You mentioned area. Area==zero. That is how to test for collinearity of points:
https://www.geeksforgeeks.org/program-check-three-points-collinear/

You can also use the distance test, if that makes any difference to you.

Now we are on the same page I hope.

The above is the best way to test when you want yes/no answers. Or. Use some kind of Minimum area test, if you are okay with a not "perfect" result. What you do in this case is up to you. This is arbitrary you realize. Regression seems okay here. As others mentioned.

This is an example for "not perfect", which you already know:
https://cran.r-project.org/web/packages/olsrr/vignettes/regression_diagnostics.html

Tolerance test of multi-collinearity -- what you are asking about i.e., "more points":
https://www.statisticshowto.com/tolerance-level-statistics/
 
  • #20
kelly0303 said:
But in order to estimate c, I would need to know the functional form of the non-linearity.
Not really. You can always do a series expansion and approximate your nonlinearity as a polynomial. You only need to know the functional form if you want to make accurate predictions. But if you only want to detect nonlinearity a polynomial is fine.

kelly0303 said:
Basically I want to quantify how far the points are from being on a straight line. I decided to use this area as a quantifier, but I am totally open to suggestions for better ways to do it.
I suggest least squares regression to a polynomial.

kelly0303 said:
My question is simply, if I am able to measure a 4th point on that plot, how would that help me (I am sure it would, as I would gain more data, but I am not sure mathematically how is the error on the area reduced by adding one more point)?
With one more point you could fit a third order polynomial.
 
Last edited:
  • #21
kelly0303 said:
in figure S2. In the experiments so far, people used to measure 3 points and get something like in figure S2.
On the page following that figure, the paper says:
Our procedure above applies to cases with enough experimental data. For systems lacking (sufficiently precise)measurements, we can still derive projections provided that an acceptable estimation of the F21 constant is availablefrom either theory calculation or hyperfine splitting data (whenever available).

So I think the three points in figure S2 are themselves are not necessarily 3 single measurements, but instead , each of those points may be the mean value of many measurements.
 
  • #22
Dale said:
Not really. You can always do a series expansion and approximate your nonlinearity as a polynomial. You only need to know the functional form if you want to make accurate predictions. But if you only want to detect nonlinearity a polynomial is fine.

I suggest least squares regression to a polynomial.

With one more point you could fit a third order polynomial.
I am not sure I understand, I do want to set very accurate bounds on the non-linearity. Basically I want to describe my points by ##y=ax+b+g(x)##, with ##g(x) << ax,b##. From there I want to set constraints as tight as possible on the ##g(x)##. If I use a polynomial won't that influence how tight the constraints are? On a more practical aspect, in all the paper on this topic they use this area method, so I assume that if polynomial were to work they would have used them. But given that they use areas in literature, I would still like to find out the answer to my question in the case of using areas to define non-linearity.
 
  • #23
jim mcnamara said:
You mentioned area. Area==zero. That is how to test for collinearity of points:
https://www.geeksforgeeks.org/program-check-three-points-collinear/

You can also use the distance test, if that makes any difference to you.

Now we are on the same page I hope.

The above is the best way to test when you want yes/no answers. Or. Use some kind of Minimum area test, if you are okay with a not "perfect" result. What you do in this case is up to you. This is arbitrary you realize. Regression seems okay here. As others mentioned.

This is an example for "not perfect", which you already know:
https://cran.r-project.org/web/packages/olsrr/vignettes/regression_diagnostics.html

Tolerance test of multi-collinearity -- what you are asking about i.e., "more points":
https://www.statisticshowto.com/tolerance-level-statistics/
I am not sure I understand what you mean. Of course area=0 for collinear points. But in practice they won't be on a straight line, as we have experimental errors. So the area will be of the form ##3 \pm 5##, which is not zero, but it is consistent with zero within the error. My question is, if I add one more point, and I calculate the area formed by these 4 points, what to I gain compared to the case of having only 3 points. Sending me link to statistic webpages doesn't help me. I know the basics, I just don't know how to apply it to my problem.
 
  • #24
Stephen Tashi said:
On the page following that figure, the paper says:So I think the three points in figure S2 are themselves are not necessarily 3 single measurements, but instead , each of those points may be the mean value of many measurements.
On yes, the points in figure S2 are the results of many measurements. In the experiment one measures the x and y for a given point several times, then places it on that plot in S2. After measuring 3 such points we quantify the non-linearity by calculating that area. My question is, if I measure a 4th point, with the same uncertainty as the other 3 points. Do I get anything in terms of better constraining the non-linearity of the 3 points.
 
  • #25
anuttarasammyak said:
I see. And the paper saying "Equivalently, in our geometrical picture it is the volume of the parallelepiped defined by −→mν1,2 and −→mµ." assures my view.

Going back to your point what would you like to do more than this triplet vectors ? Making a quartet by incorporating another vector ? Getting a set of the triplet by many experiments?
I would like to add another point to figure S2. In terms of the mathematical description of the problem, the vector will be become 4D (not they are 3D).
 
  • #26
Dale said:
Not really. You can always do a series expansion and approximate your nonlinearity as a polynomial. You only need to know the functional form if you want to make accurate predictions. But if you only want to detect nonlinearity a polynomial is fine.

I suggest least squares regression to a polynomial.

With one more point you could fit a third order polynomial.
Just to clarify it, you might be right that using this area method is not the best. But whether is the best or not, if I want to do something in practice and compare it with the literature, I need to use the same method as the one in the literature, which is this area stuff. Even if I want to claim there is a better method, I still need to apply the old method to my problem to actually show that the new method does better by direct comparison. So for now let's assume that I have to use this area method, wether is the best thing to do or not. So my question is, if I quantify the non-linearity using this method for more points, how does that help me set better constraints on the non-linearity than using only 3 points? Thank you!
 
  • #27
kelly0303 said:
On a more practical aspect, in all the paper on this topic they use this area method, so I assume that if polynomial were to work they would have used them. But given that they use areas in literature, I would still like to find out the answer to my question in the case of using areas to define non-linearity.
I did read the paper you posted earlier. The physics is far outside of my area of expertise. But from a statistical perspective the area approach makes no sense to me.

In general there is nothing particularly superior to modeling ##g(x)## as a piecewise linear function than as a polynomial with no zero or first order terms. The statistical methods for polynomials are very well studied and optimized over many decades, so personally I would prefer those. In my field we use the polynomial approach as the standard method to characterize non linearity.

However, I understand the value of using the same strategy that has been previously used in the relevant literature. Unfortunately, I don’t know that there is a good way to generalize this niche approach to additional points. With this approach there may be no advantage to additional points (a feature that should call into question the approach)
 
  • #28
kelly0303 said:
I would like to add another point to figure S2. In terms of the mathematical description of the problem, the vector will be become 4D (not they are 3D).

Figure S2 https://arxiv.org/pdf/1704.05068.pdf is a plot where each point corresponds to data for one isotope pair taken from 3 distinct isotopes? How do you propose to add another point to that situation?
 
  • #29
Dale said:
I did read the paper you posted earlier. The physics is far outside of my area of expertise. But from a statistical perspective the area approach makes no sense to me.

In general there is nothing particularly superior to modeling ##g(x)## as a piecewise linear function than as a polynomial with no zero or first order terms. The statistical methods for polynomials are very well studied and optimized over many decades, so personally I would prefer those. In my field we use the polynomial approach as the standard method to characterize non linearity.

However, I understand the value of using the same strategy that has been previously used in the relevant literature. Unfortunately, I don’t know that there is a good way to generalize this niche approach to additional points. With this approach there may be no advantage to additional points (a feature that should call into question the approach)
Thank you for this. So if I am to use the polynomial approach, could you please give me a bit more details (or point me towards some readings about that)?
 
  • #30
Stephen Tashi said:
Figure S2 https://arxiv.org/pdf/1704.05068.pdf is a plot where each point corresponds to data for one isotope pair taken from 3 distinct isotopes? How do you propose to add another point to that situation?
Yes, each point on x and y corresponds to a transition in a given isotope pair, so you measure 4 isotopes for that. My question was for the case in which you measure a 5th isotope. In that case on both axis you would add one more point, which would be ##m\nu_i^{AA_4}##. In principle, if you are able to measure the 2 transitions in more isotopes you can add as many points as you want (you keep the reference isotope the same all the time, usually the one with the smallest uncertainties, labeled just as ##A## in that plot).
 
  • #31
kelly0303 said:
Thank you for this. So if I am to use the polynomial approach, could you please give me a bit more details (or point me towards some readings about that)?
Sure. The basic idea is that you model your data as ##y_i= b_0 x_i^0 + b_1 x_i^1 + b_2 x_i^2 + ... + \epsilon## where ##\epsilon \sim N(0,\sigma)## and the ##b_j## are the least squares fit terms. Note that even though the ##x_i^j## terms are non-linear for ##j\ge 2##, the fit is still an ordinary least squares linear fit because the ##b_j## terms are linear. So any typical ordinary least squares package will be able to fit this model.

Many fit packages will also be able to test for significance of the ##b_j## and give you both an estimate and a confidence interval for each. And if you need the area then you can simply evaluate the area under the polynomial to whatever order you wish and subtract the area under the first order polynomial.
 
  • #32
Dale said:
But from a statistical perspective the area approach makes no sense to me.
I hear you, Dale. I'm going to go out on a limb and guess that this method was chosen because it has a simple graphical interpretation.

@kelly0303 I can think of a way to generate a similar measure for N isotope pairs, but it's definitely something I pulled out of a hat. So take this suggestion with a suitably sized grain of salt. It's also a fair bit of work, unfortunately

If you have N data points (isotope pairs), ##\vec{x}_i = (x_i, y_i)##, you can define the area of the triangle made by 3 adjacent points: $$A_i = \frac{1}{2} ||(\vec{x}_{i+1} - \vec{x}_i) \times (\vec{x}_{i+2} - \vec{x}_{i+1})||$$
(To see why this is the area, note that ##(\vec{x}_{i+1} - \vec{x}_i)## is one leg of the triangle, ##(\vec{x}_{i+2} - \vec{x}_{i+1})## is another leg of the triangle, and their cross product has magnitude equal to the area of a parallelpiped. So half the area of the parallelpiped is the area of the triangle.)

This gives you N-2 individual triangle's areas, which you can then average: $$\langle A \rangle = \frac{1}{N-2} \sum^{N-2}_{i=1} A_i$$

The tricky part about doing this is the error propagation, because the ##A_{i}##'s depend on shared ##\vec{x}_i##'s. Because of this, I would suggest using mathematica to expand ##\langle A \rangle## as a first-order series in small displacements in ##\vec{x}_i##, call them ##\delta \vec{x}_i##. The coefficients in this series will be the coefficients in the error propagation for ##\langle A \rangle##. Does that make sense is that just really confusing?

I couldn't tell you how meaningful ##\langle A \rangle## is, but it will have the same scale as the area for 3 points and it will have a well-defined error-bar. I think so long as there isn't an inflection point in the nonlinear curve, it should be fine?

Edit: forgot a normalization in the definition of ##\langle A \rangle##
 
  • Like
Likes Dale
  • #33
kelly0303 said:
I would like to add another point to figure S2. In terms of the mathematical description of the problem, the vector will be become 4D (not they are 3D).
I observe in the paper PHYSICAL REVIEW RESEARCH 2, 043444 (2020)
https://journals.aps.org/prresearch/pdf/10.1103/PhysRevResearch.2.043444
equation (12) and Fig.1 for NL estimation in multi dimension space would be of your help.
 
  • #34
anuttarasammyak said:
I observe in the paper PHYSICAL REVIEW RESEARCH 2, 043444 (2020)
https://journals.aps.org/prresearch/pdf/10.1103/PhysRevResearch.2.043444
equation (12) and Fig.1 for NL estimation in multi dimension space would be of your help.
I came across that paper, but they are actually not doing what I am looking for. I am looking for the case in which you have just 2 transitions but more isotopes (so the system is overdetermined). In the paper you mentioned they have more than 2 transitions, but the system is not overdetermined.
 
  • #35
Let us say we repeat 2 transition experiment on the isotopes.
We get:
1st experiment data plot graph DP_1 and NL estimation NL_1
2nd experiment data plot graph DP_2 and NL estimation NL_2
3rd experiment data plot graph DP_3 and NL estimation NL_3
----
n th experiment data plot graph DP_n and NL estimation NL_n
----

Is this the right story we are dealing with ?

EDIT
To be clearer I add:
We keep eyes on the specific level transition during the experiment. We do our best to repeat the experiments in the same manner and condition. NL_n's are number data of the well defined physical quantity NL whose level transition is defined and shared with all the experiments. We can make n as large as we wish in idea.
 
Last edited:
  • #36
Twigg said:
I hear you, Dale. I'm going to go out on a limb and guess that this method was chosen because it has a simple graphical interpretation.

@kelly0303 I can think of a way to generate a similar measure for N isotope pairs, but it's definitely something I pulled out of a hat. So take this suggestion with a suitably sized grain of salt. It's also a fair bit of work, unfortunately

If you have N data points (isotope pairs), ##\vec{x}_i = (x_i, y_i)##, you can define the area of the triangle made by 3 adjacent points: $$A_i = \frac{1}{2} ||(\vec{x}_{i+1} - \vec{x}_i) \times (\vec{x}_{i+2} - \vec{x}_{i+1})||$$
(To see why this is the area, note that ##(\vec{x}_{i+1} - \vec{x}_i)## is one leg of the triangle, ##(\vec{x}_{i+2} - \vec{x}_{i+1})## is another leg of the triangle, and their cross product has magnitude equal to the area of a parallelpiped. So half the area of the parallelpiped is the area of the triangle.)

This gives you N-2 individual triangle's areas, which you can then average: $$\langle A \rangle = \frac{1}{N-2} \sum^{N-2}_{i=1} A_i$$

The tricky part about doing this is the error propagation, because the ##A_{i}##'s depend on shared ##\vec{x}_i##'s. Because of this, I would suggest using mathematica to expand ##\langle A \rangle## as a first-order series in small displacements in ##\vec{x}_i##, call them ##\delta \vec{x}_i##. The coefficients in this series will be the coefficients in the error propagation for ##\langle A \rangle##. Does that make sense is that just really confusing?

I couldn't tell you how meaningful ##\langle A \rangle## is, but it will have the same scale as the area for 3 points and it will have a well-defined error-bar. I think so long as there isn't an inflection point in the nonlinear curve, it should be fine?

Edit: forgot a normalization in the definition of ##\langle A \rangle##
@Twigg @Dale Thank you for your replies! @Twigg I need to look a bit more into your idea (not 100% sure I understand it). I just found this paper, which does what I need in equations 9 and 10, in terms of generalizing the formula to more than 4 isotopes. However, just by looking at that formula, I am still not sure I see how having more isotopes helps. It still looks to me like more isotopes would increase the propagated error, which makes no sense to me. Even looking at formula 12 which seems to be a rough approximation of the error on the parameter of interest (I still need to look into more detail at the derivation), assuming that the errors on the measured transitions are the same for all isotopes, the only improvement comes from the ##\Delta A_j^{max}##, which is the maximum difference between 2 isotopes. However in practice, measuring one more isotope (assuming we use even ones), that number would increase from something like 8 to 10 or 10 to 12, which is not much of an improvement and it doesn't even have to do with the statistics (not to mention that the bound presented there is an upper bound so the actual improvement would be even lower). Am I missing something? I know for a fact that there is a big effort to measure isotope shifts for as many isotopes as possible (hence why new radioactive beam facilities appeared), but at least for this particular problem that doesn't seem to help much.
 
  • Like
Likes Twigg and Dale
  • #37
anuttarasammyak said:
Let us say we repeat 2 transition experiment on the isotopes.
We get:
1st experiment data plot graph DP_1 and NL estimation NL_1
2nd experiment data plot graph DP_2 and NL estimation NL_2
3rd experiment data plot graph DP_3 and NL estimation NL_3
----
n th experiment data plot graph DP_n and NL estimation NL_n
----

Is this the right story we are dealing with ?
No, this is not what I am asking for. My question involves only 2 transitions, so there is just one NL. In figure 1 in the paper you mentioned each NL corresponds to different pairs of transitions.
 
  • #38
I added EDIT to #35 to make it clearer. Still No ? Then what kind of experiment you do to get a new vector to be incorporated ?
 
Last edited:
  • #39
anuttarasammyak said:
I added EDIT to #35 to make it clearer. Still No ? Then what kind of experiment you do to get a new vector to be incorporated ?
This is the kind of experiment I am talking about (not sure if this is what you meant). Also we don't want to add a new vector to the data, we want to make the previous vectors longer.
 
  • #40
Twigg said:
I hear you, Dale. I'm going to go out on a limb and guess that this method was chosen because it has a simple graphical interpretation.

@kelly0303 I can think of a way to generate a similar measure for N isotope pairs, but it's definitely something I pulled out of a hat. So take this suggestion with a suitably sized grain of salt. It's also a fair bit of work, unfortunately

If you have N data points (isotope pairs), ##\vec{x}_i = (x_i, y_i)##, you can define the area of the triangle made by 3 adjacent points: $$A_i = \frac{1}{2} ||(\vec{x}_{i+1} - \vec{x}_i) \times (\vec{x}_{i+2} - \vec{x}_{i+1})||$$
(To see why this is the area, note that ##(\vec{x}_{i+1} - \vec{x}_i)## is one leg of the triangle, ##(\vec{x}_{i+2} - \vec{x}_{i+1})## is another leg of the triangle, and their cross product has magnitude equal to the area of a parallelpiped. So half the area of the parallelpiped is the area of the triangle.)

This gives you N-2 individual triangle's areas, which you can then average: $$\langle A \rangle = \frac{1}{N-2} \sum^{N-2}_{i=1} A_i$$

The tricky part about doing this is the error propagation, because the ##A_{i}##'s depend on shared ##\vec{x}_i##'s. Because of this, I would suggest using mathematica to expand ##\langle A \rangle## as a first-order series in small displacements in ##\vec{x}_i##, call them ##\delta \vec{x}_i##. The coefficients in this series will be the coefficients in the error propagation for ##\langle A \rangle##. Does that make sense is that just really confusing?

I couldn't tell you how meaningful ##\langle A \rangle## is, but it will have the same scale as the area for 3 points and it will have a well-defined error-bar. I think so long as there isn't an inflection point in the nonlinear curve, it should be fine?

Edit: forgot a normalization in the definition of ##\langle A \rangle##
I see that in the papers I read, they seem to completely ignore the theoretical error. Even in the paper I mentioned, in equation 11, they claim that we just need to propagate the error on the isotope shift measurement. However the theoretical parameter (##X_1##, ##X_2##, ##F_{12}##) are usually quite poorly calculated (the relative error is a few percent which is huge compared to the relative error of the IS measurements). Why can we just ignore those errors when setting bounds?
 
  • #41
Thanks for sharing the papers as you find them, it's very helpful and I've learned a few things. I think the method I shared above only applies to having extra electronic transitions, not extra isotopes. Sorry about that, and thanks for bearing with us! (Edit: I was right the first time. Whoops. Still, the method outlined by Solaro et al seems more consistent than what I was proposing, because my method could give you a fake "linear" reading for a curve with an inflection point in the middle.)

I'm still wrapping my head around the 4-dimensional shenanigans of the Solaro et al paper. But I think what you said about the only improvement coming from ##\Delta A^{max}_j## actually makes sense. The way I see it, Solaro et al were trying to measure just a non-zero non-linearity, and they observed pure linearity. I know they say they got ##NL = \frac{V}{\sigma_V} = 1.26##, but I have no idea what kind of nightmarish probability distribution NL follows, and they say in the main text that they got a regression fit that was consistent with linearity in the paragraph under Fig. 2. (I can at least interpret the second part with my pea brain, no idea what the first part means lol o0)). The range ##\Delta A^{max}_j## represents the span of their experiment in unexplored parameter space. In other words, adding isotopes increases the scope of the experiment, but doesn't increase its sensitivity. The sensitivity only cares about the precision (in Hz) of the spectroscopic measurements. Does that make sense?

I would point out that the math one finds in the supplementary material can be a little typo-prone, so that's something to keep in mind. There's definitely a typo in Eqn 10, since the inside of the square root isn't dimensionally sound (I think they just forgot an exponent of 2).
 
Last edited:
  • #42
About them not counting theoretical uncertainty, this is just a thing precision measurement people do. I can vouch for that :oldbiggrin:

The reason is that they're just looking to make a measurement not consistent with 0. The error bars on coupling constants may change the non-zero value you eventually see, but they don't affect whether or not you see 0. Only your experimental sensitivity determines that. If they get a measurement that's just barely not consistent with zero, they'll get the grad students (sounds like that might be you!) to work around the clock and get another 100 hours of data to push the measurement one way or the other (push it towards being consistent with 0 or reduce the error bar enough to definitively say it's non-zero). This is why when I was on a precision measurement I would pray for a measurement of zero :oldlaugh: Beyond Standard Model physics is cool and all, but I'll take my weekends to myself thank you.

Also, just wanted to say about the Solaro et al paper, that's one heck of a setup they've got. That's one intense spectroscopy experiment!
 
  • #43
Twigg said:
Thanks for sharing the papers as you find them, it's very helpful and I've learned a few things. I think the method I shared above only applies to having extra electronic transitions, not extra isotopes. Sorry about that, and thanks for bearing with us!

I'm still wrapping my head around the 4-dimensional shenanigans of the Solaro et al paper. But I think what you said about the only improvement coming from ##\Delta A^{max}_j## actually makes sense. The way I see it, Solaro et al were trying to measure just a non-zero non-linearity, and they observed pure linearity. I know they say they got ##NL = \frac{V}{\sigma_V} = 1.26##, but I have no idea what kind of nightmarish probability distribution NL follows, and they say in the main text that they got a regression fit that was consistent with linearity in the paragraph under Fig. 2. (I can at least interpret the second part with my pea brain, no idea what the first part means lol o0)). The range ##\Delta A^{max}_j## represents the span of their experiment in unexplored parameter space. In other words, adding isotopes increases the scope of the experiment, but doesn't increase its sensitivity. The sensitivity only cares about the precision (in Hz) of the spectroscopic measurements. Does that make sense?

I would point out that the math one finds in the supplementary material can be a little typo-prone, so that's something to keep in mind. There's definitely a typo in Eqn 10, since the inside of the square root isn't dimensionally sound (I think they just forgot an exponent of 2).
So basically, from a statistics point of view, measuring more isotopes doesn't improve the bounds (up to that ##\Delta A^{max}_j## factor). What needs to be done is to actually reduce the uncertainty on each individual IS measurement. It still confuses me that adding more data doesn't help you. Intuitively I would imagine that observing linearity with 10 points would help you set much tighter bounds on new physics than with 3 points.

I am not totally sure I understand the theoretical uncertainty part. In this case, the new physics coupling constant, call it ##\alpha## is of the form ##\frac{A}{B}##, with A depending only on experimental data and B depending on both theory and experiment, but for now let's assume it depends only on theory. Assume that we get ##A = 10 \pm 20## and from theory we have ##B = 100 \pm 10##. Whether A is consistent with zero or not has nothing to do with the theory and from example above we can set a 95% confidence limit on A of ##A < 50##. However, their limits are on ##\alpha## whose central value is ##10/100 = 0.1##. But I don't understand how can we just ignore the theoretical errors in this case. It seems like the way they quote the error would be ##\alpha < 0.1 + 20 + 20 = 40.1##, where only the experimental error is considered. Why can we ignore the ##\pm 10## coming from the theory?
 
  • #44
kelly0303 said:
I am not totally sure I understand the theoretical uncertainty part.
Your analysis here is correct, but the purpose is different. The goal isn't to quote the total uncertainty on ##\alpha##, but to give the best estimate of the experiment's sensitivity to ##\alpha##. That estimate is itself uncertain due to theory error bars, as your analysis shows. I couldn't tell you exactly where this practice started, but I believe the reason precision measurement folks do this is to compare experiments and rate them.
Imagine people quoted the total uncertainty on ##\alpha##, and imagine a case where the theory uncertainty dominated over the experimental uncertainty. Every experiment would approximately have the same "sensitivity" with this convention. So instead, people ignore the theory uncertainty and quote only the experiment uncertainty. That way it's easier to "rank" experiments. Of course, in reality this number doesn't mean anything on its own because for all you know an experiment with a lower statistical sensitivity could be bloated with systematic uncertainty.
 
  • #45
kelly0303 said:
Intuitively I would imagine that observing linearity with 10 points would help you set much tighter bounds on new physics than with 3 points.
I think what the expression in eqn 12 is saying is that what really matter is how far spaced those 10 points are. When measuring non-linearity, you want a large lever arm over which to see the change in slope. You can make a better measurement with 3 points very far apart than 10 points bunched together.
 
  • #46
Twigg said:
Your analysis here is correct, but the purpose is different. The goal isn't to quote the total uncertainty on ##\alpha##, but to give the best estimate of the experiment's sensitivity to ##\alpha##. That estimate is itself uncertain due to theory error bars, as your analysis shows. I couldn't tell you exactly where this practice started, but I believe the reason precision measurement folks do this is to compare experiments and rate them.
Imagine people quoted the total uncertainty on ##\alpha##, and imagine a case where the theory uncertainty dominated over the experimental uncertainty. Every experiment would approximately have the same "sensitivity" with this convention. So instead, people ignore the theory uncertainty and quote only the experiment uncertainty. That way it's easier to "rank" experiments. Of course, in reality this number doesn't mean anything on its own because for all you know an experiment with a lower statistical sensitivity could be bloated with systematic uncertainty.
Thanks a lot! So basically it is something agreed upon to make this exclusion plots, ignoring the theoretical part? However, now there is something else that confuses me. They talk in all these papers about non linearities coming from the SM, and how they can significantly affect the sensitivity to new physics. So as far as I understand, they can be calculated, but the errors are pretty big, so future experiments will try to measure more transitions in order to get rid of them using data, not theory. However, say that from the experiment we obtain a non-linearity of ##10 \pm 20##. If we assume no SM non-linearity, we would set a limit at ##<50##. Now if from theory we have a predicted non-linearity in the SM of 5, the value of the non-linearity due to new physics would be ##(10-5) \pm 20 = 5 \pm 20## from which we get a bound of ##<45##. So it looks as if including the SM non-linearity we get even a better bound. What am I doing wrong here? How does the SM non-linearity reduce the sensitivity to new physics?
 
  • #47
kelly0303 said:
So basically it is something agreed upon to make this exclusion plots, ignoring the theoretical part?
Yes, but it's not specific to this problem. Here's a better explanation than what I gave before: say you measured a value of ##A## that's ##5\sigma## away from 0. The confidence in this measurement does not depend on theory whatsoever. Even if there's a lot of uncertainty on ##B## (and therefore ##\alpha##), you still proved the existence of novel physics.

The problem with the SM non-linearity is that it introduces a systematic uncertainty. When looking for miniscule effects, you always want to measure something that would be 0 in the SM. Under this condition, you can separate experimental and theoretical uncertainties. When you have to subtract a systematic shift from your measurement, you add uncertainty due to your correction, and thus the theoretical uncertainty on the SM nonlinearity bleeds into your final error budget.

In your example, there would be theory error bars on the SM non-linearity of 5. The corrected non-linearity (##NL_{BSM} = NL_{observed} - NL_{SM}##) would have an error bar ##\sqrt{(20^2 + \sigma_{SM}^2)}## where ##\sigma_{SM}## is the uncertainty on the SM non-linearity of 5. Experiments often quickly outpace theoretical calculation in these projects.

That's why they try to cancel out the SM non-linearity by taking differential measurements. It's a classic rule of thumb for precision measurements to try and measure something with a baseline of 0 and to avoid non-zero correction factors like the plague.
 
  • #48
Twigg said:
Yes, but it's not specific to this problem. Here's a better explanation than what I gave before: say you measured a value of ##A## that's ##5\sigma## away from 0. The confidence in this measurement does not depend on theory whatsoever. Even if there's a lot of uncertainty on ##B## (and therefore ##\alpha##), you still proved the existence of novel physics.

The problem with the SM non-linearity is that it introduces a systematic uncertainty. When looking for miniscule effects, you always want to measure something that would be 0 in the SM. Under this condition, you can separate experimental and theoretical uncertainties. When you have to subtract a systematic shift from your measurement, you add uncertainty due to your correction, and thus the theoretical uncertainty on the SM nonlinearity bleeds into your final error budget.

In your example, there would be theory error bars on the SM non-linearity of 5. The corrected non-linearity (##NL_{BSM} = NL_{observed} - NL_{SM}##) would have an error bar ##\sqrt{(20^2 + \sigma_{SM}^2)}## where ##\sigma_{SM}## is the uncertainty on the SM non-linearity of 5. Experiments often quickly outpace theoretical calculation in these projects.

That's why they try to cancel out the SM non-linearity by taking differential measurements. It's a classic rule of thumb for precision measurements to try and measure something with a baseline of 0 and to avoid non-zero correction factors like the plague.
Oh I see, so basically we would need to measure one more transition in order to get rid of one SM non-linear effect. That extra transition would add some more error on the measurement, compared to just 2 transitions, but the extra error added is usually much smaller than the error on the theoretical predictions of the SM non-linearity and also now we know that the obtained results reflects exclusively new physics (assuming there is just one SM non-linearity).
 
  • Like
Likes Twigg
  • #49
Yep! I don't know the specifics, but it's undoubtedly something to that effect based on your description.

I wouldn't say it adds error, because taking two measurements also means you get to average down on the BSM non-linearity. For example, if you make two measurements that yield results like ##\alpha_1 = \alpha_{BSM} + \alpha_{SM}## and ##\alpha_2 = \alpha_{BSM} - \alpha_{SM}## (this is totally hypothetical), you'd get a ##\frac{1}{\sqrt{2}}## improvement in error over taking one measurement (because ##\alpha_{BSM} = \frac{1}{2} (\alpha_1 + \alpha_2)## so ##\sigma_{BSM} = \frac{\sqrt{\sigma^2 + \sigma^2}}{2} = \frac{1}{\sqrt{2}}\sigma##). But that's the same as just taking two measurements without cancelling. If anything, the error per square root the number of measurements is a constant.
 
  • Like
Likes kelly0303
  • #50
Twigg said:
Yep! I don't know the specifics, but it's undoubtedly something to that effect based on your description.

I wouldn't say it adds error, because taking two measurements also means you get to average down on the BSM non-linearity. For example, if you make two measurements that yield results like ##\alpha_1 = \alpha_{BSM} + \alpha_{SM}## and ##\alpha_2 = \alpha_{BSM} - \alpha_{SM}## (this is totally hypothetical), you'd get a ##\frac{1}{\sqrt{2}}## improvement in error over taking one measurement (because ##\alpha_{BSM} = \frac{1}{2} (\alpha_1 + \alpha_2)## so ##\sigma_{BSM} = \frac{\sqrt{\sigma^2 + \sigma^2}}{2} = \frac{1}{\sqrt{2}}\sigma##). But that's the same as just taking two measurements without cancelling. If anything, the error per square root the number of measurements is a constant.
One more questions (sorry!). The actual expression for the new physics parameter ##\alpha## is of the form ##A/B##, as we said above, but in practice B contains both theory and experimental input. As you said, if we are to measure ##A\neq 0## at a ##5 \sigma## level, we know for sure we found new physics (assuming we have no SM effects). So when we calculate the error on ##\alpha## and doing error propagation, why do we need to propagate the error from the experimental part in B, too? If all that matter is whether A is consistent with zero or not, why do we bother with B at all? Also, why don't people make exclusion plots for A directly, without involving ##\alpha## at all?
 

Similar threads

Back
Top