Direct Echo-Based Measurement of the Speed of Sound - Comments

Dale · Mar 21, 2020

Dr. Courtney said:

Fitting this data to a traditional power law yields better agreement with the "known good" value for the exponent. Adding a third adjustable parameter (the vertical shift) gives a higher r-squared and a lower Chi-square, but it gives a less accurate value for the exponent in both the condensation and rarefaction cases.

By the way, you seem to have a misunderstanding here. You didn’t describe what you were doing in detail, but there should be no third term. In the model ##y=bx^m## the log of the parameter ##b## is the intercept. This type of model is fit by log-transforming “under the hood” so you actually fit ##\ln (y) = m \ln (x) + \ln (b)##. Thus a no-intercept fit would be one that coerces ##\ln (b)=0 \implies b=1##. That coercion would lead to bias in the estimate of ##m## if ##b\ne 1##

Dr. Courtney · Mar 21, 2020

Dale said:

By the way, you seem to have a misunderstanding here. You didn’t describe what you were doing in detail, but there should be no third term. In the model y=bxmy=bxm the log of the parameter bb is the intercept. This type of model is fit by log-transforming “under the hood” so you actually fit ln(y)=mln(x)+ln(b)ln⁡(y)=mln⁡(x)+ln⁡(b). Thus a no-intercept fit would be one that coerces ln(b)=0⟹b=1ln⁡(b)=0⟹b=1. That coercion would lead to bias in the estimate of mm if b≠1b≠1

No, I used a non-linear least squares fit directly to the model without the log transformation. In that approach, the vertical offset is a third parameter. In the log transformation approach, the intercept is necessary, because it is the second parameter needed and has a physical meaning relating to the data. In direct application of non-linear least squares, the vertical shift is not necessary or helpful, since it is the third parameter, has no physical meaning, and makes the results less accurate. I can't see how you thought I used the obsolete log transform approach, since it is not possible to do that with the three parameter model (true power law plus vertical shift.)

Dale · Mar 21, 2020

Dr. Courtney said:

I used a non-linear least squares fit directly to the model without the log transformation.

Ah. All of my comments above are specifically about linear least squares fits. Non-linear fits are more complicated and need much more care with regard to the errors. Biased estimates are pervasive for non linear fits.

Dr. Courtney said:

In direct application of non-linear least squares, the vertical shift is not necessary or helpful, since it is the third parameter, has no physical meaning, and makes the results less accurate.

No objection here.

Dr. Courtney · Mar 21, 2020

Dale said:

Ah. All of my comments above are specifically about linear least squares fits.

To clarify, do you mean linear models or linear least squares fits? One can use Levenberg-Marqhardt (non-linear least squares method) on linear models. I don't think how one obtains the best-fit parameters matters as long as Chi-square in minimized and R-squared is closest to one in the fit determined to be "best." In any case, let's return to linear models.

The graph below shows linear best fits of Mass vs. Volume for distilled water. The slope of the best fit line should give the density of water. I feel using a line with a vertical offset set to zero is warranted, since I know the mass of water with zero volume is zero. "Forcing" the line through the origin yields a density of water of 0.9967 g/mL with an uncertainty of 0.0014 g/mL estimated from the algorithm in the spreadsheet LINEST command. This agrees reasonably with the known good density of distilled water at 70 deg F, which is 0.99802. Not bad for a student grade graduated cylinder and electronic balance with 0.1 g resolution. The experiment and analysis produced an accuracy better than 0.2%.

What happens when a vertical shift is allowed? The analysis suggests the density of water is 1.0045 g/mL with an estimated uncertainty of 0.001 g/mL. Even though the accuracy estimate is 0.1%, the actual error is over 0.6%.

sophiecentaur · Mar 21, 2020

Dr. Courtney said:

since I know the mass of water with zero volume is zero.

This is where my problem lies. Of course we know that no volume will have zero mass but how can you be sure that your volume and mass measurements have no offset and that all measured points on your graph - except your artificial one - could be subject to the same offset. If the offset were greater than the other random factors and if you took enough readings (@Dale has already made this point) then the offset would reveal itself in the position of the intercept of a good straight line. Are you saying that you should just ignore this actual intercept value and not put the final dot on this line? It could represent an experimentally significant piece of information. Where is the 'reality' in this discussion?

I made the point, earlier, that there will always be an offset in time / distance measurements when the source and receiver are very close together because the positions are undefinable (there are no point sources or point receivers). You seem to suggest that there is a 'real' answer to the final value of velocity from your experiment but there will always be some uncertainty.

Introducing Power Laws and Log scales is merely clouding the issue. How is finding and demonstrating that there is an underlying linear relationship between two measured variables helped by forcing any single data point into the process which would, in fact suggest that the relationship is not linear?

Dr. Courtney · Mar 21, 2020

I've provided case after case of fitting where fits without an offset provide better agreement with known good parameter values.

You repeat the same tired theoretical considerations in the face of additional demonstrations that leaving the offset out produces more accurate parameter values. I have no need to answer theoretical objections when case after of case of experimental data supports leaving the offset out.

As Feynman said, "The easiest person to fool is yourself." But Feynman also noted the more important principle in the scientific method is that experiment is the ultimate arbiter. I have additional experimental data showing that leaving the offset out gives better agreement with known good parameter values. Would you like to see it?

sophiecentaur · Mar 21, 2020

Dr. Courtney said:

I have additional experimental data showing that leaving the offset out gives better agreement with known good parameter values.

Could you tell me how you manage to measure the distance when it is small? How do you define the positions of source and detector (relative to the reflector) and how big are the two terminals?

sophiecentaur · Mar 21, 2020

Dr. Courtney said:

But Feynman also noted the more important principle in the scientific method is that experiment is the ultimate arbiter.

Doesn't that imply that the experimental result for small distances is the arbiter? But Ad Hominem arguments are not really relevant here. We all know Richard was a smart guy but that doesn't mean his particular statements apply to your particular example.

Vanadium 50 · Mar 21, 2020

I hope this adds more light than heat.

First, there is a theorem, I forget by whom, that states that in general the estimator with the smallest variance is not unbiased. You can sort of see why this might be the case: if you want the smallest σ/μ, if μ is biased high, σ/μ will be biased low.

Next, in this situation we are fitting models. x = vt is a model. So is x = vt + x₀. So is x = at² vt + x₀. Statistics can't tell you what model is right or wrong. It can only tell you what fits well and what doesn't.

It has been suggested to add extra terms and if they come out zero, they don't hurt anything. One (of several) problem with this is "where do you stop?" You can always add terms and the best fit will always be when the number of parameters equals the number of data points. These "fits" tend to be unphysical and wiggly.

Resoultion in x has been cited as a reason not to use the x = vt model. However, if you're going down that path, you also need to consider resolution in t. This opens up all sorts of cans of worms, previously discussed here.

Swamp Thing · Mar 21, 2020

Vanadium 50 said:

Next, in this situation we are fitting models. x = vt is a model. So is x = vt + x0. So is ##x = at^2 vt + x0##. ... One (of several) problem with this is "where do you stop?"

Perhaps we should think of the model as representing the physical phenomenon as well as the error mechanisms. For example, in the Insight article (distance measurement), if the air temperature during the experiment can vary as we move away from the reflecting wall, then ##x = a_0vt+a_1t^2 vt + x0## might not be such a ridiculous idea. On the other hand, if we are sure that the speed of sound is independent of distance from the wall, then our model should not include a square term. Based on this pholosophy, we should probably attribute the intercept ##x_0## to a measurement bias.

Once we are clear about which terms belong to the phenomenon model, and which ones belong to the error model, then we can subtract out the terms that we're sure come from the error model. Me, I would bet on subtracting out the ##x0##.

Or, if we want to add some finesse, we can increase the weight of measurements in proportion to the distance from the wall, since we can be less confident about a measurement that is too close to the wall. (if we do 20 measurements at a place 2 meters from the wall and 20 measurements somewhere 50 meters from the wall, will the first set have a larger standard deviation?)

In some situations, the error model and the phenomenon model can both contribute to a particular term. In this case, a probabilistic approach can be used to partition the term between the error model and the phenomenon model. IIRC, something like this is done in Kalman filters, for example.

sophiecentaur · Mar 22, 2020

The two approaches both have their place, I guess but there is always a potential offset in deciding on the effective position of the origin of any wave. It may be more obvious for an omnidirectional source but, when there is a reflector involved and when the wavelength of the wave is greater than the distance, there will be dispersion and distortion of the wave / pulse shape.

When you get down to it, it depends whether you want to get 'the appropriate answer for a particular model' for the benefit of the student or whether you want to explore the experiment itself and see how it affects the model that operates in that instance.

Dr. Courtney · Mar 22, 2020

Experiments are the arbiters of theoretical assertions, and the proper experiment depends on the assertion. If the assertion is essentially "adding an offset to a model of a proportionality is always better", then it depends on what you mean by "better." For me "better" means "more accurate" since accuracy is key in testing the specific hypothesis and in determining the constant of proportionality.

Vanadium 50 said:

It has been suggested to add extra terms and if they come out zero, they don't hurt anything. One (of several) problem with this is "where do you stop?" You can always add terms and the best fit will always be when the number of parameters equals the number of data points. These "fits" tend to be unphysical and wiggly.

Extra terms do hurt. Even when they are statistically not different from zero, the cases above demonstrate that allowing them increases the errors of the parameters from the known good values.

Extra terms are also unnecessary for testing hypotheses relating to proportionality. Before least square fitting was invented, there was hundreds of years of sound scientific practice testing proportions without a constant term. That's how Kepler tested his third law: T^2 = k a^3. That's how Galileo tested his law of falling bodies: d = k t^2. That's how Robert Boyle tested Boyle's law: PV = k. Least squares fitting provides much easier ability, but not the necessity of constant terms.

If there are significant resolution issues in either time or position, adding terms can remedy the experimental limitations with more advanced analysis. Using a video camera to test Galileo's law of falling bodies (and determine g), the time resolution is 1/30 or 1/60 of a second, and the position resolution might be 1720 pixels across the vertical field of view. The time resolution is the big limitation since it is much harder to drop an object at the exact moment a frame is captured. Adding constant and linear terms to the equation saves the day, both allowing testing of Galileo's law of falling bodies AND determination of g within 1-2% (the accuracy is now limited by the imperfect linear mapping of pixels to position.) But there is something of a happy accident here in that adding a constant and linear term don't change the value of the coefficient of the quadratic term. This is not generally the case.

As I mentioned in the original speed of sound article, there is no need for a least-squares fit. The historical procedure for testing a hypothesis of proportionality is adequate. One can compute a speed of sound simply as V = d/t for each data point. If one has a number of data points, one can then compute a mean and standard error for comparison with the speed of sound predicted for the measured temperature. Values computed this historical way tend to be within the standard error of those predicted from the temperature and obtained from a least squares fit without a vertical shift. Adding the vertical shift, while the shift is statistically not different from zero, it pulls the slope further from the proportionality constant obtained using the historical method (d/t) and further from the predicted speed of sound.

Before selecting an analysis approach for students, my habit is to run a pilot experiment or two along with the analysis. Of course, I try common variations on the analysis which is how I learned that the constant and linear terms save the day for a video experiment on Galileo's law of falling bodies, but that the constant term reduces accuracy on the speed of sound experiment. For some physics classes, it might make sense to try a linear fit with the offset as an additional test on the proportionality hypothesis, d = Vt. If one has done a good experiment (sufficiently small systematic errors), an offset different from zero (in the sense of statistical significance) would cast doubt on the hypothesis. But this experiment is not hard to do well. A simple tape measure determines the distance to the wall to much better than 1%, the separation from the firecracker to the wall is over 1000 times the microphone offset, and the firecracker produces very high frequencies so that diffraction does not significantly harm the speed of sound determination. Wind has the greatest potential to introduce systematic errors, but there are a lot of still mornings to avoid that.

Dale · Mar 22, 2020

Dr. Courtney said:

I've provided case after case of fitting where fits without an offset provide better agreement with known good parameter values.

You repeat the same tired theoretical considerations in the face of additional demonstrations that leaving the offset out produces more accurate parameter values. I have no need to answer theoretical objections when case after of case of experimental data supports leaving the offset out.

This is an absolutely terrible stance to take, and contrary to your belief this is unscientific.

There are an infinite number of cases where fitting without the intercept is OK. There are also an infinite number of cases where fitting without the intercept introduces bias. This is what the "tired theoretical considerations" that you want to ignore says, so your presented data is not evidence contrary to the accepted theory. Your data is a non-random sample from the set of all experiments testing a physical theory having no intercept.

You have an alternative theory that you have not clearly formulated but seems to be along the lines of "as long as my physical theory has no intercept it is preferable for my statistical model to also have no intercept". Your data also supports this theory, but my data posted earlier contradicts it. (please feel free to express your theory clearly in your own words)

So, together we have a set of data presented in this thread, yours and mine, that is consistent with the standard and well-known "tired theoretical considerations" you dismiss, but are inconsistent with the idea that it is generally safe to fit statistical models without intercepts given a physical model with no intercept. If you really wish to be scientific and if you really wish to rely on experiment, then the correct conclusion supports the standard statistical theory which urges caution in fitting no-intercept models and explains possible failure modes and their causes. Your desire to ignore such established knowledge is not scientific at all.

Dale · Mar 22, 2020

Vanadium 50 said:

It has been suggested to add extra terms and if they come out zero, they don't hurt anything.

That is a little bit overstating it. The suggestion is specifically to add an intercept to a model ##y=mx##. We are not talking about adding higher order terms that make the fit more wiggly. We are talking about adding the intercept to prevent the slope from becoming biased (among other considerations).

Vanadium 50 · Mar 22, 2020

Sure, but why pick that particular model? How do you decide your model needs exactly one improvement, and that this is it?

Dr. Courtney · Mar 22, 2020

Vanadium 50 said:

Sure, but why pick that particular model? How do you decide your model needs exactly one improvement, and that this is it?

The point I doubt is that the direct proportion is unique among power laws in needing (or benefiting from) an added constant. Every power law can be expressed as a direct proportion with a suitable change of variables. Does this mean it is suitable to take this approach? Is this how the most accurate mass of the Earth and sun really need to be determined? Even if this were so, does this make adding the constant is appropriate for models in intro physics courses when testing the hypothesis and accomplishing the learning the objectives can be done adequately without it?

Even if it were true that "every direct proportion needs an added constant in the analysis" (which I doubt), certainly this is a refinement that may be ignored for most intro physics labs (kinda like neglecting air resistance and other common simplifications.)

Dale · Mar 22, 2020

Swamp Thing said:

Perhaps we should think of the model as representing the physical phenomenon as well as the error mechanisms.

That is a good approach. We discussed this briefly earlier in the context of Occham’s razor. It doesn’t make a difference for Occham’s razor if you have a simple effect model and a complicated error model or a complicated effect model and a simple error model. But if you prefer the complicated error model for other reasons then that is fine.

Dale · Mar 22, 2020

Vanadium 50 said:

Sure, but why pick that particular model? How do you decide your model needs exactly one improvement, and that this is it?

Because you can demonstrate that if you don’t make that specific one improvement then your other parameter estimates can become biased (among other specific considerations mentioned earlier). Adding other higher order terms is not similarly justifiable.

Personally, I like Bayesian statistics where this type of concern is directly addressable. Bayesian model comparison allows you to decide naturally between a more or less complicated model. Of course, those techniques are sensitive to your priors, which is not necessarily damning, but does require care.

With frequentist statistics there are other model comparison techniques which can be used. The BIC, in particular, helps guard against overfitting.

Dale · Mar 22, 2020

Dr. Courtney said:

this is a refinement that may be ignored for most intro physics labs

I do agree completely with this. It should be handled in a dedicated statistics class, not an intro physics class.

I just disagree that instructing students to click on the “fit with no intercept” button is ignoring the issue. Quietly leaving it at the default value without comment would be ignoring the issue.

Dr. Courtney · Mar 22, 2020

Dale said:

I do agree completely with this. It should be handled in a dedicated statistics class, not an intro physics class.

I just disagree that instructing students to click on the “fit with no intercept” button is ignoring the issue. Quietly leaving it at the default value without comment would be ignoring the issue.

But is it even needed to do a least squares fit? The traditional approach of computing a proportionality constant for each data point (and then averaging them, perhaps computing a standard error of the level of course warrants) ALSO completely ignores a potential offset. Is it preferable to the science or the pedagogy to abandon this approach in favor of a least squares fit with offset? I actually had students do it two ways (compute speeds from each trial, average them and compute SEM) AND compute a linear fit without the offset. Adding the offset increases the estimated uncertainty, and yields a slope value further from the average of individual trials AND further from predicted speed of sound based on temperature.

I don't take scientific or teaching advice from default software settings, and to be honest I had forgotten what the default was on the LINEST spreadsheet command until I checked. Since I want students to see the error estimate for the slope, I teach them to type the whole command =LINEST(Y array,X array2,0,1). One changes the 0 to a 1 to add an offset. To have the software use the default, one types =LINEST(Y array, X array, , 1). It's my preference as a teacher to instruct students what all the arguments are and to use them in spreadsheet calls. It's also my preference to have due consideration of available estimates of uncertainty (which are always LARGER adding the offset to the speed of sound analysis.) In an experiment with only five data points, adding a second adjustable parameter almost always significantly increases the uncertainty of the slope when the offset is statistically no different from zero. If there is both confidence in the experimental design that there is no significant offset, and if this confidence is supported by trying an offset with data sets from pilot work, the offset is not only unnecessary, it is a bad idea.

Vanadium 50 · Mar 22, 2020

Dale said:

Because you can demonstrate that if you don’t make that specific one improvement then your other parameter estimates can become biased (among other specific considerations mentioned earlier).

Would you make the same argument if this were an Ohm's Law measurement? If you want to make the argument on mathematical grounds, I think you have to. Then you need to have to decide whether you want a model with current spontaneously flowing with zero voltage or a minimum voltage below which current doesn't flow.

If you're arguing you want a different model in the two cases, then you're agreeing with me: statistics won't tell you the model to use; it will only tell you how well it fits.

Dale · Mar 22, 2020

Dr. Courtney said:

But is it even needed to do a least squares fit? The traditional approach of computing a proportionality constant for each data point (and then averaging them, perhaps computing a standard error of the level of course warrants) ALSO completely ignores a potential offset.

That would be fine in my opinion. It would defer instruction about statistics for a dedicated statistics class where the statistical issues can be presented in appropriate depth.

Dale · Mar 22, 2020

Vanadium 50 said:

Would you make the same argument if this were an Ohm's Law measurement? If you want to make the argument on mathematical grounds, I think you have to.

Absolutely. As you say, this is a mathematical issue not a physical issue.

Vanadium 50 said:

If you're arguing you want a different model in the two cases, then you're agreeing with me: statistics won't tell you the model to use; it will only tell you how well it fits.

Well, I am not arguing that I want a different model, but I do agree that statistics won’t tell you the model to use.

It can, however, tell you the possible failure modes of different methods. I would much rather lose a little precision than risk introducing bias. Precision can be “fixed” with additional data, bias cannot.

Dr. Courtney · Mar 23, 2020

Dale said:

That would be fine in my opinion. It would defer instruction about statistics for a dedicated statistics class where the statistical issues can be presented in appropriate depth.

Not at all. In two semesters of my lab physics courses, there still will be lots of least squares fitting to models that are functions other than direct proportions - over 10 cases. I suspect students would find it odd that we use least-squares fitting so often for other cases but avoid it for direct proportions. The smart ones would want to know why. Further, I've taught both intro and intermediate college statistics courses. Those courses tend to be packed with too much other material to spend much time on the theoretical development of least squares. The question of whether and why direct proportions are unique among functions in needing a vertical offset is far enough into the weeds that most instructors are not going to spend much time on it.

Dale said:

There are an infinite number of cases where fitting without the intercept is OK. There are also an infinite number of cases where fitting without the intercept introduces bias. This is what the "tired theoretical considerations" that you want to ignore says, so your presented data is not evidence contrary to the accepted theory. Your data is a non-random sample from the set of all experiments testing a physical theory having no intercept.

You have an alternative theory that you have not clearly formulated but seems to be along the lines of "as long as my physical theory has no intercept it is preferable for my statistical model to also have no intercept". Your data also supports this theory, but my data posted earlier contradicts it. (please feel free to express your theory clearly in your own words)

So, together we have a set of data presented in this thread, yours and mine, that is consistent with the standard and well-known "tired theoretical considerations" you dismiss, but are inconsistent with the idea that it is generally safe to fit statistical models without intercepts given a physical model with no intercept. If you really wish to be scientific and if you really wish to rely on experiment, then the correct conclusion supports the standard statistical theory which urges caution in fitting no-intercept models and explains possible failure modes and their causes. Your desire to ignore such established knowledge is not scientific at all.

It is misrepresenting my position to assert that I've dismissed the "tired theoretical considerations." But my approach to science is to hold theoretical assertions tentatively and keep on the lookout for real data sets against which to test them. I've given two cases of experiments where a vertical offset was needed and added due to known measurement challenges even though the physical theory goes through the origin.

Your assertion seems to be "one cannot know whether adding a constant will improve accuracy, so it should always be used to avoid the risk of introducing bias."

My assertion is that "the careful experimenter or data analyst can make a carefully considered choice of whether to fit to a constant with an added offset and often achieve a more accurate value for the slope for real experimental data without a significant risk of achieving a less accurate value." - The Dr. Courtney hypothesis.

Thanks to your simulation, I now see that simulated data sets won't do to arbitrate between our positions, since a choice is always made whether to add an offset in generating the data set. Real experimental data is needed that the experimenter or analysis believes does not have systematic errors large enough to require a vertical offset. Simulations might be useful in relating the magnitude of the systematic offset to the random noise to see where adding the offset begins to be required to recover the more accurate slope.

Arbitrating between our positions with experimental data requires data sets with known good slopes. Inclusion of data sets also requires the experimenter or analyst understand the physical system (including the measurement system) and determine the data set is a good choice for fitting without a slope. The crux of using real data to test my hypothesis is how to define "known good slope." So far, my approach has been to accept slopes as known good if they are known with greater accuracy than the analysis of the available data is likely to produce. The density of distilled water meets this criteria, as does pi. In V50's example of Ohm's law, using a resistance determined on a much more accurate Ohm meter would be a "known good." I've got some data sets from for or five different electronic balances. One might use the slope obtained from the most accurate electronic balance as the "known good."

You seem to be open to the idea of also using the average of slopes determined from individual measurements, since I don't recall you claiming that getting the slope this way is "biased." Error estimates of this process are usually comparable with the uncertainty of the best fit slope without the offset. So it depends on whether the selection criteria allows unbiased values of comparable accuracy or if it demands the known good be significantly more accurate. There are many more available data sets including this data.

Dale · Mar 23, 2020

Dr. Courtney said:

The question of whether and why direct proportions are unique among functions in needing a vertical offset is far enough into the weeds that most instructors are not going to spend much time on it.

They are not. Any OLS linear fit needs the intercept term. I am not sure why you believe that. As far as I know it is not supported in the literature.

Dr. Courtney said:

My assertion is that "the careful experimenter or data analyst can make a carefully considered choice of whether to fit to a constant with an added offset and often achieve a more accurate value for the slope for real experimental data without a significant risk of achieving a less accurate value." - The Dr. Courtney hypothesis.

How do you know if your "carefully considered choice" is correct? Particularly in the general case without a "gold standard" reference to fall back on.

Dr. Courtney said:

Real experimental data is needed that the experimenter or analysis believes does not have systematic errors large enough to require a vertical offset.

Here is one such example: https://arc.aiaa.org/doi/10.2514/1.B36120

See in particular their figure 19. The experimenter did not believe that they required a vertical offset. They have good theoretical reasons to believe that 0 power would produce 0 thrust, every bit as valid as your 0 volume is 0 weight and 0 distance is 0 time. But the data clearly should be fit to a model with an intercept and the no-intercept slope is clearly biased positive.

Dr. Courtney · Mar 23, 2020

Dale said:

They are not. Any OLS linear fit needs the intercept term.

So casting power laws as linear fits with a transformation of variables requires an offset, but performing a NLLS on the original functional form does not? Testing Kepler's Third as T = k a^1.5 needs an offset of the exponent is fixed as 1.5 but not if it is allowed to vary? The OLS model should be T = k a^1.5 + c, but the NLLS model can be T = k a^n ?

Dale said:

How do you know if your "carefully considered choice" is correct? Particularly in the general case without a "gold standard" reference to fall back on.

If having an offset of zero works in most cases of a "carefully considered choice" when there is a known good value, then there is no data to support the hypothesis that it is suddenly going to introduce significant errors in cases without a known good value. In every case, it will be possible to compare the slope of the best fit line with the value obtained from averaging the ratios. In every case it will also be possible to include the offset term and see if a fit yields a value significantly different from zero.

Dale said:

Here is one such example: https://arc.aiaa.org/doi/10.2514/1.B36120

See in particular their figure 19. They have good theoretical reasons to believe that 0 power would produce 0 thrust, every bit as valid as your 0 volume is 0 weight and 0 distance is 0 time. But the data clearly should be fit to a model with an intercept and the no-intercept slope is clearly biased positive.

Two points: 1) Data with vertical error bars that large is not very useful for testing a hypothesis of direct proportion or obtaining accurate slopes. Intro physics labs with error bars that large are either poorly designed or poorly performed, or both. We need to teach greater experimental care. 2) Most of the time experimenters will realize the weaknesses in their physical system or data and choose the right model.

Consider the paper below that I co-authored. Someone, somewhere may expect the bullet energy with zero gun powder would be zero and force a fit through the origin, introducing a bias in the slope. We knew the physical process required energy to be expended overcoming barrel friction, so we allowed a y-intercept to estimate the lost energy. Our measurements were sufficiently accurate for this procedure to work.
https://apps.dtic.mil/dtic/tr/fulltext/u2/a555779.pdf

In any event, the Dr. Courtney hypothesis is testable in a straight forward manner. The open question is whether the average of ratios can reasonably serve as a "known good" value. I'm doing some pilot work that suggests it can. In fact, I'm finding cases where the average of ratios is a much better estimate of the slope than the slope obtained by OLS with or without the offset.

If you are offering your assertion that is a "theoretical truth" that is not experimentally testable, then I'll simply point out that would make it unscientific. If we don't test our assertions against real-world data, we are only doing math.

Dr. Courtney · Mar 24, 2020

This is one of the cases where the average ratio (A^1.5)/T is much closer to the known good value (1.0000) than the slope obtained by OLS either with or without the offset. I'm expecting a trend where this will usually be true when the data set spans several orders of magnitude (a factor of 1000 in this case), since OLS will tend to weigh the higher values more heavily by minimizing the squared error. In contrast, computing the average value of the ratio weighs each data point in the set equally. For data sets that cover closer to 1 order of magnitude (a factor of 10), the trend I'm seeing is that the average ratio has comparable accuracy to the best fit slope (without an offset).

For both Kepler's original data, and the modern data, the OLS without the offset produces slopes closer to the known good value than including the offset. And for both Kepler's original and the modern data, the uncertainty in the offset using OLS is statistically not different from zero.

This raises an interesting question. For hundreds of years before OLS, scientists used ratios and their averages to estimate the leading constant in proportions. Do we need to have concern that this carries a risk of "bias"? Is it a useful exercise to perform an OLS with an added offset as a test for bias, as I have done above for Kepler's law?

Dr. Courtney · Mar 25, 2020

I posted these some time ago, but it bears repeating that while the statistics literature recommends due care in doing OLS fits through the origin, the practice is supported in some cases:In certain circumstances, it is clear, a priori, that the model describing the relationship between the independent variable and the dependent variable(s) should not contain a constant term and, in consequence, the least squares fit needs to be constrained to pass through the origin.
(HA Gordon, The Statistician, Vol 30 No 1, 1981)

There are many practical problems where it is reasonable to assume the relationship of a straight line passing through the origin ... (ME Turner, Biometrics, Vol 16 No 3, 1960)

This article describes situations in which regression through the origin is appropriate, derives the normal equation for such a regression and explains the controversy regarding its evaluative statistics. (JG Eisenhauer, Teaching statistics, Vol 25 No 3 2003)

Direct Echo-Based Measurement of the Speed of Sound - Comments

Similar threads

High School What is the Correct Reading on the Scale in This Mass/Scale Puzzle?

Undergrad Is calling fictitious forces "not real" just about terminology?

Undergrad Topic about physics axioms, theory, laws etc..

Undergrad Reference frames, center of rotation, etc

Undergrad Is energy really conserved?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers