Yep. I'm actually going to use "Chemistry of Pyrotechnics" to put together a few labs for next year (supposing the local school is pleased enough to let me coordinate a few labs for them again.)That is fun! Not often that you get to set off fireworks for science
I think that is a succinct summary of the problem with pop-sci presentations. It is good that you are focusing on more than just the fun, but including both fun and learning objectives.It's easy to pretend one is doing science when all the students remember is the "Gee Whiz" and no one remembers the learning objectives.
In a paper coming out this fall in TPT, colleagues and I identified three challenges in the typical introductory physics lab design:I think that is a succinct summary of the problem with pop-sci presentations. It is good that you are focusing on more than just the fun, but including both fun and learning objectives.
I really worries me that students seem to confuse simulation with reality all the time. It's the Startrek effect. They ask why their simulation is not giving the answers they expect. It's GIGO without having any way of chasing the fault in the model. A simulation is so much cheaper than hardware and you don't need lab space nor need to tidy up for the next class. You can see why 'the system' likes to encourage it.I've got mixed feelings about calling an analysis activity a real "laboratory" if someone else did the experiment and collected the data.
I consider downloading real data acquired from a third party as a different (better) class of lab than computer simulations. For example, last year, I had a physical science class download and analyze both Brahe's original data and modern data for testing Kepler's third law. Later, (for a different lab), I had them download available orbital data for earth satellites to test Kepler's third law in that system. I had a physics class analyze Robert Boyle's original data (from his historical publication) to test Boyle's law.I really worries me that students seem to confuse simulation with reality all the time. It's the Startrek effect. They ask why their simulation is not giving the answers they expect. It's GIGO without having any way of chasing the fault in the model. A simulation is so much cheaper than hardware and you don't need lab space nor need to tidy up for the next class. You can see why 'the system' likes to encourage it.
I encountered an equivalent phenomenon several years ago while walking on a local college campus. I passed between a blank wall of a building and a pulsating garden sprinkler. My left ear heard the sprinkler, which produced psst sound as it spurted about four times a second. My right ear heard the echo off of the building. I was able to position myself so I heard both sounds simultaneously. I saw that I was hearing the direct sound of the nth spurt and the echo if the (n-1)th spurt. Given the period of the sprinkler spurts and the distance from the sprinkler to the wall I could get the speed of sound.A very cheap way that has good accuracy / consistency is to stand a distance from a large wall and use a hammer to hit a metal object. That is obvs so far. The clever bit is to strike the metal exactly when you hear the echo, and repeat. You repeat until you are accurately in sync with the echo pulses. Then you measure the time for 10, 20 or more echos. The accuracy gets better and better with more pulses.
Inductive thinking. It seems that you have 5 DATA points. The origin is not a data point, it is part of the hypothesis you are supposed to be testing.When fitting to a trendline in graph.exe, we were sure to check the box to set the vertical intercept to zero, as the hypothesis predicts not only a linear relationship, but also a vertical intercept of zero (a direct proportionality.)
To large degree you induced this result. Not good teaching to suggest this "supported" the hypothesis.Inspection of Figure 1 shows that the hypothesis was supported.
That is purely a convention, in relativity time is conventionally the independent variable and is plotted on the vertical axis. There is nothing that requires one axis to be dependent and the other independent.Time is the dependent variable and should be plotted on the y axis.
I agree with you on this, but teaching the students why belongs to a statistics class. Same with the fact that regression of x vs y is different than y vs x.The text underlines that care was taken to ensure the software was forced to go through the origin. This it totally wrong.
Scientific method demands that you conduct an experiment and then compare to theory / hypothesis. You do not start inserting assumptions from your hypothesis into you data then conclude that this "supports the hypothesis".I explain it to students this way: the only possible distance any signal can travel in zero time is zero distance.
That convention is not arbitrary. There is very good reason for following that convention if you are going to use standard OLS tools without knowing what you are doing because they are following that convention too !!Plotting it this way makes calculating the speed of sound easier, which was the main point of the lab. So setting the dependent variable on the horizontal axis is in fact a better choice for this experiment than following the arbitrary convention.
There is nothing "artificial" about the second parameter, there may be some experimental or physical conditions which produce something a little different from what you expect. You should analyse the data objectively without attempting to force the result you expect. That is the "need". It does not cost anything and if things go as expected you get near zero intersect and say to your students : "this is what we would expect from theory because .... ".When physical considerations demand that a mathematical relationship goes through the origin, there is no need to add a variable vertical shift artificially.
You can take up your trendline debate with those who make spreadsheets and other graphical and data analysis tools that refer to least squares fitting results as trendlines.Again it is not a "trendline"...
I disagree. Like all conventions, it is completely arbitrary. There is no non arbitrary reason to put the dependent variable on the vertical axis. I challenge you to find a non-arbitrary for the vertical dependent axis.neither is that convention arbitrary.
I am not familiar with the specific tool used in the write up, but I disagree completely that standard OLS tools use that convention. The standard OLS tools that I have used typically have the variables horizontal and the observations vertical. Often even that can be overridden by the user. I don’t even know how the OLS tools could follow that convention in principle.standard OLS tools ... are "blindly following" that convention too
I agree with this point. Fitting a model without an intercept term is rarely advisable.If the aim is examine the experimental relationship between elapsed time and distance travelled you should be fitting a two parameter linear model. If your experiment is well designed and there are not any anomalous effects it should have an intercept very close to zero.
You should pretty much always include it. The only time you can leave it out is when it is actually 0, not just not significantly different from 0, but exactly 0. And in that case then leaving it in is the same as leaving it out, so you should always leave it in.The question of whether to include a vertical intercept is more interesting.
A constant term is not "silly". If the fit evaluates it near zero, it will not cost anything, and that is valuable information in itself, not "silly". Negative results can be as important a positive ones. Blinkering the analysis by trying to coerce the result is not only silly but unscientific.Analysis of the residuals of the fit to a line forced through the origin suggested the small residuals were systematically due to widening of the cylinder at the top. Fitting to a quadratic with zero constant term made a lot more sense (as the two parameter model) in that case. But this was pretty far into the weeds relative to the initial hypothesis that mass was proportional to volume. A constant term in this case is just silly.
For now I'm not buying it, and I intend to keep teaching students to set the vertical intercept to zero when the basic science of the experiment suggests the model will go through the origin. Here's why:You should pretty much always include it. The only time you can leave it out is when it is actually 0, not just not significantly different from 0, but exactly 0. And in that case then leaving it in is the same as leaving it out, so you should always leave it in.
First, and most importantly, if you remove it then all of your other parameter estimates become biased. The EmDrive fiasco is a great example of this. This bias occurs even if the intercept is not significantly different from zero.
Second, your residuals will no longer be zero mean. This may be related to your observation.
Third, many software implementations change the meaning of the R^2 value they report when the intercept is removed. So the resulting R^2 cannot be meaningfully compared to other R^2 values nor interpreted in the usual fashion.
Fourth, even if your true intercept is zero if the function is not exactly linear then your fit can be substantially worse than a linear fit with an intercept.
I’m sure there are other reasons, but basically don’t do it. It is never statistically beneficial (since the only time it is appropriate is when it makes no difference) and it can be quite detrimental. If it makes a difference then you need to leave it in for the reasons above, and if it doesn’t make a difference then it doesn’t hurt to leave it in.
Honestly, with your data the above biases and problems should be minuscule. So this data seems to be on the “it doesn’t make a difference” side of the rule. But I would recommend leaving it in for the future. I wouldn’t proactively give any explanation to the students, but just use the default setting.
This is a potential contradiction with your earlier assertion that the vertical axis should always have the dependent variable. Now you are saying the vertical axis should be the variable with the larger errors. Which is preferred in the case where the independent variable is expected to have the larger errors?Even if there is not time to go into the details of the maths it would seem important to at least mention that it only minimises y residuals and that the basic criterion for this to work properly is to have very small errors on the x axis variable. It is only under those conditions that it will produce the "best unbiased linear estimation" of the slope.
The practical question is simply which slope gives you the better approximation of the speed of sound. I guess it depends on the type of error you have, and the distribution of the samples.Does the vertical intercept have a physical meaning or is it more of a fudge factor to get a better fit?
s h 2h correct delta 1 0 0 0 0 1 0.5 1 0.414213562 -0.585786438 1 1 2 1.236067977 -0.763932023 1 2 4 3.123105626 -0.876894374 1 3 6 5.08276253 -0.91723747 1 4 8 7.062257748 -0.937742252 1 5 10 9.049875621 -0.950124379 1 6 12 11.04159458 -0.958405421 1 7 14 13.03566885 -0.964331152 1 8 16 15.03121954 -0.968780458 1 9 18 17.02775638 -0.972243623 1 10 20 19.02498439 -0.975015605 1 11 22 21.02271555 -0.977284454 1 12 24 23.0208243 -0.979175701 1 13 26 25.01922366 -0.980776337 1 14 28 27.01785145 -0.982148548