I apologize for my wrong assumption. Based on your questions it seemed like you did not understand the statistical issues involved as you did not mention any of the relevant statistical issues but only the pedagogical/scientific issues. For me, if I had decided (due to pedagogical or scientific considerations) to use the no-intercept method then I would have gone through a few of the relevant statistical issues, identified them as being immaterial for the data sets in consideration, and only then proceeded with the pedagogical/scientific justification. I mistakenly thought that the absence of any mention of the statistical issues indicated an unfamiliarity with them.
That is not the only issue, nor even the most important. By far the most important one is the possibility of bias in the slope. It does not appear to be a substantial issue for your data, so that would be the justification I would use were I trying to justify this approach.
Or in the Bayesian framework you can directly compare the probability of different models.
This would be a good statistical justification. It is not a general justification, because the general rule remains that use of the intercept is preferred. It is a justification specific to this particular experiment that the violation of the usual process does not produce the primary effect of concern: a substantial bias in the other parameter estimates.
Then you should know that your Ockham's razor argument is not strong in this case. It is at best neutral.
In the Bayesian approach this can be decided formally, and in the frequentist framework this is a no-no which leads to p-value hacking and failure to replicate results.
All considerations from the viewpoint of doing science intended for the mainstream literature. But from the viewpoint of the high school or intro college science classroom, largely irrelevant. The papers I cited make a strong case for leaving out the constant term when physical considerations indicate a reasonable physical model will go through the origin, and I think this is sufficient peer-reviewed statistics work to justify widespread use in the classroom in applicable cases. I also pointed out the classroom case of Mass vs. Volume where leaving out the constant term consistently provides more accurate estimates of the material density than including it. Been at this a while and never seen a problem when the conditions are met that are pointed out in the statistics papers I cited. You seem to be maintaining a disagreement based on your own authority without a willingness to cite peer-reviewed support for your position that the favored (or valid) approach is to include a constant term.
I don't regard the Bayesian approach as appropriate for the abilities of high school students I've tended to encounter. In contrast, computing residuals (and their variance) can be useful and instructive and is well within their capabilities once they've grown in their skills through 10 or so quantitative laboratories.
But zooming out, the statistical details of the analysis approach are all less relevant if one has taught the students the effort, means, and care to acquire accurate data in the first place for the input and output variables. It may seem to some that I am cutting corners in teaching analysis due to time and pedagogical constraints. But start with 5-10 data points with all the x and y values measured to 1% and you can yield better results with simplified analysis than you can with the same number of data points with 5% errors and the most rigorous statistical approach available. Analysis is often the turd polishing stage of introductory labs. I don't teach turd polishing.