I Can Ockham’s razor be used to make quick qualitative predictions?

yeet991only · Jan 14, 2025

I asked in this reddit post the same question, and i got the answer that this is not a "law" and it is just preferable for a hypothesis to be more simple (something like Einstein quotes you hear). Yet I am still not convinced and I wanted to ask here. I saw on this website this "insight" post.
Most important quote from here " Simpler models make sharper predictions because they have less “wiggle room” ".
Now why is this relevant? Because when you make more assumptions you have more "wiggle room" , because you may be wrong, so it's better for the model.

I don't understand ockham razor, here is an example that makes it absurd.
Suppose I have a friend who lies frequently. This is not an assumption.
Suppose I asked him to buy me something and let it get delivered at my house. He said he did that.
It has not arrived.

Which hypothesis is more probable:
1. My friend lies -> no assumptions
2. The delivery failed -> I have to assume that something happened , maybe and accident , maybe their service is busy.
... Clearly the first is more probable, and I find that it is always more probable to say someone is lying even when you don't know anything about them, when i make such "thought experiments".
But isn't that wrong???

Which is the right side??? ockham razor points toward most preferable or most probable hypothesis.

phinds · Jan 14, 2025

As you have already been told, Ockham is not a law, it just says is that the the simplest solution (yes, the one with the least assumptions) is MORE LIKELY to be the correct one. It certainly does not guarantee it.

yeet991only · Jan 14, 2025

phinds said:

As you have already been told, all Ockham is not a law, it just says is that the the simplest solution (yes, the one with the least assumptions) is MORE LIKELY to be the correct one. It certainly does not guarantee it.

Ok , so is it like:
p(hypothesis1) = 0.6 given it has 10 assumptions
and p(hypothesis2) = 0.92 given it has 2 assumptions???
supposing they have equal explanatory power.
is this what you mean MORE LIKELY?
it doesnt guarantee it , it is a probability.

FactChecker · Jan 14, 2025

Good question. You should be very cautious about applying probabilities to something that is not a random variable. Ockham's Razor is a wise heuristic to use, but is not a valid part of rigorous probability theory.
Like you, I would be interested to know if there has been any rigorous mathematical treatment of this question.

fresh_42 · Jan 14, 2025

Ockham's razor is more like Murphy's law than it has actually to do something with reality. Events in reality usually have many causes. Ten likely assumptions are better than a single stupid one.

Dale · Jan 14, 2025

yeet991only said:

supposing they have equal explanatory power.
is this what you mean MORE LIKELY?
it doesnt guarantee it , it is a probability.

It is very rare for a more complicated model to have less explanatory power. In fact, I think if the simpler model is a sub model of the more complicated one, then the more complicated model will always have more explanatory power.

However, it is fairly common for the simpler model to have more predictive power. What we want in science is models with greater predictive power, not greater explanatory power.

So this criterion allows us to decide when to use Occham’s razor and when not to use it. You need a model that is simple but not too simple. This can be determined using Bayesian statistics. So Occham’s razor can be quantified, improved, and replaced by Bayesian statistics. See:

https://www.physicsforums.com/insights/how-bayesian-inference-works-in-the-context-of-science

yeet991only · Jan 15, 2025

Dale said:

It is very rare for a more complicated model to have less explanatory power. In fact, I think if the simpler model is a sub model of the more complicated one, then the more complicated model will always have more explanatory power.

However, it is fairly common for the simpler model to have more predictive power. What we want in science is models with greater predictive power, not greater explanatory power.

So this criterion allows us to decide when to use Occham’s razor and when not to use it. You need a model that is simple but not too simple. This can be determined using Bayesian statistics. So Occham’s razor can be quantified, improved, and replaced by Bayesian statistics. See:

https://www.physicsforums.com/insights/how-bayesian-inference-works-in-the-context-of-science

Hey Dale, I have read your post and that's why I posted on this forum.
Regarding your first point, I think that more explanatory power actually means the opposite , it means more simple. See wikipedia. Is it wrong?

Let's think of Ockham razor as Kahnemann's Linda example.
As other guys on this thread pointed , you can have more assumptions and the model be more probable.
Here is a more formal explanation from my point of view.

Let H1 be first hypothesis with assumptions A1 , A2 ... A12.
Let H2 be second one with assumptions a1 , a2.
Let's suppose they have equal explanatory power. (they account for all our facts/observations)
Considering them independent (but my point would still be clear if we don't , let it be like that to be more simple)
Then the chance for all my assumptions for H1 to be right would be p(A1 and A2... A12) = p(A1)p(A2)...p(A12) .
(let the probability here be taken as a idea of confidence).
and for the second p(a1 and a2) = p(a1)p(a2)
Now because any probability is 0 < p(a1) < 1, then we would get a smaller conjunction for more assumptions (yes it can still be better than the first H1 if our assumptions are more confident).

Doesn't that clearly prove occam's razor?

fresh_42 · Jan 15, 2025

yeet991only said:

Hey Dale, I have read your post and that's why I posted on this forum.

Yes, but Ockham's razor deals with subjective probabilities, not objective ones. Hence, it is a question about assessments, not about mathematics.

yeet991only · Jan 15, 2025

fresh_42 said:

Yes, but Ockham's razor deals with subjective probabilities, not objective ones. Hence, it is a question about assessments, not about mathematics.

My question is in general about subjective probabilities. (The title of the thread is how to make quick qualitative predictions). Why won't math work with subjective probabilities?? My question is not about assessments, I need a mathematical framework.

fresh_42 · Jan 15, 2025

yeet991only said:

My question is in general about subjective probabilities. (The title of the thread is how to make quick qualitative predictions). Why won't math work with subjective probabilities?? My question is not about assessments, I need a mathematical framework.

The mathematical framework is simply the concept of conditional probability and Bayes' theorem. More generally, it is Bayesian epistemology that is closer to philosophy than it is to mathematics in my opinion.

yeet991only · Jan 15, 2025

fresh_42 said:

The mathematical framework is simply the concept of conditional probability and Bayes' theorem. More generally, it is Bayesian epistemology that is closer to philosophy than it is to mathematics in my opinion.

How is that a framework. All you are doing is multiplying and dividing ratios. Is all probability theory just that? No way to use a mix of mathematical logic and probability?

fresh_42 · Jan 15, 2025

yeet991only said:

How is that a framework. All you are doing is multiplying and dividing ratios. Is all probability theory just that? No way to use a mix of mathematical logic and probability?

The first two links are the mathematical background, whether you call it a framework or not, and the third one is the philosophical background.

PeroK · Jan 15, 2025

yeet991only said:

My question is in general about subjective probabilities. (The title of the thread is how to make quick qualitative predictions). Why won't math work with subjective probabilities?? My question is not about assessments, I need a mathematical framework.

Standard probability theory is based on the Kolmogorov axioms, which is the formal mathematical basis.

https://en.wikipedia.org/wiki/Probability_axioms

The key question, however, is how to apply probability theory to real-world events. The starting point is one of two things:

Your credence, based on knowledge and "expert" opinion.
Data, using probabilities as relative frequencies.

The first is more subjective and leads to a more subtle interpretation of probabilities. Very broadly, these represent the Bayesian and Frequentist schools of thought. That said, any good Bayesian and good Frequentist should agree on most things. And, they should understand the points at which the two approaches differ.

As fas as I know, Occam's razor doesn't enter into this. Not in terms of using Bayesian or Frequentist statistics to decide on various hypothesis. Simplicity of the set of hypotheses is not really a factor.

Where Occam's razor comes in is where, for example, the data defies expert opinion. Whereupon, you need an explanation. Since the Australian Open tennis is on, let's take a tennis example. For years, tennis commentators believed that if a player was 2-0 down in sets and came back to 2-2, then they were strong favourite to complete the comeback and win the fifth set. But, Jim Courier is one of the few commentators who believes in looking at the data. And, when he did, he found that the player who had been 2-0 up, and had lost the 3rd and 4th sets, actually won the 5th set more often. Although, it was fairly even. But, there was conclusive evidence that the expert opinion had been wrong.

The question is how do you explain that? That's where a different sort of hypothesis comes in. The quantitative question has been answered: the 5th set is almost an even bet. You now have a much subtler, more subjective question to answer on why the data is like that. It's hard to see how you could figure out an explanation. How would you test one theory against another? Occam's razor might help you there. But, it's not a statistical tool, IMO.

yeet991only · Jan 15, 2025

PeroK said:

Standard probability theory is based on the Kolmogorov axioms, which is the formal mathematical basis.

https://en.wikipedia.org/wiki/Probability_axioms

The key question, however, is how to apply probability theory to real-world events. The starting point is one of two things:

Your credence, based on knowledge and "expert" opinion.
Data, using probabilities as relative frequencies.

The first is more subjective and leads to a more subtle interpretation of probabilities. Very broadly, these represent the Bayesian and Frequentist schools of thought. That said, any good Bayesian and good Frequentist should agree on most things. And, they should understand the points at which the two approaches differ.

As fas as I know, Occam's razor doesn't enter into this. Not in terms of using Bayesian or Frequentist statistics to decide on various hypothesis. Simplicity of the set of hypotheses is not really a factor.

Where Occam's razor comes in is where, for example, the data defies expert opinion. Whereupon, you need an explanation. Since the Australian Open tennis is on, let's take a tennis example. For years, tennis commentators believed that if a player was 2-0 down in sets and came back to 2-2, then they were strong favourite to complete the comeback and win the fifth set. But, Jim Courier is one of the few commentators who believes in looking at the data. And, when he did, he found that the player who had been 2-0 up, and had lost the 3rd and 4th sets, actually won the 5th set more often. Although, it was fairly even. But, there was conclusive evidence that the expert opinion had been wrong.

The question is how do you explain that? That's where a different sort of hypothesis comes in. The quantitative question has been answered: the 5th set is almost an even bet. You now have a much subtler, more subjective question to answer on why the data is like that. It's hard to see how you could figure out an explanation. How would you test one theory against another? Occam's razor might help you there. But, it's not a statistical tool, IMO.

Hey I don't understand your example. Why wouldn't the 5th set be a even bet? I don't know tennis rules, but why wouldn't the players have a even chance to win? ( they are both 2-2 before the 5th , it's fairly even).

PeroK · Jan 15, 2025

yeet991only said:

Hey I don't understand your example. Why wouldn't the 5th set be a even bet? I don't know tennis rules, but why wouldn't the players have a even chance to win? ( they are both 2-2 before the 5th , it's fairly even).

That's a good example of subjective probablity. Your credence is that the 5th set is an even bet, based on your general knowledge and gut feeling.

Others, like me, would say that the probability for the 5th set should be determined by looking at data.

By the way, I must have misremembered this. The data is 55% in favour of the player who was 2-0 down. There is an advantage, but it's not as big as you might think. And so, I immediately change my mind and admit I was wrong. I.e. probabilities are objective (as far as can be determined).

yeet991only · Jan 15, 2025

PeroK said:

That's a good example of subjective probablity. Your credence is that the 5th set is an even bet, based on your general knowldege and gut feeling.

Others, like me, would say that the probability for the 5th set should be determined by looking at data.

By the way, I must have misremembered this. The data is 55% in favour of the player who was 2-0 down. There is an advantage, but it's not as big as you might think. And so, I immediately change my mind and admit I was wrong. I.e. probabilities are objective (as far as can be determined).

I see your point. Why can't you just assume it was just noise in the data and that it will get to 0.5 : 0.5 eventually?
Clearly if you think that the player down has an advantage, you would need to assume more things. For example: Why he would get an advantage now? Who helped him? What causes that to happen? and more...
The most probable one is that the advantage is just noise from the data.

Am I naive? Why wouldn't this be the case? Is there some magic cause that we need to do crazy causal inference to find?

phinds · Jan 15, 2025

yeet991only said:

Why can't you just assume it was just noise in the data and that it will get to 0.5 : 0.5 eventually?

Depends on the sample size. With a large enough data set, noise gets averaged out.

fresh_42 · Jan 15, 2025

phinds said:

Depends on the sample size. With a large enough data set, noise gets averaged out.

Here is a nice sample size calculator:
https://www.calculator.net/sample-size-calculator.html?type=1&cl=95&ci=5&pp=50&ps=&x=Calculate

... as we speak about real-world probabilities.

PeroK · Jan 15, 2025

yeet991only said:

I see your point. Why can't you just assume it was just noise in the data and that it will get to 0.5 : 0.5 eventually?

That's where the statistics of confidence comes in. The data can be formally tested and your hypothesis that it's 50-50 tested. If there is enough data, then the hypothesis becomes statistically untenable.

yeet991only said:

Clearly if you think that the player down has an advantage, you would need to assume more things. For example: Why he would get an advantage now? Who helped him? What causes that to happen? and more...

You don't have to assume anything. You look at the data.

yeet991only said:

The most probable one is that the advantage is just noise from the data.

You can't know that. Also you're using "probable" there subjectively; not as a formal mathematical probability.

yeet991only said:

Am I naive? Why wouldn't this be the case? Is there some magic cause that we need to do crazy causal inference to find?

There's a whole world of statistical tools out there (Bayesian and Frequentist). If you are interested, you could start studying probability and statistics.

PeroK · Jan 15, 2025

PS In the UK, the A-Level mathematics syllabus includes a statistics option, which includes hypothesis testing.

Note that such statistical hypotheses are very different from a hypothesis such as Newton's second law of motion. It's the same word, but has very different scientific meanings in those two cases.

And Occam's razor only applies to the second type (Newton's three laws of motion are better than 21 laws of motion).

fresh_42 · Jan 15, 2025

My point about hypotheses - better assumptions - used for Ockham's razor is the reliability of such assumptions, ergo an assessment of probabilities. I need only one crazy assumption to explain a UFO sighting, but I need several serious assumptions to provide alternative explanations. That is where individualism kicks in and makes it unmathematical.

Dale · Jan 15, 2025

yeet991only said:

, I think that more explanatory power actually means the opposite , it means more simple. See wikipedia. Is it wrong?

That Wikipedia article is listing several different concepts of explanatory power.

The distinction I am making between explanatory and predictive power is the one used by statisticians. Specifically, explanatory power refers to the accuracy of the in-sample fit. In contrast, predictive power refers to the accuracy of the out-of-sample fit.

In sample accuracy is always increased by increasing parameters, up to the number of samples. Out of sample accuracy is not always increased.

yeet991only said:

Why won't math work with subjective probabilities?? My question is not about assessments, I need a mathematical framework.

It will. That is what Bayesian statistics is all about.

Hornbein · Jan 15, 2025

yeet991only said:

I see your point. Why can't you just assume it was just noise in the data and that it will get to 0.5 : 0.5 eventually?
Clearly if you think that the player down has an advantage, you would need to assume more things. For example: Why he would get an advantage now? Who helped him? What causes that to happen? and more...
The most probable one is that the advantage is just noise from the data.

Am I naive? Why wouldn't this be the case? Is there some magic cause that we need to do crazy causal inference to find?

The underdog has found some weakness in the overdog's game that he/she is exploiting.

BWV · Mar 10, 2025

there is a related probability fallacy called the Conjunction Fallacy where respondents give a higher probability to

Amos & Tversky's famous example is:

Which is more probable?

Linda is a bank teller.

Linda is a bank teller and is active in the feminist movement.

2) cannot be more likely that 1) given that its a joint probability including 1), but many people will pick 2 as it fits their preconceived notions

I Can Ockham’s razor be used to make quick qualitative predictions?

Similar threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

I What Are the Axioms of Fuzzy Logic and How Do They Extend Boolean Algebra?

A Distribution of Range of Samples taken from N(0,1)

I A variant of the Monty Hall problem

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers