Proof that BB(k) grows faster than any computable function

cobalt124 · Apr 24, 2018

Hi, layman post, not sure what thread level I need. From post https://www.physicsforums.com/threads/the-busy-beaver-function.942741/, I've been working my way through https://www.scottaaronson.com/busybeaver.pdf and come to section "1.3 The Busy Beaver Function", which states:

"The Busy Beaver function, written BB(k), equals the number of steps it takes for a k-state
Busy Beaver to halt. The Busy Beaver function has many striking properties. To begin with,
it is not computable; in other words, there does not exist an algorithm that takes k as input and
returns BB(k), for arbitrary values of k. This follows directly from the undecidability of the
halting problem. Suppose an algorithm existed to compute the Busy Beaver function; then given
a k-state Turing machine M as input, we could compute BB(k) and run M for BB(k) steps. If,
after BB(k) steps, M had not yet halted, we could safely conclude that M would never halt. Thus,
we could solve the halting problem, which we know is impossible.".

and

"By the same argument, BB(k) must grow faster than any computable function. (To check
this, assume that some computable function f(k) grows faster than BB(k), and substitute f(k)
for BB(k) in the rest of the proof.)".

The first statement I understand, the second statement I don't. Taking it literally and presumably missing the point, I get:

"...Suppose a function f(k) existed that grew faster than BB(k); then given
a k-state Turing machine M as input, we could compute f(k) and run M for f(k) steps. If,
after f(k) steps, M had not yet halted, we could safely conclude that M would never halt. Thus,
we could solve the halting problem, which we know is impossible.".

That doesn't look right to me, it just looks the same as the first proof but substituting f(k) for BB(k) and changing the assumption that is to be proven incorrect. I suspect I may be confusing the notion of "Busy Beaver" and "fastest growing computable function". Can anyone help me untangle (in my head) the second proof?

SSequence · Apr 24, 2018

Well it is an elementary argument, but there are actually a few subtleties (though not difficult ones). I wasn't aware of some of them until quite recently.

Let's first define a couple of terms more carefully (just for the purpose of this post or thread). We can say that a function ##f:\mathbb{N} \rightarrow \mathbb{N}## "dominates" the function ##g:\mathbb{N} \rightarrow \mathbb{N}## when:
##f(x)>g(x)## for all ##x##
We can say that a function ##f:\mathbb{N} \rightarrow \mathbb{N}## "eventually dominates" the function ##g:\mathbb{N} \rightarrow \mathbb{N}## when there exists a value ##N## such that:
##f(x)>g(x)## for all ##x \ge N##
Note that if "##f## dominates ##g##" then this implies that "##f## eventually dominates ##g##".Now the same argument which shows that the function ##bb## is not computable also shows that:
p: "There is no computable function which dominates ##bb##"

However, proposition ##p## above doesn't actually prove the following:
q: "##bb## eventually dominates every computable function"

In other words, ##p## doesn't imply ##q##. The reason is that ##p## being true does leave open the possibility of a computable function ##f## which (i) doesn't dominate ##bb## (ii) is greater than ##bb## for infinitely many values. If such a computable function ##f## exists, then ##q## must be false (because ##bb## doesn't eventually dominates the given function ##f##).

I don't know whether you are having trouble understanding regarding the truth of ##p## or ##q##. The truth of ##p## is fairly simple (argument similar to usual standard argument for ##bb## being uncomputable).
Regarding ##q##, someone recently asked a question on MSE. I managed to figure out the answer after quite a few hours of trial and error. The argument is a bit more lengthy but not hard or advanced.

cobalt124 said:

I suspect I may be confusing the notion of "Busy Beaver" and "fastest growing computable function"

There is of course no such thing as "fastest growing computable function" for the simple reason that corresponding to any computable function ##x \mapsto f(x)## you can define a computable function ##x \mapsto f(x)+1## such that the latter dominates the former.

cobalt124 · Apr 26, 2018

I am interested in the truth of p in relation to the pdf. I understand that assuming BB(k) is computable leads to the Halting Problem being computable, the steps in the argument are listed in the pdf. What I don't see is what are the corresponding steps in the argument that there is a function f that grows faster than BB that presumably also leads to the Halting Problem being computable. I found a proof online that ends with BB(k) being computable, so this along with the proof of BB(k) being non-computable is a proof, but I think the proof in the pdf is different and I am trying to understand that. It is the "By the same argument, BB(k) must grow faster than any computable function." part I am trying to understand, why is it by the same argument? Looking at your reply, the question is "why does an argument that shows BB is not computable also show that there is no computable function that dominates BB?".

And as for my stating "fastest growing computable function", my misconceptions may be contributing to my misunderstanding.

SSequence · Apr 27, 2018

cobalt124 said:

Looking at your reply, the question is "why does an argument that shows BB is not computable also show that there is no computable function that dominates BB?".

To put it in a slightly alternative way, the answer to this is that if there was a computable function ##f## which dominated ##bb##, we could easily "extract" ##bb## from the function ##f## (basically showing that ##bb## is computable ... which can be shown to be false).

Let's use some specific values. Suppose we set the size of machines/programs to 1000. And suppose (for the sake of argument) that only 200 of these programs halt when given an empty/zero input. Let's discuss a few questions first:
(Q1) How many programs (on zero input) halt till ##bb(1000)## steps?
(Q2) How many programs (on zero input) halt till ##bb(1000)-1000## steps?
(Q3) How many programs (on zero input) halt till ##bb(1000)+1000## steps?

(A1) 200
(A2) We can't say a specific value for sure without running all the programs till the given number of steps. But of course we can say that the value will be less than 200.
(A3) 200

========================================

It is the answer to the third question before that is crucial to the argument. Suppose we have a function ##f## dominating ##bb##. Obviously then we have ##f(1000)>bb(1000)##. Simply speaking here is how you obtain ##bb(1000)## from it: Simply run all programs of length ##1000## (on zero input) till ##f(1000)## number of steps. Now what is the "last time" at which some program halted (on zero input)? This is simply the value of ##bb(1000)##.

In more pictorial terms we can write it like:
how many programs halted till ##0## steps? surely less than 200
how many programs halted till ##1## steps? surely less than 200
how many programs halted till ##2## steps? surely less than 200
how many programs halted till ##3## steps? surely less than 200
....
how many programs halted till ##bb(1000)-2## steps? surely less than 200
how many programs halted till ##bb(1000)-1## steps? surely less than 200
how many programs halted till ##bb(1000)## steps? 200
how many programs halted till ##bb(1000)+1## steps? 200
how many programs halted till ##bb(1000)+2## steps? 200
how many programs halted till ##bb(1000)+3## steps? 200
....
how many programs halted till ##f(1000)-1## steps? 200
how many programs halted till ##f(1000)## steps? 200

=================================

When you generalise the previous argument from the specific value 1000 to some general value "a", we get the desired argument that shows that if ##f## were to be computable, then so would be ##bb## (which is false). Hence we conclude ##f## isn't computable.

cobalt124 · May 8, 2018

Thanks for the detailed responses, I'm possibly beginning to get a handle on what this is about. One thing I cannot get clear in my head is does "not provable within ZFC assuming ZFC is consistent" equate to "not computable"?

Widening my reading on the subject I'm thinking about how close to the current lower bound of the non computability of BB, BB(5) this process can get given that the upper bound has been reduced to BB(1919) from BB(7918), and also the computation of BB(5). Running all k-state TMs would presumably not work as it is not known if all of them halt. BB(5) could be "possibly within reach", but is not easy to compute. Looking at http://skelet.ludost.net/bb/nreg.html, for BB(5) there are 28 TMs not proven to halt. What methods are available to prove a TMs behaviour? The link above mentions proof by hand. Possibly there are methods to prove the behaviour of a specific TM in ZFC. Possibly there are other methods. What would they be?

SSequence · May 8, 2018

cobalt124 said:

Thanks for the detailed responses, I'm possibly beginning to get a handle on what this is about. One thing I cannot get clear in my head is does "not provable within ZFC assuming ZFC is consistent" equate to "not computable"?

Well it is extremely common to use the word "undecidable" in two rather different senses. Here are two uses:
(1) Take a set A⊆ℕ. Here ℕ being the usual set of natural numbers (including 0). One says that A is undecidable to mean that the set A is not computable/decidable/recursive. What that simply means is that the "characteristic function" of A is not a computable/recursive function.
(2) When some given proposition (say one that can be expressed in a given reasoning system) can't be proved or disproved within a reasoning system (say F), one can say that the proposition is "independent" of F. It is also common to say that the given proposition "undecidable".

Note that the usage in (1) can be extended to any subset of some other set (say C) which has an easy 1-1 correspondence with ℕ. For example, the most used and common example is the the set of finite strings (with finite alphabet).

cobalt124 said:

What methods are available to prove a TMs behaviour? The link above mentions proof by hand. Possibly there are methods to prove the behaviour of a specific TM in ZFC. Possibly there are other methods. What would they be?

Well that is essentially the (sub)domain of "number theory" (and that being linked to many other domains of maths one would guess).
With a clerical method one can ofc always show halting (in principle). But you can't (always) prove non-halting (even in principle) just by a clerical process.

I am not knowledgeable here, but I am not really sure there are that many methods. Other than ZFC, there is a rather advanced/specialized field of study called "large cardinals", where I think truth of some statements might correspond specifically to non-halting of certain programs (one may try to go one step further perhaps and search for more statements/programs whose non-halting is implied by these). And as far as I have know (which isn't much to be fair), I think that's about it.

cobalt124 · May 11, 2018

Thanks again, I think I understand this as much as I am going to at the moment, the undecidable definitions make sense, I don't think I'm going to be delving into number theory yet, if ever! Interesting reading.

derivador · May 27, 2022

SSequence said:

Simply run all programs of length 1000 (on zero input) till f(1000) number of steps. Now what is the "last time" at which some program halted (on zero input)? This is simply the value of bb(1000).

Sorry, but I can't understand this implication. Since I can't compute f(1000) with a program with length of 1000, f(1000) steps doesn't seem to say anything about BB(1000) to me. I mean, to compute f(1000) you would need, for the sake of argument, a program with 1500 of length, and to compute f(1001) you would need a program with 3500 of length etc. Length = states I suppose.

My point is, f(n) could dominate BB(n) and still be computable by this logic. I must be missing something, but I can't see what.

Jarvis323 · May 27, 2022

derivador said:

Sorry, but I can't understand this implication. Since I can't compute f(1000) with a program with length of 1000, f(1000) steps doesn't seem to say anything about BB(1000) to me. I mean, to compute f(1000) you would need, for the sake of argument, a program with 1500 of length, and to compute f(1001) you would need a program with 3500 of length etc. Length = states I suppose.

My point is, f(n) could dominate BB(n) and still be computable by this logic. I must be missing something, but I can't see what.

It's a proof by contradiction.

You assume, for the sake of contradiction, that ##f## is computable and dominates ##BB##. Then the following algorithm is shown to compute ##BB##:

Python:

for n = [ 1, 2, 3, ... ]:
    maxRunTime = 0
    for each Turing machine T with n states:
        T.tape = blank
        for i = 0 to f(n):
            advance_one_step( T )
            if in_halting_state( T ) :
               maxRunTime = max( maxRuntime, i )
               break
    bb_n = maxRunTime
    print( bb_n )

Which leads to a contradiction (##BB## is computable and ##BB## is not computable). Therefore the assumption (##f## is computable and dominates ##BB##) is false.

If you don't assume ##f## is computable and dominates ##BB##, then the loop `for i = 0 to f(n)` isn't usable. Instead, you would have to resort to something like, `while( p hasn't halted )`, which may be an infinite loop since the halting problem is undecidable.

derivador · May 27, 2022

Jarvis323 said:
It's a proof by contradiction.

You assume, for the sake of contradiction, that ##f## is computable and dominates ##BB##. Then the following algorithm is shown to compute ##BB##:
Python:
for n = [ 1, 2, 3, ... ]:
    maxRunTime = 0
    for each Turing machine T with n states:
        T.tape = blank
        for i = 0 to f(n):
            advance_one_step( T )
            if in_halting_state( T ) :
               maxRunTime = max( maxRuntime, i )
               break
    bb_n = maxRunTime
    print( bb_n )
Which leads to a contradiction (##BB## is computable and ##BB## is not computable). Therefore the assumption (##f## is computable and dominates ##BB##) is false.

If you don't assume ##f## is computable and dominates ##BB##, then the loop `for i = 0 to f(n)` isn't usable. Instead, you would have to resort to something like, `while( p hasn't halted )`, which may be an infinite loop since the halting problem is undecidable.

So, it would work only for all ##m < n## with ##n## being the length of the program that computes ##f(1000)##, right? In my example, ##n = 1500##, and I could compute every ##BB(m)## with ##m < 1500##, but not ##BB(1512)## for example. Did I get it right?

Jarvis323 · May 27, 2022

derivador said:

So, it would work only for all ##m < n## with ##n## being the length of the program that computes ##f(1000)##, right? In my example, ##n = 1500##, and I could compute every ##BB(m)## with ##m < 1500##, but not ##BB(1512)## for example. Did I get it right?

It doesn't matter how long the program that computes ##f## is. Note the one algorithm for ##f## computes ##f(n)## for any ##n \in \mathbb{N}##.

You can picture ##f## as an oracle where you give it any number ##k## and it tells you a number which is greater than the run time of any halting Turing machine of length ##k##. With that information, you know that if you run a program ##p## of length ##k## longer than that number, and it didn't halt yet, then ##p## must be a program that will never halt (and thus isn't relevant to ##BB##). So then you can stop, and move on to the next ##p##.

Of course such an ##f## which we can compute can't exist, since we show it leads to a contradiction.

derivador · May 27, 2022

Yes, I think I got it. My example was just to illustrate the essence of the idea. I mean, to compute ##BB(1512)##, for the sake of argument, I would need at least a 3500-length TM which would be the smaller program that computes ##f(1001)## in this case.

I'm just talking about the program length because I'm considering a TM with no input. A busy beaver-like TM.

Jarvis323 · May 27, 2022

derivador said:

Yes, I think I got it. My example was just to illustrate the essence of the idea. I mean, to compute ##BB(1512)##, for the sake of argument, I would need at least a 3500-length TM which would be the smaller program that computes ##f(1001)## in this case.

I'm not sure I understand what you mean. The program for ##f## that we assume (for contradiction) exists, is assumed to be finite and fixed in size for all of the infinite possible inputs ##n## (so you would never need more states if ##n## is larger). The program we wrote for ##BB## using ##f## as a subroutine would be longer than our program for ##f## because it is embedded in it. But that isn't really relevant as far as I can tell. Because the problem/proof isn't saying anything about the length of a program for ##f## or length of a program for ##BB##, just that they can't exist.

Maybe you're asking about something from SSequence's post?

derivador · May 27, 2022

Jarvis323 said:

The program for ##f## that we assume (for contradiction) exists, is assumed to be finite and fixed in size, to be able to compute ##f## for all of the infinite possible inputs ##n## (so you would never need more states if ##n## is larger).

Yes. I was considering the program as a busy beaver-like TM, which starts with a blank tape and halts. In this case, it's length would change for every ##n##. Sorry for the confusion and thanks for your answer. I got it!

Jarvis323 · May 27, 2022

derivador said:

Yes. I was considering a busy beaver-like TM, which starts with a blank tape. In this case, the program length would change for every ##n##. Sorry for the confusion and thanks for your answer. I got it!

I guess this might be why they avoid using the word length. The Turing machine's states are fixed and finite and separate from the tape. The tape has no limit on how many symbols you can write onto it. In the example code, a real implementation would have some extra code embedded into it which clears the tape and then writes the next input, ##n##, onto it before running it.

Edit: The term configuration is used to refer to the current settings of the TM, which include current state, tape pointer/head position, and tape contents. Since it has a finite state machine as one component of it, the word state is reserved already.

derivador · May 27, 2022

Jarvis323 said:

I guess this might be why they avoid using the word length. The Turing machine's states are fixed and finite and separate from the tape. The tape has no limit on how many symbols you can write onto it. In the example code, a real implementation would have some extra code embedded into it which clears the tape and then writes the next input, ##n##, onto it before running it.

I think this was the origin of my confusion. I was interpreting it as an unique TM starting with a blank tape, inserting the input itself, computing and halting; all alone. I'm very layman on this subject and started to study it this week for pure curiosity. That's the price of skipping steps while learning, hehe.

Proof that BB(k) grows faster than any computable function

1. What is BB(k)?

2. How is BB(k) related to computable functions?

3. What is the significance of proving that BB(k) grows faster than any computable function?

4. How was it proven that BB(k) grows faster than any computable function?

5. Can the growth rate of BB(k) be compared to non-computable functions?

Similar threads

Hot Threads

Recent Insights