Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Collatz Problem

  1. Jul 12, 2009 #1
    im trying to do the Euler problem #14 - to determine which starting value under one million produces the longest chain

    and heres my code in Python

    Code (Text):
    Count=[]
    Count2=[]
    List=[]
    def seq(n):
        if n%2==0:
            return n/2
        else:
            return 3*n+1
    m=1
    for i in range(1,1000000):
        j=i
        List.append(j)
        while seq(j)>1:
            List.append(seq(j))
            j=seq(j)
        List.append(1)
        Count.append(len(List))
        List=[] #Reset List
        if max(Count)>m:   
            m=max(Count)
            Count2.append((m,i))
    print Count2[len(Count2)-1]
    raw_input("...")

    However, I left it on all night and it still did not finish

    but it works for lower numbers like for in in range(1,1000) only takes a few seconds

    anyone know what my problem is?

    Thanks
     
    Last edited: Jul 12, 2009
  2. jcsd
  3. Jul 12, 2009 #2

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Well, one thing is that you're doing 1000 times more iterations through the loop; that adds up fast. Suppose the time per iteration was constant, and 1000 iterations took 15 seconds. Then 15 seconds * 1000 = 4.2 hours.

    Another thing is that, on average, a bigger starting number means longer chains -- so the time it takes to do one iteration of your loop is actually increasing, not remaining constant.

    By the way, use [code] ... [/code] tags around your code to format it properly.
     
    Last edited: Jul 12, 2009
  4. Jul 12, 2009 #3
    Ok thanks Hurkly

    im still not sure how to reduce the # of iterations id have to do though

    even if I just go

    Code (Text):

    for i in range(1,1000000):
          print seq(i)
     
    it takes a long time
     
  5. Jul 12, 2009 #4

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Displaying stuff to screen is a slow operation. I bet that would only take a second or so if you did something else, like appending seq(i) to a list.
     
  6. Jul 12, 2009 #5
    Code (Text):
    def seq(n):
        if n%2==0:
            return n/2
        else:
            return 3*n+1

    def rec(n):

        while seq(n)>1:
            n=seq(n)
        return n

    for i in range(1,1000000):
        rec(i)
    raw_input('...')
     
    Even this is still running half an hour later on my computer
    and it's the most efficient way I can think of to just run through each chain

    so maybe I just have to skip this problem or something if its going to take a day to do
     
  7. Jul 12, 2009 #6

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Well, you've checked that you can do 1,000,000 iterations of something simply fairly quickly.

    And you've observed that directly computing the length of an entire chain doesn't count as quick.

    So, you cannot approach the problem by writing a solution to "find the length of a chain by enumerating it" and then executing that solution 1,000,000 times. You need to do something fundamentally different. For example, in no particular order:

    (1) Find some other method of finding the length of an individual chain.

    (2) Rather than treating each individual calculation independently, find some way to take advantage of the fact you're really trying to calculate 1,000,000 of them.

    (3) Some other clever tweak.

    (4) Switch to another programming language and optimize the heck out of the algorithm.
     
  8. Jul 12, 2009 #7
    Thanks I got it

    Code (Text):
    Count=[]
    Count2=[]
    Count3=[]
    Count4=[]
    List=[]
    Part=[1]
    for i in range(1,1001):
        Part.append(i*10000)
    def seq(n):
        if n%2==0:
            return n/2
        else:
            return 3*n+1
    m=1

    for k in range(1,101):
        for i in range(Part[k-1],Part[k]):
            j=i
            List.append(j)
            while seq(j)>1:
                List.append(seq(j))
                j=seq(j)
            List.append(1)
            Count.append(len(List))
            List=[] #Reset List
            if max(Count)>m:   
                m=max(Count)
                Count2.append([m,i])   
        Count3.append(Count2[len(Count2)-1])
        Count2=[]
        m=1
        List=[]
        Count=[]
    #Sorting
    for i in range(1,101):
        Count4.append(Count3[i-1][0])
    m=max(Count4)
    for i in range(1,101):
        if Count3[i-1][0]==m:
            answer=Count3[i-1][1]
            break
    print "The starting point that makes the biggest chain is %d"%answer

    print Count3
    raw_input("...")

     
    it worked I just had to do it in smaller partitions and then clear the lists each time
     
  9. Jul 13, 2009 #8
    You may have it working, but it is still not right. From your print statements, I can see you are not using Python version 3.x, but some flavor of version 2.x.

    So, you don'r want to use range() for loops that size as it will first create a list that size and then iterate through it. You should use xrange() instead. This will create a generator that will allow you to iterate through it without consuming all that memory.
     
  10. Jul 13, 2009 #9

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Wait a moment -- are you saying something is actually wrong, or are you just offering a suggestion for a minor optimization?

    If the latter, you're making it sound like a bigger deal than it really is. Yes, xrange is something he should be made aware of, but using range is certainly not incorrect. (And has a very minor effect in this particular case)
     
  11. Jul 13, 2009 #10

    Wrong and not right are not necessarily synonymous.

    The OP would be better off learning this lesson now while it's minor than having it bite him later. This is the Collatz problem, remember? Where things can get really big, really fast.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook