B Algorithm for checking if a set of digits are all distinct

  • B
  • Thread starter Thread starter etotheipi
  • Start date Start date
  • Tags Tags
    Algorithm Set
AI Thread Summary
To check if a set of 9 digits contains all distinct digits from 1 to 9, one effective method is to use a bit manipulation technique, where each digit sets a corresponding bit in an integer. The sum of the digits should equal 45, and the product should equal 362,880 (9!). However, while these conditions are necessary, they are not sufficient alone, as demonstrated by the example 124445799, which meets both conditions but contains duplicates. A more straightforward approach involves creating a frequency array to count occurrences of each digit and ensuring all counts equal one. This method is efficient and avoids the complexities of prime factorization or product checks.
etotheipi
If I give you 9 digits ##u_1, u_2, \dots, u_9##, is there an operation/set of operations you can perform to check whether all the digits from 1 to 9 are represented in that set? Just asking because my solution was a boring brute force check.

I don't think anything useful becomes of the product ##u_1 u_2 \dots u_9## since, for instance, there exist other sets of 9 digits that multiply to ##9!## (e.g. 653346274).

I wondered whether anyone knew if a nicer method is possible?
 
Last edited by a moderator:
  • Like
Likes JD_PM and member 587159
Mathematics news on Phys.org
etotheipi said:
If I give you 9 digits ##u_1, u_2, \dots, u_9##, is there an operation/set of operations you can perform to check whether all the digits from 1 to 9 are represented in that set? Just asking because my solution was a boring brute force check.

I don't think anything useful becomes of the product ##u_1 u_2 \dots u_9## since, for instance, there exist other sets of 9 digits that multiply to ##9!## (e.g. 653346274).

I wondered whether anyone knew if a nicer method is possible?
If the 9 digits are distinct, their sum will be 45. If their product also is 362,880 (9!), that would suggest that they are distinct. The conditions that their sum is 45 and their product is 362,880 are necessary, but I haven't convinced myself that they are sufficient conditions.
 
  • Like
Likes etotheipi and member 587159
Mark44 said:
but I haven't convinced myself that they are sufficient conditions.

This can be brute-forced by a computer (didn't check myself, but it sounds plausible).
 
  • Like
Likes etotheipi
I just wrote quick script to test this and it turns out there is one exception. The other one it threw up was 124445799.
Code:
found = ["123456789"]

for i in range(100000000,1000000000):
    num = i
    product = 1
    sum = 0

    while(i>=1):
        product *= i%10
        sum += i%10
        i//=10

    if(product == 362880 and sum == 45):
        ascending = "".join(sorted(str(num)))
        new = True
        for j in found:
            if(ascending == j):
                new = False
        if(new):
            found.append(ascending)

print(found)
OUTPUT:
['123456789', '124445799']
 
Last edited by a moderator:
  • Like
Likes member 587159
Here's another check: 9! = ##2^7 \cdot 3^4 \cdot 5 \cdot 7##
If the product is divisible by any higher powers of the four factors above, at least one digit is repeated. In other words, the product can't be evenly divisible by ##2^8## or ##3^5## or ##5^2## or ##7^2##. And I'm assuming that the sum of the 9 digits is 45.
 
etotheipi said:
I just wrote quick script to test this and it turns out there are a few exceptions. The first one it threw up was
124445799.

Code:
for i in range(100000000,1000000000):
    num = i
    product = 1
    sum = 0

    while(i>=1):
        product *= i%10
        sum += i%10
        i//=10

    if(product == 362880 and sum == 45):
        ascending = "".join(sorted(str(num)))
        if(ascending != "123456789"):
            print(num)
Why are you checking numbers in the range 100,000,000 to 999,999,999? If the product is larger than 362,880 (9!), then there have to be some repeated digits.

I think I understand your reasoning -- If all 9 digits are 9, the product is 387,420,489, which is within your range, but that's brute force with a bludgeon.
 
  • Like
Likes Janosh89 and etotheipi
Mark44 said:
I think I understand your reasoning -- If all 9 digits are 9, the product is 387,420,489, which is within your range, but that's brute force with a bludgeon.

Whoops you're quite right, I wasn't thinking too much about that part 😂, a sledgehammer approach indeed. Luckily modern computing power makes fairly short work of my sloppiness anyway!
 
  • Haha
Likes member 587159
Start off with a 16 bit integer set to 0
For each digit, set the corresponding bit
Compare the integer with 511
 
  • Like
Likes etotheipi
Merlin3189 said:
Start off with a 16 bit integer set to 0
For each digit, set the corresponding bit
Compare the integer with 511

This is fairly similar to my initial approach but your method is much snappier; I just scanned through the result and checked for no zeroes but you're right it would be quicker to just compare with ##2^9 -1##. I think this is probably the best solution, since a number theory approach is perhaps overly complicated. Thanks!
 
  • #10
etotheipi said:
If I give you 9 digits ##u_1, u_2, \dots, u_9##, is there an operation/set of operations you can perform to check whether all the digits from 1 to 9 are represented in that set?
{...snip...}
I wondered whether anyone knew if a nicer method is possible?
If the solution does not have to include math operations, then
if a digit repeats, then not all digits from 1 to 9 are represented.

create a temporary set ##temp[1:9]##; zero ##temp##​
for each ##u_i##​
if (##u_i ∈ temp##) then return (false)​
let ##temp_i = u_i##​
end for​
return (true)​

For brevity, I left out the details for checking if digits repeat.
{Edit: removed check for zero since already zeroed ##temp##.}
 
  • Like
Likes etotheipi
  • #11
Setting digits is the quickest option. It's also one that comes with minimal code:
Code:
sum=0
for k in u:
  sum+=1<<k
return sum==1022
This is valid python code. It assumes "u" is a set of integers, it doesn't check that. If they might not be integers this needs to be verified as well.
 
  • Love
Likes etotheipi
  • #12
Here's an implementation of what I mentioned earlier.
Python:
from time import perf_counter
t1 = perf_counter()
count = 0
list = [1, 2, 3, 5, 5, 5, 7, 8,9]
sum = 0
prod = 1
for i in range (0, 9):
    sum += list[i]
    prod *= list[i]
if (sum == 45):
    if (prod % 256 == 0):
        print ("Duplicate digit! Too many 2s")
    elif (prod % 81 == 0):
        print ("Duplicate digit! Too many 3s")
    elif (prod % 25 == 0):
        print ("Duplicate digit! Too many 5s")
    elif (prod % 49 == 0):
        print ("Duplicate digit! Too many 7s")
    else:
        print("No duplicate digits...")
else:
    print("Sum is incorrect")
t2 = perf_counter()
print("t2: ", t2, " t1: ", t1)
print("Elapsed time: ", (t2-t1), " seconds")
@etothepi, I tried out your version, and it took longer to run than I wanted to wait -- maybe about 5 minutes before I CTRL-C-ed out.
The version above runs in 0.0002833 seconds, or just under 3 milliseconds, and prints the message that there are too many 5s.
Edit: just under 0.3 msec.
 
Last edited:
  • Like
Likes etotheipi
  • #13
Mark44 said:
The version above runs in 0.0002833 seconds, or just under 3 milliseconds, and prints the message that there are too many 5s.
0.3 milliseconds?
I ran my function a million times and it finished in 2-3 seconds. The runtime doesn't depend on the input, so I didn't vary that.

As discussed before, checking the powers of the primes isn't a guarantee that every digit occurs. 124445799 has the same digit sum and digit product as 123456789. Your 25 line code takes 100 times as long as the 4 line code and has a bug.

etotheipi's script was designed to look for a bug in your script. It's not a surprise that it takes longer because it does a vastly more complicated task than checking if every digit occurs once in a set of 9 digits.
 
  • Like
Likes etotheipi
  • #14
Mark44 said:
@etothepi, I tried out your version, and it took longer to run than I wanted to wait -- maybe about 5 minutes before I CTRL-C-ed out.
The version above runs in 0.0002833 seconds, or just under 3 milliseconds, and prints the message that there are too many 5s.
Although I guess this is mainly because you are checking one list whilst my one was checking about 900000000 of them :wink:
mfb said:
Setting digits is the quickest option. It's also one that comes with minimal code:
This is valid python code. It assumes "u" is a set of integers, it doesn't check that. If they might not be integers this needs to be verified as well.

This looks like magic... very neat!
 
  • Like
Likes mfb
  • #15
It's possible to make that code more pythonic and less portable, of course:
Code:
return sum(1<<k for k in u)==1022
 
  • Informative
Likes etotheipi
  • #16
mfb said:
As discussed before, checking the powers of the primes isn't a guarantee that every digit occurs. 124445799 has the same digit sum and digit product as 123456789. Your 25 line code takes 100 times as long as the 4 line code and has a bug.
My code catches 124445799, saying that it has too many 3s.

I ran my code 1,000,000 times as well, but commented out the line where it prints "too many 3's," because I/O is inherently slow. I don't dispute that your algorithm is faster, but running my code a million times took only 7.6 sec, which is less than 3 times as slow, not 100 times as slow. Your claim that my code takes 100 times as long doesn't take into account that your code was running from cache most of the time, whereas my one time through wasn't.

And what's the bug in my code? It successfully noted that 124445799 has at least one duplicate digit.
 
  • #17
Mark44 said:
My code catches 124445799, saying that it has too many 3s.
Uh, it's worse than I expected. Another bug was hiding the more fundamental flaw of the program logic. It also complains about too many 3s for correct sets. You would need to check for 243, not 81. But if you do that then 124445799 will pass. Of course it does: It has the same sum and product as the correct set.
 
  • #18
I'm lazier than that:
PHP:
<?php
echo strlen($u) == strlen(count_chars($u, 3));
?>
Or:
Python:
u = str(u);
print len(u) == len(set(u));
As a bonus, it works with any other set of characters - not just digits - of any length!
 
  • Like
  • Love
Likes Merlin3189 and etotheipi
  • #19
I admit I miscounted the exponent on 3, and that even with that correction, my algorithm misclassifies 124445799. I'll take a harder look at my algorithm.
 
  • #20
Fixed the miscount on the exponent on 3. This code checks that the sum of the squares of the digits is correct. It produces the correct answer for 123456789 and 124445799. Are there any that it miscategorizes?
Python:
from time import perf_counter
t1 = perf_counter()
count = 0
list = [1, 2, 4, 4, 4, 5, 7, 9, 9]
sum = 0
prod = 1
sum_sq = 0
for i in range (0, 9):
sum += list[i]
prod *= list[i]
   sum_sq += list[i]*list[i]
if (sum != 45):
   print("Sum is incorrect")
elif (prod != 362880):
   print("Product is incorrect")
elif (sum_sq != 285):
   print("Sum of squares is incorrect")
else:
   if (prod % 256 == 0):
      print ("Duplicate digit! More than 7 factors of 2")
   elif (prod % 243 == 0):
      print ("Duplicate digit! More than 4 factors of 3")
   elif (prod % 25 == 0):
      print ("Duplicate digit! More than 2 factors of 5")
   elif (prod % 49 == 0):
      print ("Duplicate digit! More than 2 factors of 7")
   else:
      print("No duplicate digits...")
t2 = perf_counter()
print("t2: ", t2, " t1: ", t1)
print("Elapsed time: ", (t2-t1), " seconds")
 
  • #21
mfb said:
It's possible to make that code more pythonic and less portable, of course:
Code:
return sum(1<<k for k in u)==1022
When you're setting flag bits, I would call this more assembly-like. Although it's not intuitively obvious at first glance, yours is a very elegant solution.
Here's a more-or-less equivalent version in ARMv7:
[CODE title="ARMv7"]@ Constants
.eqv PR_STR, 0x69
.eqv HALT, 11
.eqv LIST_SIZE, 9
.eqv STD_OUT, 1

.data
List: .word 1, 2, 4, 4, 4, 5, 7, 9, 9
MagicNo: .word 0x3FE @ Bit pattern 11 1111 1110, or decimal 1022
NoDupe: .asciz "No duplicate digits"
Dupe: .asciz "Duplicate digits"

.text
main:
mov r0, #0 @ Accumulator
ldr r1, =List
mov r2, #LIST_SIZE @ Loop counter
mov r3, #1
ldr r4, =MagicNo
ldr r4, [r4]

Loop:
ldr r5, [r1] @ Get a number from List
lsl r5, r3, r5 @ r5 <- 1 << r5
add r0, r0, r5 @ Add to sum
add r1, r1, #4 @ Increment address in List
subs r2, r2, #1 @ Decrement loop counter
bne Loop

subs r2, r0, r4 @ r2 <- sum - MagicNo
beq NoDupeDigit @ If equal, no duplicate digits
mov r0, #STD_OUT @ Otherwise, there are duplicates
ldr r1, =Dupe
swi PR_STR
b Done

NoDupeDigit:
ldr r1, =NoDupe
swi PR_STR
Done:
swi HALT[/CODE]
 
  • #22
I would do some basic checks, sort the digits in ascending order and compare the result to "123456789".
 
  • Like
Likes etotheipi
  • #23
Mark44 said:
Are there any that it miscategorizes?
124445799 is the only case where you get the same sum and product as 123456789, so once you catch that there is nothing left that is misclassified. Note that checking for the product is sufficient, you don't need to check divisibility any more. If the product is 362880 then your divisibility checks have to pass.
I still don't see the point of the approach. It takes longer, it is harder to understand, harder to generalize, and most importantly it relies on an advance check of every single set of 9 digits to make sure there are no further counterexamples because there is no theorem guaranteeing the uniqueness of properties you check.
 
  • #24
I would accumulate a frequency array of all the digits. Freq( "digit" ) += 1.
I would verify that all digits had a frequency of one. Maybe check the product of all frequencies = 1.
 
  • #25
Baluncore said:
I would accumulate a frequency array of all the digits. Freq( "digit" ) += 1.
I would verify that all digits had a frequency of one. Maybe check the product of all frequencies = 1.

Assuming you did initialise all frequencies to 0, if you use = rather than +=, then each frequency ends up as 0 or 1, so you can check by adding rather than multiplying, which used to be faster, even for integers.

And it's probably faster than my method using bits rather than words to store the frequencies.
 
  • #26
Merlin3189 said:
Assuming you did initialise all frequencies to 0, ...
How could it be frequency, or quickly done otherwise?
Merlin3189 said:
... if you use = rather than +=, then each frequency ends up as 0 or 1, ...
But then it is presence and not frequency, and you must assume the correct number of digits were input.
Multiplication was slower than addition once, but now it is faster than indexing and fetching data.
 

Similar threads

Back
Top