New Reply

Efficiently computing the parity of the Hamming weight of a binary sequence

 
Share Thread
Aug17-12, 05:44 PM   #18
 
Recognitions:
Homework Helper Homework Help

Efficiently computing the parity of the Hamming weight of a binary sequence


Quote by burritoloco View Post
long m = (1L << 63) - 1;
for(long i=0; i < m; i++)
{
__builtin_popcountl(i);
}
Try changing this to:

long j = 0;
long m = (1L << 31) - 1;
for(long i=0; i < m; i++)
{
j += __builtin_popcountl(i);
}

cout << j << endl;
Aug17-12, 05:57 PM   #19
 
Cool, computing time was 29197 milliseconds, j = 33285996513. Still, addition time is proportional to the amount of digits no?
Aug17-12, 05:59 PM   #20
 
Recognitions:
Homework Helper Homework Help
Quote by burritoloco View Post
I would love to output the assembly code if anyone would show me how?
Use -S (upper case) to generate assembly source:

gcc -S -O2 example.cpp

or to generate assembly source and object but not link:

gcc -S -O2 -c example.cpp

This will produce an assembler file called example.s.

Quote by burritoloco View Post
From your comment I take it that perhaps "block" is a term already in use in relation to codewords?
No, there's no standard for the term "block" when working with primitive polynomials. I wasn't sure if in your case a block would be a codeword or not.

Quote by burritoloco View Post
Cool, computing time was 29197 milliseconds, j = 33285996513. Still, addition time is proportional to the amount of digits no?
You can try this and hope the compiler doesn't optimize this to just running the last loop:

register long j = 0;
long m = (1L << 31) - 1;
for(long i=0; i < m; i++)
{
j = __builtin_popcountl(i);
}

cout << j << endl;
Aug17-12, 06:13 PM   #21
 
Ooh, this is what I get after adding the -S option for g++:

./example: line 1: .file: command not found
./example: line 2: .globl: command not found
./example: line 3: .section: command not found
./example: line 4: .LC0:: command not found
./example: line 5: .string: command not found
./example: line 6: .LC1:: command not found
./example: line 7: .string: command not found
./example: line 8: .LC4:: command not found
./example: line 9: .string: command not found
./example: line 10: .section: command not found
./example: line 11: .p2align: command not found
./example: line 12: .globl: command not found
./example: line 13: .type: command not found
./example: line 14: main:: command not found
./example: line 15: .LFB1025:: command not found
./example: line 16: .cfi_startproc: command not found
./example: line 17: pushq: command not found
./example: line 18: .cfi_def_cfa_offset: command not found
./example: line 19: .cfi_offset: command not found
./example: line 20: xorl: command not found
./example: line 21: xorl: command not found
./example: line 22: pushq: command not found
./example: line 23: .cfi_def_cfa_offset: command not found
./example: line 24: .cfi_offset: command not found
./example: line 25: xorl: command not found
./example: line 26: subq: command not found
./example: line 27: .cfi_def_cfa_offset: command not found
./example: line 28: movq: command not found
./example: line 29: call: command not found
./example: line 30: xorl: command not found
./example: line 31: jmp: command not found
./example: line 32: .p2align: command not found
./example: line 33: .p2align: command not found
./example: line 34: .L6:: command not found
./example: line 35: movq: command not found
./example: line 36: call: command not found
./example: line 37: .L3:: command not found
./example: line 38: cltq: command not found
./example: line 39: addq: command not found
./example: line 40: addq: command not found
./example: line 41: cmpq: command not found
./example: line 42: jne: command not found
./example: line 43: syntax error near unexpected token `('
./example: line 43: ` leaq 16(%rsp), %rdi'
Without the -S it works fine though.
Aug17-12, 06:27 PM   #22
 
Recognitions:
Homework Helper Homework Help
Quote by burritoloco View Post
This is what I get after adding the -S option for g++ ...
Try with the -c option (means don't link):

gcc -O2 -S -c example.cpp

In spite of the errors you got, gcc may have produced an assembly source file, look for a file with a .s suffix.
Aug17-12, 06:41 PM   #23
 
Same thing happened with -c. I tried

register long j = 0;
long m = (1L << 31) - 1;
for(long i=0; i < m; i++)
{
j = __builtin_popcountl(i);
}

cout << j << endl;
Computing time was now 29107 milliseconds.
Aug17-12, 06:45 PM   #24
 
The .s file is not there...
Aug17-12, 07:21 PM   #25
 
Recognitions:
Homework Helper Homework Help
Quote by burritoloco View Post

j = __builtin_popcountl(i);

Computing time was now 29107 milliseconds.
That's just under 30 seconds, so it appears the popcount and the rest of the loop instructions are being performed (2^31) -1 times. It tried an assembly code version with a two instruction loop in 32 bit mode:

Code:
        mov     ecx,0ffffffffh
test0:  popcnt  eax,ecx
        loop    test0
and it takes about 5.7 seconds on my system (Intel 2600K). I also tried popcnt eax,ebx with various values in ebx, and the timing remained the same.

Quote by burritoloco View Post
The .s file is not there...
I'm not sure why this is happening, but I don't have a copy of gcc. Try

gcc -help

to see the options. Also make sure you're using upper case S:

gcc -S example.cpp

I still don't understand why you want to calculate the even / odd parity when you're learning about GF(2) primitive polynomials.
Aug18-12, 02:56 AM   #26
 
I still don't understand why you want to calculate the even / odd parity when you're learning about GF(2) primitive polynomials.
:) OK, to satisfy the curiosity, I'm looking at the factorization of cyclotomic polynomials of order 2^n-1; all its irreducible factors are the prim polys of degree n. There are several methods to compute prim polys, but this one seems at least unusual. In this method I'm looking at, the coefficients of these polys can be obtained by the parity of the weight of certain chunks of some types of "primitive" binary sequences of size given by the number (large) of irreducible factors of the corresponding cyclotomic polynomial. I was wondering about the time complexity to compute the coefficients with this method, but unfortunately, with what I know, the overall method seems to be exponential (seems to be true from the tests I've done as well). However, there are several other known methods which compute these prim polys in polynomial time...

Thanks rcgldr.
Aug18-12, 04:33 AM   #27
 
Recognitions:
Homework Helper Homework Help
Quote by burritoloco View Post
I'm looking at the factorization of cyclotomic polynomials of order 2^n-1; all its irreducible factors are the prim polys of degree n.
Thanks for the explantion. My experience with this stuff is for Reed Solomon error correction code, where the primitive polynomials are usually given, and relatively small (9 bits, 13 bits, 17 bits, 33 bits), where crude methods are good enough to find them if needed. Sometimes irreducible but non-primitive polynomials are used, as in the case of AES encryption (not sure why this was done).
Aug18-12, 09:33 AM   #28
 
Recognitions:
Science Advisor Science Advisor
Quote by burritoloco View Post
Still, addition time is proportional to the amount of digits no?
Yep. But the overhead of integer add is similar to the logical instructions, so it would be roughly the same run time if you just chained them together with XOR (to give the overall parity of the string) and then just a single "and 01" at the end to isolate the parity.

Cool, computing time was 29197 milliseconds, j = 33285996513.
Ok thanks for the results. So if you multiply that time by 2^30 that extrapolates to about 1000 years to do the original proposed test. Now you see why I was so skeptical of those previous results.
Aug18-12, 09:45 AM   #29
 
Recognitions:
Science Advisor Science Advisor
Quote by burritoloco View Post
My laptop is an Acer 5736Z, Pentium(R) Dual-Core CPU T4500 @ 2.30GHz × 2, x64, not a super-computer.
Looks like nobody noticed that this CPU does not support SSE4 and (I'm pretty sure) doesn't support the ABM (advanced bit manipulations) either. So no native popcnt in this case. I guess gcc must substitute an X86 equivalent for targets without support ABM extensions.
Aug18-12, 10:00 AM   #30
 
Recognitions:
Science Advisor Science Advisor
Quote by rcgldr View Post
Code:
        mov     ecx,0ffffffffh
test0:  popcnt  eax,ecx
        loop    test0
It takes about 5.7 seconds on my system (Intel 2600K). I also tried popcnt eax,ebx with various values in ebx, and the timing remained the same.
Hi rcgldr. My PC is too old to support native popcnt, so just for fun I thought I'd try an X86 parity flag implementation. Surprisingly the following calculates the overall parity of the string [1, 2, 3 ... 7FFFFFFF] in about 14.8 seconds on my aging single core Athlon64 (@2.4 GHz).

Code:
var myGlobalResult : longword;

procedure testparity;
label Strt,L1,L2,L3,L4;
begin
 asm
         mov ecx,$7FFFFFFF
         mov edx,0
   Strt:
         mov eax,ecx
         or  eax,eax
         jpe L1
         not edx
   L1:
         shr eax,08
         jpe L2
         not edx
   L2:
         shr eax,08
         jpe L3
         not edx
   L3:
         shr eax,08
         jpe L4
         not edx
   L4:
         loop Strt
         and edx,01
         mov myGlobalResult,edx
 end; {asm}
end;  {testparity}
Just in case anyone is interested, the overall parity of that string was 0 (even).
Aug18-12, 11:58 AM   #31
 
Recognitions:
Homework Helper Homework Help
Quote by uart View Post
X86 parity flag implementation.
Based on some comments in some old threads at other forums, I think this is similar to what gcc probably does. It takes 4.4 seconds on my system (note it's 1/2 the number of iterations I used for my popcnt test).

Code:
        mov     ecx,07fffffffh
        xor     edx,edx                 ;set edx = parity
par0:   mov     eax,ecx
        xor     al,ah
        jpe     par1
        not     edx
par1:   shr     eax,16
        xor     al,ah
        jpe     par2
        not     edx
par2:   loop    par0
        neg     edx
Aug18-12, 12:48 PM   #32
 
Recognitions:
Science Advisor Science Advisor
Hey good idea to add the "xor al,ah" to check 16 bits at once, that nearly doubles the speed. That reduced my times from 14.7 sec down to about 8.6 seconds.

note it's 1/2 the number of iterations I used for my popcnt test
Yeah I noticed that you had 2^32-1 for that previous test. At first I didn't see it, and I was seriously puzzled as to why my much slower computer seemed to be doing the tight loop at the same speed or even slightly faster than yours (about 5.6 sec). Then I noticed you were doing twice as many iterations.
Aug18-12, 07:21 PM   #33
 
Recognitions:
Homework Helper Homework Help
Wasn't sure what the return value is supposed to be. At the end of the code the "neg edx", returns 0 for even parity, 1 for odd parity, to reverse this use "inc edx" instead (also the return value would normally be in eax, not edx).
New Reply

Similar discussions for: Efficiently computing the parity of the Hamming weight of a binary sequence
Thread Forum Replies
[C++] Efficiently counting the number of '1' bits inside blocks of a binary number Programming & Comp Sci 4
reduce the sequence..or how to calculate it efficiently General Math 1
Algorithm for computing conjugacy classes in subgroups of the GL(n,F_p) efficiently Calculus & Beyond Homework 0
Hamming Code, Parity Matrix Linear & Abstract Algebra 0
Show the following properties of Hamming weight... Calculus & Beyond Homework 3