Recognitions:
Homework Help

## Efficiently computing the parity of the Hamming weight of a binary sequence

 Quote by burritoloco long m = (1L << 63) - 1; for(long i=0; i < m; i++) { __builtin_popcountl(i); }
Try changing this to:

long j = 0;
long m = (1L << 31) - 1;
for(long i=0; i < m; i++)
{
j += __builtin_popcountl(i);
}

cout << j << endl;
 Cool, computing time was 29197 milliseconds, j = 33285996513. Still, addition time is proportional to the amount of digits no?

Recognitions:
Homework Help
 Quote by burritoloco I would love to output the assembly code if anyone would show me how?
Use -S (upper case) to generate assembly source:

gcc -S -O2 example.cpp

or to generate assembly source and object but not link:

gcc -S -O2 -c example.cpp

This will produce an assembler file called example.s.

 Quote by burritoloco From your comment I take it that perhaps "block" is a term already in use in relation to codewords?
No, there's no standard for the term "block" when working with primitive polynomials. I wasn't sure if in your case a block would be a codeword or not.

 Quote by burritoloco Cool, computing time was 29197 milliseconds, j = 33285996513. Still, addition time is proportional to the amount of digits no?
You can try this and hope the compiler doesn't optimize this to just running the last loop:

register long j = 0;
long m = (1L << 31) - 1;
for(long i=0; i < m; i++)
{
j = __builtin_popcountl(i);
}

cout << j << endl;

Ooh, this is what I get after adding the -S option for g++:

 ./example: line 1: .file: command not found ./example: line 2: .globl: command not found ./example: line 3: .section: command not found ./example: line 4: .LC0:: command not found ./example: line 5: .string: command not found ./example: line 6: .LC1:: command not found ./example: line 7: .string: command not found ./example: line 8: .LC4:: command not found ./example: line 9: .string: command not found ./example: line 10: .section: command not found ./example: line 11: .p2align: command not found ./example: line 12: .globl: command not found ./example: line 13: .type: command not found ./example: line 14: main:: command not found ./example: line 15: .LFB1025:: command not found ./example: line 16: .cfi_startproc: command not found ./example: line 17: pushq: command not found ./example: line 18: .cfi_def_cfa_offset: command not found ./example: line 19: .cfi_offset: command not found ./example: line 20: xorl: command not found ./example: line 21: xorl: command not found ./example: line 22: pushq: command not found ./example: line 23: .cfi_def_cfa_offset: command not found ./example: line 24: .cfi_offset: command not found ./example: line 25: xorl: command not found ./example: line 26: subq: command not found ./example: line 27: .cfi_def_cfa_offset: command not found ./example: line 28: movq: command not found ./example: line 29: call: command not found ./example: line 30: xorl: command not found ./example: line 31: jmp: command not found ./example: line 32: .p2align: command not found ./example: line 33: .p2align: command not found ./example: line 34: .L6:: command not found ./example: line 35: movq: command not found ./example: line 36: call: command not found ./example: line 37: .L3:: command not found ./example: line 38: cltq: command not found ./example: line 39: addq: command not found ./example: line 40: addq: command not found ./example: line 41: cmpq: command not found ./example: line 42: jne: command not found ./example: line 43: syntax error near unexpected token (' ./example: line 43:  leaq 16(%rsp), %rdi'
Without the -S it works fine though.

Recognitions:
Homework Help
 Quote by burritoloco This is what I get after adding the -S option for g++ ...
Try with the -c option (means don't link):

gcc -O2 -S -c example.cpp

In spite of the errors you got, gcc may have produced an assembly source file, look for a file with a .s suffix.

Same thing happened with -c. I tried

 register long j = 0; long m = (1L << 31) - 1; for(long i=0; i < m; i++) { j = __builtin_popcountl(i); } cout << j << endl;
Computing time was now 29107 milliseconds.
 The .s file is not there...

Recognitions:
Homework Help
 Quote by burritoloco j = __builtin_popcountl(i); Computing time was now 29107 milliseconds.
That's just under 30 seconds, so it appears the popcount and the rest of the loop instructions are being performed (2^31) -1 times. It tried an assembly code version with a two instruction loop in 32 bit mode:

Code:
        mov     ecx,0ffffffffh
test0:  popcnt  eax,ecx
loop    test0
and it takes about 5.7 seconds on my system (Intel 2600K). I also tried popcnt eax,ebx with various values in ebx, and the timing remained the same.

 Quote by burritoloco The .s file is not there...
I'm not sure why this is happening, but I don't have a copy of gcc. Try

gcc -help

to see the options. Also make sure you're using upper case S:

gcc -S example.cpp

I still don't understand why you want to calculate the even / odd parity when you're learning about GF(2) primitive polynomials.

 I still don't understand why you want to calculate the even / odd parity when you're learning about GF(2) primitive polynomials.
:) OK, to satisfy the curiosity, I'm looking at the factorization of cyclotomic polynomials of order 2^n-1; all its irreducible factors are the prim polys of degree n. There are several methods to compute prim polys, but this one seems at least unusual. In this method I'm looking at, the coefficients of these polys can be obtained by the parity of the weight of certain chunks of some types of "primitive" binary sequences of size given by the number (large) of irreducible factors of the corresponding cyclotomic polynomial. I was wondering about the time complexity to compute the coefficients with this method, but unfortunately, with what I know, the overall method seems to be exponential (seems to be true from the tests I've done as well). However, there are several other known methods which compute these prim polys in polynomial time...

Thanks rcgldr.

Recognitions:
Homework Help
 Quote by burritoloco I'm looking at the factorization of cyclotomic polynomials of order 2^n-1; all its irreducible factors are the prim polys of degree n.
Thanks for the explantion. My experience with this stuff is for Reed Solomon error correction code, where the primitive polynomials are usually given, and relatively small (9 bits, 13 bits, 17 bits, 33 bits), where crude methods are good enough to find them if needed. Sometimes irreducible but non-primitive polynomials are used, as in the case of AES encryption (not sure why this was done).

Recognitions:
 Quote by burritoloco Still, addition time is proportional to the amount of digits no?
Yep. But the overhead of integer add is similar to the logical instructions, so it would be roughly the same run time if you just chained them together with XOR (to give the overall parity of the string) and then just a single "and 01" at the end to isolate the parity.

 Cool, computing time was 29197 milliseconds, j = 33285996513.
Ok thanks for the results. So if you multiply that time by 2^30 that extrapolates to about 1000 years to do the original proposed test. Now you see why I was so skeptical of those previous results.

Recognitions:
 Quote by burritoloco My laptop is an Acer 5736Z, Pentium(R) Dual-Core CPU T4500 @ 2.30GHz × 2, x64, not a super-computer.
Looks like nobody noticed that this CPU does not support SSE4 and (I'm pretty sure) doesn't support the ABM (advanced bit manipulations) either. So no native popcnt in this case. I guess gcc must substitute an X86 equivalent for targets without support ABM extensions.

Recognitions:
 Quote by rcgldr Code:  mov ecx,0ffffffffh test0: popcnt eax,ecx loop test0 It takes about 5.7 seconds on my system (Intel 2600K). I also tried popcnt eax,ebx with various values in ebx, and the timing remained the same.
Hi rcgldr. My PC is too old to support native popcnt, so just for fun I thought I'd try an X86 parity flag implementation. Surprisingly the following calculates the overall parity of the string [1, 2, 3 ... 7FFFFFFF] in about 14.8 seconds on my aging single core Athlon64 (@2.4 GHz).

Code:
var myGlobalResult : longword;

procedure testparity;
label Strt,L1,L2,L3,L4;
begin
asm
mov ecx,\$7FFFFFFF
mov edx,0
Strt:
mov eax,ecx
or  eax,eax
jpe L1
not edx
L1:
shr eax,08
jpe L2
not edx
L2:
shr eax,08
jpe L3
not edx
L3:
shr eax,08
jpe L4
not edx
L4:
loop Strt
and edx,01
mov myGlobalResult,edx
end; {asm}
end;  {testparity}
Just in case anyone is interested, the overall parity of that string was 0 (even).

Recognitions:
Homework Help
 Quote by uart X86 parity flag implementation.
Based on some comments in some old threads at other forums, I think this is similar to what gcc probably does. It takes 4.4 seconds on my system (note it's 1/2 the number of iterations I used for my popcnt test).

Code:
        mov     ecx,07fffffffh
xor     edx,edx                 ;set edx = parity
par0:   mov     eax,ecx
xor     al,ah
jpe     par1
not     edx
par1:   shr     eax,16
xor     al,ah
jpe     par2
not     edx
par2:   loop    par0
neg     edx

Recognitions: