iCE is a bitboard engine, which means it sees the board not as a 8x8 array but as a set of 64 bit integers (bitboards) representing the piece locations. So far I only compiled 32 bit code because I was missing a 64 bit OS. This forced the compiler to emulate 64 bit operations by splitting them into 32 bit ones.
I'm now owning a powerful 64 bit monster but I did not compile a 64 bit target yet. iCE is using some inline assembler to speed up some time critical bit operations and this is not portable to 64 bit out of the box. This code must be rewritten and I haven't done this yet.
But as tuning my engine is so CPU intensive I really liked to speed up the engine a bit so I can run the same amount of games in less time. So I rewrote those parts lately.
In order to measure the outcome I produced different 64 bit targets using the Intel and the Microsoft compiler to find the fastest combination. The speed is measured solving a set of 25 positions searched 12 ply deep.
pgo - Profile Guided Optimization is used
popcnt - The popcount processor instruction available on the Nehalem CPUs is used (instead of a software algorithm)
Looks like it is really worth it. The fastest build is about 75% faster than my 32 bit build.