Sunday, December 19, 2010

Detecting Three-fold repetition

While designing the new engine I thought about a way to efficiently detect two-fold and three-fold repetitions. My PASCAL mACE engine stores the hash keys of all positions that happened on the board and when a new position is added it looks through the last ones (up to the last pawn move or capture) whether this key shows up in the history. This is very safe, not too bad, but it involves a loop every time a move is made.

Maybe there are better ways of doing that, I thought about a special hash table that only stores a counter for every position that is incremented when the move to that positions is made and decremented when the moves is unmade (in search this happens very often). This would be pretty fast, but what about hash table collisions. Two different positions on the board map to the same hash slot. Are the likely, do I have to care about them.

So question is, in a game of maybe 200 moves, how likely is it that in a table of lets say 32.768 slots 2 different positions map to the same slot. This would confuse the repetition counter and lead to incorrect results.

When you consider the numbers (200 moves, 32768 slots) you might think that is is very unlikely that two positions collide, but in fact it is rather likely. The probability is about 46%. So it that setup every other game you would end up with incorrect repetition detection, which is not acceptable.

How does the probability change when I increase the the hash size. I created a little spreadsheet, that did this for me. The probability P for a collision is P = 1 - (y/(y-x))^(y-x) * e^-x


So it shows that even with large hash sizes a significant probability remains that a collision occurs, I have to find a smart way of handling them.