Saturday, March 30, 2013

Positional evaluation in chess

Chess engine are very good a tactics because it fits their way of approaching a position. To play moves that improve the position without gaining an immediate material advantage or even sacrifice material for a better position is much harder to program. It requires a careful coding of positional chess concepts and a fair amount of chess knowledge in its programmer.

My automated tuner seems to help here a bit. The tuned version is much more willing to give up some material to improve its position then the old "materialistic" version was.

In Jeremy Silman's "The Amateurs Mind" he discussed the following position

1r4k1/5p1p/1p3np1/pR6/P1Pr3P/2R2B2/1P4P1/7K w - -

Fischer - Taimanov, Palma de Mallorca 1970, White to Play
White has the queen side pawn majority and an active bishop targeting the queen side. The black knight wants to relocate to c5. Also Black threatens to take the white pawn at h4 with check.

I gave this position to my original version and it wasn't really understanding the position. At lower depths it tried to save the pawn with g3 and at higher depths it suggested 1. h5 Ne4 2. Bxe4 Rxe4 giving up its strong bishop and its advantage.

The tuned version very quickly found a better move. 1. c5! (1. .. Rxh4+ 2. Kg1 Rb4 3. Rb3 Rxb3 4. Rxb3 Nd7 5. c6). The pawn then queens after a rook sacrifice (Rxb6) which was a tactics my engine still missed so deep down the tree.

But it seems to make a little progress in the area of positional understanding.

Sunday, March 24, 2013

Time Trouble (again)

During the tuning process of the engine I realized that some games that were played during the tuning are lost on time. The games are played very fast, e.g. 10 moves within 1 second, so an occasional loss on time is not uncommon. But after I measured it, the rate was between 15% and 20% time losses, which is much to high.

I thought my time management module was working correctly. It should adapt to fast time controls and prevent time losses, but it did not. I decided to have a look at it and spotted the problem almost instantly.
The second line was reading hardLimit = softLimit causing the trouble

The hard limit is the time that the search has to stop no matter what to prevent a time loss. The soft limit is a recommendation for the search but can be extended. There is a final check that the soft limit can never be bigger than the hard limit, but unfortunately I messed up the order. Instead of lowering the soft limit I extended the hard limit.

Stupid me. 
Fixed, tested and the time loss percentage dropped to 0%.

PS.: Another GA run is almost done, so I will soon have enough samples to check how much the whole stuff is worth in terms of ELO points. 

Saturday, March 16, 2013

Population Based Incremental Learning (PBIL)

As mentioned earlier I'm using a genetic algorithm to tune my chess evaluation module. I'm not using a classic GA where a good mother and a father is selected and their genes combined to produced the members of the next generation. I'm using a more modern approach called PBIL (Population Based Incremental Learning) and I even used it with some modifications. It is more similar now to "simulated annealing".

During my tests I made the following observations
  • The algorithm is able to converge the more important terms rather quickly
  • Terms with lesser influence converge very slowly, some none not at all given about 1000 generations
  • The choice of the fitness function is very important. Runs with some fast and simple functions failed badly.
  • The choice of the GA parameters is rather a second degree influence factor. All runs optimized the gene pool towards the fitness function very good. Some were just faster. But I haven't spent to much time trying to optimize these as a serious run with a complex fitness function takes about a week to complete. 
Here is the evolutionary progress of a run regarding the material weights. Already in generation 9 the values are very good, stable and close to their final values (except maybe for the queen weight).
Convergence data of the material values
Convergence of the Probability Vector after 9, 200, 400 and 1100 generations

As a measurement of increasing strength I checked the ability of the engines to find the best move in a set of positions. After less than 200 generations the engines reached the strength of my manual "tuned" engine. After 1000 generations the engines solved much more. Whether they also play better chess is yet to prove.


Saturday, March 9, 2013

Chess evaluation evolution

My little chess engine plays correct chess for some time now and also reached a decent strength level. So it is not yet among the worlds top 100 chess engines but the goal is in sight at least (place 117).

So far I haven't spent any time with tuning the evaluation. That does not mean its evaluation is simple. It is not, it possesses already a lot of chess knowledge but it is not able to use that knowledge to its full extend. Besides some basic material knowledge it also understands the concepts of mobility, center control, king safety, basic pawn structure properties, passed pawns or development.

So ice knows that having a doubled pawn is considered a liability and having control of the center is usually an advantage. So it will try to avoid unnecessarily doubling its pawns and will play moves that improve its influence in the center. But a game of chess consists usually of trade offs, unless your opponent makes a mistake you don't get anything for free. So it might be a good deal in a position to double the pawns if this also leads to more center control and a half open file for the rook, or it might still be bad deal.

Those things to decide are very difficult for an engine. A human will use its intuition for that which comes from experience, but an engine does not have this. It has to use a mathematical approach to decide whether some positional imbalances are good or bad for it.

And here tuning the evaluation features comes into play. With good tuned features an engine will be better in estimating whether the positional concessions in a line of play are worth it or whether this line should be avoided. Unfortunately picking the correct values is very tough and most of the values are connected somehow.

A simple example. Most chess players learn very early that a pawn is worth 1 point, knight and bishop 3, the rook 5 and the queen 9. Those are values that can be used as material values for the pieces. So now you add a bonus for the rook on an open file, this will increase the overall value of the rook, so to keep everything in balance its material value must be lowered. As my evaluation consists of more than a 100 terms that all are somehow connected the task of finding an optimal balanced setting lets the traveling sales man problem look like an easy job.

As I was always fascinated by genetic algorithms to solve optimization problems I decided to give it a try to optimize my evaluation. The general public opinion in the chess community is that it does not work. Of course this can also mean that it does work but the people that were successful with it don't talk about it. Having an effective and working automated evaluation tuner will give them a big competitive advantage against the manual weight pickers, so why give it away. I wanted to try it anyway because at least I learn something with it, even if it fails. And it is much more interesting than boring manual tuning.

I decided not to use some existing GA C libraries but to program the algorithm myself. Over the past month I dedicated a bit of my time to it and slowly an advanced tuning framework evolved. And my new i7 will provide the horse power to run it.

So far I did a few runs with different GA parameter settings and the results look promising. I might not be able to find the perfect weights but I'm surely able to find better weights than I would be able to find manually. So it is at least for me a limited form of success, and maybe more than that.

To find out I have to further test the evolved solutions with different parameters and fitness functions against the former version of ice and also against other engines. This will take some more time. Luckily not mine, everything runs automated and unattended, just my electricity bill will probably get a hit.