To read John Hartmann’s Chess Tech Tips 1-3, visit Part 1 of this article.
4. Start your Engines!
Engines are the third leg in the chess technology triad, and in some ways, they are perhaps the most important. When compared with traditional book study, working with ChessBase alone was different only in degree and not in kind. The program allowed users to study much more quickly and efficiently, but they still had to rely on their own calculative skills to make assessments. With the rise of the engines, however, we chess mortals suddenly had unfettered access to Grandmaster-level analysis at the push of a button.
Dozens of chess engines are available for sale and for free download on the Internet. So which ones are strongest? Which are best for analysis? The answers to these two questions largely overlap, but it’s worth considering them separately.
There are multiple engine ratings lists from reputable testing groups, the most notable lists being those from CCRL, CEGT, IPON and SSDF. Almost all agree that the top three engines are Stockfish 8 (an open-source engine), Komodo 10.3 (commercial), and Houdini 5.01 (commercial), and usually in that order. Stockfish also won the year’s preeminent engine competition, the Top Chess Engine Championship (TCEC), soundly defeating Houdini in a 100 game match on very fast hardware, while Houdini won the TCEC Rapid Tournament.
A few other engines are worth mentioning beyond these top three. Deep Shredder 13 (commercial), a new incarnation of a venerable program, is in fourth place on most of the lists above, followed by a slew of less heralded entries like Andscacs, Critter, Equinox, Fizbo, and Gull (all free or open-source) and older favorites like Fritz and Hiarcs (both commercial).
While the ratings achieved by these engines are impressive – Stockfish approaches 3500 on some lists! – remember that they only measure results in computer-computer play. Whether an engine is analytically useful is much more subjective. Comparative strength of play is important, but it’s also important to settle upon an engine (or two or three) whose analysis you understand and trust.
Some, on the basis of this line of thinking, argue that it’s best to stick with one engine, to learn its qualities, its strengths and its biases. I can see the logic of this, as some engines are better at some things than others. Stockfish 8, for example, is excellent in the endgame, particularly when it has access to tablebases. But it can also ‘cop out’ in very complex positions, returning 0.00 scores that culminate in unnatural perpetuals. Once you know an engine’s weaknesses, you can, in a sense, compensate for them.
There are others who think it important to consult a few engines in tough positions. Because engines aren’t perfect, as we will soon see, the idea here would be to solicit multiple opinions and use human judgment to more closely approach the truth. I tend to fall into this camp, but this may simply be personal preference combined with a touch of hubris. My rotation includes many of the engines listed above, with Stockfish and Komodo getting the most use.
5. Trust… but verify!
The key thing to remember about engines is that none are infallible. All of the engines discussed in this article are far, far stronger than any human alive. And it’s true that computers are excellent at finding consistently good moves and avoiding bad ones. This doesn’t mean that the computer always finds the best moves, or that best moves can always be identified.
Computer chess enthusiasts have created specialized positions that can fool even the best engines, where completely blocked positions still register as a win for one side, or where bishop underpromotions (not coded into Fritz or Rybka) are the only moves that win. Here, however, I want to take a look at three recent real-life examples to help illustrate both the power and the limits of computer analysis.
Fabiano Caruana nearly broke the Internet when he played 21.Nf5! (instead of 21.Nc6) against Hikaru Nakamura at the 2016 London Chess Classic. After the game Caruana said that “I’d analyzed this and the computer doesn’t show 21.Nf5. [ed: It was found by his second Rustam Kasimdzhanov in a training game.] The problem is that the computer doesn’t understand that after 21.Nf5 Bxf5 Black is pretty much just lost. It’s one of the saddest positions I’ve ever seen for Black.”
The truth, however, is that engines can find this move… provided you give them enough time. When I tested the four leading engines in this position on a moderately fast quad-core machine, I was surprised to learn that three of the four could find the move after all.
|Engine||Time to find 21. Nf5 (minutes)|
|Stockfish (latest developmental version)||8:50|
|Deep Shredder 13||N/A|
There are echoes here of Kramnik’s infamous game 8 loss to Peter Leko in their 2004 World Championship match. Had his assistants let their engines run slightly longer during their preparation, they would have seen that their idea – which Kramnik duly played – was refutable, something that Leko was happy to discover over the board.
In both cases the lesson is clear: blindly trusting computer analysis is not without risk. A quick scan with the engine can be useful, but there’s no replacement for rigorous analysis that combines silicon strength with human guile.
This position is from the second tiebreak game in the recent Carlsen-Karjakin World Championship match, where Carlsen has just played 40.Bd3. I was lucky enough to witness part of the match in person, and as compelling as it was, I think I’ve spent more time in recent weeks thinking about the punditry and the way it framed the match than the actual match itself.
Agon’s commentary team made very heavy use of computer analysis in their work, with a ‘prediction bar’ embedded in the graphics package. Grandmaster analysts were continually interrupted by anchorman Knut Solberg’s gleeful shouting of current engine evaluations. All of this worked to reinforce the idea that computer analysis was the best predictor of the game’s correct outcome and each player’s best moves.
If you asked a strong human player to evaluate the position above, they might say that the game was somewhere between a win and a draw, but that the ultimate evaluation was not clear. The engine, however, gives White a consistent advantage, and it found more than one win for Carlsen before Karjakin finally held the draw in 84 moves.
What is ‘winning’ for an engine isn’t always winning for a human. Sometimes the win requires a move that is completely foreign to human logic. In this case, as Greg Shahade (citing Gregory Kaidanov) has said, it’s best to pretend the move doesn’t exist and instead focus on moves that mortals might actually play.
The point is that humans, struggling against a clock, an opponent, and their own psyches, can’t be expected to find moves that a computer will. Carlsen shouldn’t be blamed for missing precise moves in a pressure-filled rapid game. Some of the computer’s proposed wins were literally super-human.
There is also a lesson here for the chess analyst. Some computer moves and evaluations are irrelevant for humans. You have to understand why a move is winning, not just that it is. If you can’t win the resulting position over the board, the engine’s brilliancy is of no use, and it’s better to choose a ‘weaker,’ more ‘human’ move. Boris Gelfand has recently made much the same argument:
“I am sure that there was someone watching the game [Karjakin-Gelfand, World Cup (6.1), 2009] online who thought I had blundered because the engine’s evaluation went from -6 to -2.5, but this is a complete misunderstanding. If the position is winning and you know how to win, then you are home and dry.” (Dynamic Decision Making in Chess, 239)
Our final example comes from the realm of high endgame theory, and in this case, we encounter a position where the engine’s evaluation is fundamentally wrong. Both Stockfish (+2.6 at 57 ply) and Komodo (+1.7 at 47 ply) see the Kantorovich position (taken from Tal-Giplis, 1983) as winning for White.
Here we run into an extreme form of the so-called horizon effect, where an engine cannot analyze deeply enough to fully understand the position. Human analysts like Jacob Aagaard and Vartan Pogohysian have – with computer assistance, to be sure – discovered that the position is ultimately drawn following 1.Kd4 g5 2.Kd5 g4 (Poghosyan) or 2…Kg6 followed by …f6, …Kf5 and kingside counterplay (Aagaard).
It is rare that computers make mistakes in evaluation like this, but they do occur. Let your engine try to assess 24.Qxe5!! in Gusev-Averbakh (Moscow, 1951) if you want proof, or set it on some of the latest Mar del Plata lines in the King’s Indian and watch it struggle.
So where does this leave us? The computer is a powerful tool for improvement, but it is vital to remember that it is just a tool and not allow our thinking to be dominated by it. My rule of thumb is this: a chess engine can always tell you if a move is bad, but it can’t always tell you what move is best. There are multiple good moves in many non-tactical positions, and it’s up to the human user to pick one that suits her style and practical needs. Trust your favorite engine, but don’t forget to verify!