An AI developed by the British company DeepMind ranked among the best players on the online version of the famous board game Stratego, by learning how to bluff and sacrifice important pieces to claim victory.

DeepNash artificial intelligence

Inspired by the Napoleonic campaigns, Stratego pits two players against each other attempting to capture the opposing flag, hidden among a set of 40 game pieces. If these predominantly represent military functions or ranks numbered 1-10 (with higher-ranking soldiers defeating their lower-ranking counterparts) , unlike most board games, only direct confrontations allow you to know the nature of your opponent’s pieces.

In the context of work published in the journal ScienceJulien Perolat and his colleagues from DeepMind have developed a new AI baptized DeepNash, aiming to surpass its human opponents on this extremely complex strategy game (with 10,535 possible game situations, more than the game of Go, chess or poker).

To do this, the team used its artificial intelligence to participate in some 5.5 billion simulated games, corresponding to hundreds of years of play. Unlike many of its counterparts, this learning phase does not did not involve the analysis of human strategies specific to Stratego, or games opposing it to real players.

Rather than trying to play by exploring all possible game situations, which would be technically unfeasible, DeepNash’s algorithm constantly directs its behavior towards an “optimal” strategy. guaranteeing a win rate of at least 50% against an opponent who does not make mistakes. This, despite incomplete information, and the considerable number of actions that can be taken at each turn.

Impressive performance

DeepNash achieved an 84% win rate in 50 ranked matches against expert human players on the online version of the game, allowing it to enter the top 3 without them suspecting they were facing a machine . DeepMind’s AI also won 97% of games against the top bots playing Stratego.

Good players tend to memorize opponent’s pieces and predict their deployment patterns commented Georgios Yannakakis from the University of Malta. “ While DeepNash has a clear advantage when it comes to memorization, it does both well, plays interestingly and unpredictably, and isn’t shy about bluffing.. »

The most surprising thing is probably his ability to sacrifice valuable pieces to obtain information on the configuration and the opposing strategy. concludes Perolat.

