Bot Bowl

Recently I learned about a (video) game called Blood Bowl, which combines American football with Warhammer. I'm not a huge fan of Warhammer games, there's too much gore, but the interesting thing about Blood Bowl is that there is an associated AI competition called Bot Bowl. The competition also comes with a paper that describes the simulation environment.

The simulation environment comes with a little GUI and can be found as a Python project on Github.

There are many game environments with competitions, but it turns out that this game in particular is more complex than many because of the large search space. To quote the paper:

With 10 players able to take actions in a fixed order, the average turn-wise branching factor is $$10^5×10 = 10^{50}$$. As players can take actions any order, this is a lower-bound estimate. In comparison, the turn-wise branching factor is around 30 in Chess and 300 in Go.

So give such a huge search space, how would you go about writing a bot?

Learning how to Win

There have been many bots and approaches for this game, but the winner of the 3rd Bot Bowl used an interesting trick with their Mimic Bot.

Before this bot, no machine learning approach had beat a scripted bot which uses domain knowledge on the game. The reason why scripting bots were so pupular before is because the search space is so vast that many ML bots would just get stuck exploring loosing states without learning anything useful. It turned out that randomly playing agents were not able to score a single touchdown, even after 300K simulated games.

So what's the solution? Instead of starting with random behavior, MimicBot first tries to learn to mimic existing players. To quote the article:

In the most generic sense, imitation learning agents aim at mimicking the human behavior on a given task. Given the ability to access a human expert acting in the same environment as the agent, a number of states $$s$$ and the corresponding actions $$\hat{a}$$ are captured. The agent is then trained to mimic the performance of the human given the captured pairs.

What's interesting here is that they didn't even need human players.

In this work, we demonstrate the potential of using Behavioural Cloning to pre-train a reinforcement learning policy. Instead of using human experts, we demonstrate that having access to scripted agents enables to achieve superior performance, without the need to put in place large scale collection of human games. We also show, that combining Reinforcement Learning and Behavioural Cloning enables to avoid the so called ”policy collapse” that can happen when training with a RL solution.

Thanks to an imitation learning, and a hybrid decision-making process, MicMicBot seems to consistently beat the scripted agents.

In case it's of interest, this work was also presented at PyData Eindhoven.

Applications

Another anekdote that's pretty interesting is that there's a market for these kinds of bots: playtesting video games! It seems like companies like modl.ai are actually exploring this market.

I had no idea this was a thing. Today I learned.