In a previous document I've explained how to use the CAPM model to create naive ranking for pokemon. Even though the approach had some logic to it, I wanted to explore if a naive probibalistic implementation of trueskill could do the job as well. This document contains a python implementation my naive approach as well as an application to a pokemon dataset. The first part will explain how it is made and tested. The second part will analyse the ranking of all 750 pokemon.

Starting Data

Before writing any application code, it seems sensible to first think about how to test if the code works. It would be a good idea to simulate some random players with predetermined skill levels such that we can test if our ranking algorithm can at least rank simulated data well enough. The code below generates this data.

import uuid
                            import numpy as np
                            import pandas as pd
                            import json 
                            import matplotlib.pyplot as plt
                            %matplotlib inline
import seaborn; seaborn.set() SIZE_ARR = 100 def sim(p1, p2, entropy = 1): """ Given two players (tuples: <name, skill>) simulate a game between them. Returns (winner-name, loser-name) """ r = np.random.randint(-entropy, entropy + 1) if p1[1] + r > p2[1]: return p1[0], p2[0] else: return p2[0], p1[0]
def gen_data(n_users = 20, n_games = 100, entropy = 1): """ Generate a lot of games and outputs a list of user tuples (<name>, <skill>)as well as a list of games played. """ data = [] users = [(str(uuid.uuid4())[:6], i) for i in range(n_users)] for in range(n_games): i, j = np.random.choice(range(len(users)), 2) u1, u2 = users[i], users[j] data.append(sim(u1, u2, entropy)) return users, pd.DataFrame(data, columns = ['winner', 'loser'])

Code that does the math

We'll now concern ourselves with writing code that updates a system of beliefs. The belief will be captured in a numpy array and can be interpreted as a histogram of likelihood values for a certain level of skill. The algorithm is explained in the previous blogpost as well but schematically, this is what the belief update looks like;

We start out with two arrays of prior skill beliefs for player 1 and player 2. We then combine them in a two dimensional density. This is our prior belief of the skill levels of the two players. Once a player wins, we can update this belief. One player has a lower skill level. To translate this to our belief system, we cut away all likelihood in the triangle of the losing player. Then we marginalise our belief back to the two players. By repeating this for all the games, we hope to achieve a belief of skill that can be updated in real time.

Towards Code

Next we'll define all the functions that do the ranking. These functions are meant to be minimal, as we only want to check if this algorithm can work. It makes sense to think about what our app code might look like and to think about how to test individual components before writing the code though. I'm not particularily religious about testing but I do see some value for writing a test before writing code here. We're going to be munging with numpy matrices and typos in indices will be unforgiving.

p1 = np.array([1.,2.,3.])
                        p2 = np.array([3.,2.,1.])
                        p3 = np.array([1.,1.,1.])
                        mat1 = gen_prior(p1, p2)
                        mat2 = gen_prior(p2, p1)
                        mat3 = gen_prior(p3, p3)
                        m1, m2 = gen_marginals(mat1)
                        m3, m4 = gen_marginals(mat2)
                        np1, np2 = gen_marginals(cut_matrix(mat1))
                        assert all(m1 == m2[::-1]) #marginals should still be same shape 
                        assert all(m3 == m4[::-1]) #marginals should still be same shape 
                        assert np.sum(mat1) == np.sum(mat2) #matrices should have equal values
                        assert np.sum(mat3) == 9 #all ones should sum to 9 
                    assert all(np1 == np2[::-1]) #marginals should still be same shape

With these tests in place, I could worry about the code.

def get_player_stats(name):
                            """
                                Check if a player exists in the dictionary. If not, add it. 
                                Then returns the player belief. 
                                """
                            if name not in d.keys(): 
                            d[name] = np.ones(SIZE_ARR)/np.sum(np.ones(SIZE_ARR))
                            return d[name]
                            def gen_prior(arr1, arr2):
                            """
                                Combines two arrays into a matrix.
                                """
                            return np.matrix(arr1).T  np.matrix(arr2)
                            def cut_matrix(mat): 
                            """
                                Cuts the matrix according to likelihood update rule. 
                                Also applies normalisation
                                """
                            posterior = np.triu(mat) + 0.000001
                            posterior = posterior/np.sum(posterior)
                            return posterior
                            def gen_marginals(posterior_mat):
                            """
                                From a cut matrix, generate the appropriate marginals back
                                """
                            winner = np.squeeze(np.asarray(np.sum(posterior_mat, 0)))
                            loser = np.squeeze(np.asarray(np.sum(posterior_mat, 1)))
                            return winner, loser
                            def update(winner, loser):
                            """
                                Given a winner and user, update our state. 
                                """
                            winner_arr = get_player_stats(winner)
                            loser_arr = get_player_stats(loser)
                            prior_mat = gen_prior(loser_arr, winner_arr)
                            posterior_mat = cut_matrix(prior_mat)
                            new_winner, new_loser = gen_marginals(posterior_mat)
                            d[loser] = new_loser
                            d[winner] = new_winner
                        

I wrote the code and the tests passed. For now this is enough. Before running this code against the pokemon dataset, let us first run it against some simulated data.

Running the code a few times.

def plot_scores(plot_idx): 
                            plt.subplot(plot_idx)
                            plt.scatter(
                            np.arange(len(users)), 
                            [np.mean(d[usr[0]]  np.arange(SIZE_ARR)) for usr in users]
                            )
                            def plot_dist(plot_idx):
                            plt.subplot(plot_idx)
                            for usr in users:
                            plt.plot(np.arange(SIZE_ARR), d[usr[0]], alpha = 0.5)
                        

Simulate n_players = 100, n_games = 400, entropy = 1

d = {} 
                            users, df = gen_data(n_users=100, n_games=400, entropy=1)
                            for i in json.loads(df.to_json(orient ='records')): 
                            update(i['winner'], i['loser'])
                            plt.figure(1, figsize=(15,10))
                            plot_scores(221)
                            plot_dist(222)
                            d = {} 
                            users, df = gen_data(n_users=100, n_games=10000, entropy=1)
                            for i in json.loads(df.to_json(orient ='records')): 
                            update(i['winner'], i['loser'])
                            plot_scores(223)
                            plot_dist(224)
                        

Simulate n_players = 100, n_games = 400/10000, entropy = 15

Increasing the entropy in the system will cause the ranking algorithm to measure the noise more than the signal, but more games allow for convergence to still occur.

d = {} 
                        users, df = gen_data(n_users=100, n_games=400, entropy=15)
                        for i in json.loads(df.to_json(orient ='records')): 
                        update(i['winner'], i['loser'])
                        plt.figure(1, figsize=(15,10))
                        plot_scores(221)
                        plot_dist(222)
                        d = {} 
                        users, df = gen_data(n_users=100, n_games=10000, entropy=15)
                        for i in json.loads(df.to_json(orient ='records')): 
                        update(i['winner'], i['loser'])
                        plot_scores(223)
                        plot_dist(224)
                    

Simulate n_players = 100, n_games = 400/10000, entropy = 50

Suppose that there is more noise than signal in this sytem? In that case it will correctly indicate this to us.

d = {} 
                    users, df = gen_data(n_users=100, n_games=400, entropy=50)
                    for i in json.loads(df.to_json(orient ='records')): 
                    update(i['winner'], i['loser'])
                    plt.figure(1, figsize=(15,10))
                    plot_scores(221)
                    plot_dist(222)
                    d = {} 
                    users, df = gen_data(n_users=100, n_games=10000, entropy=50)
                    for i in json.loads(df.to_json(orient ='records')): 
                    update(i['winner'], i['loser'])
                    plot_scores(223)
                    plot_dist(224)
                

There's still some room for improvement to this system to make the convergence better, but it seems well enough for my purposes. I want to use this metric to rank pokemon so if I need to simulate some more outcomes between two pokemon then I won't mind. For more real-time purposes, the following tweaks seem to be important;

  • New matches are allocated randomly, which is suboptimal from a learning perspective. If you want to fine tune the top 10 players it makes sense to have them play against eachother if you are unsure who is number 1. If instead you'd have one of these players compete against somebody mediocre you can predict the game outcome before it even started. This means you would not learn anything from that game.
  • I am using buckets to simulate a distribution for a player. It may be better to use Beta distributions instead and save it's parameters. This would make the blogpost harder to read for more novice readers which is the main reason of not doing it, but this should improve the stability and remove the need for the hard coded smoothing parameter in the cut_matrix function.

For the pokemon dataset I only have one concern; the effect of rock paper scissors.

Pokemon Data

I took the liberty of preparing a dataset. It's easy to do yourself with a bit of python and this project. I'll use the formula from my previous post here again to create a function that can simulate the outcome of a pokemon battle. It will be based on the number of turns that one pokemon can outlast the other.

$$ T
= \frac{HP_i}{DMG_{ji}} $$

where $T_{ij}$ is the number of turns the pokemon $i$ would be able to survive the opponent $j$ if it were to be attacked indefinately, $HP_i$ is the amount of hitpoints that pokemon $i$ has and $DMG_ji$ is the amount of damage a basic attack pokemon $j$ would deal to pokemon $i$. This damage is defined via:

$$ DMG_{ji} = \frac{2L_j+10}{250} \times \frac{A_j}{D_i} \times w_{ji}$$

The situation will very much assume some core aspects of the game away. The hope is that my making this assumption we'll still leave much of the actual pokemon game intact.

poke_df = pd.read_csv("http://koaning.io/theme/data/full_pokemon.csv")
                    weak_df = pd.read_csv("http://koaning.io/theme/data/pokemon_weakness.csv")
                
poke_df.head()
                
def weakness_weight(types_j, types_i): 
                    """
                        Returns w_{ji} 
                        """
                    res = 1.
                    possible_types = [t for t in weak_df['type']]
                    for i in filter(lambda t: t in possible_types, types_i):
                    for j in filter(lambda t: t in possible_types, types_j):
                    res = float(weak_df[weak_df['type'] == j][i])
                    return res
                    def calc_tij(poke_i, poke_j):
                    """
                        Calculate the number of turns one pokemon would outlast the other. 
                        """
                    poke_row_i, poke_row_j = poke_df[poke_df['name'] == poke_i].iloc[0], poke_df[poke_df['name'] == poke_j].iloc[0]
                    dmg_ji = (200. + 10)/250
                    dmg_ji = float(poke_row_j['attack'])/float(poke_row_i['defence'])
                    dmg_ji *= weakness_weight(eval(poke_row_j['type']), eval(poke_row_i['type']))
                    return poke_row_i['hp']/dmg_ji
                    def sim_poke_battle(poke_i, poke_j):
                    """
                        Simulate the outcome between two pokemon. Output: winner_name, loser_name.
                        """
                    if calc_tij(poke_i, poke_j) > calc_tij(poke_j, poke_i):
                    return poke_i, poke_j
                    return poke_j, poke_i
                

Let it run here

d = {} 
                    poke_names = [i for i in poke_df['name']]
                    for  in range(len(poke_names) - 1):
                    p1, p2 = poke_names[], poke_names[ + 1] 
                    win, lose = sim_poke_battle(p1, p2)
                    update(win, lose)
                    # this takes a while! 
                    for  in range(50000): 
                    i, j = np.random.choice(range(len(poke_names)), 2)
                    p1, p2 = poke_names[i], poke_names[j] 
                    win, lose = sim_poke_battle(p1, p2)
                    update(win, lose)
                

This takes a while, but when it is done we can create the same distribution chart as before but now for pokemon.

plt.figure(figsize=(15,5))
                    for poke in poke_names:
                    plt.plot(np.arange(SIZE_ARR), d[poke], alpha = 0.5)