Today I Learned

Inspired by Simon Willison, this part of my site is for short lessons worth journalling.


2022-11-20
Playtesting Candycrush

with Deep Learning?

2022-11-18
Bot Bowl

an AI Competition in Blood Bowl

2022-11-02
Missing Pedestrians

In a self-driving car dataset. Ouch.

2022-11-01
Ascent

Library with some cool sentiment ideas

2022-10-27
Game Time Distribution

Turns out, it's Weibull?

2022-10-26
Minecraft Diffusion

Crafting starting points for diffusion.

2022-10-22
Only 7 Percent

Not everybody shares their data.

2022-10-21
Zelda Street View

This is a really cool hobby project.

2022-10-08
Data Duplications

Between Test and Training data!

2022-10-06
Annotation Datasets

Let’s study annotators

2022-10-05
Punderstanding

Computational "Pun"-derstanding that is.

2022-09-09
DALC

Dutch Abbusive Language Corpus

2022-07-21
Generating Receipts

This is a really cool use-case for Blender.

2022-07-13
Annotators vs. Tasks

Are We Modeling the Task or the Annotator?

2022-05-17
Won't Predict via Disagreement

Learning from Teachers, more Literally

2022-05-13
Interactive Confusion Matrices

Ideas for UI work.

2022-05-02
Active Churning

Randomly Sampling is a Strong Benchmark

2022-04-23
Active Street Signs

Neat usecase for Active Learning.

2022-04-22
Perfect Fit

Never ever claim a perfect fit.

2022-04-21
Active, but Visual, Learning

Colors and Convex Hulls

2022-01-16
The Story Theory

Statistics, Storks and Babies

2021-12-20
2021-12-05
VADER

Rule Based Sentiment

2021-12-03
Linkrot

It is a Huge Problem

2021-10-29
Learning to Place

Classification as a Heavy-Tail Regressor

2021-10-13
Optimal Seeds

Manual_seed(3407) is All You Need

2021-10-12
1.4 Million Jupyter Notebooks

And only 24.1% of them actually ran.

2021-09-27
Sentiment and Bias

Exploring Huggingface while I'm at it.

2021-09-26
Gorilla Hypotheses

A hypothesis *can* be a liability.

2021-09-13
Scots Wikipedia

The Ouch continues in Embeddings

2021-09-01
Analytics Providers

It's Numbers that Differ!

2021-08-27
poke2vec

As in ... text embeddings!

2021-08-10
Pandas Format

Pretty table renders.

2021-08-06
Stopwords

They're not very consistent.

2021-07-29
Dixit Data

How a Great Game became a Grand Challenge

2021-07-22
Label Errors

How to find LOTS of them.

2021-07-17
DnD Data

There's lots of it.

2021-07-17
Shaded Screenshots

A "shortcut" with 4 keys.

2021-07-16
Copilot & Pytest

Pytest vs. Parrot

2021-07-15
metatags.io

It's a great helper

2021-07-08
Copilot & Submodules

Autocomplete Might be Better

2021-06-25
Github Actions as a Number

Is it big or is it small?

2021-06-23
Plenty of Bad Labels

Data Quality Strikes Again

2021-06-18
Recursive HTML

I *really* like Svelte.

2021-06-13
Urban Dictionary Embeddings

It's an entertaining idea.

2021-06-05
Tesla vs. Stoplights

Data Quality Strikes Again

2021-06-03
Kolektor

My take on Git-Scraping[tm]

2021-06-01
Flight Simulatoops

Data Quality Strikes Again