Today I Learned
Inspired by Simon Willison, this part of my site is for short lessons worth journalling.
Copying and pasting text to the clipboard in Python
The internet does not dissapoint.
Artificial demand, it works!
Two interesting anekdotes
Embeddings for research papers
And thanks to complexity theory, now we know!
Mini LLMs just for docstrings
... as in LiDAR
... via Emoji2Vec
Harvesting robots need it.
Another usecase for Blender
Finetuning LLMs away from something
Detecting about 196 of them
On the tree and on the ground.
Via a travelling salesman
For Birds and Celebs!
And how to check them.
How to search in fashion
For computer vision
In CIFAR100 no less
A study in sleep deprivation
A benchmark/dataset of memes
Mapping all the moves
Collecting datasets with annotator information
Using video to map sensors to activity
Seeing things in the dark
Neural networks vs. web standards
LOL stands for League of Legends
Computer vision
There's a genuine conference for it.
Diseases Spreading in World of Warcraft
A cool public dataset
Via tactics of deception!
Via Mechanical Turk and Git Repos
with Deep Learning?
an AI Competition in Blood Bowl
In a self-driving car dataset. Ouch.
Library with some cool sentiment ideas
Turns out, it's Weibull?
Crafting starting points for diffusion.
Not everybody shares their data.
This is a really cool hobby project.
Between Test and Training data!
Let’s study annotators
Computational "Pun"-derstanding that is.
Dutch Abbusive Language Corpus
This is a really cool use-case for Blender.
Are We Modeling the Task or the Annotator?
Learning from Teachers, more Literally
Ideas for UI work.
Randomly Sampling is a Strong Benchmark
Neat usecase for Active Learning.
Never ever claim a perfect fit.
Colors and Convex Hulls
Statistics, Storks and Babies
Via Github Copilot!
Rule Based Sentiment
It is a Huge Problem
Classification as a Heavy-Tail Regressor
Manual_seed(3407) is All You Need
And only 24.1% of them actually ran.
Exploring Huggingface while I'm at it.
A hypothesis *can* be a liability.
The Ouch continues in Embeddings
It's Numbers that Differ!
As in ... text embeddings!
Pretty table renders.
They're not very consistent.
How a Great Game became a Grand Challenge
How to find LOTS of them.
There's lots of it.
A "shortcut" with 4 keys.
Pytest vs. Parrot
It's a great helper
Autocomplete Might be Better
Is it big or is it small?
Data Quality Strikes Again
I *really* like Svelte.
It's an entertaining idea.
Data Quality Strikes Again
My take on Git-Scraping[tm]
Data Quality Strikes Again