koaning.io

Blog of a data person

All Posts Keyboard Reviews Apps About RSS Subscribe
Home All Posts Keyboard Reviews Apps About RSS Subscribe

DALC

2022-09-09

A few Dutch students took the effort of making an Abusive Language Corpus for the Dutch language. They described their effort in a paper and also released the dataset on GitHub.

The repository also contains the GROF lexicon, which comes with a lemma list that can be compared against.

Lists like these aren't perfect, but they can be a great starting point to detect abusive speech online. Many (Dutch) platforms can really benefit from that.

Like the content? You might like my substack too! Subscribe here.

Recent Articles

When Kevin Malone meets Claude

TIL wget can download full sites

diskcache with zlib

TaskyPi can turn your pyproject.toml into a Makefile too

Banning SQLAlchemy Dialects with Ruff

2nd Talk Python Interview

slipways

til etsy datasets

python data tools live

framework mechanical keyboard

ruff banned imports

dev-requirements.txt is bad

The transfer of enthousiasm

Python can open a webbrowser for you

webtui is stunning

Deliberate play

The music of 3b1b

Overtype markdown

The titanic dataset has a twist

The Sock Drawer Paradox

cline feels like an upgrade

Domain Specific Keyboards: the mathpad

© 2025 Vincent D. Warmerdam. All rights reserved.

ESC

No posts found. Try a different search term.

↑↓ to navigate ↵ to select ESC to close