projects

Vincent D. Warmerdam

I try to be appreciative of all the open source software out there. Here’s a list of projects that I have been active in setting up to give back.

Services

drawdata.xyz

Drawdata is a simple tool that allows you to draw you data and then download it.

dearme.email

Dearme was a simple email server that will send back emails. Useful for reflection.

calmcode.io

This is a repository for educational content that really tries to push against the skill anxiety by offering calm videos and code snippets. It is in part inspired by teach tech together and fast.ai. The content is available for free.

It’s a second blog for me and I figured this type of content is best served using a slighty different medium. I made this choice in part because it keeps this blog focussed but also because I really like the idea of creating an alternative to datacamp in my spare time.

thismonth.rocks

This is a website with inspiring ideas of things to start doing this month. A lot of the suggestions are for hobbies. It was created during the corona pandemic. The website is also fully collaborative. Everybody can add ideas to the website.

Open Source

rasa

This one is cheating since it is my employer. That said, I am happy to contribute to it. Natural language is both interesting and unsolved. One of the associated packages I wrote is rasa nlu examples. It features word-embeddings for less common languages like Zulu. I also wrote rasalit which offers useful streamlit applications that explain some of the ML inside of Rasa.

scikit-lego

Scikit-Lego is an opinionated package that contains lego bricks that you can use in your scikit-learn projects. Together with Matthijs Brouns I started this project to make it easier for us to teach people how to contribute to open source. We ended up creating an library with utilities for the pydata stack that adresses some artificial stupidity issues. It has been confirmed that companies are using parts in production as well.

human-learn

This package contains scikit-learn compatible tools that should make it easier to construct and benchmark rule based systems that are designed by humans. Part of the library offers user-interfaces to make it easy to construct these systems but we also make it easy to add rules to existing machine learning models.

tokenwiser

It’s a bag, not words, but tricks! It’s a library that combines tools from spaCy, vowpal wabbit and scikit-learn. The tool is somewhat experimental in nature but it has proven tools that are compatible across frameworks.

whatlies

While working at Rasa I saw an opportunity to make a tool that made experimentation with word embeddings easier. We decided it was best that I open sourced it. It has a lot of cool features that make it fun to play with; it supports scikit-learn compatible embeddings for tfhub, hugginface, fasttext, bytepair, spacy and gensim as well as an interface for visualisation and arithmatic. Around launch it got 4000 download a month and it’s been a great educational tool.

clumper

I wanted to make content on how to write your own python package over at calmcode.io. I decided to write a cute little python library that can clump collections of data together. There’s many fun lessons learned along the way but it’s also a cute lil’ library that I love toying with.

memo

Memo is a package that makes tracking statistics in your python code a whole log simpler. Simply add a decorator!

mktestdocs

I wanted to be able to test my markdown documentation via pytest as well as any examples written in markdown in my docstrings. So I wrote a small package that is able to help with this.

drawdata

This small python app allows you to draw a dataset in a jupyter notebook. This should be very useful when teaching machine learning algorithms. You can get the same tooling from going to drawdata.xyz but with this library you’ll also be able to use it from within jupyter.

justcharts.js

This small javascript library makes it easier to hack static dashboards together. It combines a vegalite spec with html to give you a <vegachart> component. These components can be used in static files. No npm build needed.

skedulord

Skedulord makes cron a bit more use-able by logging the jobs predictably, helping you find find broken jobs and adding a retry mechanic. It’s minimal and cli-focussed. The docs feature awesome animations. It has like, 5 users.

evol

Evol is a tool that makes writing evolutionary heuristics fun again. It makes sure that you keep the code clean and that the user optimises for joy. I started this project together with Rogier van der Geer.

Community

pydata

I’ve been a co-founder and chair of PyData Amsterdam. I’ve helped set up some sattelite events in other cities as well. It’s a cool meetup and a great conference. When we started there was a shortage of events where there were accessible yet technical in the field of data science. I’m still around but am no longer formally in the comittee.

spaCy

I’ve been wanting to learn the tool for a while so when Matt and Ines asked if I wanted to help out with educational content I couldn’t say no. I’m keeping track of lessons that I am learning on their youtube channel. It can be viewed here.

freecodecamp

I’ve collaborated with freecodecamp by sharing educational content for scikit-learn. I’m proud to mention that it is the first introduction to the tool that acknoledges the issues surrounding the load_boston dataset.