Uncommon Contributions

Making impact without touching the core of a library.

Vincent Warmerdam koaning.io
08-24-2020

There are a lot of ways that you can contribute to open source. Frequent contributions include adding features to a library, fixing bugs, or providing examples to a documentation page. All of these contributions are valid, but there are other avenues that you may want to consider too. In this blog post, I’d like to list a few non-standard examples that might inspire.

I’ll also add some images to highlight why some of these changes matter.

Info

My first proper PR for Rasa has little to do with the core library. It doesn’t help in making a chatbot at all. Instead, I upgraded this command:


rasa --version

Before, this command would list the current version of Rasa. In the new version, it lists:

  1. The version of python.
  2. The path to your virtual environment.
  3. The versions of related packages.

This is a pretty big feature if you consider the amount of work that goes into submitting a proper bug report. By adding this feature, it is much easier to supply all the relevant information. Just copy and paste the output of this call, and you’ll be writing a much better bug report. No need to fiddle around with commands like:


pip freeze | grep package-name

This feature has little to do with the core package code but still makes a lot of impact. The debugging process is made easier for both the user and the maintainers of the project.

Cron on Dependencies

A user for scikit-lego, a package that I maintain, discovered that the reason the code wasn’t working was because scikit-learn introduced a minor, but breaking, change. To fix this the user added a cronjob with Github actions to the project.

Before the PR, we had to hear from users when a dependency introduced breaking changes. By adding a cronjob that would run our unit tests daily, the user removed a blind spot from the project. Every day we now run the unit tests using the latest versions of our dependencies. If the tests break, we can quickly pinpoint what package caused it and create a fix. You can imagine how this might lead to fewer issues for our users.

This feature, again, had little to do with the core package code.

Spellcheck

For the scikit-lego project, we met a user who was interested in contributing but didn’t know where to start. The user hadn’t made many contributions yet and was looking for something easy. His main goal was to get more comfortable with git and to get going with a first open-source contribution. I mentioned that addressing spelling errors is certainly valid, so the user went ahead with this.

The results were somewhat epic. He ran a spellchecker, not just against our docs, but also on our source code! It turns out we had some issues in our docstrings as well. While exploring this theme we’ve also discovered flake8 compatible packages that do spellchecking on variable names.

Spell-checking was something we hadn’t considered, and it was a much-appreciated code quality update.

Error Messages

One of the harder parts of writing a package for other people is getting people to understand how they should use the package. As a maintainer, you cannot imagine what it is like not to understand your own code. New users, on the other hand, can spot this instantly.

This is why a popular entry point for contribution is documentation. Documentation is often read when something goes wrong after the user sees a confusing error, so it makes sense for new users to contribute there. But you can go a step further! Instead of changing the docs, why not write a more meaningful error message?

In whatlies, we’ve recently allowed for optional dependencies. If you try to use a part of the library that requires a dependency that is not part of the base package, then you’ll get this error message.


In order to use ConveRTLanguage you'll need to install via;

> pip install whatlies[tfhub]

See installation guide here: https://rasahq.github.io/whatlies/#installation

This feature has little to do with the core functionality of the library. Yet, it will do a lot for developer experience.

Failing Unit Tests

There’s a lovely plugin for mkdocs called mkdocs-jupyter. It allows you to easily add jupyter notebooks to your documentation pages. When I was playing with it, I noticed that it wasn’t compatible with a new version of mkdocs. Instead of just submitting a bug to Github, I went the extra mile. I created a PR that contained a failing unit-test for this issue. This was great for the maintainer because it was easier to understand the issue and to fix it.

This feature, again, had little to do with the core package code.

Renaming files.

Let’s compare two pieces of code from a library that I maintain.

Exibit A


from whatlies.transformer import Pca
pca_plot = emb.transform(Pca(2)).plot_interactive()

Exibit B


from whatlies.transformer import pca
pca_plot = emb.transform(pca(2)).plot_interactive()

The Difference

Did you see the difference? In the first example we’re importing Pca and in the second example pca. The difference is an upper case/lower case letter and this small typo caused a very unintuitive bug.

The bug is related to the file structure of the project.


whatlies
├── __init__.py
├── embedding.py
├── embeddingset.py
├── language
│   ├── __init__.py
│   ├── ...
└── transformers
    ├── __init__.py
    ├── pca.py
    └── ...

The Pca class is defined in the pca.py file in the transformers folder. We intended to expose the Pca class via the __init__.py file in the same folder. Unfortunately, by importing pca instead of Pca you’re getting the submodule instead of the intended class.

A single character could cause a really confusing bug so we had to fix it. So we fixed it by changing the filename from pca.py to _pca.py.


whatlies
├── __init__.py
├── embedding.py
├── embeddingset.py
├── language
│   ├── __init__.py
│   ├── ...
└── transformers
    ├── __init__.py
    ├── _pca.py
    └── ...

We didn’t just do this for _pca.py but for all files where this error might occur.

Again, this change doesn’t require you to understand the core code of the library.

Conclusion

Many of the coolest contributions might have nothing to do with the core library. All of the examples that I’ve listed above have been tremendous features for some of the package that I maintain but also for some of the larger code repositories out there. Feel free to remember this when you’re considering your first contribution.