Today I Scripted

Simon Willison did everyone a favour by advocating shorter blogposts in the form of a TIL (today I learned). The goal of these posts is that they contain short lessons learned that people may totally find useful even when there is not a big story around it. This style of blogging is something that I have taken to heart myself and I also wish more people would do it.

However, ever since the new release of uv, it also feels like it might be time to encourage folks to write a dialect of these posts which I would like to call "Today I Scripted". It is not my goal to coin a term here, but I am hoping to encourage folks towards a variant of the "TIL" which might be great for productivity ... and also a bit of fun.

The enabler

The latest release of uv has a lot of features that are great, but there is one specific feature that I want to zoom in on.

# /// script
# dependencies = [
#   "requests<3",
#   "rich",
# ]
# ///

import requests
from rich.pretty import pprint

resp = requests.get("https://peps.python.org/api/peps.json")
data = resp.json()
pprint([(k, v["title"]) for k, v in data.items()][:10])

This Python file has a special comment on top. This is a new inline metadata feature that declares the dependencies of the script. The really neat thing about this is that tools like uv can automatically pick this up for you and make sure all dependencies are installed when the script needs to run.

To get this to work with uv you literally only need to do this:

uv run script.py

The dependencies are all just plain taken care of. This is all really amazing from a technical and practical standpoint, but lets now think about blogging again.

An example

When I work on data quality projects I am usually dealing with many different steps that play nice together in a Taskfile. This is similar to a Makefile but you can more easily run things in parallel and there are some clever ways to skip steps that do not need to run.

Just to give an impression of the kinds of things that might need to happen:

I might need to deduplicate a dataset.
I might need to turn a set of paragraphs into sentences.
I have a bunch of images that need to be resized a specific way.
I need to run a super specific model that relies on old versions of PyTorch/numpy on my stuff.

All of these tasks are usually just a small script and the usually also carry a very distinct set of dependencies that can pile up in a project. Sofar I have put these scripts in a repository somewhere because they are kind of too small to share. But maybe ... now that we have uv run ... ... it starts to make sense to blog about such scripts. That way folks can easily seach/find/copy/paste these utilities into their own projects.

Productive!

I figured I should put my money where my mouth is and attempt an example right away. So I added a new kind of content on my personal blog called scripts where I might share these "Today I Scripted" posts. There are a few old utilities from spaCy that I have put there now, but I can also see myself adding some more prompt-y stuff there too. These posts will mostly just contain a fully self-contained script and a small description, in the hope that it will help future me ... but hopefully also future you.

So feel free to join me! This might just be super fun and who knows, maybe this way a lot of useful scripts do not get forgotten.

ps.

Oh, and by the way. This idea might also totally work for Jupyter notebooks down the line. But I don't know if there is a dependency/metadata standard for that on the horizon yet.

pps.

I couldn't help myself while exploring this idea a bit deeper and actually seem to have hacked a way to also allow you to run this uv-trick from within Python itself. Maybe this is useful for situations when it is impractical to run a command line tool as part of the bigger picture. I have an implementation, but it is pretty darn hacky at the moment.

If you're eager to explore (and live dangerously), here's what it looks like:

from uvtrick import load

# Load the function `add` from the file `some_script.py`
add = load("path/to/some_script.py", "add")

# This result is from the `some_script.py` file, running in another virtualenv 
# with `uv`. A pickle in a temporary file is used to communicate the result.
# Be aware of the limitations, please only consider base Python objects.
add(1, 2)  # 3

I'm not sure if it's a great idea to actually run this from Python, but it was a fun exercise in recreative programming to find out that it actually works.

koaning.io

Today I Scripted

The enabler

An example

Productive!

ps.

pps.

Related Posts

Overtype markdown

The titanic dataset has a twist

The Sock Drawer Paradox

cline feels like an upgrade