Introducing bespoken

2025-07-12

When used right, claude code is a huge productivity boost. Take any brainfart, and you get working tools by just writing English. And yet ... there are moments when it frustrates me. It's very hard to customise the workflow. I need to adapt to claude and I sometimes I want to be able to declare how it should adapt to me. And I can’t.

Let's say that I want it to work on a marimo notebook. In that case I want it to make changes to one, and only one, file. But claude's permissions go way beyond that. It can edit any file, and even has access to all the command line tools, including the ones that can rm -rf.

That example is all about removing capabilities, but sometimes I also want to be able to add them. Let's say that I built a custom Python class that knows where to find the right docs and can fetch it in an effective format ... can I add that to claude loop? No! Instead, we are forced to have claude try and search the web, but why are we wasting our tokens on that?!

Enter bespoken

Video demo

If you're keen for a full demo, you might appreciate this YouTube video that I've recorded.

Written explainer

This is what led me to work on a new experiment. Consider this file:

app.py - a bespoken agent
from bespoken import chat
from bespoken.tools import FileTool

chat(
    model_name="anthropic/claude-3-5-sonnet-20240620",
    tools=[FileTool("notebook.py")],
    system_prompt="You are a coding assistant that can make edits to a single file.",
    debug=True,
)

Running uv run app.py will start an agent on the command line for you to interact with.

You can prompt it to make changes to the notebook.py file and if you ask it to make changes to any other file ... well ... it can't! This tool only allows edits (or even read access) to one and only one file.

How?

Under the hood this uses llm from Simon Willison. This library provides an ecosystem of LLMs, many of which support tool calling which also let's you add tools. One way to think about these tools is that they add functionality, but they can also be configured to constrain permissions. In the above example, we define an agent that can only read/edit the "notebook.py" file.

There are loads of tools that you could add here. The library has some that can give the agent the ability to track todo lists, connect to a browser via Playwright or even call specific command line tools like uv/pip/npm. These are all just examples, what's most important is that it is incredibly easy to add your own.

Taking it a step further

Tools allow the llm to call Python on your behalf. But sometimes you want to be able to trigger that manually instead. For that, you can add "slash commands". Here's an example:

def set_role():
    """Set a role for the assistant"""
    roles = ["developer", "teacher", "analyst", "creative writer", "code reviewer"]
    role = ui.choice("What role should I take?", roles)
    return f"You are now acting as a {role}. Please respond in character for this role."


chat(
    ...,
    slash_commands={
        "/set_role": set_role,
    }
)

When you call /set_role in the TUI you effectively get a form appear that can trigger Python on your behalf. The final string the set_role() returns is the string that is sent to the LLM.

When is this useful? You could use this to trigger the right git command. It always feels like a shame to waste tokens on that and it also feels awkward to keep terminal tabs open for this. But my favourite use-case is to inject debugging prompts with just enough context. Here's one template for a debugging tehcnique that forces the LLM to find the bug on their own.

You've introduced a new bug, but instead of me telling you what the bug is, let's see if you can find it for yourself.

Imagine that you are a user and that you start by . Go through all the steps that would happen if a user tries to .

Think through all the steps and see if you can spot something that could go wrong. Don't write any code, but let's see if we both find the same issue.

Notice how we need an "entrypoint" and an "action" for this prompt? This is where custom slash commands become super useful again but it's also where we can add some ui.

def custom_debug():
    # Ask the user for what we need.
    entrypoint = ui.input("What is the entrypoint for the user?")
    action = ui.input("What is the action the user is trying to take?")

    # Construct the prompt 
    out = f"""You've introduced a new bug, but instead of me telling you what the bug is, let's see if you can find it for yourself. Imagine that you are a user and that you start by {entrypoint}. Go through all the steps that would happen if a user tries to {action}. Think through all the steps and see if you can spot something that could go wrong. Don't write any code, but let's see if we both find the same issue."""
    print("")

    # Show the prompt to the user
    ui.print_neutral(out)

    # Send the prompt to the LLM
    return out

Here's what the call looks like:

Cool?

It's very much early, but it's at a spot now where I can show it to people to hear some feedback. I'm keen to explore adding a callback so that every turn in the conversation is logged and I can also imagine adding some commands that interact with git. Also tonnes of tools! But right now is the time to keep an open mind.

To learn more, check the docs and the repo.

departure mono

2025-07-11

I learned about a new mono font called departure mono (git) that looks really neat.

This is what it looks like:

It's got this pixel style vibe to it, and it is perfectly suitable as a mono-font.

It could work well in some video games or websites that have a retro element to them but I think you could even set it up to be the font you use inside of your terminal or IDE.

It's not going to be for every project out there, but a good font can really make a demo great. Especially when you have something for developers.

the LLM pizza pattern

2025-07-08

A few years ago I worked at Rasa, a company that builds tools for chatbots. This was well before LLMs came on to the scene and it took me a while to realize how much simpler things are now that we have types that can interact with LLMs.

Old forms

One of the key features of the old Rasa stack was the ability to trigger "forms" (link to legacy docs) in a conversation. This is a branch in a conversation where we keep on querying the user for information until we have enough to trigger a follow-up action.

As a main example I always used the "pizza"-bot. A user might say "give me a pepperoni" pizza but this would not suffice because the bot would also need to know the size of the pizza. Rasa had a mechanism to deal with this, which worked kind of like a state machine where the user would keep on getting questions until everything was validated.

New forms

Instead of a state machine, we can now do something like this:

Example with pydantic.ai
class Pizza(BaseModel):
    flavour: str
    """The type of pizza that the user wants to order"""
    size: Literal["small", "medium", "large"] 
    """The size of pizza that the user wants to order"""

agent = Agent(
    "openai:gpt-4.1",
    output_type=list[Pizza] | str,
    instructions="You are here to detect pizza purchases from the user. If the user does not ask for a pizza then you have to ask.",
)

resp = await agent.run(message_history=message_history)

The key element in this example is the list[Pizza] | str part.

  • If the LLM cannot extract a Pizza type then it can proceed to send text back to the user asking for more information.
  • The user is able to ask for more than one pizza in a conversation.
  • The user can even tell the assistant to add or remove pizzas from "memory". The whole conversation can be passed to the system as a message_history and it will be able to update the belief on what list[Pizza] should represent.

From here we can also add tools to check for extra properties. But the main thing to appreciate here is just the list[Pizza] | str type. This is so incredibly flexible because it gives the LLM full control to clarify a property from the end-user.

Livestream

This realization came to me during a livestream, which you can watch here:

I was joined by Marcelo (of FastAPI, starlette, uvicorn and Pydantic fame) and he shares a lot of interesting notes as we explore building a pizza bot.