compdef is handy

2026-01-02

Most of the time when I need autocompletion on the terminal I really just need it for files and folders. With zsh this works out of the box when you use most known linux commands, but chains of Python commands can totally flunk here.

Enter compdef, so for marimo I added the following two lines to my ~/.zshrc:

# Enable file completion for uvx commands
compdef _default uvx

# Add uvx marimo alias
alias mam="uvx marimo -y edit --sandbox --watch"

This adds a nice marimo alias but it also makes sure that the alias gets the file autocompletion.

When Kevin Malone meets Claude

2025-12-31

Matt Pocuck had an interesting suggestion that I believe might be inspired from an episode of the office.

The whole idea is that end a Claude Code prompt with this:

Be extremely concise. Sacrifice grammar for the sake of concision.

So I couldn't help but try some prompts with this idea. Here are some results.

Normal prompt

explain huberregression from scikit-learn.

Huber regression in scikit-learn is a **robust linear regression** method that behaves like ordinary least squares for small residuals but becomes linear (less sensitive) for large residuals. It uses the Huber loss:

- If |r| ≤ ε: loss = 0.5 * r² (quadratic, like least squares)
- If |r| > ε: loss = ε * (|r| − 0.5 * ε) (linear, like absolute loss)

Here `r = y − ŷ` and ε is the threshold that controls when the loss switches from quadratic to linear.

Why it’s useful:
- Reduces the influence of outliers compared to plain least squares.
- Still differentiable everywhere, which makes optimization stable.

Key points in scikit-learn:
- Class: `sklearn.linear_model.HuberRegressor`
- Important params: `epsilon` (switch point), `alpha` (L2 regularization), `max_iter`, `tol`
- It **scales features internally**; coefficients are returned in original scale.
- If `epsilon` is large, it behaves more like standard regression; smaller values increase robustness but can down-weight legitimate high-residual points.

When to use:
- You suspect outliers or heavy-tailed noise.
- You want a linear model that’s more robust than `LinearRegression` but smoother than `RANSAC`/`TheilSen`.

Prompt + consise

explain huberregression from scikit-learn. Be extremely concise.

Huber regression (sklearn.linear_model.HuberRegressor) is a robust
linear model: it uses squared loss for small residuals and linear 
loss for large ones, reducing outlier influence while staying close 
to least squares.

Prompt + grammar consise

explain huberregression from scikit-learn. Be extremely concise. Sacrifice grammar for the sake of concision.

HuberRegression (sklearn): linear reg w/ Huber loss; quadratic for 
small residuals, linear for large → robust to outliers. Tuned by 
`epsilon`; solves via iterative reweighted least squares; supports 
`alpha` (L2) + `fit_intercept`, `max_iter`, `tol`.

Notice it?

Is it me, or does it feel like the variant with less grammer gives me many extra details that the normal "be consise" prompt doesn't give? Feels like dropping grammar might indeed work as a forcing mechanism.

TIL wget can download full sites

2025-12-28

TIL (via paul) that you can use wget to fully download a site. Back when I was backpacking this was an awesome technique to pull in the full documentation of PyData projects like pandas so I could learn from it even if the wifi was down.

$ wget -r -E -k -p -np -nc --random-wait http://example.com

Here's what the different CLI options do:

  • -r or --recursive: makes sure wget uses recursion to download sub-pages of the requested site.
  • -E or --adjust-extension: let wget change the local file extensions to match the type of data it received (based on the HTML Content-Type header)
  • -k or --convert-links: have wget change the references in the page to local references, for offline use.
  • -p or --page-requisites: don't just follow links within the same domain, but also download everything that is used by a page (even if it's outside of the current domain).
  • -np or --no-parent: restrict wget to the provided sub-folder. Don't let it escape to parent or sibling pages.
  • -nc or --no-clobber: don't retrieve files multiple times
  • --random-wait: have wget add some intervals between downloading different files in order to not bash the host server.

Neat!