MoreRSS

site iconArmin RonacherModify

I'm currently located in Austria and working as a Director of Engineering for Sentry. Aside from that I do open source development.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Armin Ronacher

Ugly Code and Dumb Things

2025-02-20 08:00:00

This week I had a conversation with one of our engineers about “shitty code” which lead me to sharing with him one of my more unusual inspirations: Flamework, a pseudo framework created at Flickr.

Two Passions, Two Approaches

There are two driving passions in my work. One is the love of creating beautiful, elegant code — making Open Source libraries and APIs that focus on clear design and reusability. The other passion is building quick, pragmatic solutions for real users (who may not even be developers). The latter usually in a setting of building a product, where the product is not the code. Here, speed and iteration matter more than beautiful code or reusability, because success hinges on shipping something people want.

Flamework is in service of the latter, and in crass violation of the former.

Early on, I realized that creating reusable code and directly solving problems for users are often at odds. My first clue came when I helped run the German ubuntuusers website. It was powered by a heavily modified version of phpBB, which despite how messy it was, scaled to a large user base when patched properly. It was messy, but easy to adjust. The abstractions were one layer deep.

Back then, me and a friend tried to replace it by writing my own bulletin board software, Pocoo. Working in isolation, without users, led me down a path of over-engineering. While we learned a lot and ended up creating popular Open Source libraries (like Jinja, Werkzeug and Pygments), Pocoo never became a solid product. Later, my collaborators and I rebuilt ubuntuusers, without the goal of making it into a reusable product. That rewrite shipped successfully and it lives to this very day.

But it took me years to fully realize what was happening here: reusability is not that important when you’re building an application, but it’s crucial when you’re building a library or framework.

The Flickr Philosophy

If you are unfamiliar with Flamework you should watch a talk that Cal Henderson gave in 2008 at DjangoCon (Why I hate Django). He talked about scale and how Django didn't solve for it. He enumerated all the things important to him: sharding, using custom sequences for primary keys, forgoing joins and foreign keys, supporting database replication setups, denormalizing data to the extreme. This is also were I first learned about the possibility of putting all session data into cookies via signing. It was a memorable talk for me because it showed me that there are shortcomings. Django (which I used for ubuntuusers) had beautiful APIs but at the time solved for little of that Cal needed. The talk really stuck with me.

At the time of the talk, Flamework did not really exist. It was more of an idea and principles of engineering at Flickr.

A few years later, Flamework appeared on GitHub, not as an open-sourced piece of Flickr code but as a reimplementation of those same ideas. You can explore its repository and see code like this:

function _db_update($tbl, $hash, $where, $cluster, $shard){
    $bits = array();
    foreach(array_keys($hash) as $k){
        $bits[] = "`$k`='$hash[$k]'";
    }
    return _db_write("UPDATE $tbl SET ".implode(', ',$bits)." WHERE $where", $cluster, $shard);
}

Instinctively it makes me cringe. Is that a SQL injection? Well you were supposed to use the PHP addslashes function beforehand. But notice how it caters to sharding and clustering directly in the query function.

Messy but Effective

Code like this often triggers a visceral reaction, especially in engineers who prize clean design.

How does something like that get created? Cal Henderson described Flickr's principle as “doing the dumbest possible thing that will work.” Maybe “dumb” is too strong — “simple” might be more apt. Yet simplicity can look messy to someone expecting a meticulously engineered codebase. This is not at all uncommon and I have seen it over and over. The first large commercial project that got traction that I ever worked on (Plurk) was also pretty pragmatic and messy inside. My former colleague Ben Vinegar also recently shared a story of early, messy FreshBooks code and how he came to terms with it. Same story at Sentry. We moved fast, we made a mess.

None of this is surprising in retrospective. Perfect code doesn't guarantee success if you haven't solved a real problem for real people. Pursuing elegance in a vacuum leads to abandoned side projects or frameworks nobody uses. By contrast, clunky but functional code often comes with just the right compromises for quick iteration. And that in turn means a lot of messy code powers products that people love — something that's a far bigger challenge.

A Rorschach Test

I have shown Flamework's code to multiple engineers over the years and it usually creates such a visceral response. It blind sights one by seemingly disregarding all rules of good software engineering.

That makes Flamework serve as a fascinating Rorschach test for engineers. Are you looking at it with admiration for the focus on some critical issues like scale, the built-in observability and debugging tools. Or are you judging it, and its creators, for manually constructing SQL queries, using global variables, not using classes and looking like messy PHP4 code? Is it a pragmatic tool, intentionally designed to iterate quickly at scale, or is it a naive mess made by unskilled developers?

Would I use Flamework? Hello no. But I appreciate the priorities behind it. If these ugly choices help you move faster, attract users and validate the product, then a rewrite, or large refactorings later are a small price to pay.

A Question of Balance

At the end of the day, where you stand on “shitty code” depends on your primary goal:

  • Are you shipping a product and racing to meet user needs?
  • Or are you building a reusable library or framework meant to stand the test of time?

Both mindsets are valid, but they rarely coexist harmoniously in a single codebase. Flamework is a reminder that messy, simple solutions can be powerful if they solve real problems. Eventually, when the time is right, you can clean it up or rebuild from the ground up.

The real challenge is deciding which route to take — and when. Even with experience, it is can be hard to know when to move from quick fixes to more robust foundations. The principles behind Flamework are also reflected in Sentry's development philosophy. One more poignant one being “Embrace the Duct Tape”. Yet as Sentry matured, much of our duct tape didn't stand the test of time, and was re-applied at moments when the real solution would have been a solid foundation poured with concrete.

That's because successful projects eventually grow up. What let you iterate fast in the beginning might eventually turn into an unmaintainable mess and will be rebuilt from the inside out.

I personally would never have built Flamework, it repulses me a bit. At the same time, I have a enormous respect for the people who build it. Their work and thinking has shaped how I solve problems and think of product engineering.

Seeking Purity

2025-02-08 08:00:00

The concept of purity — historically a guiding principle in social and moral contexts — is also found in passionate, technical discussions. By that I mean that purity in technology translates into adherence to a set of strict principles, whether it be functional programming, test-driven development, serverless architectures, or, in the case of Rust, memory safety.

Memory Safety

Rust positions itself as a champion of memory safety, treating it as a non-negotiable foundation of good software engineering. I love Rust: it's probably my favorite language. It probably won't surprise you that I have no problem with it upholding memory safety as a defining feature.

Rust aims to achieve the goal of memory safety via safe abstractions, a compile time borrow checker and a type system that is in service of those safe abstractions. It comes as no surprise that the Rust community is also pretty active in codifying a new way to reason about pointers. In many ways, Rust pioneered completely new technical approaches and it it widely heralded as an amazing innovation.

However, as with many movements rooted in purity, what starts as a technical pursuit can evolve into something more ideological. Similar to how moral purity in political and cultural discourse can become charged, so does the discourse around Rust, which has been dominated by the pursuit of memory safety. Particularly within the core Rust community itself, discussion has moved beyond technical merits into something akin to ideological warfare. The fundamental question of “Is this code memory safe?”, has shifted to “Was it made memory safe in the correct way?”. This distinction matters because it introduces a purity test that values methodology over outcomes. Safe C code, for example, is often dismissed as impossible, not necessarily because it is impossible, but because it lacks the strict guarantees that Rust's borrow checker enforces. Similarly, using Rust’s unsafe blocks is increasingly frowned upon, despite their intended purpose of enabling low-level optimizations when necessary.

This ideological rigidity creates significant friction when Rust interfaces with other ecosystems (or gets introduced there), particularly those that do not share its uncompromising stance. For instance, the role of Rust in the Linux kernel has been a hot topic. The Linux kernel operates under an entirely different set of priorities. While memory safety is important there is insufficient support for adopting Rust in general. The kernel is an old project and it aims to remain maintainable for a long time into the future. For it to even consider a rather young programming language should be seen as tremendous success for Rust and also for how open Linus is to the idea.

Yet that introduction is balanced against performance, maintainability, and decades of accumulated engineering expertise. Many of the kernel developers, who have found their own strategies to write safe C for decades, are not accepting the strongly implied premise that their work is inherently flawed simply because it does not adhere to Rust's strict purity rules.

Tensions rose when a kernel developer advocating for Rust's inclusion took to social media to push for changes in the Linux kernel development process. The public shaming tactic failed, leading the developer to conclude:

“If shaming on social media does not work, then tell me what does, because I'm out of ideas.”

It's not just the kernel where Rust's memory safety runs up against the complexities of the real world. Very similar feelings creep up in the gaming industry where people love to do wild stuff with pointers. You do not need large disagreements to see the purist approach create some friction. A recent post of mine for instance triggered some discussions about the trade-offs between more dependencies, and moving unsafe to centralized crates.

I really appreciate that Rust code does not crash as much. That part of Rust, among many others, makes it very enjoyable to work with. Yet I am entirely unconvinced that memory safety should trump everything, at least at this point in time.

What people want in the Rust in Linux situation is for the project leader to come in to declare support for Rust's call for memory safety above all. To make the detractors go away.

Python's Migration Lesson

Hearing this call and discussion brings back memories. I have lived through a purity driven shift in a community before. The move from Python 2 to Python 3 started out very much the same way. There was an almost religious movement in the community to move to Python 3 in a ratcheting motion. The idea that you could maintain code bases that support both 2 and 3 were initially very loudly rejected. I took a lot of flak at the time (and for years after) for advocating for a more pragmatic migration which burned me out a lot. That feedback came both in person and online and it largely pushed me away from Python for a while. Not getting behind the Python 3 train was seen as sabotaging the entire project. However, a decade later, I feel somewhat vindicated that it was worth being pragmatic about that migration.

At the root of that discourse was a idealistic view of how Unicode could work in the language and that you can move an entire ecosystem at once. Both those things greatly clashed with the lived realities in many projects and companies.

I am a happy user of Python 3 today. This migration has also taught me the important lesson not be too stuck on a particular idea. It would have been very easy to pick one of the two sides of that debate. Be stuck on Python 2 (at the risk of forking), or go all in on Python 3 no questions asked. It was the path in between that was quite painful to advocate for, but it was ultimately the right path. I wrote about my lessons of that migration a in 2016 and I think most of this still rings true. That was motivated by even years later people still reaching out to me who did not move to Python 3, hoping for me to embrace their path. Yet Python 3 has changed! Python 3 is a much better language than it was when it first released. It is a great language because it's used by people solving real, messy problems and because it over time found answers for what to do, if you need to have both Python 2 and 3 code in the wild. While the world of Python 2 is largely gone, we are still in a world where Unicode and bytes mix in certain contexts.

The Messy Process

Fully committing to a single worldview can be easier because you stop questioning everything — you can just go with the flow. Yet truths often reside on both sides. Allowing yourself to walk the careful middle path enables you to learn from multiple perspectives. You will face doubts and open yourself up to vulnerability and uncertainty. The payoff, however, is the ability to question deeply held beliefs and push into the unknown territory where new things can be found. You can arrive at a solution that isn't a complete rejection of any side. There is genuine value in what Rust offers—just as there was real value in what Python 3 set out to accomplish. But the Python 3 of today isn't the Python 3 of those early, ideological debates; it was shaped by a messy, slow, often contentious, yet ultimately productive transition process.

I am absolutely sure that in 30 years from now we are going to primarily program in memory safe languages (or the machines will do it for us) in environments where C and C++ prevail. That glimpse of a future I can visualize clearly. The path to there however? That's a different story altogether. It will be hard, it will be impure. Maybe the solution will not even involve Rust at all — who knows.

We also have to accept that not everyone is ready for change at the same pace. Forcing adoption when people aren't prepared only causes the pendulum to swing back hard. It's tempting to look for a single authority to declare “the one true way,” but that won't smooth out the inevitable complications. Indeed, those messy, incremental challenges are part of how real progress happens. In the long run, these hard-won refinements tend to produce solutions that benefit all sides—if we’re patient enough to let them take root. The painful and messy transition is here to stay, and that's exactly why, in the end, it works.

Fat Rand: How Many Lines Do You Need To Generate A Random Number?

2025-02-04 08:00:00

I recently wrote about dependencies in Rust. The feedback, both within and outside the Rust community, was very different. A lot of people, particularly some of those I greatly admire expressed support. The Rust community, on the other hand, was very dismissive on on Reddit and Lobsters.

Last time, I focused on the terminal_size crate, but I also want to show you a different one that I come across once more: rand. It has a similarly out-of-whack value-to-dependency ratio, but in a slightly different way. More than terminal_size, you are quite likely to use it. If for instance if you want to generate a random UUID, the uuid crate will depend on it. Due to its nature it also has a high security exposure.

I don't want to frame this as “rand is a bad crate”. It's not a bad crate at all! It is however a crate that does not appear very concerned about how many dependencies it has, and I want to put this in perspective: of all the dependencies and lines of codes it pulls in, how many does it actually use?

As the name implies, the rand crate is capable of calculating random numbers. The crate itself has seen a fair bit of churn: for instance 0.9 broke backwards compatibility with 0.8. So, as someone who used that crate, I did what a responsible developer is supposed to do, and upgraded the dependency. After all, I don't want to be the reason there are two versions of rand in the dependency tree. After the upgrade, I was surprised how fat that dependency tree has become over the last nine months.

Today, this is what the dependency tree looks like for the default feature set on macOS and Linux:

x v0.1.0 (/private/tmp/x)
└── rand v0.9.0
    ├── rand_chacha v0.9.0
    │   ├── ppv-lite86 v0.2.20
    │   │   └── zerocopy v0.7.35
    │   │       ├── byteorder v1.5.0
    │   │       └── zerocopy-derive v0.7.35 (proc-macro)
    │   │           ├── proc-macro2 v1.0.93
    │   │           │   └── unicode-ident v1.0.16
    │   │           ├── quote v1.0.38
    │   │           │   └── proc-macro2 v1.0.93 (*)
    │   │           └── syn v2.0.98
    │   │               ├── proc-macro2 v1.0.93 (*)
    │   │               ├── quote v1.0.38 (*)
    │   │               └── unicode-ident v1.0.16
    │   └── rand_core v0.9.0
    │       ├── getrandom v0.3.1
    │       │   ├── cfg-if v1.0.0
    │       │   └── libc v0.2.169
    │       └── zerocopy v0.8.14
    ├── rand_core v0.9.0 (*)
    └── zerocopy v0.8.14

About a year ago, it looked like this:

x v0.1.0 (/private/tmp/x)
└── rand v0.8.5
    ├── libc v0.2.169
    ├── rand_chacha v0.3.1
    │   ├── ppv-lite86 v0.2.17
    │   └── rand_core v0.6.4
    │       └── getrandom v0.2.10
    │           ├── cfg-if v1.0.0
    │           └── libc v0.2.169
    └── rand_core v0.6.4 (*)

Not perfect, but better.

So, let's investigate what all these dependencies do. The current version pulls in quite a lot.

Platform Dependencies

First there is the question of getting access to the system RNG. On Linux and Mac it uses libc, for Windows it uses the pretty heavy Microsoft crates (windows-targets). The irony is that the Rust standard library already implements a way to get a good seed from the system, but it does not expose it. Well, not really at least. There is a crate called fastrand which does not have any dependencies which seeds itself by funneling out seeds from the stdlib via the hasher system. That looks a bit like this:

use std::collections::hash_map::RandomState;
use std::hash::{BuildHasher, Hasher};

fn random_seed() -> u64 {
    RandomState::new().build_hasher().finish()
}

Now obviously that's a hack, but it will work because the hashmap's hasher is randomly seeded from good sources. There is a single-dependency crate too which can read from the system's entropy source and that's getrandom. So there at least could be a world where rand only depends on that.

Dependency Chain

If you want to audit the entire dependency chain, you end up with maintainers that form eight distinct groups:

  1. libc: rust core + various externals
  2. cfg-if: rust core + Alex Crichton
  3. windows-*: Microsoft
  4. rand_* and getrandom: rust nursery + rust-random
  5. ppv-lite86: Kaz Wesley
  6. zerocopy and zerocopy-derive: Google (via two ICs there, Google does not publish)
  7. byteorder: Andrew Gallant
  8. syn, quote, proc-macro2, unicode-ident: David Tolnay

If I also cared about WASM targets, I'd have to consider even more dependencies.

Code Size

So let's vendor it. How much code is there? After removing all tests, we end up with 29 individual crates vendored taking up 62MB disk space. Tokei reports 209,150 lines of code.

Now this is a bit misleading, because like many times most of this is within windows-*. But how much of windows-* does getrandom need? A single function:

extern "system" fn ProcessPrng(pbdata: *mut u8, cbdata: usize) -> i32

For that single function (and the information which DLL it needs link into), we are compiling and downloading megabytes of windows-targets. Longer term this might not be necessary, but today it is.

On Unix, it's harder to avoid libc because it tries multiple APIs. These are mostly single-function APIs, but some non-portable constants make libc difficult to avoid.

Beyond the platform dependencies, what else is there?

  • ppv-lite86 (the rand's picked default randon number generator) alone comes to 3,587 lines of code including 168 unsafe blocks. If the goal of using zerocopy was to avoid unsafe, there is still a ton of unsafe remaining.
  • The combination of proc-macro2, quote, syn, and unicode-ident comes to 49,114 lines of code.
  • byteorder clocks in at 3,000 lines of code.
  • The pair of zerocopy and zerocopy-derive together? 14,004 lines of code.

All of these are great crates, but do I need all of this just to generate a random number?

Compilation Times

Then there are compile times. How long does it take to compile? 4.3 seconds on my high-end M1 Max. A lot of dependencies block each other, particularly the part that waits for the derives to finish.

  • rand depends on rand_chacha,
  • which depends on ppv-lite86,
  • which depends on zerocopy (with the derive feature),
  • which depends on zerocopy-derive
  • which pulls compiler plugins crates.

Only after all the code generation finished, the rest will make meaningful progress. In total a release build produces 36MB of compiler artifacts. 12 months ago, it took just under 2 seconds.

Final Thoughts

The Rust developer community on Reddit doesn't seem very concerned. The main sentiment is that rand now uses less unsafe so that's benefit enough. While the total amount of unsafe probably did not go down, that moved unsafe is is now in a common crate written by people that know how to use unsafe (zerocopy). There is also the sentiment that all of this doesn't matter anyways, because we will will all soon depend on zerocopy everywhere anyways, as more and more dependencies are switching over to it.

Maybe this points to Rust not having a large enough standard library. Perhaps features like terminal size detection and random number generation should be included. That at least is what people pointed out on Twitter.

We already treat crates like regex, rand, and serde as if they were part of the standard library. The difference is that I can trust the standard library as a whole—it comes from a single set of authors, making auditing easier. If these external, but almost standard crates were more cautious about dependencies and make it more of a goal to be auditable, we would all benefit.

Or maybe this is just how Rust works now. That would make me quite sad.


Update: it looks like there is some appetite in rand to improve on this.

  • zerocopy might be removed in the core library: issue #1574 and PR #1575.
  • a stripped down version of chacha20 (which does not require zerocopy or most of the rust-crypto ecosystem) might replace ppv-lite86: PR #934.
  • if you use Rust 1.71 or later, windows-target becomes mostly a no-op if you compile with --cfg=windows_raw_dylib.

Edit: This post originally incorrectly said that getrandom depends on windows-sys. That is incorrect, it only depends on windows-targets.

How I Use AI: Meet My Promptly Hired Model Intern

2025-01-30 08:00:00

After Musk's acquisition of Twitter, many people I respect and follow moved to Bluesky. I created an account there and made an honest attempt of making it my primary platform. Sadly, I found Bluesky to be surprisingly hostile towards AI content. There is an almost religious resistance to AI on there, at least in whatever corner of the platform I ended up in.

Despite these challenges, some individuals on both Twitter and Bluesky have shown genuine curiosity about my AI usage. In this post, I want to share how I use Large Language Models and why I find them so helpful.

Before moving on, I want to include an an important disclaimer: I am by no means an expert in AI; I'm mostly an enthusiastic user. Absolutely nothing here is novel! What I do here is pretty boring which to some degree is the point. I won't be covering underlying technology or my programmatic experience. This is strictly about how I use AI as a “techy consumer”.

In addition, as you read through this article you will probably figure out rather quickly that the way I use AI — despite being used in part for content creation — does not really impact intellectual property much. That said, I'm curious and open to discussions about how we should be dealing with this problem. Particularly on Bluesky a lot of the negativity towards AI is related to watering down of copyrights and human creation. I don't know the answers to these things, but I think we need to have a productive dialog here rather than wishing for the technology to go away.

Why AI?

In short: AI makes me significantly more productive. I recently read Aaron Francis's Tweet about how he gets a week's worth of productivity out of a day now thanks to AI. I'm not anywhere close to that, but I use AI a lot now. It has become indispensable for me for both content writing as well as programming.

Moreover, a common misconception is that AI is still at the level it was when ChatGPT first launched. Many people tried it early, saw its limitations, and never returned. However, AI is evolving rapidly. If you haven't kept up, you risk drawing inaccurate conclusions based on outdated impressions. In some sense I believe people who never tried to use AI yet, are in a better position to judge than the folks who used it two years ago and never returned.

My AI Toolkit

I work with a variety of AI tools, mostly because of professional curiosity and to a smaller degree because each tool excels at something slightly different. Here are the ones I use most often:

  • Open WebUI. In short this is a Python web app that offers a chat interface similar to ChatGPT. Unlike ChatGPT, however, it lets you talk to different models. First and foremost, I use this to talk to local models hosted by Ollama, but secondarily I also use it to interface with other remote services like OpenAI, Anthropic and DeepSeek.
  • Simon's llm. This is a command line tool with plenty of plugins that lets you prompt different models. Think of it as a command-line version of Open WebUI. It's particularly useful for quick scripting and basic automation.
  • Ollama. This allows me to run models locally on my MacBook Pro M1 Max. With the 64GB of RAM it has, it's a pretty potent machine for basic inference despite it being three years old. I'll explain later why I use local models.
  • Cursor. It is a fork of Visual Studio Code with AI-based auto completion and code generation built-in. It's my go-to for programming with AI assistance at the moment.
  • ChatGPT. Like probably most AI users, I use ChatGPT, particularly on my phone. I pay for the Plus subscription because I use it enough to get a lot of value out of it. One significant use of this for me is in fact the voice mode (more on that later).

It's probably worth mentioning that you can get most of the benefits of this from just paying for a single AI tool. I think as one expands their use, as especially as one gets better at writing prompts, the desire naturally grows to use more tooling. As for which models (and services) to use day to day I don't have a particular strong strategy and preferences change quickly. For instance after DeepSeek's R1 release, I started exploring it quite a bit for programming — and it's doing a phenomenal job at it —  and as of writing that's just a few days old.

If you want to run models locally, Apple Silicon machines currently offer some of the best “bang for your buck” in terms of performance, power usage, and money. With tailscale, I can even access my MacBook's Open WebUI interface from my phone, as long as it is powered on.

Guide for installing llm and Open WebUI

This is a bit of a plug for uv quite honestly. If you have it, it's crazy quick to get Open WebUI and llm running locally:

uv tool install -p python3.11 open-webui
uv tool install --with llm-ollama --with llm-claude-3 --with llm-deepseek llm

Afterwards you can launch open-webui and use the llm tool:

open-webui serve
llm keys set deepseek
llm -m deepseek-coder '# write timsort in python'
  • By default, Open WebUI only connects to OpenAI or Ollama. If you want to add the hosted Anthropic or DeepSeek models, you need to create a custom function in the “Functions” section of Open WebUI's admin panel.
  • Import the functions and configure your API keys, and you’re ready to go.

Functions you can import to the WebUI:

AI Affordances

One frequent concern I hear is “you cannot trust LLMs” as they tend to hallucinate. I get this in particular when I explain that I frequently use this as a replacement for Google! However, I approach the risk of hallucination the same way I would when seeking advice from another human: people can and are routinely wrong, and you learn to cross-check selectively.

I treat AI as I would a collaborator or a pretty good intern but I remain responsible for the final outcome. In this case the intern also happens to get better month by month as models improve. And a bit like a human, that digital intern has morals and wants to be argued with. Except, of course, that some of those AI interns don't want to talk about China, while others get a stroke if you talk about certain people. But regardless of how good they get, in the end, it's my fault and my fault alone if I do the wrong thing. I won't blame the AI and I need to spot check.

However, the logical conclusion of this is not that it's wrong all the time and you need to check everything, or that you cannot trust it at all. It's similar to how you engage in a technical discussion with others about a problem. I have seen more than one situation where the conventional wisdom in the room is just wrong for a few minutes, until someone points out that we had it wrong.

Another major advantage is that AI tools are relatively open. You can run models locally and integrate them with scripts. Even the famous OpenAI which is not at all open is much more open than a Google search is. For instance, you can create a simple script for grammar-checking right from your command line.

In other words, you can integrate it locally and nobody stops you. By contrast, many, many years ago I had a tool on my computer that allowed me to issue web searches and extract text from results. That has stopped working such a long time ago that I almost forgot about it. It has stopped working because there is basically no competition in search, and Google does not want me to use it like that.

For instance, you can create a simple script for grammar checking right from your command line:

#!/bin/sh
MODEL=phi4:latest
if ping -q -c1 google.com &>/dev/null; then
  MODEL=claude-3-5-sonnet-latest
fi
OLD_TEXT="$(cat)"
llm -m $MODEL "$OLD_TEXT" -s "fix spelling and grammar in the given text,
    and reply with the improved text and no extra commentary.
    Use double spacing."

This script can automatically switch between a local model (phi4 via Ollama) and a remote one (claude-3-5-sonnet-latest) based on internet connectivity. With a command like !llm-spell in Vim, I can fix up sentences with a single step.

Or you can manipulate the contents of the clipboard like this:

pbpaste | llm-spell | pbcopy && say "AI is done"

Writing with AI

I don't let AI write my articles. As a non-native Speaker, mistakes and linguistic quirks are part of me. That said, I do rely on AI tools for helping me write.

Often, after I have made my first pass through a page, I ask an LLM to read through it and give me comments. I have a Apple Note with various prompts I can use or I just come up with what I need in the moment. The process is not particularly scripted out. I basically talk to the LLM to figure out what it thinks of the text, etc.

Here are some of the things I use AI for when writing:

  • Grammar checking: I compare the AI’s suggested revisions side by side with my original text and pick the changes I prefer.
  • Restructuring: AI often helps me see when my writing is too wordy. In the days before AI, I often ended up with super long articles that did not read well and that I did not publish. Models like o1 are very helpful in identifying things that don't need to be said.
  • Writing Notes and finding key points: Here, I ask the AI to read through a draft “like a Computer Science 101 student” and take notes. This helps me see if what it absorbed matches what I intended to convey.
  • Roast my Article: I have a few prompts that asks the AI to “roast” or criticize my article, as if commenting on Reddit, Twitter, or Hacker News. Even though these critiques seem shallow, they can sting, and they often highlight weaknesses in my argument or lack of clarity. Even if they don't necessarily impact the writing, they prime me for some of the feedback I inevitably receive.
  • Identifying jargon: If I worry there's too much jargon, I use AI to resolve acronyms and point out technical terms I've used without explanation, helping me make the text more accessible.

Programming with AI

This is probably the bulk of what I do and it falls into three major categories.

  • Debugging and summarizing: Throwing stack traces and errors at an AI is usually a good first step in finding out where I should look, what might be wrong. Sometimes it fixes a problem entirely by itself. I also sometimes just throw a long bug report or issue at the thing to try to reason through it.
  • Fuzzy parsing: Sometimes you have an output, but you don't have the input. As an example I have a little side project to write a command line tool help generator, and AI is really good at taking the output, “parsing“ it and creating a structure as input. That is a good first step to then generate out the code that should match again that output.
  • Code generation: lastly it just programs really well. It can do single code snippets, it can generate complex stuff. Particularly for prototyping it's amazing.

Now it's not great at all languages. I find it performs exceptionally well at Python, somewhat okay at JavaScript and horrible at Rust. However, particularly if I wanna solve an algorithmic problem, I can often iterate in Python until I have what I want, then ask it to translate it to Rust. That's from my experience at least a decent way to get going quickly on some harder problems.

Talking to Her

Pair programming is fantastic, but not always possible if you're programming alone at home. An LLM can serve as a stand-in discussion partner — a poor man's interlocutor. Thankfully it's actually not all that poor, because it's really good at telling you what stuff is called. This is most fun if you do it with voice mode, but it also works quite well in text.

Here a motivating example from voice mode:

A: So, do you remember if you have, like, strcat, it always copies from the start, but if you keep concatenating to the same buffer, it's really inefficient. So, remember what the Big-O of that is if you do it in a loop, and wasn't there, like, a humorous name for this kind of algorithm?

AI: You’re thinking of the “Schlemiel the Painter’s algorithm.” It’s a humorous way of describing the inefficiency of repeatedly appending to a string using operations like strcat, which copies from the start each time. The Big O complexity for this operation is O(n^2) if done repeatedly in a loop, since each strcat operation takes time proportional to the current length of the string.

It's also fun and educational for my kids. They ask a lot of questions, and ChatGPT can tailor its explanations to their age level. If the explanation is still too advanced, I can immediately ask it to simplify further. If it was too dumbed down, I can make it go the other way. Now again, there is always a risk that it gets it wrong, but that is okay. Part of all of this is to convey to the children the idea that we need to validate it and that being critical is important.

What makes ChatGPT so enjoyable here is that it's able to keep the conversation going — it has state. Answered in large metric numbers? Can always ask it to scale to elephants or soccer fields.

ChatGPT is also incredibly helpful when having to work with multiple languages. For a recent example, my kids have Greek friends and we tried to understand the difference between some Greek words that came up. I have no idea how to write it, Google translate does not understand my attempts of pronouncing them either. However, ChatGPT does. If I ask it in voice mode what “pa-me-spee-tee” in Greek means it knows what I tried to mumble and replies in a helpful manner.

Lastly the use on the go. Sometimes I'm just not in a position where I can actually write it down on a keyboard, but if I basically talk to ChatGPT, it can transcribe it down and it will also remove some of my “um's” and it will just be there and I can copy-paste it later and edit it. (To prove a point, I transcribed the previous sentence with the mobile app). This is a very different experience than transcribing on iOS natively. It understands enough context of what I'm babbling that it will start and stop transcribing. Now this is not perfect, but pretty damn good and incredibly helpful.

The multilingual aspect is particularly helpful because our family is multilingual. Being able to fluidly switch between German, Russian and English is such a refreshing experience.

No Slop

I mentioned earlier that I don't let LLMs write my texts. I also don't use AI to make illustrations, though I have in the past. The reason is that there is a certain style that goes along with these illustrations, which is just incredibly off-putting. They are noticeably AI-generated, and typically quite low-quality. People have taken to calling these images “AI slop” and I personally respond really badly to it. When someone throws me a pull request, an email or a text message that is obviously AI-generated without disclosing this, I immediately have a very low opinion of them.

Slop like hallucinations are a problem, but they are only a problem if you don't use your brain. Even the worst slop can be the foundation of really good content. I'm a horrible artist, but I can use Illustrator. Even an AI slop image can help me trace the person in the pose I wanted. Likewise you can throw your notes into a document and let the AI imagine a story around it. You probably can't use that story right away, but you can use it as potential inspiration.

Final Thoughts

AI tools, at their best, feel less like disruptive and dark magic and more like a natural extension of the creative process as long as you see them as curious collaborators. My approach isn't about outsourcing thinking, but augmenting it: using LLMs to accelerate grunt work, untangle mental knots, and prototype ideas faster. Skepticism is healthy, but dismissing AI outright risks missing its potential as a multiplier for those willing to engage critically.

Build It Yourself

2025-01-24 08:00:00

Another day, another rant about dependencies. from me. This time I will ask you that we start and support a vibe shift when it comes to dependencies.

You're probably familiar with the concept of “dependency churn.” It's that never-ending treadmill of updates, patches, audits, and transitive dependencies that we as developers love to casually install in the name of productivity. Who doesn't enjoy waiting for yet another cargo upgrade just so you can get that fix for a bug you don't even have?

It's a plague in most ecosystems with good packaging solutions. JavaScript and Rust are particularly badly affected by that. A brand new Tokio project drags in 28 crates, a new Rocket project balloons that to 172, and a little template engine like MiniJinja can exist with just a single dependency — while its CLI variant slurps up 142.

If that doesn't sound like a big deal, let's consider terminal_size. It is a crate that does exactly what its name suggests: it figures out your terminal dimensions. The underlying APIs it uses have effectively been stable since the earliest days of computing terminals—what, 50 years or so? And yet, for one function, terminal-size manages to introduce three or four additional crates, depending on your operating system. That triggers a whole chain reaction, so you end up compiling thousands of other functions just to figure out if your terminal is 80x25 or 120x40. That crate had 26 releases. My own version of that that I have stuck away in a project from 10 years ago still works without a single update. Because shocker: nothing about figuring out terminal sizes has changed. [1]

So why does terminal-size have so many updates if it's so stable? Because it's build on top of platform abstraction libraries that constantly churn, so it needs to update to avoid code duplication and blowing up compile times even more.

But “big supply chain” will tell you that you must do it this way. Don't you dare to copy paste that function into your library. Or don't you date to use “unsafe” yourself. You're not qualified enough to write unsafe code, let the platform abstraction architects do that. Otherwise someone will slap you. There are entire companies who are making a living of supplying you with the tools needed to deal with your dependency mess. In the name of security, we're pushed to having dependencies and keeping them up to date, despite most of those dependencies being the primary source of security problems.

The goal of code in many ways should be to be written in a way that it does not need updates. It should eventually achieve some level of stability. In the Rust ecosystem stable code is punished. If you have a perfectly working dependency but you have a somewhat inactive bug tracker, RUSTSEC will come by and give you a chunk rating.

But there is a simpler path. You write code yourself. Sure, it's more work up front, but once it's written, it's done. No new crates, no waiting for upsteam authors to fix that edge case. If it's broken for you, you fix it yourself. Code that works doesn't necessarily need the maintenance treadmill. Your code has a corner case? Who cares. This is that vibe shift we need in the Rust world: celebrating fewer dependencies rather than more.

We're at a point in the most ecosystems where pulling in libraries is not just the default action, it's seen positively: “Look how modular and composable my code is!” Actually, it might just be a symptom of never wanting to type out more than a few lines.

Now one will make the argument that it takes so much time to write all of this. It's 2025 and it's faster for me to have ChatGPT or Cursor whip up a dependency free implementation of these common functions, than it is for me to start figuring out a dependency. And it makes sense as for many such small functions the maintenance overhead is tiny and much lower than actually dealing with constant upgrading of dependencies. The code is just a few lines and you also get the benefit of no longer need to compile thousands of lines of other people's code for a single function.

But let's face it: corporate code review culture has also has infected Open Source software. Companies are more likely to reward engineers than scold them for pulling in that new “shiny library” that solves the problem they never actually had. That creates problems, so dependabot and friends were born. Today I just dread getting dependabot pull requests but on projects but I have to accept it. I'm part of an ecosystem with my stuff and that ecosystem is all about churn, churn, churn. In companies you can also keep entire internal engineering teams busy with vendoring dependencies, internal audits and upgrading things throughout the company.

Fighting this fight is incredibly hard! Every new hire has been trained on the idea that dependencies are great, that code reuse is great. That having old code sitting around is a sign of bad engineering culture.

It's also hard to fight this in Open Source. Years ago I wrote sha1-smol which originally was just called sha1. It became the standard crate to calculate SHA1 hashes. Eventually I was pressured to donate that package name to rust-crypto and to depend on the rest of the crypto ecosystem as it was so established. If you want to use the new sha1 crate, you get to enjoy 10 dependencies. But there was just no way around it, because that name in the registry is precious and people also wanted to have trait compatibility. It feels tiring to be the only person in a conversation pushing to keep the churn down and dependencies low.

It's time to have a new perspective: we should give kudos to engineers who write a small function themselves instead of hooking in a transitive web of crates. We should be suspicious of big crate graphs. Celebrated are the minimal dependencies, the humble function that just quietly does the job, the code that doesn't need to be touched for years because it was done right once.

And sure, it's not black and white. There are the important libraries that solve hard problems. Graphics libraries that abstract over complex drivers, implementations of protocols like HTTP and QUIC. I won't be able to get rid of tokio and I have no desire to. But when you end up using one function, but you compile hundreds, some alarm bell should go off.

We need that vibe shift. To celebrate building it yourself when it's appropriate to do so. To give credit to library authors who build low to no-dependency Open Source libraries.

For instance minijinja celebrates it in the readme:

$ cargo tree
minimal v0.1.0 (examples/minimal)
└── minijinja v2.6.0 (minijinja)
    └── serde v1.0.144

And it has a PR to eventually get rid of the last dependency. And sometime this year I will make it my goal to go ahead proudly and trim down all that fat in my projects.

[1] Disclaimer: you will need one dependency for UNIX: libc. That's because Rust does not expose the platform's libc constants to you, and they are not standarized. That however is such a common and lightweight dependency that you won't be able to avoid it anyways.

Automatic Server Reloading in Rust on Change: What is listenfd/systemfd?

2025-01-19 08:00:00

When I developed Werkzeug (and later Flask), the most important part of the developer experience for me was enabling fast, automatic reloading. Werkzeug (and with it Flask), this is achieved by using two procsses at all times. The parent process holds on to the file descriptor of the socket on which the server listens, and a subprocess picks up that file descriptor. That subprocess restarts when it detects changes. This ensures that no matter what happens, there is no window where the browser reports a connection error. At worst, the browser will hang until the process finishes reloading, after which the page loads successfully. In case the inner process fails to come up during restarts, you get an error message.

A few years ago, I wanted to accomplish the same experience for working with Rust code which is why I wrote systemfd and listenfd. I however realized that I never really wrote here about how they work and disappointingly I think those crates, and a good auto-reloading experience in Rust are largely unknown.

Watching for Changes

Firstly one needs to monitor the file system for changes. While in theory I could have done this myself, there was already a tool that could do that.

At the time there was cargo watch. Today one might instead use it together with the more generic watchexec. Either one monitor your workspace for changes and then executes a command. So you can for instance tell it to restart your program. One of these will work:

watchexec -r -- cargo run
cargo watch -x run

You will need a tool like that to do the watching part. At this point I recommend the more generic watchexec which you can find on homebrew and elsewhere.

Passing Sockets

But what about the socket? The solution to this problem I picked comes from systemd. Systemd has a “protocol” that standardizes passing file descriptors from one process to another through environment variables. In systemd parlance this is called “socket activation,” as it allows systemd to only launch a program if someone started making a request to the socket. This concept was originally introduced by Apple as part of launchd.

To make this work with Rust, I created two crates:

  • systemfd is the command line tool that opens sockets and passes them on to other programs.
  • listenfd is a Rust crate that accepts file descriptors from systemd or systemfd.

It's worth noting that systemfd is not exclusivly useful to Rust. The systemd protocol can be implemented in other languages as well, meaning that if you have a socket server written in Go or Python, you can also use systemfd.

So here is how you use it.

First you need to add listenfd to your project:

cargo add listenfd

Then, modify your server code to accept sockets via listenfd before falling back to listening itself on ports provided through command-line arguments or configuration files. Here is an example using listenfd in axum:

use axum::{routing::get, Router};
use tokio::net::TcpListener;

async fn index() -> &'static str {
    "Hello, World!"
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let app = Router::new().route("/", get(index));

    let mut listenfd = listenfd::ListenFd::from_env();
    let listener = match listenfd.take_tcp_listener(0)? {
        Some(listener) => TcpListener::from_std(listener),
        None => TcpListener::bind("0.0.0.0:3000").await,
    }?;

    axum::serve(listener, app).await?;
    Ok(())
}

The key point here is to accept socket 0 from the environment as a TCP listener and use it if available. If the socket is not provided (e.g. when launched without systemd/systemfd), the code falls back to opening a fixed port.

Putting it Together

Finally you can use cargo watch / watchexec together with systemfd:

systemfd --no-pid -s http::8888 -- watchexec -r -- cargo run
systemfd --no-pid -s http::8888 -- cargo watch -x run

This is what the parameters mean:

  • systemfd needs to be first it's the program that opens the sockets.
  • --no-pid is a flag prevents the PID from being passed. This is necessary for listenfd to accept the socket. This is a departure of the socket passing protocol from systemd which otherwise does not allow ports to be passed through another program (like watchexec). In short: when the PID information is not passed, then listenfd will accept the socket regardless. Otherwise it would only accept it from the direct parent process.
  • -s http::8888 tells systemfd to open one TCP socket on port 8888. Using http instead of tcp is a small improvement that will cause systemfd to print out a URL on startup.
  • -- watchexec -r makes watchexec restart the process when something changes in the current working directory.
  • -- cargo run is the program that watchexec will start and re-start onm changes. In Rust this will first compile the changes and then run the application. Because we put listenfd in, it will try to first accept the socket from systemfd.

The end result is that you can edit your code, and it will recompile automatically and restart the server without dropping any requests. When you run it, and perform changes, it will look a bit like this:

$ systemfd --no-pid -s http::5555 -- watchexec -r -- cargo run
~> socket http://127.0.0.1:5555/ -> fd #3
[Running: cargo run]
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/axum-test`
[Running: cargo run]
   Compiling axum-test v0.1.0 (/private/tmp/axum-test)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.52s
     Running `target/debug/axum-test`

For easier access, I recommend putting this into a Makefile or similar so you can just run make devserver and it runs the server in watch mode.

To install systemfd you can use curl to bash:

curl -sSfL https://github.com/mitsuhiko/systemfd/releases/latest/download/systemfd-installer.sh | sh

What About Windows?

Now how does this work on Windows? The answer is that systemfd and listenfd have a custom, proprietary protocol that also makes socket passing work on Windows. That's a more complex system which involves a local RPC server. However the system does also support Windows and the details about how it works are largely irrelevant for you as a user — unless you want to implement that protocol for another programming language.

Potential Improvements

I really enjoy using this combination, but it can be quite frustrating to require so many commands, and the command line workflow isn't optimal. Ideally, this functionality would be better integrated into specific Rust frameworks like axum and provided through a dedicated cargo plugin. In a perfect world, one could simply run cargo devserver, and everything would work seamlessly.

However, maintaining such an integrated experience is a much more involved effort than what I have. Hopefully, someone will be inspired to further enhance the developer experience and achieve deeper integration with Rust frameworks, making it more accessible and convenient for everyone.