2024-11-26 08:00:00
There is currently an effort underway to build a new universal lockfile standard for Python, most of which is taking place on the Python discussion forum. This initiative has highlighted the difficulty of creating a standard that satisfies everyone. It has become clear that different Python packaging tools are having slightly different ideas in mind of what a lockfile is supposed to look like or even be used for.
In those discussions however also a small other aspect re-emerged: Python has a metadata problem. Python's metadata system is too complex and suffers from what I would call “lack of constraints”.
JavaScript provides an excellent example of how constraints can simplify and improve a system. In JavaScript, metadata is straightforward. Whether you develop against a package locally or if you are using a package from npm, metadata represents itself the same way. There is a single package.json file that contains the most important metadata of a package such as name, version or dependencies. This simplicity imposes significant but beneficial constraints:
These constraints offer several benefits:
In contrast, Python has historically placed very few constraints on metadata. For example, the old setup.py based build system essentially allowed arbitrary code execution during the build process. At one point it was at least strongly suggested that the version produced by that build step better match what is uploaded to PyPI. However, in practice, if you lie about the version that is okay too. You could upload a source distribution to PyPI that claims it's 2.0 but will in fact install 2.0+somethinghere or a completely different version entirely.
What happens is that both before a package is published to PyPI and when a package is installed locally after downloading, the metadata is generated from scratch. Not only does that mean the metadata does not have to match, it also means that it's allowed to be completely different. It's absolutely okay for a package to claim it depends on cool-dependency on your machine, but on uncool-dependency on my machine. Or to dependent on different packages depending on the time of the day of the phase of the moon.
Editable installs and caching are particularly problematic since metadata could become invalid almost immediately after being written. [3]
Some of this has been somewhat improved because the new pyproject.toml standard encourages static metadata. However build systems are entirely allowed to override that by falling back to what is called “dynamic metadata” and this is something that is commonly done.
In practice this system incurs a tremendous tax to everybody that can be easily missed.
Disjointed and complex metadata access: there is no clear relationship of PyPI package name and the installed Python modules. If you know what the PyPI package name is, you can access metadata via importlib.metadata. Metadata is not read from pyproject.toml, even if it's static, instead it takes the package name and it accesses the metadata from the .dist-info folder (most specifically the METADATA file therein) installed into site-packages.
Mandatory metadata re-generation: As a consequence if you edit pyproject.toml to edit a piece of metadata, you need to re-install the package for that metadata to be updated in the .dist-info. People commonly forget doing that, so desynchronized metadata is very common. This is true even for static metadata today!
Unclear cache invalidation: Because metadata can be dynamic, it's not clear when you should automatically re-install a package. It's not enough to just track pyproject.toml for changes when dynamic metadata is used. uv for instance has a really complex, explicit cache management system so one can help uv detect outdated metadata. This obviously is non-standardized, requires uv to understand version control systems and is also not shared with other tools. For instance if you know that the version information incorporates the git hash, you can tell uv to pay attention to git commits.
Fragmented metadata storage: even where generated metadata is stored is complex. Different systems have slightly different behavior for storing that metadata.
Dynamic metadata makes resolvers slow: Dynamic metadata makes the job of resolvers and installers very hard and slows them down. Today for instance advanced resolvers like poetry or uv sometimes are not able to install the right packages, because they assume that dependency metadata is consistent across sdists and wheels. However there are a lot of sdists available on PyPI that publish incomplete dependency metadata (just whatever the build step for the sdist created on the developer's machine is what is cached on PyPI).
Not getting this right can be the difference of hitting one static URL with all the metadata, and downloading a zip file, creating a virtualenv, installing build dependencies, generating an entire sdist and then reading the final generated metadata. Many orders of magnitude difference in time it takes to execute.
This also extends to caching. If the metadata can constantly change, how would a resolver cache it? Is it required to build all possible source distributions to determine the metadata as part of resolving?
Cognitive complexity: The system introduces an enormous cognitive overhead which makes it very hard to understand for users, particularly when things to wrong. Incorrectly cached metadata can be almost impossible to debug for a user because they do not understand what is going on. Their pyproject.toml shows the right information, yet for some reason it behaves incorrectly. Most people don't know what "egg info" or "dist info" is. Or why an sdist has metadata in a different location than a wheel or a local checkout.
Having support for dynamic metadata also means that developers continue to maintain elaborate and confusing systems. For instance there is a plugin for hatch that dynamically creates a readme [4], requiring even arbitrary Python code to run to display documentation. There are plugins to automatically change versions to incorporate git version hashes. As a result to figure out what version you actually have installed it's not just enough to look into a single file, you might have to rely on a tool to tell you what's going on.
The challenge with dynamic metadata in Python is vast, but unless you are writing a resolver or packaging tool, you're not going to experience the pain as much. You might in fact quite enjoy the power of dynamic metadata. Unsurprisingly bringing up the idea to remove it is very badly received. There are so many workflows seemingly relying on it.
At this point fixing this problem might be really hard because it's a social problem more than a technical one. If the constraint would have been placed there in the first place, these weird use cases would never have emerged. But because the constraints were not there, people were free to go to town with leveraging it with all the consequences it causes.
I think at this point it's worth moving the cheese, but it's unclear if this can be done through a standard. Maybe the solution will be for tools like uv or poetry to warn if dynamic metadata is used and strongly discourage it. Then over time the users of packages that use dynamic metadata will start to urge the package authors to stop using it.
The cost of dynamic metadata is real, but it's felt only in small ways all the time. You notice it a bit when your resolver is slower than it has to, you notice it if your packaging tool installs the wrong dependency, you notice it if you need to read the manual for the first time when you need to reconfigure your cache-key or force a package to constantly reinstall, you notice it if you need to re-install your local dependencies over and over for them not to break. There are many ways you notice it. You don't notice it as a roadblock, just as a tiny, tiny tax. Except that is a tax we all pay and it makes the user experience significantly worse compared to what it could be.
The deeper lesson here is that if you give developers too much flexibility, they will inevitably push the boundaries and that can have significant downsides as we can see. Because Python's packaging ecosystem lacked constraints from the start, imposing them now has become a daunting challenge. Meanwhile, other ecosystems, like JavaScript's, took a more structured approach early on, avoiding many of these pitfalls entirely.
[1] | You can see how this works in action for sentry-cli for instance. The @sentry/cli package declares all its platform specific dependencies as optionalDependencies (relevant package.json). Each platform build has a filter in its package.json for os and cpu. For instance this is what the arm64 linux binary dependency looks like: package.json. npm will attempt to install all optional dependencies, but it will skip over the ones that are not compatible with the current platform. |
[2] | For @sentry/cli at version 2.39.0 for instance this means that this singular URL will return all the information that a resolver needs: registry.npmjs.org/@sentry/cli/2.39.0 |
[3] | A common error in the past was to receive a pkg_resources.DistributionNotFound exception when trying to run a script in local development |
[4] | I got some flak on Bluesky for throwing readme generators under the bus. While they do not present the same problem when it comes to metadata like dependencies and versions do, they do still increase the complexity. In an ideal world what you find in site-packages represents what you have in your version control and there is a README.md file right there. That's what you have in JavaScript, Rust and plenty of other ecosystems. What we have however is a build step (either dynamic or copying) taking that readme file, and placing it in a RFC 5322 header encoded file in a dist info. So instead of "command clicking" on a dependency and finding the readme, we need special tools or arcane knowledge if we want to read the readme files locally. |
2024-11-18 08:00:00
It's been a few years since I wrote about my challenges with async/await-based systems and how they just seem to not support back pressure well. A few years later, I do not think that this problem has subsided much, but my thinking and understanding have perhaps evolved a bit. I'm now convinced that async/await is, in fact, a bad abstraction for most languages, and we should be aiming for something better instead and that I believe to be thread.
In this post, I'm also going to rehash many arguments from very clever people that came before me. Nothing here is new, I just hope to bring it to a new group of readers. In particular, you should really consider these who highly influential pieces:
As programmers, we are so used to how things work that we make some implicit assumptions that really cloud our ability to think freely. Let me present you with a piece of code that demonstrates this:
def move_mouse():
while mouse.x < 200:
mouse.x += 5
sleep(10)
def move_cat():
while cat.x < 200:
cat.x += 10
sleep(10)
move_mouse()
move_cat()
Read that code and then answer this question: do the mouse and cat move at the same time, or one after another? I guarantee you that 10 out of 10 programmers will correctly state that they move one after another. It makes sense because we know Python and the concept of threads, scheduling and whatnot. But if you speak to a group of children familiar with Scratch, they are likely to conclude that mouse and cat move simultaneously.
The reason is that if you are exposed to programming via Scratch you are exposed to a primitive form of actor programming. The cat and the mouse are both actors. In fact, the UI makes this pretty damn clear, just that the actors are called “sprites”. You attach logic to a sprite on the screen and all these pieces of logic run at the same time. Mind-blowing. You can even send messages from sprite to sprite.
The reason I want you to think about this for a moment is that I think this is rather profound. Scratch is a very, very simple system and it's intended to teaching programming to young kids. Yet the model it promotes is an actor system! If you were to foray into programming via a traditional book on Python, C# or some other language, it's quite likely that you will only learn about threads at the very end. Not just that, it will likely make it sound really complex and scary. Worse, you will probably only learn about actor patterns in some advanced book that will bombard you with all the complexities of large scale applications.
There is something else though you should keep in mind: Scratch will not talk about threads, it will not talk about monads, it will not talk about async/await, it will not talk about schedulers. As far as you are concerned as a programmer, it's an imperative (though colorful and visual) language with some basic “syntax” support for message passing. Concurrency comes natural. A child can program it. It's not something to be afraid of.
The second thing I want you to take away is that imperative languages are not inferior to functional ones.
While probably most of us are using imperative programming languages to solve problems, I think we all have been exposed to the notion that it's inferior and not particularly pure. There is this world of functional programming, with monads and other things. This world have these nice things involving composition, logic and maths and fancy looking theorems. If you program in that, you're almost transcending to a higher plane and looking down to the folks who are stitching together if statements, for loops, make side effects everywhere, and are doing highly inappropriate things with IO.
Okay, maybe it's not quite as bad, but I don't think I'm completely wrong with those vibes. And look, I get it. I feel happy chaining together lambdas in Rust and JavaScript. But we should also be aware that these constructs are, in many languages, bolted on. Go, for instance, gets away without most of this, and that does not make it an inferior language!
So what you should keep in mind here is that there are different paradigms, and mentally you should try to stop thinking for a moment that functional programming has all its stuff figured out, and imperative programming does not.
Instead, I want to talk about how functional languages and imperative languages are dealing with “waiting”.
The first thing I want to back to is the example from above. Both of the functions (for the cat and the mouse) can be seen as separate threads of execution. When the code calls sleep(10) there's clearly an expectation by the programmer that the computer will temporarily pause the execution and continue later. I don't want to bore you with monads, so as my “functional” programming language, I will use JavaScript and promises. I think that's an abstraction that most readers will be sufficiently familiar with:
function moveMouseBlocking() {
while (mouse.x < 200) {
mouse.x += 5;
sleep(10); // a blocking sleep
}
}
function moveMouseAsync() {
return new Promise((resolve) => {
function iterate() {
if (mouse.x < 200) {
mouse.x += 5;
sleep(10).then(iterate); // non blocking sleep
} else {
resolve();
}
}
iterate();
});
}
You can immediately see a challenge here: it's very hard to translate the blocking example into a non blocking example because all the sudden we need to find a way to express our loop (or really any control flow). We need to manually decompose it into a form of recursive function calling and we need the help of a scheduler and executor here to do the waiting.
This style obviously eventually became annoying enough to deal with that async/await was introduced to mostly restore the sanity of the old code. So it now can look more like this:
async function moveMouseAsync() {
while (mouse.x < 200) {
mouse.x += 5;
await sleep(10);
}
}
Behind the scenes though, nothing has really changed, and in particular, when you call that function, you just get an object that encompasses the “composition of the computation”. That object is a promise which will eventually hold the resulting value. In fact, in some languages like C#, the compiler will really just transpile this into chained function calls. With the promise in hand, you can await the result, or register a callback with then which gets invoked if this thing ever runs to completion.
For a programmer, I think async/await is clearly understood as some sort of neat abstraction — an abstraction over promises and callbacks. However strictly speaking, it's just worse than where we started out, because in terms of expressiveness, we have lost an important affordance: we cannot freely suspend.
In the original blocking code, when we invoked sleep we suspended for 10 milliseconds implicitly; we cannot do the same with the async call. Here we have to “await” the sleep operation. This is the crucial aspect of why we're having these “colored functions”. Only an async function can call another async function, as you cannot await in a sync function.
The above example shows another problem that async/await causes: what if we never resolve? A normal function call eventually returns, the stack unwinds, and we're ready to receive the result. In an async world, someone has to call resolve at the very end. What if that is never called? Now in theory, that does not seem all that different from someone calling sleep() with a large number to suspend for a very long time, or waiting on a pipe that never gets data sent into. But it is different! In one case, we keep the call stack and everything that relates to it alive; in another case, we just have a promise and are waiting for independent garbage collection with everything already unwound.
Contract wise, there is absolutely nothing that says one has to call resolve. As we know from theory the halting problem is undecidable so it's going to be actually impossible to know if someone will call resolve or not.
That sounds pedantic, but it's very important because promises/futures and async/await are making something strictly worse than not having them. Let's consider a JavaScript promise to be the most canonical example of what this looks like. A promise is created by an anonymous function, that is invoked to eventually call resolve. Take this example:
let neverSettle = new Promise((resolve) => {
// this function ends, but we never called resolve
});
Let me clarify first that this is not a JavaScript specific problem, but it's nice to show it this way. This is a completely legal thing! It's a promise, that never resolves. That is not a bug! The anonymous function in the promise itself will return, the stack will unwind, and we are left with a “pending” promise that will eventually get garbage collected. That is a bit of a problem because since it will never resolve, you can also never await it.
Think of the following example, which demonstrates this problem a bit. In practice you might want to reduce how many things can work at once, so let's imagine a system that can handle up to 10 things that run concurrently. So we might want to use a semaphore to give out 10 tokens so up to 10 things can run at once; otherwise, it applies back pressure. So the code looks like this:
const semaphore = new Semaphore(10);
async function execute(f) {
let token = await semaphore.acquire();
try {
await f();
} finally {
await semaphore.release(token);
}
}
But now we have a problem. What if the function passed to the execute function returns neverSettle? Well, clearly we will never release the semaphore token. This is strictly worse compared to blocking functions! The closest equivalent would be a stupid function that calls a very long running sleep. But it's different! In one case, we keep the call stack and everything that relates to it alive; in the other case case we just have a promise that will eventually get garbage collected, and we will never see it again. In the promise case, we have effectively decided that the stack is not useful.
There are ways to fix this, like making promise finalization available so we can get informed if a promise gets garbage collected etc. However I want to point out that as per contract, what this promise is doing is completely acceptable and we have just caused a new problem, one that we did not have before.
And if you think Python does not have that problem, it does too. Just await Future() and you will be waiting until the heat death of the universe (or really when you shut down your interpreter).
The promise that sits there unresolved has no call stack. But that problem also comes back in other ways, even if you use it correctly. The decomposed functions calling functions via the scheduler flow means that now you need extra affordances to stitch these async calls together into full call stacks. This all creates extra problems that did not exist before. Call stacks are really, really important. They help with debugging and are also crucial for profiling.
Okay, so we know there is at least some challenge with the promise model. What other abstractions are there? I will make the argument that a function being able to “suspend” a thread of execution is a bloody great capability and abstraction. Think of it for a moment: no matter where I am, I can say I need to wait for something and continue later where I left off. This is particularly crucial to apply back-pressure if you decide to need it later. The biggest footgun in Python asyncio remains that write is non blocking. That function will stay problematic forever and you need to follow up with await s.drain() to avoid buffer bloat.
In particular it's an important abstraction because in the real world we have constantly faced with things in fact not being async all the time, and some of the things we think might not block, will in fact block. Just like Python did not think that write should be able to block when it was designed. I want to give you a colorful example of this. Why is the following code blocking, and what is?
def decode_object(idx):
header = indexes[idx]
object_buf = buffer[header.start:header.start + header.size]
return brotli.decompress(object_buf)
It's a bit of a trick question, but not really. The reason it's blocking is because memory access can be blocking! You might not think of it this way, but there are many reasons why just touching a memory region can take time. The most obvious one is memory-mapped files. If you're touching a page that hasn't been loaded yet, the operating system will have to shovel it into memory before returning back to you. There is no “await touching this memory” expression, because if there were, we would have to await everywhere. That might sound petty but blocking memory reads were at the source of a series of incidents at Sentry [1].
The trade-off that async/await makes today is that the idea is that not everything needs to block or needs to suspend. The reality, however, has shown me that many more things really want to suspend, and if a random memory access is a case for suspending, then is the abstraction worth anything?
So maybe to allow any function call block and suspend really was the right abstraction to begin with.
But then we need to talk about spawning threads next, because a single thread is not worth much. The one affordance that async/await system gives you that you don't have otherwise, is actually telling two things to run concurrently. You get that by starting the async operation and deferring the awaiting to later. This is where I will have to concede that async/await has something going for it. It moves the reality of concurrent execution right into the language. The reason concurrency comes so natural to a Scratch programmer is that it's right there, so async/await solves a very similar purpose here.
In a traditional imperative language based on threads, the act of spawning a thread is usually hidden behind a (often convoluted) standard library function. More annoyingly threads very much feel bolted on and completely inadequate to even to the most basic of operations. Because not only do we want to spawn threads, we want to join on them, we want to send values across thread boundaries (including errors!). We want to wait for either a task to be done, or a keyboard input, messages being passed etc.
So lets focus on threads for a second. As said before, what we are looking for is the ability for any function to yield / suspend. That's what threads allow us to do!
When I am talking about “threads” here, I'm not necessarily referring to a specific kind of implementation of threads. Think of the example of promises from above for a moment: we had the concept of “sleeping”, but we did not really say how that is implemented. There is clearly some underlying scheduler that can enable that, but how that takes places is outside the scope of the language. Threads can be like that. They could be real OS threads, they could be virtual and be implemented with fibers or coroutines. At the end of the day, we don't necessarily have to care about it as developer if the language gets it right.
The reason this matters is that when I talk about “suspending” or “continuing somewhere else,” immediately the thought of coroutines and fibers come to mind. That's because many languages that support them give you those capabilities. But it's good to step back for a second and just think about general affordances that we want, and not how they are implemented.
We need a way to say: run this concurrently, but don't wait for it to return, we want to wait later (or never!). Basically, the equivalent in some languages to call an async function, but to not await. In other words: to schedule a function call. And that is, in essence, just what spawning a thread is. If we think about Scratch: one of the reasons concurrency comes natural there is because it's really well integrated, and a core affordance of the language. There is a real programming language that works very much the same: go with its goroutines. There is syntax for it!
So now we can spawn, and that thing runs. But now we have more problems to solve: synchronization, waiting, message passing and all that jazz are not solved. Even Scratch has answers to that! So clearly there is something else missing to make this work. And what even does that spawn call return?
There is an irony in async/await and that irony is that it exists in multiple languages, it looks completely the same on the surface, but works completely different under the hood. Not only that, the origin stories of async/await in different languages are not even the same.
I mentioned earlier that code that can arbitrary block is an abstraction of sorts. That abstraction for many applications really only makes sense is if the CPU time while you're blocking can be used in other useful ways. On the one hand, because the computer would be pretty bored if it was only doing things in sequence, on the other hand, because we might need things to run in parallel. At times as programmers we need to do two things to make progress simultaneously before we can continue. Enter creating more threads. But if threads are so great, why all that talking about coroutines and promises that underpins so much of async/await in different languages?
I think this is the point where the story actually becomes confusing quickly. For instance JavaScript has entirely different challenges than Python, C# or Rust. Yet somehow all those languages ended up with a form of async/await.
Let's start with JavaScript. JavaScript is a single threaded language where a function scope cannot yield. There is no affordance in the language to do that and threads do not exist. So before async/await, the best you could do is different forms of callback hell. The first iteration of improving that experience was adding promises. async/await only became sugar for that afterward. The reason that JavaScript did not have much choice here is that promises was the only thing that could be accomplished without language changes, and async/await is something that can be implemented as a transpilation step. So really; there are no threads in JavaScript. But here is an interesting thing that happens: JavaScript on the language level has the concept of concurrency. If you call setTimeout, you tell the runtime to schedule a function to be called later. This is crucial! In particular it also means that a promise created, will be scheduled automatically. Even if you forget about it, it will run!
Python on the other hand had a completely different origin story. In the days before async/await, Python already had threads — real, operating system level threads. What it did not have however was the ability for multiple of those threads to run in parallel. The reason for this obviously the GIL (Global Interpreter Lock). However that “just” makes things not to scale to more than one core, so let's ignore that for a second. Because it had threads, it also rather early had people experiment with implementing virtual threads in Python. Back in the day (and to some extend today) the cost of an OS level thread was pretty high, so virtual threads were seen as a fast way to spawn more of these concurrent things. There were two ways in which Python got virtual threads. One was the Stackless Python project, which was an alternative implementation of Python (many patches for cpython rather) that implemented what's called a “stackless VM” (basically a VM that does not maintain a C stack). In short, what that enabled is implementing something that stackless called “tasklets” which were functions that could be suspended and resumed. Stackless did not have a bright future because the stackless nature meant that you could not have interleaving Python -> C -> Python calls and suspend with them on the stack.
There was a second attempt in Python called “greenlet”. The way greenlet worked was implementing coroutines in a custom extension module. It is pretty gnarly in its implementation, but it does allow for cooperative multi tasking. However, like stackless, that did not win out. Instead, what actually happened is that the generator system that Python had for years was gradually upgraded into a coroutine system with syntax support, and the async system was built on top of that.
One of the consequences of this is that it requires syntax support to suspend from a coroutine. This meant that you cannot implement a function like sleep that, when called, yields to a scheduler. You need to await it (or in earlier times you could use yield from). So we ended up with async/await because of how coroutines work in Python under the hood. The motivation for this was that it was seen as a positive thing that you know when something suspends.
One interesting consequence of the Python coroutine model is that at least on the coroutine model it can transcend OS level threads. I could make a coroutine on one thread, ship it off to another, and continue it there. In practice, that does not work because once hooked up with the IO system, it cannot travel to another event loop on another thread any more. But you can already see that fundamentally it does something quite different to JavaScript. It can travel between threads at least in theory; there are threads; there is syntax to yield. A coroutine in Python will also start out with not running, unlike in JavaScript where it's effectively always scheduled. This is also in parts because the scheduler in python can be swapped out, and there are competing and incompatible implementations.
Lastly let's talk about C#. Here the origin story is once again entirely different. C# has real threads. Not only does it have real threads, it also has per-object locks and absolutely no problems with dealing with multiple threads running in parallel. But that does not mean that it does not have other issues. The reality is that threads alone are just not enough. You need to synchronize and talk between threads quite often and sometimes you just need to wait. For instance you need to wait for user input. You still want to do something, while you're stuck there processing that input. So over time .NET introduced “tasks” which are an abstraction over async operations. They are part of the .NET threading system and the way you interact with them is that you write your code in there, you can suspend from tasks with syntax. .NET will run the task on the current thread, and if you do some blocking you stay blocked. This is in that sense, quite different from JavaScript where while no new “thread” is created, you pend the execution in the scheduler. The reason it works this way in .NET is that some of the motivation of this system was to allow UI triggered code to access the main UI thread without blocking it. But the consequence again is, that if you block for real, you just screwed something up. That however is also why at least at one point what C# did was just to splice functions into chained closures whenever it hit an await. It just decomposes one logical piece of code into many separate functions.
I really don't want to go into Rust, but Rust's async system is probably the weirdest of them all because it's polling-based. In short: unless you actively “wait” for a task to complete, it will not make progress. So the purpose of a scheduler there is to make sure that a task actually can make progress. Why did rust end up with async/await? Primarily because they wanted something that works without a runtime and a scheduler and the limitations of the borrow checker and memory model.
Of all those languages, I think the argument for async/await is the strongest for Rust and JavaScript. Rust because it's a systems language and they wanted a design that works with a limited runtime. JavaScript to me also makes sense because the language does not have real threads, so the only alternative to async/await is callbacks. But for C# the argument seems much weaker. Even the problem of having to force code to run on the UI thread could be just used by having a scheduling policy for virtual threads. The worst offender here in my mind is Python. async/await has ended up with a really complex system where the language now has coroutines and real threads, different synchronization primitives for each and async tasks that end up being pinned to one OS thread. The language even has different futures in the standard library for threads and async tasks!
The reason I wanted you to understand all this is that all these different languages share the same syntax, yet what you can do with it is completely different. What they all have in common is that async functions can only be called by async functions (or the scheduler).
Over the years I heard a lot of arguments about why for instance Python ended up with async/await and some of the arguments presented don't hold up to scrutiny from my perspective. One argument that I have heard repeatedly is that if you control when you suspend, you don't need to deal with locking or synchronization. While there is some truth to that (you don't randomly suspend), you still end up with having to lock. There is still concurrency so you need to still protect all your stuff. In Python in particular this is particularly frustrating because not only do you have colored functions, you also have colored locks. There are locks for threads and there are locks for async code, and they are different.
There is a very good reason why I showed the example above of the semaphore: semaphores are real in async programming. They are very often needed to protect a system from taking on too much work. In fact, one of the core challenges that many async/await-based programs suffer from is bloating buffers because there is an inability to exert back pressure (I once again point you to my post on that). Why can they not? Because unless an API is async, it is forced to buffer or fail. What it cannot do, is block.
Async also does not magically solve the issues with GIL in Python. It does not magically make real threads appear in JavaScript, it does not solve issues when random code starts blocking (and remember, even memory access can block). Or you very slowly calculate a large Fibonacci number.
I already alluded to this above a few times, but when we think about being able to “suspend” from an arbitrary point in time, we often immediately think of coroutines as a programmers. For good reasons: coroutines are amazing, they are fun, and every programming language should have them!
Coroutines are an important building block, and if any future language designer is looking at this post: you should put them in.
But coroutines should be very lightweight, and they can be abused in ways that make it very hard to follow what's going on. Lua, for instance, gives you coroutines, but it does not give you the necessary structure to do something with them easily. You will end up building your own scheduler, your own threading system, etc.
So what we really want is where we started out with: threads! Good old threads!
The irony in all of this is, that the language that I think actually go this right is modern Java. Project Loom in Java has coroutines and all the bells and whistles under the hood, but what it exposes to the developer is good old threads. There are virtual threads, which are mounted on carrier OS threads, and these virtual threads can travel from thread to thread. If you end up issuing a blocking call on a virtual thread, it yields to the scheduler.
Now I happen to think that threads alone are not good enough! Threads require synchronization, they require communication primitives etc. Scratch has message passing! So there is more that needs to be built to make them work well.
I want to follow up on an another blog post about what is needed to make threads easier to work with. Because what async/await clearly innovated is bringing some of these core capabilities closer to the user of the language, and often modern async/await code looks easier to read than traditional code using threads is.
Lastly I do want to say something nice about async/await and celebrate the innovations that it has brought up. I believe that this language feature singlehandedly drove some crucial innovation about concurrent programming by making it widely accessible. In particular it moved many developers from a basic “single thread per request” model to breaking down tasks into smaller chunks, even in languages like Python. For me, the biggest innovation here goes to Trio, which introduced the concept of structured concurrency via its nursery. That concept has eventually found a home even in asyncio with the concept of the TaskGroup API and is finding its way into Java.
I recommend you to read Nathaniel J. Smith's Notes on structured concurrency, or: Go statement considered harmful for a much better introduction. However if you are unfamiliar with it, here is my attempt of explaining it:
I believe that structured concurrrency needs to become a thing in a threaded world. Threads must know their parents and children. Threads also need fo find convenient ways to ways to pass their success values back. Lastly context should flow from thread to thread implicity through context locals.
The second part is that async/await made it much more apparent that tasks / threads need to talk with each other. In particular the concept of channels and selecting on channels became more prevalent. This is an essential building block which I think can be further improved upon. As food for thought: if you have structured concurrency, in principle each thread's return value really can be represented as a buffered channel attached to the thread, holding up to a single value (successful return value or error) that you can select on.
Today, although no language has perfected this model, thanks to many years of experimentation, the solution seems clearer than ever, with structured concurrency at its core.
I hope I was able to demonstrate to you that async/await has been a mixed bag. It brought some relief from callback hell, but it also saddled us with new issues like colored functions, new back-pressure challenges, and introduced new problems all entirely such as promises that can just sit around forever without resolving. It has also taken away a lot of utility that call stacks brought, in particular for debugging and profiling. These aren't minor hiccups; they're real obstacles that get in the way of the straightforward, intuitive concurrency we should be aiming for.
If we take a step back, it seems pretty clear to me that we have veered off course by adopting async/await in languages that have real threads. Innovations like Java's Project Loom feel like the right fit here. Virtual threads can yield when they need to, switch contexts when blocked, and even work with message-passing systems that make concurrency feel natural. If we free ourselves from the idea that the functional, promise system has figured out all the problems we can look at threads properly again.
However at the same time async/await has moved concurrent programming to the forefront and has resulted in real innovation. Making concurrency a core feature of the language (via syntax even!) is a good thing. Maybe the increased adoption and people struggling with it, was what made structured concurrency a real thing in the Python async/await world.
Future language design should rethink concurrency once more: Instead of adopting async/await, new languages should model themselves more like Java's Project Loom but with more user friendly primitives. But like Scratch, it should give programmers really good APIs that make concurrency natural. I don't think actor frameworks are the right fit, but a combination of structured concurrency, channels, syntax support for spawning/joining/selecting will go a long way. Watch this space for a future blog post about some things I found to work better than others.
[1] | Sentry works with large debug information files such as PDB or DWARF. These files can be gigabytes in size and we memory map terabytes of preprocessed files into memory during processing. Memory mapped files can block is hardly a surprise, but what we learned in the process is that thanks to containerization and memory limits, you can easily navigate yourself into a situation where you spend much more time on page faults than you expected and the system crawls to a halt. |
2024-11-08 08:00:00
I wrote in the past about how I'm a pessimist that strives for positive outcomes. One of the things that I gradually learned is is wishing others to succeed. That is something that took me a long time to learn. I did not see the value in positive towards other people's success, but there is. There is one thing to be sceptical to a project or initiative, but you can still encourage the other person and wish them well.
I think not wishing others well is a coping mechanism of sorts. For sure it was for me. As you become more successful in life, it becomes easier to be supportive, because you have established yourself in one way or another and you feel more secure about yourself.
That said, there is something I continue to struggle with, and that are morals. What if the thing the other person is doing seems morally wrong to me? I believe that much of this struggle stems from the fear of feeling complicit in another's choices. Supporting someone — even passively — can feel like tacit approval, and that can be unsettling. Perhaps encouragement doesn't need to imply agreement. Another angle to consider is that my discomfort may actually stem from my own insecurities and doubts. When someone's path contradicts my values, it can make me question my own choices. This reaction often makes it hard to wish them well, even when deep down I want to.
What if my tribe is just wrong on something? I grew up with the idea of “never again”. Anything that remotely looks like fascism really triggers me. There is a well known propaganda film from the US Army called “Don't Be a Sucker” which warns Americans about the dangers of prejudice, discrimination, and fascist rhetoric. I watched this a few times over the years and it still makes me wonder how people can fall for that kind of rhetoric.
But is it really all that hard? Isn't that happening today again? I have a very hard time supporting what Trump or Musk are standing for or people that align with them. Trump's rhetoric and plans are counter to everything I stand for and the remind me a lot of that film. It's even harder for me with Musk. His morals are completely off, he seems to a person I would not want to be friends with, yet he's successful and he's pushing humanity forward.
It's challenging to reconcile my strong opposition to their (and other's) rhetoric and policies with the need to maintain a nuanced view of them. Neither are “literal Hitler”. Equating them with the most extreme historical figures oversimplifies the situation and shuts down productive conversation.
Particularly watching comedy shows reducing Trump to a caricature feels wrong to me. Plenty of his supporters have genuine concerns. I find it very hard to engage with these complexities and it's deeply uncomfortable and quite frankly exhausting.
Life becomes simpler when you just pick a side, but it will strip away the deeper understanding and nuance I want to hold onto. I don’t want to fall into the trap of justifying or defending behaviors I fundamentally disagree with, nor do I want to completely shut out the perspectives of those who support him. This means accepting that people I engage with, might see things very differently, and that maintaining those relationships and wishing them well them requires a level of tolerance I'm not sure I possess yet.
The reason it's particularly hard to me that even if I accept that my tribe maybe wrong in parts, I can see the effects that Trump and others already had on individuals. Think of the Muslim travel ban which kept families apart for years, his border family separation policy, the attempted repeal of Section 230. Some of it was not him, but people he aligned with. Things like the overturning of Roe v. Wade and the effects it had on women, the book bans in Florida, etc. Yes, not quite Hitler, but still deeply problematic for personal freedoms. So I can't ignore the harm that some of these policies have caused in the past and even if I take the most favorable view of him, I have that track record to hold against him.
In the end where does that leave me? Listening, understanding, and standing firm in my values. But not kissing the ring. And probably coping by writing more.
2024-10-30 08:00:00
Most software that exists today does not forget. Creating software that remembers is easy, but designing software that deliberately “forgets” is a bit more complex. By “forgetting,” I don't mean losing data because it wasn’t saved or losing it randomly due to bugs. I'm referring to making a deliberate design decision to discard data at a later time. This ability to forget can be an incredibly benefitial property for many applications. Most importantly software that forgets enables different user experiences.
I'm willing to bet that your cloud storage or SaaS applications likely serve as dumping grounds for outdated, forgotten files and artifacts. This doesn’t have to be the case.
Older computer software often aimed to replicate physical objects and experiences. This approach (skeuomorphism) was about making digital interfaces feel familiar to older physical objects. They resembled the appearance and behavior even though they didn't need to. Ironically though skeuomorphism despite focusing on look and feel, rarely considers some of the hidden affordances of the physical world. Critically, rarely does digial software feature degradation. Yes, the trash bin was created as an appoximation of this, but the bin seemingly did not make it farther than file or email management software. It also does not go far enough.
In the physical world, much of what we create has a natural tendency to decay and that is really useful information. A sticky note on a monitor gathers dust and fades. A notebook fills with notes and random scribbles, becomes worn, and eventually ends up in a cabinet to finally end its life discarded in a bin. We probably all clear out our desk every couple of months, tossing outdated items to keep the space manageable. When I do that, a key part of this is quickly judging how “old” some paper looks. But even without regular cleaning, things are naturally lost or discarded over time on my desk. Yet software rarely behaves this way. I think that’s a problem.
When data is kept indefinitely by default, it changes our relationship with that software. People sometimes may hesitate to create anything in shared spaces for fear of cluttering them, while others might indiscriminately litter them. In file-based systems, this may be manageable, but in shared SaaS applications, everything created (dashboards, notebooks, diagrams) lingers indefinitely and remains searchable and discoverable. This persistence seems advantageous but can quickly lead to more and more clutter.
Adding new data to software is easy. Scheduling it for automatic deletion is a bit harder. Simulating any kind of “visual decay” to hint at age or relevance is rarely seen in today's software though it wouldn't be all that hard to add. I'm not convinced that the work required to implement any of those things is why it does not exist, I think it's more likely that there is a belief that keeping stuff around forever is a benefit over the limitations of the real world.
The reality is that even though the entities we create are sticking around forever, the information contained within them ages badly. Of the 30 odd "test" dashboards that are in our Datadog installation, most of them don't show data any more. The same is true for hundreds of notebooks. We have a few thousand notebooks and quite a few of them at this point are anchored to data that is past the retention period or are referencing metrics that are gone.
In shared spaces with lots of users, few things are intended to last forever. I hope that it will become more popular for software to take age more intentional into account. For instance one can start fading out old documents that are rarely maintained or refreshed. I want software to hide old documents, dashboards etc. and that includes most critically not showing up in search. I don't want to accidentally navigate to old and unused dashboards in the mids of an incident.
Sorting by frequency of use is insufficient to me. Ideally software embraced an “ephemeral by default” approach. While there’s some risk of data loss, you can make the deletion purely virtual (at least for a while). Imagine dashboard software with built-in “garbage collection”: everything created starts with a short time-to-live (say, 30 days), after which it moves to a “to sort” folder. If it’s not actively sorted and saved within six months, it's moved to a trash and eventually deleted.
This idea extends far beyond dashboards! Wiki and innformation management software like Notion could benefit from decaying notes, as the information they hold often becomes outdated quickly. I routinely encounter more outdated pages than current ones. While outright deletion may not be the solution, irrelevant notes and documents showing up in searches add to the clutter and make finding useful information harder. “But I need my data sometimes years later” I hear you say. What about making it intentional? Archive them in year books. Make me intentionally “dig into the archives” if I really have to. There are many very intentional ways of dealing with this problem.
And even if software does not want to go down that path I would at least wish for scheduled deletion. I will forget to delete, and I'm lazy and given the tools available I rarely clean up. Yet many of the things I create I already know I really only need for a week or to. So give me a button I can press to schedule deletion. Then I don't have to remember to clean up after myself a few months later, but I can make that call already today when I create my thing.
2024-10-19 08:00:00
Dedicated to my loving wife.
2024-10-14 08:00:00
This year, one of the projects I was involved in at Sentry was the launch of The Open Source Pledge. The idea behind it is simple: companies pledge an amount proportional to the number of developers they employ to fund the Open Source projects they depend on. I have written about this before.
Since then, I've had the chance to engage in many insightful discussions about Open Source funding and licensing. In the meantime we have officially launched the pledge, and almost simultaneously WordPress entered a crisis. At the heart of that crisis is a clash between Open Source ideals and financial interests by people other than the original creators.
You might have a lot of opinions on David Heinemeier Hansson, but I encourage you to read two of his recent posts on that very topic. In Automattic is doing open source dirty David is laying out the case that Automattic has no right to impose moral obligations on beyond the scope of the license. This has been followed by Open source royalty and mad kings in which he goes deeper into the fallout that Matt Mullenweg (the creator of WordPress) is causing with his fight.
I'm largely in agreement with the posts. However I want to talk a bit about some pretty significant difference between David's opinions on Open Source funding (on which these posts appear to be based): the money element. In 2013 David wrote the following about money and Open Source:
[…] it's tempting to cash in on goodwill earned. […] It's a cliché, but once you've sold out, the goodwill might well be spent for good.
[…] part of the reason much of open source is so good, and often so superior to closed-source commercial projects, is the natural boundary of constraints. If you are not being paid or otherwise compensated directly for your work, you're less likely to needlessly embellish it. […]
—David Heinemeier Hansson, The perils of mixing open source and money
At face value, this suggests that Open Source and money shouldn’t mix, and that the absence of monetary rewards fosters a unique creative process. There's certainly truth to this, but in reality, Open Source and money often mix quickly.
If you look under the cover of many successful Open Source projects you will find companies with their own commercial interests supporting them (eg: Linux via contributors), companies outright leading projects they are also commercializing (eg: MariaDB, redis) or companies funding Open Source projects primarily for marketing / up-sell purposes (uv, next.js, pydantic, …). Even when money doesn't directly fund an Open Source project, others may still profit from it, yet often those are not the original creators. These dynamics create stresses and moral dilemmas.
I’ve said this before, but it’s no coincidence that Rails has a foundation, large conferences, a strong core team, and a trademark, while Flask has none of it. There are barriers and it takes a lot of energy and determination to push a project to a level where it can sustain itself.
Rails pushed through this barrier. I never did with any of my projects and I'm at peace with that. I got to learn a lot through my Open Source work, I achieved a certain level of popularity that I benefit from. I built a meaningful career by leveraging my work and I even met my wonderful wife that way. All are consequences of my Open Source contributions. There were clear and indisputable benefits to it and by all accounts I'm a happy and grateful person.
But every now and then doubts creep in and I wonder if I should have done something more commercial with Flask, or if I should have pushed Rye further. As much as I love listening to Charlie talking about uv, there is also an unavoidable doubt lingering there what could have been if I dared to build out Rye with funding on my own.
Over the years, I have seen too many of my colleagues and acquaintances struggle one way or another. Psychological, mentally and professionally. Midlife crises, burnout, health, and dealing with a strong feeling of dread and disappointment. Many of this as a indirect or even direct result of their Open Source work. While projects like Rails and Laravel are great examples of successful open source stewardship, they are also outliers. Many others don't survive or grow to that level.
And yet even some of those lighthouse projects can become fallen stars and face challenges. WordPress by all accounts is a massive success. WordPress is in the top 1% of open source projects in terms of impact, success, and financial return for its creator. Yet despite that — and it finding an actual business model to commercialize it — its creator suffers from the same fate as many small Open Source libraries: a feeling of being wronged.
This is where the lines between law and morality blur. Matt feels mistreated, especially by a private equity firm, but neither trademarks nor license terms can resolve the issue for him. It’s a moral question, and sadly, Matt’s actions have alienated many who would otherwise support him. He's turning into a “mad king” and behaving immoral in his own ways.
The reality is that we humans are messy and unpredictable. We don't quite know how we will behave until we have been throw into a particular situation. Open Source walks a very fine line, and anyone claiming to have all the answers probably doesn't. I certainly don't.
Is it a wise to mix Open Source and money? Maybe not. Yet I also believe it's something that is just a reality we need to navigate. Today there are some projects too small to get any funding (xz) and there are projects large enough to find some way to sustain by funneling money to it (Rails, WordPress).
We target with the Pledge small projects in particular. It's our suggestion of how to give to projects for which the barrier to attract funding is too high. At the same time I recognize all the open questions it leaves. There are questions about tax treatments, there are questions about sustainabilty and incentives, questions about distribution and governance.
I firmly believe that the current state of Open Source and money is inadequate, and we should strive for a better one. Will the Pledge help? I hope for some projects, but WordPress has shown that we need to drive forward that conversation of money and Open Source regardless of thes size of the project.