2025-06-21 08:00:00
I'm currently evaluating how different models perform when generating XML versus JSON. Not entirely unexpectedly, XML is doing quite well — except for one issue: the models frequently return invalid XML. That made it difficult to properly assess the quality of the content itself, independent of how well the models serialize data. So I needed a sloppy XML parser.
After a few failed attempts of getting Claude to just fix up my XML parsing in different ways (it tried html5lib, the html lxml parser etc.) which all resulted in various kinds of amusing failures, I asked Claude to ultrathink and write me a proper XML library from scratch. I gave it some basic instructions of what this should look like and it one-shotted something. Afterwards I prompted it ~20 more times to do various smaller fixes as a response to me reviewing it (briefly) and using it and to create an extensive test suite.
While that was taking place I had 4o create a logo. After that I quickly converted it into an SVG with Illustrator and had Claude make it theme-aware for dark and light modes, which it did perfectly.
On top of that, Claude fully set up CI and even remotely controlled my browser to configure the trusted PyPI publisher for the package for me.
In summary, here is what AI did:
The initial prompt that started it all (including typos):
I want you to implement a single-file library here that parses XML sloppily. It should implement two functions:
- stream_parse which yields a stream of events (use named tuples) for the XML stream
- tree_parse which takes the output of stream_parse and assembles an element tree. It should default to xml.etree.ElementTree and optoinally allow you to provide lxml too (or anything else)
It should be fast but use pre-compiled regular expressions to parse the stream. The idea here is that the output comes from systems that just pretend to speak XML but are not sufficiently protected against bad outoput (eg: LLMs)
So for instance & should turn into & but if &x is used (which is invalid) it should just retain as &x in the output. Additionally if something is an invalid CDATA section we just gracefully ignore it. If tags are incorrectly closed within a larger tag we recover the structure. For instance <foo><p>a<p>b</foo> will just close the internal structures when </foo> comes around.
Use ultrathink. Break down the implementation into
- planning
- api stubs
- implementation
Use sub tasks and sub agents to conserve context
Now if you look at that library, you might not find it amazingly beautiful. It probably is a bit messy and might have a handful of bugs I haven't found yet. It however works well enough for me right now for what I'm doing and it definitely unblocked me. In total it worked for about 30-45 minutes to write the initial implementation while I was doing something else. I kept prompting it for a little while to make some progress as I ran into issues using it.
If you want to look at what it looks like:
To be clear: this isn't an endorsement of using models for serious Open Source libraries. This was an experiment to see how far I could get with minimal manual effort, and to unstick myself from an annoying blocker. The result is good enough for my immediate use case and I also felt good enough to publish it to PyPI in case someone else has the same problem.
Treat it as a curious side project which says more about what's possible today than what's necessarily advisable.
Postscriptum: Yes, I did slap an Apache 2 license on it. Is that even valid when there's barely a human in the loop? A fascinating question but not one I'm not eager to figure out myself. It is however certainly something we'll all have to confront sooner or later.
2025-06-17 08:00:00
This week I spent time with friends to letting agents go wild and see what we could build in 24 hours. I took some notes for myself to reflect on that experience. I won't bore you with another vibecoding post, but you can read Peter's post about how that went.
As fun as it was, it also was frustrating in other ways and in entire predictable ways. It became a meme about how much I hated working with Xcode for this project. This got me thinking quite a bit more that this has been an entirely unacceptable experience for a long time, but with programming agents, the pain becomes measurable.
When I first dove into programming I found the idea of RTFM quite hilarious. “Why are you asking dumb questions, just read it up.” The unfortunate reality is that the manual often doesn't exist — or is wrong. In fact, we as engineers are quite willing to subject each others to completely inadequate tooling, bad or missing documentation and ridiculous API footguns all the time. “User error” is what we used to call this, nowadays it's a “skill issue”. It puts the blame on the user and absolves the creator, at least momentarily. For APIs it can be random crashes if you use a function wrong, for programs it can be impossible to navigate UI or lack of error messages. There are many different ways in which we humans get stuck.
What agents change about this is, is that I can subject them to something I wouldn't really want to subject other developers to: measuring. I picked the language for my current project by running basic evals and it worked well. I learned from that, that there are objectively better and worse language when it comes to my particular problem. The choice however is not just how much the AI knows about the language from the corpus of examples during training. It's also tooling, the inherent capabilities of the language, ecosystem churn and other aspects.
Using agents to measure code quality is great because agents don't judge me, but they do judge the code they are writing. Not all agents will swear, but they will express frustration with libraries when loops don't go well or give up. That opens up an opportunity to bring some measurements into not agent performance, but the health of a project.
We should pay more attention to how healthy engineering teams are, and that starts with the code base. Using agents we can put some numbers to it in which we cannot do with humans (or in a very slow and expensive way). We can figure out how successful agents are in using the things are are creating in rather objective ways which is in many ways a proxy for how humans experience working with the code. Getting together with fresh souls to walk them through a tutorial or some tasks is laborious and expensive. Getting agents that have never seen a codebase start using a library is repeatable, rather cheap, fast and if set up the right way very objective. It also takes the emotion out of it or running the experiment multiple times.
Now obviously we can have debates over if the type of code we would write with an agent is objectively beautiful or if the way agents execute tools creates the right type of tools. This is a debate worth having. Right at this very moment though what programming agents need to be successful is rather well aligned with what humans need.
So what works better than other things? For now these are basic indicators, for agents and humans alike:
When an agent struggles, so does a human. There is a lot of code and tooling out there which is objectively not good, but because of one reason or another became dominant. If you want to start paying attention to technology choices or you want to start writing your own libraries, now you can use agents to evaluate the developer experience.
Because so can your users. I can confidently say it's not just me that does not like Xcode, my agent also expresses frustration — measurably so.
2025-06-12 08:00:00
There is currently an explosion of people sharing their experiences with agentic coding. After my last two posts on the topic, I received quite a few questions about my own practices. So, here goes nothing.
For all intents and purposes, here's what I do: I predominently use Claude Code with the cheaper Max subscription for $100 a month [1]. That works well for several reasons:
My general workflow involves assigning a job to an agent (which effectively has full permissions) and then waiting for it to complete the task. I rarely interrupt it, unless it's a small task. Consequently, the role of the IDE — and the role of AI in the IDE — is greatly diminished; I mostly use it for final edits. This approach has even revived my usage of Vim, which lacks AI integration.
One caveat: I expect this blog post to age very poorly. The pace of innovation here is insane; what was true a month ago barely holds true today. That's why I'm sticking to concepts I believe have staying power.
If you want to a small session of me working on an Open Source library with it, I have a recording you can watch.
I disable all permission checks. Which basically means I run claude --dangerously-skip-permissions. More specifically I have an alias called claude-yolo set up. Now you can call that irresponsible and there are definitely risks with it, but you can manage those risks with moving your dev env into docker. I will however say that if you can watch it do its thing a bit, it even works surprisingly well without dockerizing. YMMV.
MCP. This is a term you cannot avoid. It basically is a standardized protocol to give agents access to more tools. Honestly: at this point I barely use it, but I do use it. The reason I barely use it is because Claude Code is very capable of just running regular tools. So MCP for me is really only needed if I need to give Claude access to something that finds too hard to use otherwise. A good example for this is the playwright-mcp for browser automation. I use it because I haven't found anything better yet. But for instance when I want my agent to poke around in my database, I just uses whatever it finds to be available. In my case it loves to use psql and that's more than good enough.
In general I really only start using MCP if the alternative is too unreliable. That's because MCP servers themselves are sometimes not super reliable and they are an extra thing that can go wrong. Trying to keep things very simple. My custom tools are normal scripts that it just runs.
I've evaluated agent performance across different languages my workload, and if you can choose your language, I strongly recommend Go for new backend projects. Several factors strongly favor Go:
For comparison, Python — my initial choice — often poses significant challenges. Agents struggle with Python's magic (eg: Pytest’s fixture injection) or complex runtime challenges (eg: wrong event loop when working with async), frequently producing incorrect code that even the agentic loop has challenges resolving. Python also has practical performance problems. I don't mean that it writes slow code, i mean that the agent loop is really slow. That's because the agent loves to spawn processes and test scripts, and it can take quite a while for the interpreter to boot up and initialize the entire application.
On the frontend I settled on tailwind, react with tanstack's query and router as well as vite. I'm not amazingly happy with it, but I found it better than the alternatives. Tailwind and vite are great, no complaints there. Tanstack's file based router does not make me super happy. In parts because it likes to have dollar signs in the file names and those really like to confuse the agent. For instance it's quite common that it tries to edit $param.tsx but it edits the file .tsx instead because it to confused by shell interpolation. It's a minor thing, but a very annoying thing.
Regardless of language, effective tooling is paramount. Key rules:
Example: I place critical tools into a Makefile. A make dev command starts my services via a process manager. I made two key changes to what I normally do:
Here an example agentic interaction that demonstrates this rather well:
⏺ Great! The formatting looks good. Now let me run the development server to test the dynamic onboarding: ⏺ Bash(make dev) ⎿ Error: error: services are already running make: *** [dev] Error 1 ⏺ The development server is already running. Let me check the logs to see the frontend URL: ⏺ Bash(make tail-log) ⎿ 01:06:55 backend | Starting server on localhost:8080 01:07:39 backend | [Running: sh -c 'cd backend && go run cmd/server/main.go serve'] 01:07:39 backend | Starting server on localhost:8080 08:45:34 backend | "GET http://localhost:8080/api/users/me HTTP/1.1" from 127.0.0.1:61804 - 200 39.252666ms … +96 lines (ctrl+r to expand) ⏺ Perfect! I can see the frontend is running on http://localhost:3000/ and the latest changes have been hot-reloaded. Let me test the dynamic onboarding functionality using playwright:
In general logging is super important. For instance my app currently has a sign in and register flow that sends an email to the user. In debug mode (which the agent runs in), the email is just logged to stdout. This is crucial! It allows the agent to complete a full sign-in with a remote controlled browser without extra assistence. It knows that emails are being logged thanks to a CLAUDE.md instruction and it automatically consults the log for the necessary link to click.
Does it count as a tool? In my book it does.
Agentic coding's inefficiency largely arises from inference cost and suboptimal tool usage. Let me reiterate: quick, clear tool responses are vital. What we did not talk about yet is that some tools are "emergent," temporarily written by agents themselves. Quick compilation and execution significantly boost productivity of the agent. So how can we help it?
With the right instructions it must be possible for the AI to create a new tool, by following existing conventions very quickly. This is necessary because you want the AI to write some new code, and run it. There is a big difference in the quality and speed of the flow, if that tool takes 3ms to run vs it compiles for 5 seconds and then needs another minute to boot and connect to database and kafka broker and 100 lines of nonsensical log output.
If your stuff is indeed slow, then consider vibe-coding a daemon that you can dynamically load stuff into. As an example Sentry takes too long to reload code and it takes too long to restart. To trial some agentic coding there my workaround was a module that watches a file system location and just imports and executes all python modules placed there, then writes the outputs into a log it can cat. That's not perfect, but it was a significant help for the agent to evaluate some basic code in the context of the application.
Balancing log verbosity is crucial: informative yet concise logs optimize token usage and inference speed, avoiding unnecessary costs and rate limits. If you cannot find the balance, provide some easy to turn knobs for the AI to control.
In an idea setup you get useful log output as a natural byproduct of the agent writing code. Getting observability from the first shot of code generation beats writing code, failing to run it and only then going back to a debug loop where debug information is added.
Stable ecosystems are what you really want. LLMs are great with Go and they love to use Flask, because those are quite stable ecosystems with little churn. The same thing is true for your codebase. The AI likes to leave all kinds of breadcrumbs lying around when writing code that can turn into confusion later. For instance I have seen the agents leave useful comments about why it chose one path over another. If you nilly-willy let the AI upgrade libraries where some of those decisions no longer make sense, you now might have the AI continue making a now outdated pattern.
In theory this should be the same for agents and humans, but the reality is that agents make upgrades so “cheap” that it's tempting to just let the AI do it and see if tests still pass. I do not find this to be a successful path at all. Be even more conservative about upgrades than before.
Likewise with AI I strongly prefer more code generation over using more dependencies. I wrote about why you should write your own code before, but the more I work with agentic coding, the more I am convinced of this.
Simple code significantly outperforms complex code in agentic contexts. I just recently wrote about ugly code and I think in the context of agents this is worth re-reading. Have the agent do “the dumbest possible thing that will work”.
Agents aren't exceptionally fast individually, but parallelization boosts overall efficiency. Find a way to manage shared states like the file system, databases, or Redis instances so that you can run more than one. Avoid them, or find a way to quickly segment stuff out.
Your initial shared state is just the file system and a second check-out will do. But really I don't have an amazing solution yet. There are some good initial attempts. For instance one of the tools to watch is container-use. It's an MCP server that instructs Claude or other agents to run their experiments entirely in Docker.
Then there are tools like Cursor's background agents and Codex which are moving this entire stuff into CI which will be interesting. So far, I don't this is working for me yet, but let's see again in a month.
Agentic coding alters refactoring priorities. Agents handle tasks effectively until project complexity surpasses some manageable thresholds. Too big here is defined by the total amount of stuff that it has to consider. So for instance you can vibe code your frontend together for a while, but eventually you reach the point where you absolutely need to tell it to make a component library. Why? Because if the total tailwind class mess is splitered across 50 files you will find it very hard to get the AI to make redesigns or extract components without major regressions.
An agentic workflow encourages good code maintenance and refactoring at the right moment. You don't want to do it too early and you definitely do not want to do it too late.
Agentic coding is rapidly evolving, and my workflow today may look dramatically different tomorrow. What's clear though is that integrating agents into your development process can unlock significant productivity gains. I encourage you to keep experimenting. The tools and techniques will evolve, but the core principles — simplicity, stability, observability and smart parallelization — will remain essential.
Ultimately, the goal is not just to leverage agents to write code faster, but to write better, more maintainable, and resilient code. Already today the code looks nothing like the terrible slop from a few months ago. Stay adaptable, and happy coding!
[1] | This is not an advertisment for Claude Code. It's just the agent I use at the moment. What else is there? Alternatives that are similar in their user experiences are OpenCode, goose, Codex and many others. There is also Devin and Cursor's background agents but they work a bit different in that they run in the cloud. |
2025-06-10 08:00:00
I've received quite a bit of feedback on the last thing I wrote about AI, particularly around the idea that I'm too quick to brush aside criticism. Given that Glyph — who I respect a lot — wrote a lengthy piece on why he's largely opting out of AI with some thoughtfully articulated criticism, I thought is would be a good opportunity to respond.
For this discussion, I'm focusing on AI as a tool for generating text and code — not images, video, or music. My perspective is that there’s a clear difference between utilitarian outputs (code, simple text) and creative outputs that are meant to evoke emotion (art, music, well articulated writings, etc.). For example, when I get an email from a real estate broker, I expect clear information, not art. Similarly, when I add something to a virtual shopping cart, I don’t care how artistic the code is that makes it work. In fact, even today without AI, I better not know.
So, like Glyph, I want to focus on code.
If you read my earlier post, you probably picked up that I see a lot of potential in AI. That hasn't always been my stance, and I intend to remain critical, but right now I'm quite positive about its usefulness. That is in a stark contrast to Glyph's experience.
He writes:
My experiences of genAI are all extremely bad, but that is barely even anecdata. Their experiences are neutral-to-positive. Little scientific data exists. How to resolve this?
I can't judge Glyph's experiences, and I don't want to speculate about why they differ from mine. I've certainly had my own frustrations with AI tools.
The difference, I think, is that I've learned over time how to use these tools more effectively, and that's led to better results. For me, it's not just “neutral-to-positive” — it's been astonishingly positive. As I write this, my agent is fixing code in another window for me. I recorded a video of it fixing issues in a library if you want to see what this looks like.
Glyph also argues that adoption is being forced by management and that people would not voluntarily use it:
Despite this plethora of negative experiences, executives are aggressively mandating the use of AI6. It looks like without such mandates, most people will not bother to use such tools, so the executives will need muscular policies to enforce its use.7
This doesn't match what I've seen. In my experience, people are adopting AI on their own, often before their companies are even aware.
Even at Sentry the adoption of AI happened by employees before the company even put money behind it. In fact my memory is that if anything only at the point where an exceeding number of AI invoices showed up from IC expenses did we realize how widespread adoption has been. This was entirely grounds up. For my non techy friends they sometimes need to hide their AI usage from their employers because some companies try to prevent the adoption of AI, but they are paying for it themselves to help them with the work. Some of them pay for the expensive ChatGPT subscription even!
Yes, there are companies like Shopify that put AI on their banners and are mandating this, but there are probably many more companies that leverage AI via a secret grassroots adoption.
Glyph makes the point that LLMs reduce code review to a non enjoyable part. For me code review is a fact of life and part of the job. That's just what we do as programmers. I don't do it because I want the person that wrote the code to grow and become a better programmer, I do it because I want code to be merged. That does not mean I do not care about the career opportunities or skills of the other person, I do! But that's an effort all on its own. Sometimes it takes place in a code review, most of the time however that's happening in a one-on-one setting. The reality is that we're often not in the mindset of wanting personal growth when receiving review comments either.
Now I admit that I do a lot more code review than I do programming at the moment, but I also find it quite enjoyable. On the one hand because the novelty of a machine programming hasn't worn off yet, on the other hand because it's a very patient recipient of feedback and change requests. You just tell it stuff, you don't spend too much time to think about how the other person is going to respond, if it's a good idea to nitpick a small thing and put extra load on them. It's quite freeing really and it does have a different feeling to me than a regular code review.
So is programming still enjoyable if I don't hit the keys? For me, yes. I still write code, just less of it, and it doesn't diminish the satisfaction at all. I'm still in control, and the quality still depends on the effort I put into guiding the tool.
Glyph doesn't talk too much about the economics and the climate impact, but he does mention it. My stance on this is rather simple: margins will erode, there will be a lot of competition and we all will pay for the inference necessary and someone will make money. Energy usage will go up but we need more energy even without AI as we're electrifying our cars. AI might change this trajectory slightly, but we had a climate problem before all of this and we have give or take the same climate problem until we shift towards more renewable energy. In fact, this new increased energy consumption might actually do us a great service here. Solar is already the cheapest energy solution [1] on the market and if we need more, that's quite likely the source that we will build more of. Particularly now that cost of energy storage is also going down quickly.
As for copyright and “stealing”: I've always felt that copyright terms are too long, scraping is beneficial, and sharing knowledge is a net positive for humanity. That's what drew me to Open Source in the first place. Glyph argues that scrapers are more aggressive now, but I'm not sure if that is actually true. I think there are just more of them. We got so used that it was mostly a handful of search engines scraping lowering the cost of it to all. I tend to think that more competition is good here and we might just have to accept it for a little while.
I addressed this in my previous article, but I believe LLMs have significant potential to improve learning. Glyph disagrees, partly because of concerns about cheating and that it will make it worse:
LLMs are making academic cheating incredibly rampant. […] For learning, genAI is a forklift at the gym. […] But it was within those inefficiencies and the inconveniences of the academic experience that real learning was, against all odds, still happening in schools.
I disagree strongly here. This is where I have the most first-hand experience, considering time spent with AI. Since the early days of ChatGPT, I've used LLMs extensively for learning. That's because I'm not great at learning from books, and I have found LLMs to make the process much more enjoyable and helpful to me.
To give you some ideas of how useful this can be, here is an excellent prompt that Darin Gordon shared for getting a GPT to act as a teacher of algorithms that uses the Socratic method: socratic_fp_learning.md. It works even super well if you dumb it down. I had this explain to my son how hash tables work and I did a modification to the prompt to help him understand entropy. It's surprisingly effective.
Now, that does not do much about the cheating part. But surely in a situation where students cheat, it wasn't about learning in the first case, it was about passing a test. That has not much to do with learning, but with performance assessment. When you feel the need to cheat, you probably did not learn something properly in the first place. AI might just make these pre-existing problems more visible, and even Glyph acknowledged that.
AI may complicate things for educators in the near team, but it can also offer real improvements. Either way, education needs reform to adapt to present realities.
Glyph concludes by sharing that the pace of change is overwhelming him and opting out feels like the only sane response. I understand that. The pace of AI advancement can make anyone feel like they're falling behind and I too feel like that sometimes.
I offer a different view: just assume AI will win out and we will see agents! Then the path that takes us to that future is less relevant. Many of the things that are currently asking for people's attention are going to look different in a few years — or might not even exist any longer. I initially used GitHub Copilot just to move to Cursor, now to mostly move to Claude Code, maybe I will be back with Cursor's background agents in a month. First there was v0, then there was lovable, who knows what there be in a year. But the path for me is pretty clear: it's going towards me working together with the machine. I find that thought very calming and it takes out the stress. Taking a positive view gives you a form of an excited acceptance of the future.
I really don't want to dismiss anyone's concerns. I just feel that, for me, the utility of these tools has become obvious enough that I don't feel the need to argue or justify my choices anymore.
[1] | https://en.wikipedia.org/wiki/Cost_of_electricity_by_source |
2025-06-04 08:00:00
At the moment I'm working on a new project. Even over the last two months, the way I do this has changed profoundly. Where I used to spend most of my time in Cursor, I now mostly use Claude Code, almost entirely hands-off.
Do I program any faster? Not really. But it feels like I've gained 30% more time in my day because the machine is doing the work. I alternate between giving it instructions, reading a book, and reviewing the changes. If you would have told me even just six months ago that I'd prefer being an engineering lead to a virtual programmer intern over hitting the keys myself, I would not have believed it. I can go can make a coffee, and progress still happens. I can be at the playground with my youngest while work continues in the background. Even as I'm writing this blog post, Claude is doing some refactorings.
While all this is happening, I've found myself reflecting a lot on what AI means to the world and I am becoming increasingly optimistic about our future. It's obvious now that we're undergoing a tremendous shift. AI is out of the bottle, and there's no putting it back. Even if we halted all progress today, froze the weights, halted the training, the systems already out there would still reshape how we live, work, learn, and communicate to one another.
What however took longer to accept is just how profound that change really is. As an engineer coming from a world of deterministic things, who deeply values the craft of engineering, to accept the messiness of what agents are doing took a while to digest. It took me a while to even warm up to tool usage by AI in the first place — just two years ago I was convinced AI might kill my wife. In those two years however we've come incredibly far. We have reached the point where even if we stopped here (and there is no indication we will) AI is already a new substrate for a lot of new innovation, ideas and creations and I'm here for it. It has moved beyond being a curious tool.
Never before have I seen a technology surface in every day life so quickly, so widely. Smartphones adoption felt slow in comparison. Today I can't finish a commute or coffee without spotting someone chatting with ChatGPT. I've had conversations with baristas, hairdressers, parents at the playground — people who aren't what I would consider “tech-savvy” — telling me how AI changed the way they write letters, search for recipes, help their kids with homework, or translate documents. The ripple effect is already massive. And still, the majority of the world hasn't touched these tools yet. Entire communities, professions, and economies are yet to begin exploring their transformation.
That's what makes this moment feel so strange — half revolutionary, half prelude. And yet, oddly, there are so many technologists who are holdouts. How could techies reject this change? Thomas Ptacek's piece “My AI Skeptic Friends Are All Nuts” really resonated with me. It takes a humorous stab at the push against AI that is taking place from my very circles. Why is it that so many people I've respected in tech for years — engineers, open source contributors — are the ones most resistant to what's happening? We've built something beyond what we imagined, and instead of being curious, many are dismissive and denying its capabilities. What is that?
Of course the implications are vast and real and the rapid development forces big questions. What does this mean for the education of our children? If AI can teach, explain, and personalize lessons better than a classroom of thirty ever could, what becomes of schools as we know them? And if kids grow up expecting to interact with intelligence — not just absorb content — how do we teach them to reason, create, and collaborate in ways that leverage this power without becoming dependent on it?
On the global stage, there are also ramifications that seem more fundamental than in previous cycles. It does not feel like the rise of search engines or social media, where the rest of the world was satisfied with being a consumer of US infrastructure. This feels more like the invention of the steam engine. Once it existed, there was no future without it. No country could afford to stay on the sidelines. But steam machines also became quickly commoditized and there was plenty of competition of manufacturers. It was just too obvious of a technological leap. With AI, every nation, every corporation will want its own models, its own control, its own stake in the future.
And so, as I alternate between delegating tasks to Claude and reading something thoughtful in between, I can't help but feel excited about being there when we're at the beginning of something irreversible and expansive.
I understand why it's tempting to be cynical or fearful. For sure the job of programmers and artists will change, but they won't vanish. I feel like all my skills that I learned as a programmer are more relevant than ever, just with a new kind of tool. Likewise the abundance of AI generated art also makes me so much more determined that I want to hire an excellent designer as soon as I can. People will always value well crafted products. AI might raise the bar for everybody all at once, but it's the act of careful deliberation and very intentional creation that sets you apart from everybody else.
Sure, I may have good personal reasons to be optimistic. But the more time I spend with these tools, the more I believe that optimism is the more reasonable stance for everyone. AI can dramatically increase human agency when used well. It can help us communicate across cultures. It can democratize access to knowledge. It can accelerate innovation in medicine, science, and engineering.
Right now it's messy and raw, but the path is clear: we are no longer just using machines, we are now working with them. And while it's early, I think we'll look back at this decade the way past generations looked at electricity or the printing press — not as a curiosity, but as the moment when everything changed.
I encourage you not meet that moment with cynicism or fear: meet it with curiosity, responsibility and the conviction that this future will be bright and worth embracing.
2025-03-31 08:00:00
Every ending marks a new beginning, and today, is the beginning of a new chapter for me. Ten years ago I took a leap into the unknown, today I take another. After a decade of working on Sentry I move on to start something new.
Sentry has been more than just a job, it has been a defining part of my life. A place where I've poured my energy, my ideas, my heart. It has shaped me, just as I've shaped it. And now, as I step away, I do so with immense gratitude, a deep sense of pride, and a heart full of memories.
I've known David, Sentry's co-founder (alongside Chris), long before I was ever officially part of the team as our paths first crossed on IRC in the Django community. Even my first commit to Sentry predates me officially working there by a few years. Back in 2013, over conversations in the middle of Russia — at a conference that, incidentally, also led to me meeting my wife — we toyed with the idea of starting a company together. That exact plan didn't materialize, but the seeds of collaboration had been planted.
Conversations continued, and by late 2014, the opportunity to help transform Sentry (which already showed product market fit) into a much bigger company was simply too good to pass up. I never could have imagined just how much that decision would shape the next decade of my life.
For me, Sentry's growth has been nothing short of extraordinary. At first, I thought reaching 30 employees would be our ceiling. Then we surpassed that, and the milestones just kept coming — reaching a unicorn valuation was something I once thought was impossible. While we may have stumbled at times, we've also learned immensely throughout this time.
I'm grateful for all the things I got to experience and there never was a dull moment. From representing Sentry at conferences, opening an engineering office in Vienna, growing teams, helping employees, assisting our licensing efforts and leading our internal platform teams. Every step and achievement drove me.
Yet for me, the excitement and satisfaction of being so close to the founding of a company, yet not quite a founder, has only intensified my desire to see the rest of it.
Walking away from something you love is never easy and leaving Sentry is hard. Really hard. Sentry has been woven into the very fabric of my adult life. Working on it hasn't just spanned any random decade; it perfectly overlapped with marrying my wonderful wife, and growing our family from zero to three kids.
And will it go away entirely? The office is right around the corner afterall. From now on, every morning, when I will grab my coffee, I will walk past it. The idea of no longer being part of the daily decisions, the debates, the momentum — it feels surreal. That sense of belonging to a passionate team, wrestling with tough decisions, chasing big wins, fighting fires together, sometimes venting about our missteps and discussing absurd and ridiculous trivia became part of my identity.
There are so many bright individuals at Sentry, and I'm incredibly proud of what we have built together. Not just from an engineering point of view, but also product, marketing and upholding our core values. We developed SDKs that support a wide array of platforms from Python to JavaScript to Swift to C++, lately expanding to game consoles. We stayed true to our Open Source principles, even when other options were available. For example, when we needed an Open Source PDB implementation for analyzing Windows crashes but couldn't find a suitable solution, we contributed to a promising Rust crate instead of relying on Windows VMs and Microsoft's dbghelp. When we started, our ingestion system handled a few thousand requests per second — now it handles well over a million.
While building an SDK may seem straightforward, maintaining and updating them to remain best-in-class over the years requires immense dedication. It takes determination to build something that works out of the box with little configuration. A lot of clever engineering and a lot of deliberate tradeoffs went into the product to arrive where it is. And ten years later, is a multi-product company. What started with just crashes, now you can send traces, profiles, sessions, replays and more.
We also stuck to our values. I'm pleased that we ran experiments with licensing despite all the push back we got over the years. We might not have found the right solution yet, but we pushed the conversation. The same goes for our commitment to funding of dependencies.
I feel an enormous amount of gratitude for those last ten years. There are so many people I owe thanks to. I owe eternal thanks to David Cramer and Chris Jennings for the opportunity and trust they placed in me. To Ben Vinegar for his unwavering guidance and support. To Dan Levine, for investing in us and believing in our vision. To Daniel Griesser, for being an exceptional first hire in Vienna, and shepherding our office there and growing it to 50 people. To Vlad Cretu, for bringing structure to our chaos over the years. To Milin Desai for taking the helm and growing us.
And most of all, to my wonderful wife, Maria — who has stood beside me through every challenge, who has supported me when the road was uncertain, and who has always encouraged me to forge my own path.
To everyone at Sentry, past and present — thank you. For the trust, the lessons, the late nights, the victories. For making Sentry what it is today.
I'm fully aware it's a gamble to believe my next venture will find the same success as Sentry. The reality is that startups that achieve the kind of scale and impact Sentry has are incredibly rare. There's a measure of hubris in assuming lightning strikes twice, and as humbling as that realization is, it also makes me that much more determined. The creative spark that fueled me at Sentry isn't dimming. Not at all in fact: it burns brighter fueld by the feeling that I can explore new things, beckoning me. There's more for me to explore, and I'm ready to channel all that energy into a new venture.
Today, I stand in an open field, my backpack filled with experiences and a renewed sense of purpose. That's because the world has changed a lot in the past decade, and so have I. What drives me now is different from what drove me before, and I want my work to reflect that evolution.
At my core, I'm still inspired by the same passion — seeing others find value in what I create, but my perspective has expanded. While I still take great joy in building things that help developers, I want to broaden my reach. I may not stray far from familiar territory, but I want to build something that speaks to more people, something that, hopefully, even my children will find meaningful.
Watch this space, as they say.