2025-07-30 03:00:00
I introduced this family project in the spring of 2024. I won’t reproduce those arguments for why we’re working on this, but in the current climate I feel like I hardly need to. Since that post, our aversion to Google dependency has only grown stronger. Progress has been non-zero but not fast.
Here’s the table, with progress notes below.
Need | Supplier | Alternatives |
---|---|---|
Office | Google Workspace | Proton? |
Data sharing | Dropbox | |
Photos | Google Photos | Dropbox? |
Video meetings | Google Meet | Jitsi, Signal? |
Maps | Google Maps | Magic Earth, Here, something OSM-based? |
Browser | Safari, Firefox, Vivaldi, LibreWolf | |
Search | Bing-based options, Kagi? | |
Chat | Signal | |
Photo editing | Adobe Lightroom & Nik | Capture One, Darktable, ? |
In-car interface | Google Android Auto | Automaker software |
Play my music | Plex, USB | |
Discover music | Qobuz | |
TV | Roku, Apple, migration |
Pink indicates a strong desire to get off the incumbent service, green means we’re happy-ish with what we’re using, and blue means that, happy or not, it’s not near the top of the priority list.
I’ll reproduce the metrics we care about when looking to replace Google products, some combination of:
Not ad-supported
Not VC-funded
Not Google, Microsoft, or Amazon
The list used to include “Open Source” but I decided that while that’s good, it’s less important than the other three criteria.
Now let’s walk down the chart.
This is going to be a wrenching transition; we’ve been running the family on Google stuff forever, and I anticipate muscle-memory pain. But increasingly, using Google apps feels like being in enemy territory. And, as I said last time, I will not be sorry to shake the dust of Google Drive and Docs from my heels, I find them clumsy and am always having trouble finding something that I know is in there.
While I haven’t dug in seriously yet, I keep hearing reasonably-positive things about Proton, and nothing substantive to scare me away. Wish us luck.
Dropbox is, eh, OK. It doesn’t seem actively evil, there’s no advertising, and the price is low.
We’re a four-Android family including a couple of prolific photographers, and everything just gets pumped into Google and then it fills up and then they want more money. If we could configure the phones to skip Google and go straight to Dropbox, that would be a step forward.
Google meet isn’t painful but I totally suspect it of data-mining what should be private conversations. I’m getting the feeling that the technical difficulty of videoconferencing is going steadily down, so I’m reasonably optimistic that something a little less evil will come along with a fair price.
The fear and loathing that I started feeling in 2017 grows only stronger. But replacements aren’t obvious. It’s a pity, maps and directions and reviews feel like a natural monopoly that should be a public utility or something, rather than a corporate moat.
Chrome has seriously started making my flesh crawl; once again, enemy territory. Fortunately, there are lots of good options; Even people like us who have multiple lives we need to keep separate can find enough better browsers out there.
Maybe I’ll have a look at one of the new genAI-company browsers ha ha just kidding.
The reports on Kagi keep being positive and giving it a try is definitely on the To-Do list.
Signal is the only sane choice at this point in history for personal use.
Adobe’s products are good, and I’m proficient and happy with Lightroom, but they are definitely suffering from bad genAI craziness. Also the price is becoming unreasonable.
I’ve had a few Lightroom software failures in recent months and if that becomes a trend, looking seriously at the alternatives will move to the top of the priority list.
It’s tough, Android Auto is a truly great product. I think I’m stuck here for now, particularly given that I plan to be driving a 2019-model-year car for the foreseeable future. Also, it supports my music apps.
Progress here. I’ve almost completely stopped using YouTube Music in favor of Plex and Qobuz. Really no downside; YTM has more or less completely lost the ability to suggest good new stuff.
Video continues morphing itself into Cable TV redux. We have an old Roku box that works fine and I think I’ve managed to find its don’t-spy-on-us settings. We’ll keep subscribing to Apple+ as long as they keep shipping great shows. I have zero regrets about having left Prime behind.
As for the rest, we’ve become migrants, exclusively month-at-a-time subscriptions for the purpose of watching some serial or sports league, unsubscribe after the season finale or championship game. The good news is that I haven’t encountered much friction in unsubscribing, just a certain amount of earnest pleading.
I have yet to confront any of the really hard parts of this project, but the sense of urgency is increasing. Let’s see.
2025-07-22 03:00:00
I just posted a big Quamina PR representing months of work, brought on by the addition of a small basic regular-expression feature. This ramble doesn’t exactly have a smooth story arc but I’m posting it anyhow because I know there are people out there interested in state-machine engineering and they are my people.
As far as I can tell, a couple of the problems I’m trying to solve haven’t been addressed before, at least not by anyone who published their findings. Partly because of that, I’m starting to wonder if all these disorderly Quamina postings might be worked into a small book or monograph or something. State machines are really freaking useful software constructs! So yeah, this is a war story not an essay, but if you like finite automata you’ll likely be interested in bits of it.
Prior to beginning work on Regular Expressions, I’d already wired shell-style “*
” wildcards into Quamina, which
forced me to start working with NFAs and ε-transitions. The implementation wasn’t crushingly difficult, and
the performance was… OK-ish.
Which leads me to The Benchmark From Hell.
I wondered how the wildcard functionality would work under heavy stress, so I pulled in a list of 12,959 five-letter strings
from the Wordle source code, and inserted a “*
” at a random position in each. Here are the first ten:
aalii* *aargh aar*ti abaca* a*baci a*back ab*acs ab*aft abak*a
I created an NFA for each and merged them together as described here. Building and merging the automata were plenty fast enough, and the merged NFA had 46,424 states, which felt reasonable. Matching strings against it ran at under ten thousand per second, which is pretty poor given that Quamina can do a million or two per second on patterns encoded in a DFA.
But, I thought, still reasonably usable.
?
”Last year, my
slow grind through the regexp features had led me
to the zero-or-one quantifier “?
”. The state machine for these things is not rocket science; there’s a discussion
with pictures in my recent
Epsilon Wrangling.
So I implemented that and fired off the unit tests, most of which I didn’t have to write, and they all failed. Not a surprise I guess.
It turned out that the way I’d implemented ε-transitions for the wildcards
was partially wrong, as in it worked for the tight-loop
state-to-itself ε-transitions, but not for more general-purpose things like “?
” requires.
In fact, it turns out that merging NFAs is hard (DFAs are easy), and I found precious little help online. Thompson’s construction does give an answer: Make an otherwise-empty state with two ε-transitions, one to each of the automata, and it’ll do the right thing. Let’s call that a “splice state”. It’s easy to implement, so I did. Splicing is hardly “merging” in the Quamina sense, but still.
Unfortunately, the performance was hideously bad, just a few matches per second while pegging the CPU. A glance at the final NFA was sobering; endless chains of splice states, some thousands long.
At this point I became very unhappy and got stalled for months dealing with real-life issues while this problem lurked at the back of my mind, growling for attention occasionally.
Eventually I let the growler out of the cave and started to think through approaches. But first…
Is it, really? What sane person is going to want to search for the union of thousands of regular expressions in general or wild-carded strings in particular?
I didn’t think about this problem at all, because of my experience with Quamina’s parent, Ruler. When it became popular among several AWS and Amazon teams, people sometimes found it useful to match the union of not just thousands but a million or more different patterns. When you write software that anyone actually uses, don’t expect the people using it to share your opinions on what is and isn’t reasonable. So I wasn’t going to get any mental peace until I cracked this nut.
I eventually decided that three approaches were worth trying:
Figure out a way really to merge, not just splice, the wildcarded patterns, to produce a simpler automaton.
Optimize the NFA-traversal code path.
Any NFA can be transformed into a DFA, says computer-science theory. So do that, because Quamina is really fast at DFA-based matching.
I ended up doing all of these things and haven’t entirely given up on any of them. The most intellectually-elegant was the transform-to-DFA approach, because if I did that, I could remove the fairly-complex NFA-traversal logic from Quamina.
It turns out that the Net is rich with textbook extracts and YouTubes and slide-shows about how to do the NFA-to-DFA conversion. It ended up being quite a pleasing little chunk of code, only a couple hundred lines.
The bad news: Converting each individual wildcard NFA to a DFA was amazingly fast, but then as I merged them in one by one, the number of automaton states started increasing explosively and the process slowed down so much that I never had the patience to let it finish. Finite-automata theory warns that this can happen, but it’s hard to characterize the cases where it does. I guess this one of them.
Having said that, I haven’t discarded the nfa2Dfa
code, because perhaps I ought to offer a Quamina option to
apply this if you have some collection of patterns that you want to run really super fast and are willing to wait for a while
for the transformation process to complete. Also, I may have missed opportunities to optimize the conversion; maybe it’s making
more states than it needs to?
Recently in Epsilon wrangling I described how NFA traversal has to work, relying heavily on implementing a thing called an ε-closure.
So I profiled the traversal process and discovered, unsurprisingly, that most of the time was going into memory allocation while computing those ε-closures. So now Quamina has an ε-closure cache and will only compute each one once.
This helped a lot but not nearly enough, and the profiler was still telling me the pain was in Go’s allocation and
garbage-collection machinery. Whittling away at this kind of stuff is not rocket science. The standard Go trick I’ve seen over
and over is to keep all your data in slices, keep re-using them then chopping them back to [:0]
for each request. After a while they’ll have grown to the
point where all the operations are just copying bytes around, no allocation required.
Which also helped, but the speed wasn’t close to what I wanted.
I coded multiple ways to do this, and they kept failing. But I eventually found a way to build those automata so that any two of them, or any one of them and a DFA, can merged and generate dramatically fewer ε-transition chains. I’m not going to write this up here for two reasons: First of all, it’s not that interesting, and second, I worry that I may have to change the approach further as I go on implementing new regxp operators.
In particular, at one point I was looking at the code while it wasn’t working, and I could see that if I added a particular
conditional it would work, but I couldn’t think of a principled reason to do it. Obviously I’ll have to sort this out
eventually. In the meantime, if you’re the sort of um special person who is now burning with curiosity, check out my branch from
that PR and have a look at the spinout
type.
Anyhow, I added that conditional even though it puzzled me a bit, and now you can add wildcard patterns to Quamina at 80K/second,
and my 12.9K wildcards generate an NFA with with almost 70K states, which can scan events at almost 400K/second. And that’s good
enough to ship the “?”
feature.
By the way, I tried feeding that 70K-state automaton to the DFA converter, and gave up after it’d burned an hour of CPU and grown to occupy many GB of RAM.
Add “+
” and “*
”, and really hope I don’t have to redesign the NFA machinery again.
Also, figure out the explanation for that puzzling if
statement.
Despite the very narrow not to say obsessive focus of this series, I’ve gotten a few bits and pieces of positive feedback. So there are a few people out there who care about this stuff. To all of you, thanks.
2025-07-10 03:00:00
I just came back from Canada’s only rectangular province. I was there to help out my 95-year-old mother while her main caregiver took vacation. It’s an unhappiness that my family has splashed itself across Canada in such a way that we have to get on an airplane (or take drives measured in days) to see each other, but that’s where we are. I came back with pictures and stories.
Let me set the stage with a couple of photos. Everyone knows that Saskatchewan is flat and brown and empty, right?
Mom lives in Regina, the provincial capital, a city built round a huge park that contains the Legislature (the flowers are from its front lawn), a sizeable lake, and an artificial mini-mountain (the water and trees are from its tip). Have no fear, I’ll get to some no-kidding prairie landscapes.
The night I arrived, after my Mom went to bed she got up again, tripped on something and fell hard. Her right arm was swollen, bruised, and painful. The skin and adjacent blood vessels of very old people become thin and fragile; her whole forearm was a bruise. I tried to get her to go to Emergency but she wasn’t having any of it: “You wait for hours and then they give you a pain-killer, which is constipating.” Since she could twist her wrist and wiggle her fingers and give my hand a firm grasp, I didn’t push too hard.
A couple days later on Saturday she got her regular twice-a-week visit from the public HomeCare nurse, a friendly and highly competent Nigerian immigrant, to check her meds and general condition. She looked at Mom’s wrist and said “Get her an appointment with her doctor, they’ll probably want an X-Ray.”
I called up her doctor at opening time Monday. The guy who answered the phone said “Don’t have any appointments for a couple weeks but come on over, we’ll squeeze her in.” So we went in after morning coffee and waited less than an hour. The doctor looked at her arm for 45 seconds and said “I’m writing a prescription for an X-Ray” and there was a radiologist around the corner and she was in ten minutes later. The doctor called me back that afternoon and said “Your mother’s got a broken wrist, I got her an 8AM appointment tomorrow at Regina General’s Cast Clinic.”
The doctor at the clinic looked at her wrist for another 45 seconds and said “Yeah, put on a cast” so they did and we were home by ten. I’d pessimistically overpaid a couple bucks for hospital parking.
The reason I’m including this is because I notice that this space has plenty of American readers. Did you notice that the story entirely omits insurance companies and money (except parking)? In Canada your health-care comes with your taxes (granted, higher than Americans’) and while the system is far from perfect, it can fix up an old lady’s broken wrist pretty damn fucking quick without any bureaucratic bullshit. Also, Canada spends a huge amount less per head on health-care than the US does.
And Mom told me not to forget that Saskatchewan is the birthplace of Canadian single-payer universal healthcare. Tommy Douglas, the Social Democrat who made that happen, has been named The Greatest Canadian.
Oh, did I say “flat and brown and empty”? Wrong, wrong, and wrong. The Prairies, in Canada and the US too, have textures and colors and hills and valleys, it’s just that the slopes are gentle. There are really flat parts and they make farmers’ lives easier, but more or less every square inch that’s not a town or a park is farmed. I took Mom for a drive out in the country southeast of Regina, from whence these views:
Note that in both shots we’re looking up a gentle slope. In the second, there’s farm infrastructure on
the distant horizon.
Also consider the color of the sky.
In Canada that yellow-flowering crop is called “Canola”, which Wikipedia claims refers to a particular cultivar of Brassica napus, commonly known as rapeseed or just rape, so you can see why when Canada’s agribiz sector wanted to position its oil as the thing to use while cooking they went for the cultivar not the species name. I’m old enough to remember when farmers still said just “rapeseed”. Hmm, Wikipedia also claims that the OED claims this: The term “rape” derives from the Latin word for turnip, rāpa or rāpum, cognate with the Greek word ῥάφη, rhaphe.
Let’s stick with canola.
After I’d taken those two canola-field shots I pulled out my Pixel and took another, but I’m not gonna share it because the Pixel decided to turn the sky from what I thought was a complex and interesting hue into its opinion of “what a blue sky looks like” only this sky didn’t.
Maybe it’s just me, but I think Google’s camera app is becoming increasingly opinionated about color, and not in a good way. There are plenty of alternative camera apps, I should check them out.
In case it’s not obvious, I love photographing Saskatchewan and think it generally looks pretty great, especially when you look up. On the province’s license plates it says “Land of living skies”, and no kidding.
The first two are from the park behind Mom’s place,
the third from that mini-mountain mentioned
above.
My Mom’s doing well for a nonagenerian. She’s smart. When I visited early last fall and we talked about the US election I was bullish on Kamala Harris’s chances. She laughed at me and said “The Americans won’t elect a woman.” Well then.
But she’s forgetful in the short term. I took her to the Legislature’s garden and to the top of the mini-mountain and for a drive out in the country and another adventure we’ll get to; she enjoyed them all. But maybe she won’t remember them.
“Make memories” they say, but what if you show someone you love a good time and maybe they won’t remember it the next day? I’m gonna say it’s still worthwhile and has a lesson to teach about what matters. There endeth the lesson.
Indigenous people make up 17% of Regina’s population, the highest share in any significant Canadian city. By “indigenous” I mean the people that my ancestors stole the land from. It’s personal with me; Around 1900, my Dad’s family, Norwegian immigrants, took over some pretty great farmland southeast of Edmonton by virtue of “homesteading”, such a nice word isn’t it?
Regina tries to honor its indigenous heritage and my favorite expression of that is its Mackenzie Art Gallery, a lovely welcoming space in the T.C.Douglas building (for “T.C.” read “Tommy”. (Did I mention him?) Mom and I walked around it and had lunch in its very decent café.
Every time I’ve been there the big exhibitions in the big rooms have been indigenous-centered, and generally excellent. I try to go every time I visit and I’ve never been disappointed.
In 2025, anything I have to say about this piece would be superfluous.
I love modern-art galleries, especially with big rooms full of big pieces, even if I don’t like all the art. Because it feels good to be in the presence of the work of people who are pouring out what they have to offer, especially at large scale. If the task wasn’t hard enough that failures are common then it wouldn’t be worthwhile, would it?
They’re especially great when there’s someone I love there enjoying it with me. Here’s Mom.
These days, any visit might be the last. I hope this wasn’t.
2025-07-08 03:00:00
I haven’t shipped any new features for Quamina in many months, partly due to a flow of real-life distractions, but also I’m up against tough performance problems in implementing Regular Expressions at massive scale. I’m still looking for a breakthrough, but have learned things about building and executing finite automata that I think are worth sharing. This piece has to do with epsilons; anyone who has studied finite automata will know about them already, but I’ll offer background for those people to skip.
I’ve written about this before in
Epsilon Love. A commenter pointed out that the definition of “epsilon”
in that piece is not quite right per standard finite-automata theory, but it’s still a useful in that it describes how
epsilons support constructs like the shell-style “*
”.
Finite automata come in two flavors: Deterministic (DFA) and Nondeterministic (NFA). DFAs move from state to state one input symbol at a time: it’s simple and easy to understand and to implement. NFAs have two distinguishing characteristics: First, when you’re in a state and an input symbol arrives, you can transfer to more than one other state. Second, a state can have “epsilon transitions” (let’s say “ε” for epsilon), which can happen any time at all while you’re in that state, input or no input.
NFAs are more complicated to traverse (will discuss below) but you need them if you want to
implement regular expressions with .
and ?
and *
and so on. You can turn any NFA into a
DFA, and I’ll come back to that subject in a future piece.
For implementing NFAs, I’ve been using Thompson's construction, where “Thompson” is Ken Thompson, co-parent of Unix. This technique is also nicely described by Russ Cox in Regular Expression Matching Can Be Simple And Fast. You don’t need to learn it to understand this piece, but I’ll justify design choices by saying “per Thompson”.
I’m going to discuss two specific issues today, ε-closures and a simpler NFA definition.
To set the stage, consider this regexp: A?B?C?X
It should match “X” and “BX” and “ACX” and so on, but not “CAX” or “XX”.
Thompson says that you implement A?
with a
transition to the next state on “A” and another ε-transition to that next state; because if you see an “A” you should transition,
but then you can transition anyhow even if you don’t.
The resulting NFA looks like this:
In finite-automaton
math, states are usually represented by the letter “q” followed by a number (usually italicized and subscripted, like
q0, but not here, sorry). Note q4
’s double circle which means it’s
a goal state, i.e. if we get here we’ve matched the regexp.
I should add that this was produced with
draw.io, which seems to make this sort of thing easy.
So, here’s a challenge: Sketch out the traversal code in your head. Think about the input strings “AX” and “BCX” and just “X” and how you’d get through the NFA to the Q4 goal state.
The trick is what’s called the ε-closure. When you get to a state, before you look at the next input symbol, you have to set
up to process it. In this case, you need to be able to transition on an A or B or C. So what you do is pull together the
start state q0
and also any other states you can reach from there through ε-transitions. In this case, the ε-closure
for the start state is {q0, q1, q2, q3}
.
Suppose, then, that you see a “B” input symbol. You apply it to all the states in the ε-closure. Only q1
matches,
transitioning you to
q2
. Before you look at the next input symbol, you compute the ε-closure for q2
, which turns
out to be {q2, q3}
. With this ε-closure, you can match “C” or “X”. If you get a “C”, you”ll step to
q3
, whose ε-closure is just itself, because “X” is the only path forward.
So your NFA-traversal algorithm for one step becomes something like:
Start with a list of states.
Compute the ε-closure of that list.
Read an input symbol.
For each state in the ε-closure, see if you can traverse to another state.
If so, add it to your output list of states.
When you’re done, your output list of states is the input to this algorithm for the next step.
Suppose your regular expression is (A+BC?)+
. I’m not going to sketch out the NFA, but just looking at it tells
you that it has to have loopbacks; once you’ve matched the parenthetized chunk you need to go back to a state where you can
recognize another occurrence.
For this regexp’s NFA, computing the ε-closures can lead you into an infinite loop. (Should be obvious, but I didn’t realize it until
after the first time it happened.)
You can have loops and you can also have dupes. In practice, it’s not that uncommon for a state to have more than one ε-transition, and for the targets of these transitions to overlap.
So you need to watch for loops and to dedupe your output. I think the only way to avoid this is with a cookie-crumbs “where I’ve been” trail, either as a list or a hash table.
Both of these are problematic because they require allocating memory, and that’s something you really don’t want to do when you’re trying to match patterns to events at Quamina’s historic rate of millions per second.
I’ll dig into this problem in a future Quamina-Diary outing, but obviously, caching computed epsilon closures would avoid re-doing this computation.
Anyhow, bear ε-closures in mind, because they’ll keep coming up as this series goes on.
At the top of this piece, I offered the standard definition of NFAs: First, when you’re in a state and an input symbol arrives, you can transfer to more than one other state. Second, you can have ε-transitions. Based on my recent work, I think this definition is redundant. Because if you need to transfer to two different states on some input symbol, you can do that with ε-transitions.
Here’s a mini-NFA that transfers from state q0
on “A” to both q1
and q2
.
And here’s how you can achieve the same effect with ε-transitions:
In that NFA, in qS
the “S” stands for “splice”, because it’s a state that exists to connect
two threads of finite-automaton traversal.
I’m pretty sure that this is more than just a mathematical equivalence. In my regexp implementation, so far at least,
I’ve never encountered a need to do that first kind of dual transition. Furthermore, the “splice” structure is how Thompson
implements the regular-expression “|
” operator.
So if you’re building an NFA, all the traversal stuff you need in a state is a simple map from input symbol to next state, and a list of ε-transitions.
How my own implementation of NFA traversal collided head-on into the Benchmark From Hell and still hasn’t recovered.
2025-07-07 03:00:00
Last week I published a featherweight narrative about applying GenAI in a real-world context, to a tiny programming problem. Now I’m regretting that piece because I totally ignored the two central issues with AI: What it’s meant to do, and how much it really costs.
The most important fact about genAI in the real world is that there’ve been literally hundreds of billions of dollars invested in it; that link is just startups, and ignores a comparable torrent of cash pouring out of Big Tech.
The business leaders pumping all this money of course don’t understand the technology. They’re doing this for exactly one reason: They think they can discard armies of employees and replace them with LLM services, at the cost of shipping shittier products. Do you think your management would spend that kind of money to help you with a quicker first draft or a summarized inbox?
Adobe said the quiet part out loud: Skip the Photoshoot.
At this point someone will point out that previous technology waves have generated as much employment as they’ve eliminated. Maybe so, but that’s not what business leaders think they’re buying. They think they’re buying smaller payrolls.
Maybe I’m overly sensitive, but thinking about these truths leads to a mental stench that makes me want to stay away from it.
Well, I already mentioned all those hundreds of billions. But that’s pocket change. The investment community in general and Venture Capital in particular will whine and moan, but the people who are losing the money are people who can afford to.
The first real cost is hypothetical: What if those business leaders are correct and they can gleefully dispose of millions of employees? If you think we’re already suffering from egregious levels of inequality, what happens when a big chunk of the middle class suddenly becomes professionally superfluous? I’m no economist so I’ll stop there, but you don’t have to be a rocket scientist to predict severe economic pain.
Then there’s the other thing that nobody talks about, the massive greenhouse-gas load that all those data centers are going to be pumping out. This at a time when we we blow past one atmospheric-carbon metric after another and David Suzuki says the fight against climate change is lost, that we need to hunker down and work on survival at the local level.
It’s the people who are pushing it. Their business goals are quite likely, as a side-effect, to make the world a worse place, and they don’t give a fuck. Their technology will inevitably worsen the onrushing climate catastrophe, and they don’t give a fuck.
It’s probably not as simple as “They’re just shitty people” — it’s not exactly easy to escape the exigencies of modern capitalism. But they are people who are doing shitty things.
Sorry, I’m having trouble even thinking about that now.
2025-07-02 03:00:00
At the moment, we have no idea what the impact of genAI on software development is going to be. The impact of anything on coding is hard to measure systematically, so we rely on anecdata and the community’s eventual consensus. So, here’s my anecdata. Tl;dr: The AI was not useless.
My current work on Quamina involves dealing with collections of finite-automata states, which, in the Go programming language, are represented as slices of pointers to state instances:
[]*faState
The problem I was facing was deduping them, so that there would be only one instance corresponding to any particular
collection. This is what, in Java, the intern()
call does with strings.
The algorithm isn’t rocket science:
Dedupe the states, i.e. turn the collection into a set.
For each set of states, generate a key.
Keep a hash table of sets around, and use the key to see whether you’ve already got such a set, and if so return it. Otherwise, make a new entry in the hash table and return that.
I’m out of touch with the undergrad CS curriculum, but this feels like a second-year assignment or thereabouts? Third?
So I prompted Claude thus:
I need Go code to provide a "intern"-like function for lists of pointers. For example, if I have several different []*int arrays, which may contain duplicates, I want to call intern() on each of them and get back a single canonical pointer which is de-duplicated and thus a set.
Claude did pretty well. It got the algorithm right, the code was idiomatic and usefully commented, and it also provided a
decent unit test (but in a main()
stanza rather than a proper Go test file).
I didn’t try actually running it.
The interesting part was the key computation. I, being lazy, had just done a Go fmt.Sprintf("%p")
incantation to get a hex string representing each state’s address, sorted them, joined them, and that was the key.
Claude worked with the pointers more directly.
// Sort by pointer address for consistent ordering
sort.Slice(unique, func(i, j int) bool {
return uintptr(unsafe.Pointer(unique[i])) < uintptr(unsafe.Pointer(unique[j]))
})
Then it concatenated the raw bytes of the map addresses and lied to Go by claiming it was a string.
// Create key from pointer addresses
key := make([]byte, 0, len(slice)*8)
for _, ptr := range slice {
addr := uintptr(unsafe.Pointer(ptr))
// Convert address to bytes
for i := 0; i < 8; i++ {
key = append(key, byte(addr>>(i*8)))
}
}
return string(key)
This is an improvement in that the keys will be half the size of my string version. I didn’t copy-paste Claude’s code wholesale, just replaced ten or so lines of key construction.
I dunno. I thought the quality of the code was fine, wouldn’t have decomposed the functions in the same way but wouldn’t have objected on review. I was pleased with the algorithm, but then I would be since it was the same one I’d written, and, having said that, quite possibly that’s the only algorithm that anyone has used. It will be super interesting if someone responds to this write-up saying “You and Claude are fools, here’s a much better way.”
Was it worth fifteen minutes of my time to ask Claude and get a slightly better key computation? Only if this ever turns out to be a hot code path and I don’t think anybody’s smart enough to know that in advance.
Would I have saved time by asking Claude first? Tough to tell; Quamina’s data structures are a bit non-obvious and I would have had to go to a lot of prompting work to get it to emit code I could use directly. Also, since Quamina is low-level performance-critical infrastructure code, I’d be nervous about having any volume of code that I didn’t really really understand.
I guess my take-away was that in this case, Claude knew the Go idioms and APIs better than I did; I’d never looked at the unsafe package.
Which reinforces my suspicion that genAI is going to be especially useful at helping generate code to talk to big complicated APIs that are hard to remember all of. Here’s an example: Any moderately competent Android developer could add a feature to an app where it strobes the flash and surges the vibration in sync with how fast you’re shaking the device back and forth, probably in an afternoon. But it would require a couple of dozen calls into the dense forest of Android APIs, and I suspect a genAI might get you there a lot faster by just filling the calls in as prompted.
Reminder: This is just anecdata.