2025-12-15 09:42:10

Programmers often want to randomly shuffle arrays. Evidently, we want to do so as efficiently as possible. Maybe surprisingly, I found that the performance of random shuffling was not limited by memory bandwidth or latency, but rather by computation. Specifically, it is the computation of the random indexes itself that is slow.
Earlier in 2025, I reported how you could more than double the speed of a random shuffle in Go using a new algorithm (Brackett-Rozinsky and Lemire, 2025). However, I was using custom code that could not serve as a drop-in replacement for the standard Go Shuffle function. I decided to write a proper library called batchedrand. You can use it just like the math/rand/v2 standard library.
rng := batchedrand.Rand{rand.New(rand.NewPCG(1, 2))}
data := []int{1, 2, 3, 4, 5}
rng.Shuffle(len(data), func(i, j int) {
data[i], data[j] = data[j], data[i]
})
How fast is it? The standard library provides two generators, PCG and ChaCha8. ChaCha8 should be slower than PCG, because it has better cryptographic guarantees. However, both have somewhat comparable speeds because ChaCha8 is heavily optimized with assembly code in the Go runtime while the PCG implementation is conservative and not focused on speed.
On my Apple M4 processor with Go 1.25, I get the following results. I report the time per input element, not the total time.
| Benchmark | Size | Batched (ns/item) | Standard (ns/item) | speedup |
|---|---|---|---|---|
| ChaChaShuffle | 30 | 1.8 | 4.6 | 2.6 |
| ChaChaShuffle | 100 | 1.8 | 4.7 | 2.5 |
| ChaChaShuffle | 500000 | 2.6 | 5.1 | 1.9 |
| PCGShuffle | 30 | 1.5 | 3.9 | 2.6 |
| PCGShuffle | 100 | 1.5 | 4.2 | 2.8 |
| PCGShuffle | 500000 | 1.9 | 3.8 | 2.0 |
Thus, from tiny to large arrays, the batched approach is two to three times faster. Not bad for a drop-in replacement!
Get the Go library at https://github.com/lemire/batchedrand
Further reading:
2025-12-06 03:24:50

The one constant that I have observed in my professional life is that people underestimate the need to move fast.
Of course, doing good work takes time. I once spent six months writing a URL parser. But the fact that it took so long is not a feature, it is not a positive, it is a negative.
If everything is slow-moving around you, it is likely not going to be good. To fully make use of your brain, you need to move as close as possible to the speed of your thought.
If I give you two PhD students, one who completed their thesis in two years and one who took eight years… you can be almost certain that the two-year thesis will be much better.
Moving fast does not mean that you complete your projects quickly. Projects have many parts, and getting everything right may take a long time.
Nevertheless, you should move as fast as you can.
For multiple reasons:
1. A common mistake is to spend a lot of time—too much time—on a component of your project that does not matter. I once spent a lot of time building a podcast-like version of a course… only to find out later that students had no interest in the podcast format.
2. You learn by making mistakes. The faster you make mistakes, the faster you learn.
3. Your work degrades, becomes less relevant with time. And if you work slowly, you will be more likely to stick with your slightly obsolete work. You know that professor who spent seven years preparing lecture notes twenty years ago? He is not going to throw them away and start again, as that would be a new seven-year project. So he will keep teaching using aging lecture notes until he retires and someone finally updates the course.
What if you are doing open-heart surgery? Don’t you want someone who spends days preparing and who works slowly? No. You almost surely want the surgeon who does many, many open-heart surgeries. They are very likely to be the best one.
Now stop being so slow. Move!
2025-12-04 23:40:59

“We see something that works, and then we understand it.” (Thomas Dullien)
It is a deeper insight than it seems.
Young people spend years in school learning the reverse: understanding happens before progress. That is the linear theory of innovation.
So Isaac Newton comes up with his three laws of mechanics, and we get a clockmaking boom. Of course, that’s not what happened: we get the pendulum clock in 1656, then Hooke (1660) and Newton (1665–1666) get to think about forces, speed, motion, and latent energy.
The linear model of innovation makes as much sense as the waterfall model in software engineering. In the waterfall model, you are taught that you first need to design every detail of your software application (e.g., using a language like UML) before you implement it. To this day, half of the information technology staff members at my school are made up of “analysts” whose main job is supposedly to create such designs based on requirements and supervise execution.
Both the linear theory and the waterfall model are forms of thinkism, a term I learned from Kevin Kelly. Thinkism sets aside practice and experience. It is the belief that given a problem, you should just think long and hard about it, and if you spend enough time thinking, you will solve it.
Thinkism works well in school. The teacher gives you all the concepts, then gives you a problem that, by a wonderful coincidence, can be solved just by thinking with the tools the same teacher just gave you.
As a teacher, I can tell you that students get really angry if you put a question on an exam that requires a concept not explicitly covered in class. Of course, if you work as an engineer and you’re stuck on a problem and you tell your boss it cannot be solved with the ideas you learned in college… you’re going to look like a fool.
If you’re still in school, here’s a fact: you will learn as much or more every year of your professional life than you learned during an entire university degree—assuming you have a real engineering job.
Thinkism also works well in other limited domains beyond school. It works well in bureaucratic settings where all the rules are known and you’re expected to apply them without question. There are many jobs where you first learn and then apply. And if you ever encounter new conditions where your training doesn’t directly apply, you’re supposed to report back to your superiors, who will then tell you what to do.
But if you work in research and development, you always begin with incomplete understanding. And most of the time, even if you could read everything ever written about your problem, you still wouldn’t understand enough to solve it. The way you make discoveries is often to either try something that seems sensible, or to observe something that happens to work—maybe your colleague has a practical technique that just works—and then you start thinking about it, formalizing it, putting it into words… and it becomes a discovery.
And the reason it often works this way is that “nobody knows anything.” The world is so complex that even the smartest individual knows only a fraction of what there is to know, and much of what they think they know is slightly wrong—and they don’t know which part is wrong.
So why should you care about how progress happens? You should care because…
1. It gives you a recipe for breakthroughs: spend more time observing and trying new things… and less time thinking abstractly.
2. Stop expecting an AI to cure all diseases or solve all problems just because it can read all the scholarship and “think” for a very long time. No matter how much an AI “knows,” it is always too little.
Further reading: Godin, Benoît (2017). Models of innovation: The history of an idea. MIT press.
2025-12-04 04:41:26

It is absolutely clear to me that large language models represent the most significant scientific breakthrough of the past fifty years. The nature of that breakthrough has far reaching implications for what is happening in science today. And I believe that the entire scientific establishment is refusing to acknowledge it.
We often excuse our slow progress with tired clichés like “all the low-hanging fruit has been picked.” It is an awfully convenient excuse if you run a scientific institution that pretends to lead the world in research—but in reality is mired in bureaucracy, stagnation and tradition.
A quick look at the world around us tells a different story, progress is possible and even moderately easy, even through the lens of everyday experience. I have been programming in Python for twenty years and even wrote a book about it. Managing dependencies has always been a painful, frustrating process—seemingly unsolvable. The best anyone could manage was to set up a virtual environment. Yes, it was clumsy and awkward as you know if you programmed in Python, but that was the state of the art after decades of effort by millions of Python developers. Then, in 2024, a single tool called uv appeared and suddenly made the Python ecosystem feel sane, bringing it in line with the elegance of Go or JavaScript runtimes. In retrospect, the solution seems almost obvious.
NASA has twice the budget of SpaceX. Yet SpaceX has launched more missions to orbit in the past decade than NASA managed in the previous fifty years. The difference is not money; it is culture, agility, and a willingness to embrace new ideas.
Large language models have answered many profound scientific questions, yet one of the deepest concerns the very nature of language itself. For generations, the prevailing view was that human language depends on a vast set of logical rules that the brain applies unconsciously. That rule-based paradigm dominated much of twentieth-century linguistics and even shaped the early web. We spent an entire decade chasing the dream of the Semantic Web, convinced that if we all shared formal, machine-readable metadata, rule engines would deliver web-scale intelligence. Thanks to large language models, we now know that language does not need to be rule-based at all. Verbal intelligence does not need to require on explicit rules.
It is a tremendous scientific insight that overturns decades of established thinking.
A common objection is that I am conflating engineering with science. Large language models are just engineering. I invite you to examine the history of science more closely. Scientific progress has always depended on the tools we build.
You need a seaworthy boat before you can sail to distant islands, observe wildlife, and formulate the theory of natural selection. Measuring the Earth’s radius with the precision achieved by the ancient Greeks required both sophisticated engineering and non-trivial mathematics. Einstein’s insights into relativity emerged in an era when people routinely experienced relative motion on trains; the phenomenon was staring everyone in the face.
The tidy, linear model of scientific progress—professors thinking deep thoughts in ivory towers, then handing blueprints to engineers—is indefensible. Fast ships and fast trains are not just consequences of scientific discovery; they are also wellsprings of it. Real progress is messy, iterative, and deeply intertwined with the tools we build. Large language models are the latest, most dramatic example of that truth.
So what does it tell us about science? I believe it is telling us that we need to rethink our entire approach to scientific research. We need to embrace agility, experimentation, and a willingness to challenge established paradigms. The bureaucratization of science was a death sentence for progress.
2025-11-29 13:00:03

Base64 is a binary-to-text encoding scheme that converts arbitrary binary data (like images, files, or any sequence of bytes) into a safe, printable ASCII string using a 64-character alphabet (A–Z, a–z, 0–9, +, /). Browsers use it in JavaScript to embedding binary data directly in code or HTML or to transmitting binary data as text.
Browsers recently added convenient and safe functions to process base 64 functions Uint8Array.toBase64() and Uint8Array.fromBase64(). Though they are several parameters, it comes down to an encoding and a decoding function.
const b64 = Base64.toBase64(bytes); // string const recovered = Base64.fromBase64(b64); // Uint8Array
When encoding, it takes 24 bits from the input. These 24 bits are divided into four 6-bit segments, and each 6-bit value (ranging from 0 to 63) is mapped to a specific character from the Base64 alphabet: the first 26 characters are uppercase letters A-Z, the next 26 are lowercase a-z, then digits 0-9, followed by + and / as the 62nd and 63rd characters. The equals sign = is used as padding when the input length is not a multiple of 3 bytes.
How fast can they be ?
Suppose that you consumed 3 bytes and produced 4 bytes per CPU cycle. At 4.5 GHz, it would be that you would encode to base64 at 13.5 GB/s. We expect lower performance going in the other direction. When encoding, any input is valid: any binary data will do. However, when decoding, we must handle errors and skip spaces.
I wrote an in-browser benchmark. You can try it out in your favorite browser.
I decided to try it out on my Apple M4 processor, to see how fast the various browsers are. I use blocks of 64 kiB. The speed is measured with respect to the binary data.
| browser | encoding speed | decoding speed |
|---|---|---|
| Safari | 17 GB/s | 9.4 GB/s |
| SigmaOS | 17 GB/s | 9.4 GB/s |
| Chrome | 19 GB/s | 4.6 GB/s |
| Edge | 19 GB/s | 4.6 GB/s |
| Brave | 19 GB/s | 4.6 GB/s |
| Servo | 0.34 GB/s | 0.40 GB/s |
| Firefox | 0.34 GB/s | 0.40 GB/s |
Safari seems to have slightly slower encoding speed than the Chromium browsers (Chome, Edge, Brave), however it is about twice as fast at decoding. Servo and Firefox have similarly poor performance with the unexpected result of having faster decoding speed than encoding speed. I could have tried other browsers but most seem to be derivatives of Chromium or WebKit.
For context, the disk of a good laptop can sustain over 3 GB/s of read or write speed. Some high-end laptops have disks that are faster than 5 GB/s. In theory, your wifi connections might get close to 5 GB/s with Wifi 7. Some Internet providers might get close to providing similar network speeds although your Internet connection is likely several times slower.
The speeds on most browsers are faster than you might naively guess. They are faster than networks or disks.
Note. The slower decoding speed on Chromium-based browsers appears to depend on the v8 JavaScript engine which decodes the string first to a temporary buffer, before finally copying from the temporary buffer to the final destination. (See BUILTIN(Uint8ArrayFromBase64) in v8/src/builtins/builtins-typed-array.cc.)
Note. Denis Palmeiro from Mozzila let me know that upcoming changes in Firefox will speed up performance of the base64 functions. In my tests with Firefox nightly, the performance is up by about 20%.
2025-11-29 10:39:56

Whenever I say I dislike debugging and organize my programming habits around avoiding it, there is always pushback: “You must not use a good debugger.”
To summarize my view: I want my software to be antifragile (credit to Nassim Taleb for the concept). The longer I work on a codebase, the easier it should become to fix bugs.
Each addition to a pieces of code can be viewed as a stress. If nothing is done, the code gets slightly worse, harder to maintain, more prone to bugs. Thankfully, you can avoid such outcome.
That’s not natural. For most developers lacking deep expertise, as the codebase grows, bugs become harder to fix: you chase symptoms through layers of code, hunt heisenbugs that vanish in the debugger, or fix one bug only to create another. The more code you have, the worse it gets. Such code is fragile. Adding a new feature risks breaking old, seemingly unrelated parts.
In my view, the inability to produce antifragile code explains the extreme power-law distribution in programming: most of the code we rely on daily was written by a tiny fraction of all programmers who have mastered antifragility.
How do you reverse this? How do you ensure that the longer you work on the code, the shallower the bugs become?
There are well-known techniques, and adding lots of tests and checks definitely helps. You can write antifragile code without tests or debug-time checks… but you’ll need something functionally equivalent.
Far-reaching prescriptions (“you must use language X, tool Y, method Z”) are usually cargo-cult nonsense. Copying Linus Torvalds’ tools or swearing style won’t guarantee success. But antifragillity is not a prescription, it is a desired outcome.
Defensive programming itself is uncontroversial—yet it wasn’t common in the 1980s and still isn’t the default for many today.
Of course, a full defensive approach isn’t always applicable or worth the cost.
For example, if I’m vibe-coding a quick web app with more JavaScript than I care to read, I’ll just run it in the browser’s debugger. It works fine. I’m not using that code to control a pacemaker, and I’m not expecting to be woken up at midnight on Christmas to fix it.
If your program is 500 lines and you’ll run it 20 times a year, antifragility isn’t worth pursuing.
Large language models can generate defensive code, but if you’ve never written defensively yourself and you learn to program primarily with AI assistance, your software will probably remain fragile.
You can add code quickly, but the more you add, the bigger your problems become.
That’s the crux of the matter. Writing code was never the hard part—I could write code at 12, and countless 12-year-olds today can write simple games and apps. In the same way, a 12-year-old can build a doghouse with a hammer, nails, and wood. Getting 80% of the way has always been easy.
Scaling complexity without everything collapsing—that’s the hard part.