2026-04-09 08:18:57
I recently found out about Andrica’s conjecture: the square roots of consecutive primes are less than 1 apart.
In symbols, Andrica’s conjecture says that if pn and pn+1 are consecutive prime numbers, then
√pn+1 − √pn < 1.
This has been empirically verified for primes up to 2 × 1019.
If the conjecture is true, it puts an upper bound on how long you’d have to search to find the next prime:
pn+1 < 1 + 2√pn + pn,
which would be an improvement on the Bertrand-Chebyshev theorem that says
pn+1 < 2pn.
The post Root prime gap first appeared on John D. Cook.
2026-04-09 07:30:15
Last week I wrote about the orbit of Artemis II. The orbit of Artemis I was much more interesting.
Because Artemis I was unmanned, it could spend a lot more time in orbit. The Artemis I mission took 25 days while Artemis II will take 10 days. Artemis I took an unusual path, orbiting the moon the opposite direction of the moon’s orbit around earth. This video by Primal Space demonstrates the orbit both from the perspective of earth and from the perspective of the Moon.
Another video from Primal Space describes the orbit of the third stage of Apollo 12. This stage was supposed to orbit around the sun in 1971, but an error sent it on a complicated unstable orbit of the earth, moon, and sun. It returned briefly to earth in 2002 and expected to return sometime in the 2040s.
2026-04-07 08:33:23
Landauer’s principle gives a lower bound on the amount of energy it takes to erase one bit of information:
E ≥ log(2) kBT
where kB is the Boltzmann constant and T is the ambient temperature in Kelvin. The lower bound applies no matter how the bit is physically stored. There is no theoretical lower limit on the energy required to carry out a reversible calculation.
In practice the energy required to erase a bit is around a billion times greater than Landauer’s lower bound. You might reasonably conclude that reversible computing isn’t practical since we’re nowhere near the Landauer limit. And yet in practice reversible circuits have been demonstrated to use less energy than conventional circuits. We’re far from the ultimate physical limit, but reversibility still provides practical efficiency gains today.
A Toffoli gate is a building block of reversible circuits. A Toffoli gate takes three bits as input and returns three bits as output:
T(a, b, c) = (a, b, c XOR (a AND b)).
In words, a Toffoli gate flips its third bit if and only if the first two bits are ones.
A Toffoli gate is its own inverse, and so it is reversible. This is easy to prove. If a = b = 1, then the third bit is flipped. Apply the Toffoli gate again flips the bit back to what it was. If ab = 0, i.e. at least one of the first two bits is zero, then the Toffoli gate doesn’t change anything.
There is a theorem that any Boolean function can be computed by a circuit made of only NAND gates. We’ll show that you can construct a NAND gate out of Toffoli gates, which shows any Boolean function can be computed by a circuit made of Toffoli gates, which shows any Boolean function can be computed reversibly.
To compute NAND, i.e. ¬ (a ∧ b), send (a, b, 1) to the Toffoli gate. The third bit of the output will contain the NAND of a and b.
T(a, b, 1) = (a, b, ¬ (a ∧ b))
A drawback of reversible computing is that you may have to send in more input than you’d like and get back more output than you’d like, as we can already see from the example above. NAND takes two input bits and returns one output bit. But the Toffoli gate simulating NAND takes three input bits and returns three output bits.
2026-04-06 07:04:46
The best way to run AI and remain HIPAA compliant is to run it locally on your own hardware, instead of transferring protected health information (PHI) to a remote server by using a cloud-hosted service like ChatGPT or Claude. [1].
There are HIPAA-compliant cloud options, but they’re both restrictive and expensive. Even enterprise options are not “HIPAA compliant” out of the box. Instead, they are “HIPAA eligible” or that they “support HIPAA compliance,” because you still need the right Business Associate Agreement (BAA), configuration, logging, access controls, and internal process around it, and the end product often ends up far less capable than a frontier model. The least expensive and therefore most accessible services do not even allow this as an option.
Specific examples:
Running AI locally is already practical as of early 2026. Open-weight models that approach the quality of commercial coding assistants run on consumer hardware. A single high-end GPU or a recent Mac with enough unified memory can run a 70B-parameter model at a reasonable token speed.
There’s an interesting interplay between economies of scale and diseconomies of scale. Cloud providers can run a data center at a lower cost per server than a small company can. That’s the economies of scale. But running HIPAA-compliant computing in the cloud, particularly with AI providers, incurs a large direct costs and indirect bureaucratic costs. That’s the diseconomies of scale. Smaller companies may benefit more from local AI than larger companies if they need to be HIPAA-compliant.
[1] This post is not legal advice. My clients are often lawyers, but I’m not a lawyer.
The post HIPAA compliant AI first appeared on John D. Cook.2026-04-04 23:00:14
This post will look at the problem of updating an average grade as a very simple special case of Bayesian statistics and of Kalman filtering.
Suppose you’re keeping up with your average grade in a class, and you know your average after n tests, all weighted equally.
m = (x1 + x2 + x3 + … + xn) / n.
Then you get another test grade back and your new average is
m′ = (x1 + x2 + x3 + … + xn + xn+1) / (n + 1).
You don’t need the individual test grades once you’ve computed the average; you can instead remember the average m and the number of grades n [1]. Then you know the sum of the first n grades is nm and so
m′ = (nm + xn+1) / (n + 1).
You could split that into
m′ = w1m + w2xn+1
where w1 = n/(n + 1) and w2 = 1/(n + 1). In other words, the new mean is the weighted average of the previous mean and the new score.
A Bayesian perspective would say that your posterior expected grade m′ is a compromise between your prior expected grade m and the new data xn+1. [2]
You could also rewrite the equation above as
m′ = m + (xn+1 − m)/(n + 1) = m + KΔ
where K = 1/(n + 1) and Δ = xn+1 − m. In Kalman filter terms, K is the gain, the proportionality constant for how the change in your state is proportional to the difference between what you saw and what you expected.
[1] In statistical terms, the mean is a sufficient statistic.
[2] You could flesh this out by using a normal likelihood and a flat improper prior.
The post Kalman and Bayes average grades first appeared on John D. Cook.2026-04-04 00:31:54
I used the term perilune in yesterday’s post about the flight path of Artemis II. When Artemis is closest to the moon it will be furthest from earth because its closest approach to the moon, its perilune, is on the side of the moon opposite earth.
Perilune is sometimes called periselene. The two terms come from two goddesses associated with the moon, the Roman Luna and the Greek Selene. Since the peri- prefix is Greek, perhaps periselene would be preferable. But we’re far more familiar with words associated with the moon being based on Luna than Selene.
The neutral terms for closest and furthest points in an orbit are periapsis and apoapsis. but there are more colorful terms that are specific to orbiting particular celestial objects. The terms perigee and apogee for orbiting earth (from the Greek Gaia) are most familiar, and the terms perihelion and aphelion (not apohelion) for orbiting the sun (from the Greek Helios) are the next most familiar.
The terms perijove and apojove are unfamiliar, but you can imagine what they mean. Others like periareion and apoareion, especially the latter, are truly arcane.
The post Roman moon, Greek moon first appeared on John D. Cook.