2025-12-24 22:06:41
It’s been said whatever you can validate, you can automate. An AI that produces correct work 90% of the time could be very valuable, provided you have a way to identify the 10% of the cases where it is wrong. Often verifying a solution takes far less computation than finding a solution. Examples here.
Validating AI output can be tricky since the results are plausible by construction, though not always correct.
One way to validate output is to apply consistency checks. Such checks are necessary, but not sufficient, and often easy to implement. An simple consistency check might be that inputs to a transaction equal outputs. A more sophisticated consistency check might be conservation of energy or something analogous to it.
Some problems have certificates, ways of verifying that a calculation is correct that can be evaluated with far less effort than finding the solution that they verify. I’ve written about certificates in the context of optimization, solving equations, and finding prime numbers.
Correctness is more important in some contexts than others. If a recommendation engine makes a bad recommendation once in a while, the cost is a lower probability of conversion in a few instances. If an aircraft collision avoidance system makes an occasional error, the consequences could be catastrophic.
When the cost of errors is extremely high, formal verification may be worthwhile. Formal correctness proofs using something like Lean or Rocq are extremely tedious and expensive to create, and hence not economical. But if an AI can generate a result and a formal proof of correctness, hurrah!
But if an AI result can be wrong, why couldn’t a formal proof generated to defend the result also be wrong? As the Roman poet Juvenal asked, Quis custodiet ipsos custodes? Who will watch the watchmen?
An AI could indeed generate an incorrect proof, but if it does, the proof assistant will reject it. So the answer to who will watch Claude, Gemini, and ChatGPT is Lean, Rocq, and Isabelle.
Isn’t it possible that a theorem prover like Rocq could have a bug? Of course it’s possible; there is no absolute certainty under the sun. But hundreds of PhD-years of work have gone into Rocq (formerly Coq) and so bugs in the kernel of that system are very unlikely. The rest of the system is bootstrapped, verified by the kernel.
Even so, an error in the theorem prover does not mean an error in the original result. For an incorrect result to slip through, the AI-generated proof would have to be wrong in a way that happens to exploit an unknown error in the theorem prover. It is far more likely that you’re trying to prove the wrong thing than that the theorem prover let you down.
I mentioned collision avoidance software above. I looked into collision avoidance software when I did some work for Amazon’s drone program. The software that was formally verified was also unrealistic in its assumptions. The software was guaranteed to work correctly, if two objects are flying at precisely constant velocity at precisely the same altitude etc. If everything were operating according to geometrically perfect assumptions, there would be no need for collision avoidance software.
The post Automation and Validation first appeared on John D. Cook.2025-12-24 01:03:58

Newton’s birthday was on Christmas when he was born, but now his birthday is not.
When Newton was born, England was still using the Julian calendar, and would continue to use the Julian calendar for another 25 years.
On the day of Newton’s birth, his parents would have said the date was December 25, 1642. We would now describe the date as January 4, 1643.
You’ll sometimes see Newton’s birthday written as December 25, 1642 O.S. The “O.S.” stands for “Old Style,” i.e. Julian calendar. Of course the Newton family would not have written O.S. because there was no old style until the new style (i.e. Gregorian calendar) was adopted, just as nobody living in the years before Christ would have written a date as B.C.
In a nutshell, the Julian year was too long, which made it drift out of sync with the astronomical calendar. The Julian year was 365 1/4 days, whereas the Gregorian calendar has 365 97/400 days, which more closely matches the time it takes Earth to orbit the sun. Removing three Leap Days (in centuries not divisible by 400) put the calendar back in sync. When countries adopted the Gregorian calendar, they had to retroactively remove excess Leap Days. That’s why Newton’s birthday got moved up 10 days.
You can read more on the Julian and Gregorian calendars here.
The winter solstice in the northern hemisphere was two days ago: December 21, 2025. And in 1642, using the Gregorian calendar, the solstice was also on December 21. But in England, in 1642, people would have said the solstice occurred on December 11, because the civil calendar was 10 days ahead of the astronomical calendar.
The post When was Newton born? first appeared on John D. Cook.2025-12-23 23:57:34
A few weeks ago I mentioned that I was reading Stephen Ambrose’s account of the Lewis & Clark expedition and wrote a post about their astronomical measurements. James Campbell left a comment recommending Edwin Danson’s book [1] on the history of the Mason-Dixon line. I ordered the book, and now that work has slowed down for Christmas I have had the time to open it.
In addition to determining their eponymous line, surveyors Charles Mason and Jeremiah Dixon were also the first to measure a degree of latitude in 1767.

What exactly did they measure? We’ll get to that, but first we need some background.
To first approximation a degree of latitude is simply 1/360th of the Earth’s circumference, but Mason and Dixon were more accurate than that. Isaac Newton (1643–1727) deduced that our planet was not a perfect sphere but rather an oblate spheroid. The best measurement in Mason and Dixon’s time was that the Earth’s semi-major axis was 6,397,300 meters with flattening 1/f = 216.8.
It’s a bit of an anachronism to describe the distance in meters since the meter was defined in 1791. The meter was originally defined as one ten-millionth of the distance from the equator to the North Pole along a great circle through Paris.
If the Earth were a perfect sphere, a degree of latitude would be 1/360th of its circumference. Using the original definition of the meter, this would be exactly 10,000,000/360 meters. But because the Earth is not a perfect sphere, each degree of latitude has a slightly different length. To put it another way, the length of a degree of latitude varies by latitude.
Another complication due to the flattening of the Earth is that there are multiple ways to define latitude. The two most common are geocentric and geodetic. The geocentric latitude of a point P on the Earth’s surface is the angle between the equatorial plane and a line between the center of the earth and P. The geodetic latitude (a.k.a. geographic latitude) of P is the angle between the equatorial plane and a line perpendicular to the Earth’s surface at P. More on the difference between geocentric and geodetic latitude here.
Since the length of a degree of latitude varies, we need to say at what latitude they measured the length of a degree. In short, they measured the length of a degree near what we now know as the Mason-Dixon line, the border between Pennsylvania and Maryland.
To be more precise, the starting point was Stargazer’s Stone, a stone placed by Mason and Dixon on John Harland’s farm near Embreeville, Pennsylvania, to a point about a degree and a half due south near what is now Delmar, a town on the Deleware / Maryland border.
I’ve had some difficulty determining how accurate Mason and Dixon were. Some sources I’ve found are obviously wrong. I haven’t verified this, but it seems Mason and Dixon overestimated the length of a degree of latitude at their location by only 465.55 ft or about 0.13%, a remarkable feat given the technology of their day.
[1] Edwin Danson. Drawing The Line: How Mason and Dixon Surveyed the Most Famous Border in America. John Wiley & Sons. 2001.
The post Mason, Dixon, and Latitude first appeared on John D. Cook.2025-12-23 18:28:18
I recently learned of Bowie’s numerical method for solving ordinary differential equations of the form
y″ = f(y)
via Alex Scarazzini’s masters thesis [1].
The only reference I’ve been able to find for the method, other than [1], is the NASA Orbital Flight Handbook from 1963. The handbook describes the method as “a method employed by C. Bowie and incorporated in many Martin programs” and says nothing more about its origin.
What does it mean by “Martin programs”? The first line of the foreword of the manual says
This handbook has been produced by the Space Systems Division of the Martin Company under Contract NAS8-S03l with the George C. Marshall Space Flight Center of the National Aeronautics and Space Administration.
The Martin Company was the Glenn L. Martin Company, which became Martin Marietta after merging with American-Marietta Corporation in 1961. The handbook was written after the merger but used the older name. Martin Marietta would go on to become Lockheed Martin in 1995.
Bowie’s method was used “in many Martin programs” and yet is practically unknown in academic circles. Scarazzini’s thesis shows the method works well for his problem.
My first thought when I saw the form of differential equations Bowie’s method solves was the nonlinear pendulum equation
y″ = − sin(y)
where the initial displacement y(0) is too large for the approximation sin θ ≈ θ to be sufficiently accurate. I wrote some Python code to try out Bowie’s method on this equation.
import numpy as np
N = 100
y = np.zeros(N)
yp = np.zeros(N) # y'
y[0] = 1
yp[0] = 0
T = 4*ellipk(np.sin(y[0]/2)**2)
h = T/N
f = lambda x: -np.sin(x)
fp = lambda x: -np.cos(x) # f'
fpp = lambda x: np.sin(x) # f''
for n in range(0, N-1):
y[n+1] = y[n] + h*yp[n] + 0.5*h**2*f(y[n]) + \
(h**3/6)*fp(y[n])*yp[n] + \
(h**4/24)*(fpp(yp[n])**2 + fp(y[n])*f(y[n]))
yp[n+1] = yp[n] + h*f(y[n]) + 0.5*h**2*fp(y[n])*yp[n] + \
(h**3/6)*(fpp(yp[n])**2 + fp(y[n])*f(y[n]))
Here’s a graph of the numerical solution.

The solution looks like a cosine, but it isn’t exactly. As I explain here,
The solution to the nonlinear pendulum equation is also periodic, though the solution is a combination of Jacobi functions rather than a combination of trig functions. The difference between the two solutions is small when θ0 is small, but becomes more significant as θ0 increases.
The difference in the periods is more evident than the difference in shape for the two waves. The period of the nonlinear solution is longer than that of the linearized solution.
That’s why the period T in the code is not
2π = 6.28
but rather
4 K(sin² θ0/2) = 6.70.
You’ll also see the period of the nonlinear pendulum given as 4 K(sin θ0/2). As pointed out in the article linked above,
There are two conventions for defining the complete elliptic integral of the first kind. SciPy uses a convention for K that requires us to square the argument.
[1] Alex Scarazzini.3D Visualization of a Schwarzschild Black Hole Environment. University of Bern. August 2025.
The post Bowie integrator and the nonlinear pendulum first appeared on John D. Cook.2025-12-22 23:00:06
The first difficulty in trying to fit an exponential distribution to data is that the data may not follow an exponential distribution. Nothing grows exponentially forever. Eventually growth slows down. The simplest way growth can slow down is to follow a logistic curve, but fitting a logistic curve has its own problems, as detailed in the previous post.
Suppose you are convinced that whatever you’re wanting to model follows an exponential curve, at least over the time scale that you’re interested in. This is easier to fit than a logistic curve. If you take the logarithm of the data, you now have a linear regression problem. Linear regression is numerically well-behaved and has been thoroughly explored.
There is a catch, however. When you extrapolate a linear regression, your uncertainly region flares out as you go from observed data to linear predictions based on the observed data. Your uncertainty grows linearly. But remember that we’re not working with the data per se; we’re working with the logarithm of the data. So on the original scale, the uncertainty flares out exponentially.
The post Trying to fit exponential data first appeared on John D. Cook.2025-12-21 05:58:48
A logistic curve, sometimes called an S curve, looks different in different regions. Like the proverbial blind men feeling different parts of an elephant, people looking at different segments of the curve could come to very different impressions of the full picture.
It’s naive to look at the left end and assume the curve will grow exponentially forever, even if the data are statistically indistinguishable from exponential growth.
A slightly less naive approach is to look at the left end, assume logistic growth, and try to infer the parameters of the logistic curve. In the image above, you may be able to forecast the asymptotic value if you have data up to time t = 2, but it would be hopeless to do so with only data up to time t = −2. (This post was motivated by seeing someone trying to extrapolate a logistic curve from just its left tail.)
Suppose you know with absolute certainty that your data have the form
where ε is some small amount of measurement error. The world is not obligated follow a simple mathematical model, or any mathematical model for that matter, but for this post we will assume that for some inexplicable reason you know the future follows a logistic curve; the only question is what the parameters are.
Furthermore, we only care about fitting the a parameter. That is, we only want to predict the asymptotic value of the curve. This is easier than trying to fit the b or c parameters.
I generated 16 random t values between −5 and −2, plugged them into the logistic function with parameters a = 1, b = 1, and c = 0, then added Gaussian noise with standard deviation 0.05.
My intention was to do this 1000 times and report the range of fitted values for a. However, the software I was using (scipy.optimize.curve_fit) failed to converge. Instead it returned the following error message.
RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 800.
When you see a message like that, your first response is probably to tweak the code so that it converges. Sometimes that’s the right thing to do, but often such numerical difficulties are trying to tell you that you’re solving the wrong problem.
When I generated points between −5 and 0, the curve_fit algorithm still failed to converge.
When I generated points between −5 and 2, the fitting algorithm converged. The range of a values was from 0.8254 to 1.6965.
When I generated points between −5 and 3, the range of a values was from 0.9039 to 1.1815.
Increasing the number of generated points did not change whether the curve fitting method converge, though it did result in a smaller range of fitted parameter values when it did converge.
I said we’re only interested in fitting the a parameter. I looked at the ranges of the other parameters as well, and as expected, they had a wider range of values.
So in summary, fitting a logistic curve with data only on the left side of the curve, to the left of the inflection point in the middle, may completely fail or give you results with wide error estimates. And it’s better to have a few points spread out through the domain of the function than to have a large number of points only on one end.