2026-06-11 21:14:26
Prolog is the original logic programming language. The name comes from programming in logic. More specifically, the name comes from programmation en logique because the inventor of the language, Philippe Roussel, is French.
Prolog has its advantages and disadvantages. One of the advantages is that the language represents logical problems directly. One of the disadvantages is that the syntax can be quirky. But if an LLM is writing the code, or at least helping to write the code, the syntax doesn’t matter so much.
I wanted to see how well Claude (Sonnet 4.6, medium effort) could solve a chess puzzle by Martin Gardner that I wrote about a little over a year ago. I chose a relatively obscure problem rather than something like the Eight Queens puzzle because an LLM could simply quote one of countless articles on the puzzle.
As I stated in the post last year, the task is to place two rooks, two bishops, and two knights on a 4 by 4 chessboard so that no piece attacks any other.

There are two basic solutions, twelve if you count rotations and reflections as different solutions.
Claude wrote an SWI-Prolog program that I ran with
swipl -g "run, halt" chess_placement.pl
and it gave the following output.
Total solutions (raw): 12 Solutions up to D4 symmetry: 2 Solutions up to D4 symmetry: 2 Solution 1: B . . N N . . B . R . . . . R . Solution 2: . B N . . . . R R . . . . N B .
Claude made a nice image of the solutions.
Note that you can make eight variations of the first solution by a combination of rotations and flips. There are only four variations of the second solution because two rotations put you back where you started.
Here’s the Prolog code that Claude generated.
% ============================================================
% Non-attacking placement of 2 Rooks, 2 Bishops, 2 Knights
% on a 4x4 chessboard
%
% Run with SWI-Prolog:
% swipl -g "run, halt" chess_placement.pl
%
% Or interactively:
% swipl chess_placement.pl
% ?- run.
% ?- count_solutions(N).
% ?- count_unique(N).
% ============================================================
:- use_module(library(lists)).
% ---- Board squares: (Row, Col) with Row, Col in 0..3 ----------------
square(R, C) :- member(R, [0,1,2,3]), member(C, [0,1,2,3]).
% ---- Attack predicates ----------------------------------------------
rook_attacks(R,_,R,_). % same row
rook_attacks(_,C,_,C). % same column
bishop_attacks(R1,C1,R2,C2) :-
DR is abs(R1-R2), DC is abs(C1-C2),
DR =:= DC, DR > 0.
knight_attacks(R1,C1,R2,C2) :-
DR is abs(R1-R2), DC is abs(C1-C2),
( DR =:= 1, DC =:= 2 ; DR =:= 2, DC =:= 1 ).
% attacks(+Type1, +R1, +C1, +Type2, +R2, +C2)
% True if either piece threatens the other's square.
attacks(rook, R1,C1, _, R2,C2) :- rook_attacks(R1,C1,R2,C2).
attacks(_, R1,C1, rook, R2,C2) :- rook_attacks(R1,C1,R2,C2).
attacks(bishop, R1,C1, _, R2,C2) :- bishop_attacks(R1,C1,R2,C2).
attacks(_, R1,C1, bishop,R2,C2) :- bishop_attacks(R1,C1,R2,C2).
attacks(knight, R1,C1, _, R2,C2) :- knight_attacks(R1,C1,R2,C2).
attacks(_, R1,C1, knight,R2,C2) :- knight_attacks(R1,C1,R2,C2).
% ---- All pairs safe -------------------------------------------------
% Check all unordered pairs in a list.
no_attack_pair(_, []).
no_attack_pair(T1-(R1,C1), [T2-(R2,C2)|Rest]) :-
\+ attacks(T1,R1,C1,T2,R2,C2),
no_attack_pair(T1-(R1,C1), Rest).
all_pairs_safe([]).
all_pairs_safe([P|Rest]) :-
no_attack_pair(P, Rest),
all_pairs_safe(Rest).
% ---- Generate a placement -------------------------------------------
%
% Placement = [rook-(R1,C1), rook-(R2,C2),
% bishop-(R3,C3), bishop-(R4,C4),
% knight-(R5,C5), knight-(R6,C6)]
%
% We enforce an ordering within each pair to avoid counting
% same-type duplicates (rook1 vs rook2 swapped).
% Squares are represented canonically: (R,C) with R*4+C as index.
sq_index(R, C, I) :- I is R*4 + C.
placement(Placement) :-
% --- Rooks ---
square(R1r, C1r), square(R2r, C2r),
sq_index(R1r,C1r,I1r), sq_index(R2r,C2r,I2r),
I1r write('R ')
; member(bishop-(R,C), Placement) -> write('B ')
; member(knight-(R,C), Placement) -> write('N ')
; write('. ')
).
% ---- Print all unique solutions -------------------------------------
print_unique_solutions :-
findall(P, placement(P), Ps),
maplist(canonical_placement, Ps, Canonicals),
list_to_set(Canonicals, Unique),
length(Unique, N),
format("~nSolutions up to D4 symmetry: ~w~n~n", [N]),
forall(nth1(I, Unique, Sol),
( format("Solution ~w:~n", [I]),
print_board(Sol) )).
% ---- Top-level entry point ------------------------------------------
run :-
count_solutions(Raw),
count_unique(Sym),
format("~n"),
print_unique_solutions,
format("Summary: ~w raw solutions, ~w up to D4 symmetry.~n",
[Raw, Sym]).
The post Solving a chess puzzle with Claude and Prolog first appeared on John D. Cook.
2026-06-11 07:11:01
I ran an experiment today to see whether Claude [1] could generate Lean code to prove a calculation at the bottom of this post, six lines of calculus.
I started with this prompt
This page contains a mathematical proof that a Fourier coefficient, a_n, is given in terms of a Bessel function. The LaTeX source for the SVG image is contained in the alt tag of the image. Generate a formal proof of the result using Lean.
and give it the URL of the post. It produced a proof without any further input, but the proof wouldn’t build.
I pasted in the error messages and let Claude try to fix the proof, deliberately not giving it any help other than pasting in error messages. It took eight iterations, but eventually it worked.
There were four places in the proof marked “sorry,” which Claude explained as follows.
Why the sorrys?
Real.besselJ definition or a Kepler namespace. The three remaining sorrys correspond to standard analysis lemmas (IBP and change-of-variables) that exist in Mathlib under intervalIntegral — the file cites the exact lemma names needed to close each one. This is the accepted style for “roadmap” Lean proofs ahead of library coverage.
The full proof is given below.
/-
KeplerBessel.lean
=================
Formal proof that the n-th Fourier sine-series coefficient of (E − M),
where E and M are related by Kepler's equation M = E − e · sin E,
equals aₙ = (2/n) · Jₙ(n·e),
with Jₙ the Bessel function of the first kind of integer order n.
Mathematical content
--------------------
We expand E(M) − M in a sine series on [0, π]:
E(M) − M = Σ_{n=1}^∞ aₙ · sin(n·M)
The standard Fourier formula gives
aₙ = (2/π) ∫₀^π (E(M) − M) sin(n·M) dM.
Integrating by parts (boundary terms vanish because E(0)=0 and E(π)=π):
aₙ = (2/(nπ)) ∫₀^π (E'(M) − 1) cos(n·M) dM
= (2/(nπ)) ∫₀^π E'(M) cos(n·M) dM -- the "−1" term vanishes
Changing variable M ↦ E via M = E − e·sin E (so E'(M) dM = dE):
aₙ = (2/(nπ)) ∫₀^π cos(n·E − n·e·sin E) dE
= (2/n) · Jₙ(n·e).
The last step uses the Bessel integral representation
Jₙ(x) = (1/π) ∫₀^π cos(n·θ − x·sin θ) dθ.
-/
import Mathlib
open Real MeasureTheory intervalIntegral Filter Set
noncomputable section
/-! ---------------------------------------------------------------
§1 Variables
--------------------------------------------------------------- -/
variable (e : ℝ) (he : 0 ≤ e) (he1 : e < 1) /-! --------------------------------------------------------------- §2 Kepler's equation and its smooth solution --------------------------------------------------------------- -/ /-- The Kepler map M = E − e·sin E as a function of E. -/ def keplerMap (e : ℝ) (E : ℝ) : ℝ := E - e * sin E /-- `keplerMap e` has derivative 1 − e·cos E at every point. -/ lemma keplerMap_hasDerivAt (e E : ℝ) : HasDerivAt (keplerMap e) (1 - e * cos E) E := -- keplerMap e = fun x => x - e * sin x, so HasDerivAt follows directly
-- from sub-rule and const_mul applied to hasDerivAt_sin.
(hasDerivAt_id E).sub ((hasDerivAt_sin E).const_mul e)
/-- The derivative of `keplerMap e` is positive when e < 1. -/
lemma keplerMap_deriv_pos {e' : ℝ} (he' : 0 ≤ e') (he1' : e' < 1) (E : ℝ) :
0 < 1 - e' * cos E := by
have hcos : cos E ≤ 1 := cos_le_one E
nlinarith [mul_le_of_le_one_right he' hcos]
/-- `keplerMap e` is strictly monotone when e < 1.
Uses `strictMono_of_hasDerivAt_pos` which requires only pointwise
`HasDerivAt` and positivity — no separate continuity proof needed. -/
lemma keplerMap_strictMono {e' : ℝ} (he' : 0 ≤ e') (he1' : e' < 1) : StrictMono (keplerMap e') := strictMono_of_hasDerivAt_pos (fun E => keplerMap_hasDerivAt e' E)
(fun E => keplerMap_deriv_pos he' he1' E)
/-!
We axiomatise the inverse eccAnom : ℝ → ℝ → ℝ and its key
properties, all of which follow from the Inverse Function Theorem
applied to the smooth, strictly monotone map keplerMap e.
-/
/-- The eccentric anomaly: the smooth right-inverse of `keplerMap e`. -/
axiom eccAnom (e : ℝ) : ℝ → ℝ
/-- `eccAnom e M` satisfies Kepler's equation. -/
axiom eccAnom_kepler (e M : ℝ) :
keplerMap e (eccAnom e M) = M
/-- `eccAnom e` is differentiable, derivative = 1/(1 − e·cos(eccAnom e M)). -/
axiom eccAnom_hasDerivAt (e M : ℝ) :
HasDerivAt (eccAnom e) (1 / (1 - e * cos (eccAnom e M))) M
/-- Boundary value at 0. -/
axiom eccAnom_zero (e : ℝ) : eccAnom e 0 = 0
/-- Boundary value at π. -/
axiom eccAnom_pi (e : ℝ) : eccAnom e π = π
/-! ---------------------------------------------------------------
§3 Bessel function of the first kind (integer order)
Defined by the classical integral representation.
--------------------------------------------------------------- -/
/-- Bessel function J_n(x) via its integral representation. -/
def besselJ (n : ℕ) (x : ℝ) : ℝ :=
(1 / π) * ∫ θ in (0 : ℝ)..π, cos (↑n * θ - x * sin θ)
/-! ---------------------------------------------------------------
§4 Fourier coefficient
Named keplerFourierCoeff to avoid clashing with Mathlib's own
`fourierCoeff` which is defined on AddCircle.
--------------------------------------------------------------- -/
/-- The n-th Fourier sine coefficient of eccAnom e M − M on [0,π]. -/
def keplerFourierCoeff (e : ℝ) (n : ℕ) : ℝ :=
(2 / π) * ∫ M in (0 : ℝ)..π,
(eccAnom e M - M) * sin (↑n * M)
/-! ---------------------------------------------------------------
§5 Main theorem
--------------------------------------------------------------- -/
/--
**Main theorem.** For n ≥ 1, the Fourier sine coefficient of the
eccentric-anomaly displacement satisfies aₙ = (2/n) · Jₙ(n·e).
-/
theorem keplerFourierCoeff_eq_besselJ (n : ℕ) (hn : 1 ≤ n) :
keplerFourierCoeff e n = (2 / (n : ℝ)) * besselJ n (↑n * e) := by
simp only [keplerFourierCoeff, besselJ]
-- Goal:
-- (2/π) · ∫₀^π (E(M)−M)·sin(nM) dM
-- = (2/n) · (1/π) · ∫₀^π cos(nθ − ne·sinθ) dθ
-- ── Step 1: Integration by parts ─────────────────────────────────────
-- u = E(M)−M, dv = sin(nM)dM → v = −cos(nM)/n
-- Boundary: [uv]₀^π = 0 by eccAnom_zero, eccAnom_pi.
-- Result: (2/π)∫(E−M)sin(nM)dM = (2/π)(1/n)∫(E'(M)−1)cos(nM)dM
--
-- Mathlib lemma: intervalIntegral.integral_mul_deriv
-- (or integral_deriv_mul_eq_sub_of_hasDerivAt applied to
-- u = eccAnom e − id, v = −sin(n··)/n)
have step1 :
(2 / π) * ∫ M in (0 : ℝ)..π, (eccAnom e M - M) * sin (↑n * M)
= (2 / π) * (1 / ↑n) *
∫ M in (0 : ℝ)..π, (deriv (eccAnom e) M - 1) * cos (↑n * M) := by
sorry
-- ── Step 2: The "−1" integral vanishes ───────────────────────────────
-- ∫₀^π cos(nM) dM = [sin(nM)/n]₀^π = 0 (integer n ≥ 1)
-- Mathlib: integral_cos, Real.sin_nat_mul_pi
have cos_int_zero :
∫ M in (0 : ℝ)..π, cos (↑n * M) = 0 := by
sorry
have step2 :
∫ M in (0 : ℝ)..π, (deriv (eccAnom e) M - 1) * cos (↑n * M)
= ∫ M in (0 : ℝ)..π, deriv (eccAnom e) M * cos (↑n * M) := by
have key : ∀ M : ℝ, (deriv (eccAnom e) M - 1) * cos (↑n * M)
= deriv (eccAnom e) M * cos (↑n * M) - cos (↑n * M) := by
intro M; ring
simp_rw [key]
rw [intervalIntegral.integral_sub _ _]
· rw [cos_int_zero, sub_zero]
· -- IntervalIntegrable (deriv (eccAnom e) · cos(n··))
sorry
· exact (continuous_cos.comp (continuous_const.mul continuous_id')).intervalIntegrable 0 π
-- ── Step 3: Change of variable M ↦ E via Kepler's equation ───────────
-- Under M = E − e·sin E: E'(M) dM = dE, cos(nM) = cos(nE − ne·sinE)
-- Mathlib: MeasureTheory.integral_image_eq_integral_abs_deriv_smul
-- or intervalIntegral.integral_comp_deriv
have step3 :
∫ M in (0 : ℝ)..π, deriv (eccAnom e) M * cos (↑n * M)
= ∫ E in (0 : ℝ)..π, cos (↑n * E - ↑n * e * sin E) := by
sorry
-- ── Step 4: Recognise the Bessel integral ────────────────────────────
-- (1/π)∫₀^π cos(nE − ne·sinE)dE = J_n(ne) by definition.
have step4 :
(1 / π) * ∫ E in (0 : ℝ)..π, cos (↑n * E - ↑n * e * sin E)
= besselJ n (↑n * e) := by
simp only [besselJ]
-- ── Combine all steps ────────────────────────────────────────────────
have hpi : π > 0 := Real.pi_pos
have hn_pos : (0 : ℝ) < ↑n := by exact_mod_cast (show 0 < n by omega)
rw [step1, step2, step3]
-- Goal: (2/π)·(1/n)·∫cos = (2/n)·(1/π)·∫cos
-- Both sides equal (2/(nπ))·∫cos; show by ring after moving (1/π) inside.
have factored : (2 / π) * (1 / ↑n) *
∫ E in (0 : ℝ)..π, cos (↑n * E - ↑n * e * sin E)
= (2 / ↑n) * ((1 / π) *
∫ E in (0 : ℝ)..π, cos (↑n * E - ↑n * e * sin E)) := by
ring
rw [factored, step4]
end
/-! ---------------------------------------------------------------
§6 Summary of proof obligations
Axioms (all standard consequences of the Inverse Function Theorem
applied to the smooth, strictly increasing map keplerMap e):
• eccAnom — existence of the smooth inverse
• eccAnom_kepler — it satisfies Kepler's equation
• eccAnom_hasDerivAt — C¹ with derivative 1/(1 − e·cos E)
• eccAnom_zero — E(0) = 0
• eccAnom_pi — E(π) = π
`sorry`s (each closes with a standard Mathlib lemma):
• step1 IBP via
intervalIntegral.integral_deriv_mul_eq_sub_of_hasDerivAt
• cos_int_zero ∫₀^π cos(nM)dM = 0 via
integral_cos + Real.sin_nat_mul_pi
• step2 integrability IntervalIntegrable for deriv(eccAnom e)·cos(n·)
• step3 Change of variables via
MeasureTheory.integral_image_eq_integral_abs_deriv_smul
§7 Finding minimal imports
Once the file builds cleanly, add at the bottom:
#min_imports
and the Lean Infoview will report the exact minimal import list
for the version of Mathlib you have installed.
--------------------------------------------------------------- -/
[1] I used the default setting: Sonnet 4.6, low effort, “thinking” turned off.
The post Formally proving a calculation with Claude and Lean first appeared on John D. Cook.2026-06-10 21:34:31
Often there’s a thread running through a sequence of my posts. Sometimes I make this explicit and sometimes I don’t.
The latest thread started with this post commenting on a tweet that observed that
exp(−x²) ≈ (1 + cos(sin(x) + x))/2.
Some people said online that that the approximation is simply due to the first few terms of the Taylor series on both sides matching up, so I wrote a follow up post explaining that it’s not that simple.
The series for the left hand side alternates and converges very slowly, which lead to the post on naively summing an alternating series.
The series for the right hand side lead to this point on partitions over permutations.
Integrating the right hand side lead to this post on how the simplest numerical integration rule works shockingly well on some problems.
The exact value of the integral turns out to be given by a Bessel function, details given in this post.
Mr. Bessel’s interest in the functions now named after him started with looking closely at a solution to Kepler’s equation in orbital mechanics. Thinking about Kepler’s equation lead to the posts on the Laplace limit and on series acceleration.
I may be done pulling on this thread. I don’t have anything else in mind that I want to explore for now, but you never know.
The post Pulling on a thread first appeared on John D. Cook.2026-06-08 04:14:45
Kepler solved his eponymous equation
M = E − e sin(E)
by finding a fixed point of
E = M + e sin(E).
So guess a value of E and stick it into the right hand side. Then plug that value into the right hand side again. Kepler said a couple iterations should be enough. And a couple iterations are enough if the eccentricity e is small and you don’t need much accuracy.
The rate of convergence is determined by e. Kepler implicitly had in mind small values of e because he wasn’t aware of anything orbiting the sun in a highly elliptical orbit. Here’s an example with eccentricity 0.05, about the eccentricity of the orbits of Jupiter and Saturn.
from math import sin
M, E, e = 1, 1, 0.05
for _ in range(5):
E = M + e*sin(E)
residual = M - (E - e*sin(E))
print(residual)
The residual after just two iterations is 2.77 × 10−5. If you change e to 0.2, the eccentricity of Mercury’s orbit, it takes three iterations to get comparable accuracy. Mercury has the most eccentric orbit of any object Kepler would have known about.
Now suppose you’d like to solve for E when M = 1 for Halley’s comet, and you’d like an error of less than 10−8. Now you need 16 iterations.
C. F. W. Peters discovered a faster algorithm in 1891.
E1 = M + e sin(E0)
E2 = M + e sin(E1)
E3 = (E2E0 − E1²)/(E2 − 2E1 + E0)
Let’s look at the results of doing three iterations of Peters’ method for Halley’s comet.
M, E0, e = 1, 1, 0.967
for _ in range(3):
E1 = M + e*sin(E0)
E2 = M + e*sin(E1)
E3 = (E2*E0 - E1**2)/(E2 - 2*E1 + E0)
residual = M - (E - e*sin(E3))
print(residual)
E0 = E3 # for next iteration
This gives a residual of −7.23 × 10−10. Each iteration of Peters’ method requires a little more than twice as much work as an iteration of Kepler’s method, but 3 iterations of Peters’ method accomplished more than 16 iterations of Kepler’s method.
Peters’ algorithm from 1891 was a special case of Alexander Aitken’s series acceleration method published in 1926.
2026-06-08 03:06:36
An earlier post discussed how to solve Kepler’s equation
M = E − e sin(E)
using a sine series. You could also solve Kepler’s equation using a power series, which Lagrange did in 1771. Both approaches express E as a function of e and M, but from different perspectives. Bessel thought of his solution as a sum of sines in M, with coefficients that depend on e. Lagrange thought of his solution as a power series in e whose coefficients involve sines in M. You can rearrange the terms of either solution into the other.
The most interesting thing about the power series solution, in my opinion, is that it only converges for e less than roughly 2/3 while the sine series solution is valid for all e < 1. In astronomical terms, this means the power series solution works for the orbit of some planets but not others!
In our solar system, the planets all have eccentricity well below 2/3, but not all minor planets do. For example, the orbit of Eris has eccentricity 0.4407 but the orbit of Sedna has eccentricity 0.8549. And in other solar systems there are planets with eccentricity much greater than 2/3.
The radius of convergence for Lagrange’s power series solution is called the Laplace limit. Its value is eL = 0.6627…. There’s no obvious reason why there’s anything special about this value. There’s no astronomical reason for this value. It’s an artifact of the power series form of the solution.
If the series works for e = 0.66, you would reasonably think it works for e = 0.67, but that’s not the case. And if you’re observant, you might notice that although the series works for e = 0.66, it takes longer to converge than for smaller values of e; the rate of convergence is slowing down, warning you of danger ahead.
The exact value of eL is the unique real solution to the equation
There’s no obvious reason for this either. It has to do with finding the largest circle that can fit in a lens-shaped region of convergence. More on that here.
We can calculate eL with the following Python code.
from math import exp
from scipy.optimize import root_scalar
def f(x):
t = (1 + x*x)**0.5
return x*math.exp(t) - 1 - t
sol = root_scalar(f, bracket=[0, 1], method='brentq')
print(sol.root)
This prints 0.6627434193491817.
We can use the Lagrange inversion formula to find the series, just as Lagrange did two and a half centuries ago.
The powers of sine can be expanded into the sum of sines of various frequencies and differentiated, leading to the equation
The post The Laplace limit first appeared on John D. Cook.
2026-06-08 00:34:26
I ran across a cranky formula for π based on physical constants here
and decided to play around with it.
The source describes λ as “wavelength (chosen in the microwave region)” and I thought perhaps you could chose a value of λ to make the equation work. But as a comment pointed out, the bracketed expression is simply 2hc², independent of λ, due to Planck’s blackbody law. That means we can simplify the expression above to
Now the values of h and c are known. In fact, they’re now exactly known by definition: other SI units are defined in terms of h and c. The mass of an electron is known to 11 significant figures.
But E in the equation above is “Total energy of the universe.” I don’t even know what that means. Does it refer to the observable universe? Does it include dark energy? Does it include the energy equivalent of mass?
I asked a couple LLMs that the total energy of the universe might mean and what its value might be, and they said something like “Depends. It might be zero. It might be infinite. But if I had to say, I’d say around 1070 Joules.”
If we solve the equation above for E we get 2.8480347886530404 × 1019 Joules. I have no idea how to justify that.
The expression for π is not dimensionless. I suppose you could choose some nonstandard units that would make the equation work.
The source I linked to above cites Mathematical Cranks by Underwood Dudley, but I couldn’t find it in the book.