MoreRSS

site iconJohn D. CookModify

I have decades of consulting experience helping companies solve complex problems involving applied math, statistics, and data privacy.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of John D. Cook

From Kepler to Bessel

2026-06-07 02:46:02

The previous post very briefly said that the integral representation for Bessel functions was motived by solving Kepler’s equation. This post will go into more detail.

Kepler’s equation

There are multiple ways to describe the position of a planet in an elliptical orbit around a star. For historical reasons, these descriptions have arcane names such as mean anomaly, true anomaly, and eccentric anomaly. This post explains how these three are related.

For this post, it is enough to say that often you know mean anomaly M and want to know eccentric anomaly E. These are related via Kepler’s equation

M = E - e \sin E
where e is the eccentricity of the orbit. You’d like to solve for E as a function of M and e, but there’s no elementary way to do that.

One way to solve Kepler’s equation is to take a guess at E and plug it into the right hand side of

E = M + e \sin E
to get a new E, and keep iterating until the two sides are closer together. I write more about this here.

Another approach to solving Kepler’s equation is to use Newton’s method. I write more about that here.

Still another approach is to expand E in a sine series and find the series coefficients. An advantage to this approach is that once you have the coefficients, you have an expression for E as a function of M, and you can plug in more values of M without having to solve Kepler’s equation for each value of M separately.

Sine series coefficients

Kepler’s equation is easy to solve at E = 0 and at E = π. In both cases, EM. So the function E − M is zero at both ends of [0, π], which suggests we try to expand E − M in a sine series

E - M = \sum_{n=1}^\infty a_n \sin nM

We then calculate the Fourier coefficients an as usual.

\begin{align*} a_n &= \frac{2}{\pi} \int_0^\pi (E-M) \sin(nM) \, dM \\ &= \frac{2}{n \pi} \int_0^\pi (E^\prime - 1) \cos(nM)\, dM \\ &= \frac{2}{n \pi} \int_0^\pi \cos(nM) E^\prime(M) \, dM \\ &= \frac{2}{n \pi} \int_0^\pi \cos\Big(nE - n \sin(E)\Big) E^\prime(M) \, dM \\ &= \frac{2}{n} \left\{ \frac{1}{\pi} \int_0^\pi \cos\Big(nE - ne \sin(E)\Big)\, dE\right\} \\ &= \frac{2}{n} J_n(ne) \end{align*}

The second line uses integration by parts. The third line uses Kepler’s equation. The last line uses the definition of the Bessel functions Jn given in the previous post.

The post From Kepler to Bessel first appeared on John D. Cook.

Mr. Bessel’s eponymous functions

2026-06-05 20:04:28

Yesterday I wrote a post showing that the trapezoid rule evaluates the integral

\int_{-\pi}^\pi \cos(\sin(x) + x)\, dx

very efficiently. But how do we know what the exact integral is for comparison? If you ask Mathematica, it will tell you the integral equals −2π J1(1) where J1 is a Bessel function. This may seem like rabbit out of a hat, but it’s actually a simple calculation given the integral definition of Bessel functions:

J_n(z) = \frac{1}{\pi}\int_0^\pi \cos(n\theta - z\sin(\theta))\, d\theta

Since cosine is even, we can write our integral over [−π, π] as twice the integral over [0, π]. Then a change of variables turns this into the definition of Jn(z) with n = 1 and z = 1.

A deeper question is what have we accomplished by just giving a new name to essentially the same problem we started with. Another question is why in the world are Bessel functions defined as above.

As for what we’ve accomplished, we’ve related out integration problem to a very well-studied function. Bessel functions have been studied for two centuries and it’s easy to find software to evaluate them. Even the usually minimalist command line calculator bc has a function j(x, n) for evaluating Jn(x) for integer values of n. We could calculate our integral to 50 decimal places as follows.

~$ bc -l
>>> scale = 50
>>>  -8*a(1)*j(1,1)
-2.76491937476833705153256665538788207487495025542883

Note that bc doesn’t have a value of π built in, but a(x) evaluates the arctangent function, and π = 4 arctan(1).

There are multiple ways of defining Bessel functions. The three main ways would be in terms of their power series, in terms of the differential equations they solve, and in terms of their integral representation. Friedrich Bessel defined what we now call Bessel functions of the first kind, the Jn functions, in terms of their integral representations.

Why did Bessel care about these integrals? They came out of his calculations in celestial mechanics. One example of this is solving Kepler’s equation with Fourier series; the Fourier coefficients are given by Bessel functions. This is worked out in detail in the next post.

Bessel functions had occurred in applications before Mr. Bessel drew attention to them. The functions are named after him because he was the first to systematically study them.

Mathematics is developed inductively but taught deductively. It’s common for things to be kicked around for years before someone decides they deserve a name and systematic study. See this post on the central limit theorem for another example. The CLT is older than the Gaussian distribution, even older than Gauss.

Related posts

The post Mr. Bessel’s eponymous functions first appeared on John D. Cook.

The Latin of Linux

2026-06-05 04:12:11

One reason people study Latin is that it is the ancestor of many modern languages. English derives from West Germanic languages, not from Latin, but much of English vocabulary, perhaps as much as 60%, derives from Latin, either directly or indirectly through French.

Knowing a bit of Latin makes sense of many things that would otherwise seem completely arbitrary, such as why the symbols for gold, silver, and lead are Au, Ag, and Pb respectively.

Similarly, ed(1) is the Latin of Linux [1]. Many conventions in command line utilities follow conventions that go back to the ed(1) line editor. They may go back even further. Just as Latin didn’t come out of nowhere, neither did ed(1), but you can’t go back indefinitely. It’s convenient to start history somewhere, and this post will start with ed(1) just as much discussion of Western linguistics starts with Latin.

The following are features of ed(1) that live on in sed, awk, grep, vi, perl, bash, etc.

  1. Using slashes to delimit regular expressions
  2. Using $ to indicate the end of a line or the end of a file
  3. The pattern of specifying address + action or address range + action
  4. Using regular expressions as address ranges
  5. Using \1, \2, etc to refer to regex captures
  6. Using & to refer to the entire matched text
  7. The g/regexp/command pattern
  8. Using p for printing lines, as in g/re/p
  9. The commands a, c, d, i, j, l, p, q, r, and w in vi
  10. ! for shell escape

 

[1] Because the name “ed” is so short, and looks so much like the name Ed, it’s convenient to use its full Unix name ed(1). The parenthesized number is used to disambiguate different things that have the same name, such as the user command kill(1) and the system call kill(2). There is no ed(2) or any other higher-numbered ed. The number is there to make the name stand out, not to disambiguate anything.

The post The Latin of Linux first appeared on John D. Cook.

Integrating smooth periodic functions

2026-06-05 01:18:40

Several posts lately have looked at the function

f(x) = cos(sin(x) + x).

This post will look at the function from a different angle. It’s a smooth function with period 2π. For reasons I wrote about here, this means that the trapezoid rule should find its integral very efficiently.

In general, the error in the trapezoid rule is on the order of 1/N² where N is the number of integration points. To be more specific, the error in integrating a function f over [a, b] with N points is bounded by

(baM / 12N²

where M is the maximum absolute value of the second derivative of f. So in our case we should expect the error to be less than 82.67/N². In fact we do much better than that. The error does not decrease quadratically, as it does in general, but exponentially.

With just 16 integration points, we’ve reached the limit of floating point representation.

The post Integrating smooth periodic functions first appeared on John D. Cook.

Partitions over permutations

2026-06-04 21:48:56

I was thinking more about the cosine approximation to the Gaussian

exp(−z²) ≈ (1 + cos(sin(z) + z))/2

that I wrote about last week. The two expressions above are close along the real axis but not along the imaginary axis.

If ziy, the right side grows much faster than the left, behaving like exp(exp(y)).

This led to me looking up the power series for the double exponential function exp(exp(y)). This is an interesting series because the coefficient of xn is

e Bn / n!

where Bn is the nth Bell number, which equals the number of ways to partition a set of n labeled items [1]. And of course n! is the number of ways to permute a set of n labeled items. So the nth coefficient in the power series for exp(exp(y)) is the ratio of the number of partitions to permutations for a set of n labeled things, multiplied by e.

The number of ways to partition a set of n things grows quickly as n increases, almost as quickly as the number of permutations, and so the series for the double exponential function converges very slowly.

Computing

SymPy has a function bell for computing Bell numbers, so you could compute the ratio of partitions to permutations as follows.

from sympy import bell, factorial
f = lambda n: bell(n)/factorial(n)

This returns a number of type sympy.core.numbers.Rational and so the result is exact. You can cast it to float for convenience.

Asymptotics

If we look at only the terms in the asymptotic series for log Bn and log n! that grow with n we have

log Bn ~ n log n − n log log n
log n! ~ n log n − ½ log n

and so

log( Bnn! ) ~ ½ log n − n log log n

There’s also an asymptotic series for log( Bnn! ) involving the Lambert W function:

log( Bnn! ) ~ n/r − 1 − n log r

where r = W(n).

Related posts

[1] It’s important that the items are labeled. Partition numbers are the number of partitions of an unlabeled set. Partition numbers are much smaller than Bell numbers.

The post Partitions over permutations first appeared on John D. Cook.

Naively summing an alternating series

2026-06-03 23:13:24

Suppose you run across the power series for the exponential function and decide to code it up. Good idea: you’ll probably learn something, though maybe not what you expect.

Maybe you decide a tolerance of 10−12 is good enough, and so you sum the terms until the next term to add is below the tolerance.

from math import factorial, exp

def naive_exp(x):
    tolerance = 1e-12
    s = 0
    n = 0
    while True:
        delta = x**n / factorial(n)
        s += delta
        if abs(delta) < tolerance:
            return s
        n += 1

You want to try your program out, so you compute e by calling the function at 1. If you compare this to calling exp(1) you find that you got all the digits correct.

Now you try computing exp(-20). Calling naive_exp(-20) gives

    5.47893091802112e-10

but calling exp(-20) gives

    2.061153622438558e-09

Don’t brush things like this as flukes or compiler bugs [1]. This is your golden opportunity to learn something.

Maybe you add a print statement to see the intermediate values of the sum stored in the variable s. If you do, you’ll see that the partial sums oscillate wildly before settling down.

Maybe that seems wrong, but then you look more carefully at the series. The nth term is xn/n!. Since x is negative, the terms alternate in sign. And the absolute values of the term get bigger before they get smaller. When x = −20, each numerator is 20 times larger than the previous, and each denominator is n times larger than the previous. So the terms will get bigger until n > 20. So the wild oscillations are real, not a bug.

The largest partial sum is 21822593.77927747 in absolute value. You know that exp(−20) is a very small number, so there’s going to have to be a lot of cancellation before the partial sums settle down to a small number. Maybe you’ve heard that cancellation is where numerical calculations lose precision. If not, now you know!

Look again at the largest partial sum. There are eight figures to the right of the decimal point. The code is printing out results to as much precision as it has, so the error at this point is on the order of 10−8. We’re trying to compute a number on the order of 10−9, and if any digits in our result are correct, it would be a coincidence.

If you go back and try your code on x = −22, the result is even worse, giving a negative result for a quantity that for theoretical reasons cannot be negative. But you can see why: you’re asking the code to compute a number that is closer to zero than the accuracy of the code.

Computers don’t represent numbers in base 10 internally, but the argument above is sufficient in this case. If you want to dig deeper, look into the anatomy of a floating point number.

There is a simple way around the problem above, but discovering it sooner would short-circuit the learning process. You could calculate exp(−20) as 1/exp(20) and avoid all the cancellation because the series for exp(20) does not alternate.

 

[1] Compilers do have bugs occasionally, but it’s orders of magnitude more likely that something is wrong with your code.

The post Naively summing an alternating series first appeared on John D. Cook.