MoreRSS

site iconJohn D. CookModify

I have decades of consulting experience helping companies solve complex problems involving applied math, statistics, and data privacy.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of John D. Cook

Largest known compositorial prime

2026-01-07 01:11:41

I ran across a blog post here that said a new record has been set for the largest compositorial prime. [1]

OK, so what is a compositorial prime? It is a prime number of the form

n! / n# + 1

where n# denotes n primorial, the product of the prime numbers no greater than n.

The newly discovered prime is

N = 751882!/751882# + 1

It was described in the article cited above as

751882!/751879# + 1,

but the two numbers are the same because there are no primes greater than 751879 and less than 751882, i.e.

751879# = 751882#.

About how large is N? We can calculate the log of the numerator easily enough:

>>> import scipy.special
>>> scipy.special.loggamma(751883)
9421340.780760147

However, the denominator is harder to compute. According to OIES we have

n# = exp((1 + o(1)) n)

which would give us the estimate

log(751882#) ≈ 751882.

So

log10(N) = log(N) / log(10) ≈ (9421340 − 751882) / log(10) ≈ 3765097.

According to this page,

log10(N) = 3765620.3395779

and so our approximation above is was good to four figures.

So N has between 3 and 4 million digits, making it much smaller than the largest known prime, which has roughly 41 million digits. Overall, N is the 110th largest known prime.

 

[1] I misread the post at first and thought it said there was a new prime record (skipping over the “compositorial” part) and was surprised because the number is not a Mersenne number. For a long time now the largest known prime has been a Mersenne prime because there is a special algorithm for testing whether Mersenne numbers are prime, one that is much more efficient than testing numbers in general.

 

The post Largest known compositorial prime first appeared on John D. Cook.

log2(3) and log2(5)

2026-01-06 09:02:08

AlmostSure on X pointed out that

log2 3 ≈ 19/12,

an approximation that’s pretty good relative to the size of the denominator. To get an approximation that’s as accurate or better requires a larger denominator for log2 5.

log2 5 ≈ 65/28

This above observations are correct, but are they indicative of a more general pattern? Is log2 3 easier to approximate than log2 5 using rational numbers? There are theoretical ways to quantify this—irrationality measures—but they’re hard to compute.

If you look at the series of approximations for both numbers, based on continued fraction convergents, the nth convergent for log2 5 is more accurate than the nth convergent for log2 3, at least for the first 16 terms. After that I ran out of floating point precision and wasn’t sufficiently interested to resort to extended precision.

Admittedly this is a non-standard way to evaluate approximation error. Typically you look at the approximation error relative to the size of the denominator, not relative to the index of the convergents.

Here’s a more conventional comparison, plotting the log of approximation error against the log of the denominators.

Continued fraction posts

The post log2(3) and log2(5) first appeared on John D. Cook.

In-shuffles and out-shuffles

2026-01-02 04:48:02

The previous post talked about doing perfect shuffles: divide a deck in half, and alternately let one card from each half fall.

It matters which half lets a card fall first. If the top half’s bottom card falls first, this is called an in-shuffle. If the bottom half’s bottom card falls first, it’s called an out-shuffle.

With an out-shuffle, the top and bottom cards don’t move. Presumably it’s called an out-shuffle because the outside cards remain in place.

An out-shuffle amounts to an in-shuffle of the inner cards, i.e. the rest of the deck not including the top and bottom card.

The previous post had a Python function for doing an in-shuffle. Here we generalize the function to do either an in-shuffle or an out-shuffle. We also get rid of the list comprehension, making the code longer but easier to understand.

def shuffle2(deck, inside = True):
    n = len(deck)
    top = deck[: n//2]
    bottom = deck[n//2 :]
    if inside:
        first, second = bottom, top
    else:
        first, second = top, bottom
    newdeck = []
    for p in zip(first, second):
        newdeck.extend(p)
    return newdeck

Let’s use this code to demonstrate that an out-shuffle amounts to an in-shuffle of the inner cards.

deck = list(range(10))
d1 = shuffle2(deck, False) 
d2 = [deck[0]] + shuffle2(deck[1:9], True) + [deck[9]]
print(d1)
print(d2)

Both print statements produce [0, 5, 1, 6, 2, 7, 3, 8, 4, 9].

I said in the previous post that k perfect in-shuffles will restore the order of a deck of n cards if

2k = 1 (mod n + 1).

It follows that k perfect out-shuffles will restore the order of a deck of n cards if

2k = 1 (mod n − 1)

since an out-shuffle of n cards is essentially an in-shuffle of the n − 2 cards in the middle.

So, for example, it only takes 8 out-shuffles to return a deck of 52 cards to its original order. In the previous post we said it takes 52 in-shuffles, so it takes a lot fewer out-shuffles than in-shuffles.

It’s plausible to conjecture that it takes fewer out-shuffles than in-shuffles to return a deck to its initial order, since the former leaves the two outside cards in place. But that’s not always true. It’s true for a deck of 52 cards, but not for a deck of 14, for example. For a deck of 14 cards, it takes 4 in-shuffles or 12 out-shuffles to restore the deck.

The post In-shuffles and out-shuffles first appeared on John D. Cook.

Perfect and imperfect shuffles

2026-01-01 21:31:18

Take a deck of cards and cut it in half, placing the top half of the deck in one hand and the bottom half in the other. Now bend the stack of cards in each hand and let cards alternately fall from each hand. This is called a rifle shuffle.

Random shuffles

Persi Diaconis proved that it takes seven shuffles to fully randomize a desk of 52 cards. He studied videos of people shuffling cards in order to construct a realistic model of the shuffling process.

Shuffling randomizes a deck of cards due to imperfections in the process. You may not cut the deck exactly in half, and you don’t exactly interleave the two halves of the deck. Maybe one card falls from your left hand, then two from your right, etc.

Diaconis modeled the process with a probability distribution on how many cards are likely to fall each time. And because his model was realistic, after seven shuffles a deck really is well randomized.

Perfect shuffles

Now suppose we take the imperfection out of shuffling. We do cut the deck of cards exactly in half each time, and we let exactly one card fall from each half each time. And to be specific, let’s say the first card will always fall from the top half of the deck. That is, we do an in-shuffle. (See the next post for a discussion of in-shuffles and out-shuffles.) A perfect shuffle does not randomize a deck because it’s a deterministic permutation.

To illustrate a perfect in-shuffle, suppose you start with a deck of these six cards.

A23456

Then you divide the deck into two halves.

A23 456

Then after the shuffle you have the following.

4A5263

Incidentally, I created the images above using a font that included glyphs for the Unicode characters for playing cards. More on that here. The font produced black-and-white images, so I edited the output in GIMP to turn things red that should be red.

Coming full circle

If you do enough perfect shuffles, the deck returns to its original order. This could be the basis for a magic trick, if the magician has the skill to repeatedly perform a perfect shuffle.

Performing k perfect in-shuffles will restore the order of a deck of n cards if

2k = 1 (mod n + 1).

So, for example, after 52 in-shuffles, a deck of 52 cards returns to its original order. We can see this from a quick calculation at the Python REPL:

>>> 2**52 % 53
1

With slightly more work we can show that less than 52 shuffles won’t do.

>>> for k in range(1, 53):
    ... if 2**k % 53 == 1: print(k)
52

The minimum number of shuffles is not always the same as the size of the deck. For example, it takes 4 shuffles to restore the order of a desk of 14 cards.

>>> 2**4 % 15
1

Shuffle code

Here’s a function to perform a perfect in-shuffle.

def shuffle(deck):
    n = len(deck)
    return [item for pair in zip(deck[n//2 :], deck[:n//2]) for item in pair]

With this you can confirm the results above. For example,

n = 14
k = 4
deck = list(range(n))
for _ in range(k):
    deck = shuffle(deck)
print(deck)

This prints 0, 1, 2, …, 13 as expected.

Related posts

The post Perfect and imperfect shuffles first appeared on John D. Cook.

Knight’s tour with fewest obtuse angles

2025-12-31 21:48:24

Donald Knuth gives a public lecture each year around Christmas. This year was his 29th Christmas lecture, Adventures with Knight’s Tours.

I reproduced one of the images from his lecture.

Knight's tour with the minimum number of obtuse angles

This is the knight’s tour with the minimum number of obtuse angles, marked with red dots. The solution is unique, up to rotations and reflections.

Knuth said he thought this was one of the most beautiful knight’s tours. He discusses this tour about 44 minutes into the video here.

More knight’s tour posts

The post Knight’s tour with fewest obtuse angles first appeared on John D. Cook.

The center of the earth is not straight down

2025-12-29 19:00:48

If the earth were a perfect sphere, “down” would be the direction to the center of the earth, wherever you stand. But because our planet is a bit flattened at the poles, a line perpendicular to the surface and a line to the center of the earth are not the same. They’re nearly the same because the earth is nearly a sphere, but not exactly, unless you’re at the equator or at one of the poles. Sometimes the difference matters and sometimes it does not.

From a given point on the earth’s surface, draw two lines: one straight down (i.e. perpendicular to the surface) and one straight to the center of the earth. The angle φ that the former makes with the equatorial plane is geographic latitude. The angle θ that the latter makes with the equatorial plane is geocentric latitude.

For illustration we will draw an ellipse that is far more eccentric than a polar cross-section of the earth.

At first it may not be clear why geographic latitude is defined the way it is; geocentric latitude is conceptually simpler. But geographic latitude is easier to measure: a plumb bob will show you which direction is straight down.

There may be some slight variation between the direction of a plumb bob and a perpendicular to the earth’s surface due to variations in surface gravity. However, the deviations due to gravity are a couple orders of magnitude smaller than the differences between geographic and geocentric latitude.

Conversion formulas

The conversion between the two latitudes is as follows.

\begin{align*} \theta &= \text{atan2}((1 - e^2)\sin\varphi, \cos\varphi) \\ \varphi &= \text{atan2}(\sin\theta, (1 - e^2)\cos\theta) \end{align*}

Here e is eccentricity. The equations above work for any ellipsoid, but for earth in particular e² = 0.00669438.

The function atan2(y, x) returns an angle in the same quadrant as the point (x, y) whose tangent is y/x. [1]

As a quick sanity check on the equations, note that when eccentricity e is zero, i.e. in the case of a circle, φ = θ. Also, if φ = 0 then θ = φ for all eccentricity values.

Next we give a proof of the equations above.

Proof

We can parameterize an ellipse with semi-major axis a and semi-minor axis b by

(x(t), y(t)) = (a \cos t, b \sin t)

The slope at a point (x(t), y(t)) is the ratio

\frac{y^\prime(t)}{x^\prime(t)} = \frac{b \cos t}{-a \sin t}

and so the slope of a line perpendicular to the tangent, i.e tan φ, is

\tan \varphi = \frac{a \sin t}{b \cos t} = \frac{a}{b} \tan t

Now

\tan \theta = \frac{b \sin t}{a \cos t} = \frac{b}{a} \tan t

and so

\begin{align*} \tan \varphi &= \frac{a}{b} \tan t \\ &= \frac{a}{b} \left( \frac{a}{b} \tan \theta \right) \\ &= \frac{a^2}{b^2} \tan \theta \\ &= \frac{1}{1 - e^2} \tan \theta \end{align*}

where e² = 1 − b²/a² is the eccentricity of the ellipse. Therefore

(1 - e^2) \tan \varphi = \tan \theta

and the equations at the top of the post follow.

Difference

For the earth’s shape, e² = 0.006694 per WGS84. For small eccentricities, the difference between geographic and geocentric latitude is approximately symmetric around 45°.

But for larger values of eccentricity the asymmetry becomes more pronounced.

Related posts

[1] There are a couple complications with programming language implementations of atan2. Some call the function arctan2 and some reverse the order of the arguments. More on that here.

The post The center of the earth is not straight down first appeared on John D. Cook.