2026-03-23 19:39:47
A few years ago I wrote about comm, a utility that lets you do set theory at the command line. It’s a really useful little program, but it has two drawbacks: the syntax is hard to remember, and the input files must be sorted.
If A and B are two sorted lists,
comm A B
prints A − B, B − A, and A ∩ B. You usually don’t want all three, and so comm lets you filter the output. It’s a little quirky in that you specify what you don’t want instead of what you do. And you have to remember that 1, 2, and 3 correspond to A − B, B − A, and A ∩ B respectively.

A couple little scripts can hide the quirks. I have a script intersect
comm -12 <(sort "$1") <(sort "$2")
and another script setminus
comm -23 <(sort "$1") <(sort "$2")
that sort the input files on the fly and eliminate the need to remember comm‘s filtering syntax.
The setminus script computes A − B. To find B − A call the script with the arguments reversed.
2026-03-21 00:58:42
The hardest part of using regular expressions is not crafting regular expressions per se. In my opinion, the two hardest parts are minor syntax variations between implementations, and all the environmental stuff outside of regular expressions per se.
Embedded regular expression modifiers address one of the environmental complications by putting the modifier in the regular expression itself.
For example, if you want to make a grep search case-insensitive, you pass it the -i flag. But if you want to make a regex case-insensitive inside a Python program, you pass a function the argument re.IGNORECASE. But if you put (?i) at the beginning of your regular expression, then the intention to make the match case-insensitive is embedded directly into the regex. You could use the regex in any environment that supports (?i) without having to know how to specify modifiers in that environment.
I was debugging a Python script this morning that worked under one version of Python and not under another. The root of the problem was that it was using re.findall() with several huge regular expression that had embedded modifiers. That was OK up to Python 3.5, then it was a warning between versions 3.6 and 3.10, and it’s an error in versions 3.11 and later.
The problem isn’t with all embedded modifiers, only global modifiers that don’t appear at the beginning of the regex. Older versions of Python, following Perl’s lead, would let you put a modifier like (?i) in the middle of a regex, and apply the modifier from that point to the end of the expression. In the latest versions of Python, you can either place the modifier at the beginning of the regex, or use a scoped modifier like (?:…) in the middle of the expression.
I didn’t want to edit the regular expressions in my code—some had over a thousand characters—so I changed re.findall() to regex.findall(). The third-party regex module is generally more Perl-compatible than Python’s standard re module.
2026-03-19 09:21:53
The gamma function Γ(z) extends the factorial function from integers to complex numbers. (Technically, Γ(z + 1) extends factorial.) There are other ways to extend the factorial function, so what makes the gamma function the right choice?
The most common answer is the Bohr-Mollerup theorem. This theorem says that if f: (0, ∞) → (0, ∞) satisfies
then f(x) = Γ(x). The theorem applies on the positive real axis, and there is a unique holomorphic continuation of this function to the complex plane.
But the Bohr-Mollerup theorem is not the only theorem characterizing the gamma function. Another theorem was by Helmut Wielandt. His theorem says that if f is holomorphic in the right half-plane and
then f(x) = Γ(x). In short, Weilandt replaces the log-convexity for positive reals with the requirement that f is bounded in a strip of the complex plane.
You might wonder what is the bound alluded to in Wielandt’s theorem. You can show from the integral definition of Γ(z) that
|Γ(z)| ≤ |Γ(Re z)|
for z in the right half-plane. So the bound on the complex strip {z: 1 ≤ Re z ≤ 2} equals the bound on the real interval [1, 2], which is 1.
The post A lesser-known characterization of the gamma function first appeared on John D. Cook.2026-03-18 10:50:12
The alternating series test is part of the standard calculus curriculum. It says that if you truncate an alternating series, the remainder is bounded by the first term that was left out. This fact goes by in a blur for most students, but it becomes useful later if you need to do numerical computing.
To be more precise, assume we have a series of the form
where the ai are positive and monotonically converge to zero. Then the tail of the series is bounded by its first term:
The more we can say about the behavior of the ai the more we can say about the remainder. So far we’ve assumed that these terms go monotonically to zero. If their differences
also go monotonically to zero, then we have an upper and lower bound on the truncation error:
If the differences of the differences,
also converge monotonically to zero, we can get a larger lower bound and a smaller upper bound on the remainder. In general, if the differences up to order k of the ai go to zero monotonically, then the remainder term can be bounded as follows.
Source: Mark B. Villarino. The Error in an Alternating Series. American Mathematical Monthly, April 2018, pp. 360–364.
2026-03-17 21:26:13
If a number has a finite but nonzero fractional part, so do all its powers. I recently ran across a proof in [1] that is shorter than I expected.
Theorem: Suppose r is a real number that is not an integer, and the decimal part of r terminates. Then rk is not an integer for any positive integer k.
Proof: The number r can be written as a reduced fraction a / 10m for some positive m. If s = rk were an integer, then
10mks = ak.
Now the left side of this equation is divisible by 10 but the right side is not, and so s = rk must not be an integer. QED.
The only thing special about base 10 is that we most easily think in terms of base 10, but you could replace 10 with any other base.
What about repeating decimals, like 1/7 = 0.142857142857…? They’re only repeating decimals in our chosen base. Pick the right base, i.e. 7 in this case, and they terminate. So the theorem above extends to repeating decimals.
[1] Eli Leher. √2 is Not 1.41421356237 or Anything of the Sort. The American Mathematical Monthly, Vol. 125, No. 4 (APRIL 2018), page 346.
The post Powers don’t clear fractions first appeared on John D. Cook.2026-03-17 21:22:39
The previous post introduced the idea of a twelve-tone row, a permutation of the twelve pitch classes of a chromatic scale.
The earlier post also introduced a group of operations on a tone row with elements P, R, I, and RI. Here P, which stands for “prime”, is the identity operator: it leaves the tone row alone. R stands for retrograde, which means to reverse the sequence. I stands for inversion, inverting each of the intervals in the row. If you apply R then I, you get the same result as if you first applied I then R. See the previous post for an example of each.
Do these operations always return a new tone row? In other words, are P(t), R(t), I(t), and RI(t) always distinct for all tone rows t?
It’s easy to see that P(t) and R(t) are always different. The first and last elements of t are different, and so the first element of P(t) does not equal the first element of R(t).
There is one interval that remains the same when inverted, namely the tritone. It’s possible to create a two-tone row that does not change under inversion, such as the sequence [B, F]. A tone row with more than two notes must have an interval that is not a tritone, and so inverting the tone row changes it.
If a tone row t has at least three notes, then P(t), R(t), I(t), and RI(t) are all distinct.
The pitch classes in a tone row are repeated in a cycle, so we might consider all rotations of a tone row to be in a sense the same. In mathematical terms, we can mod out by cyclic permutations.
If we apply R, I, and RI to equivalence classes of tone rows, the results are not always unique. For example, let t be the chromatic scale.
C C♯ D D♯ E F F♯ G G♯ A A♯ B
Then R(t) is
B A♯ A G♯ G F♯ F E D♯ D C♯ C
and I(t) is
C B A♯ A G♯ G F♯ F E D♯ D C♯
which is a rotation of R(t).
The chromatic scale is a boring tone row. For a random tone row t, all the variations P(t), R(t), I(t), and RI(t) will very likely be distinct, including rotations. The following Python code will illustrate the points above and estimate the probability that manipulations of a tone row are distinct even when equating rotations.
import random
import itertools
def inversion(x, m = 12):
return [(2*x[0] - x[i]) % m for i in range(m)]
def rotate_left(x, n):
return x[n:] + x[:n]
def rotdiff(x, y):
for i in range(len(x)):
if rotate_left(x, i) == y:
return False
return True
c = 0
for _ in range(1_000_000):
P = list(range(12))
random.shuffle(P) # shuffle in place
R = list(reversed(P))
I = list(inversion(P))
RI = list(reversed(I))
for pair in combinations([P, R, I, RI], 2):
assert(pair[0] != pair[1])
if not rotdiff(pair[0], pair[1]):
c += 1
print(c)
This generates and tests a million random tone rows. When I ran it, it found 226 tone rows were one manipulation of the tone row was equal to a rotation of another. So when I said the variations are “very likely” to be distinct, you could quantify that as having over a 99.9% chance.
The post Tone row operations first appeared on John D. Cook.