2026-05-22 01:31:13
“You can’t always get what you want. But sometimes you get what you need.” — The Rolling Stones
Circular functions and hyperbolic functions aren’t invertible, but we invert them anyway. These functions map many points in the domain to each point in the range, and we invert them by mapping a point in the range back to some point in the domain. Often this works as expected, but sometimes it doesn’t.
In the previous post I said
You can relate each trig function “foo” with its hyperbolic counterpart “fooh” by applying one of the functions to iz then multiplying by a constant c that depends on foo: c = i for sin and tan, c = 1 for cos and sec, and c = −i for csc and cot.
In symbols,
c foo(z) = fooh(iz).
Let’s suppose foo and fooh are invertible, ignoring any complications, and solve foo(z) = w for z. We get
i foo−1(w) = fooh−1(cw)
or using “arc” nomenclature for inverse functions
i arcfoo(w) = arcfooh(cw).
When the naive calculation above holds, except possibly at a finite number of points, we say the pair (foo, fooh) is couth. Otherwise we say the pair is uncouth. These term were coined by Robert Corless and his coauthors in their paper [1].
Whether the pair (foo, fooh) is couth depends not only on foo and fooh, but also on the details of how arcfoo and arcfooh are defined.
In Python’s NumPy library, the pairs (sin, sinh) and (tan, tanh) are couth, but the pair (cos, cosh) is uncouth.
Numpy doesn’t define the reciprocal functions sec, sech, csc, csch, cot, and coth. I used to find that annoying, but I’m beginning to think that was wise. These functions cause problems. For example, there may be two reasonable ways to define these functions, one of which forms a couth pair and one of which forms an uncouth pair.
For example, how should you define cot and coth? There would be no disagreement over the definition
cot = lambda x: 1/tan(x)
but there are at least two definitions of inverse coth that you’ll find in practice:
arccot = lambda z: 0.5*pi - arctan(z) arccot = lambda z: arctan(1/z).
Both definitions have their advantages, but the former is uncouth while the latter is couth. You can verify that both definitions are the same at z = 1 but not at z = −1.
With the following definitions, the pairs (cos, cosh) and (sec, sech) are uncouth and the rest are couth.
from numpy import * csc = lambda x: 1/sin(x) sec = lambda x: 1/cos(x) cot = lambda x: 1/tan(x) csch = lambda x: 1/sinh(x) sech = lambda x: 1/cosh(x) coth = lambda x: 1/tanh(x) arccot = lambda z: arctan(1/z) arcsec = lambda z: arccos(1/z) arccsc = lambda z: arcsin(1/z) arccoth = lambda z: arctanh(1/z) arcsech = lambda z: arccosh(1/z) arccsch = lambda z: arcsinh(1/z)
[1] “According to Abramowitz and Stegun” or arccoth needn’t be uncouth. Robert M. Corless et al. ACM SIGSAM Bulletin, Volume 34, Issue 2, pp 58 – 65 https://doi.org/10.1145/362001.362023
The post Couth and uncouth function pairs first appeared on John D. Cook.2026-05-21 23:53:56
The difference between a circular function and a hyperbolic function is a rotation or two.
For example, cosh(z) = cos(iz). You can read that as saying that to find the hyperbolic cosine of z, first you rotate z a quarter turn to the left (i.e. multiply by i) and then take the cosine.
For another example, sinh(z) = −i sin(iz). This says that you can calculate the hyperbolic sine of z by rotating z to the left, taking the sine, and then rotating to the right.
You can relate each trig function “foo” with its hyperbolic counterpart “fooh” by applying one of the functions to iz then multiplying by a constant c that depends on foo: c = i for sin and tan, c = 1 for cos and sec, and c = −i for csc and cot.
Note that if the constant for foo is c, the constant for 1/foo is 1/c. So, for example, the constant for tan is i and the constant for cot is 1/i = −i.
We have four groups of equations, depending on whether the left side of the equation is foo(iz), fooh(iz), foo(z), or fooh(z).
This post was written as a warm-up for the next post on couth and uncouth function pairs.




2026-05-20 08:49:50
How should we define √(z² − 1)? Well, you could square z, subtract 1, and take the square root. What else would you do?!
The question turns out to be more subtle than it looks.
When x is a non-negative real number, √x is defined to be the non-negative real number whose square is x. When x is a complex number √x is defined to be a function that extends √x from the real line to the complex plane by analytic continuation. But we can’t extend √x as an analytic function to the entire complex plane ℂ. We have to choose to make a “cut” somewhere, and the conventional choice is to make a cut along the negative real axis.
The “principal branch” of the square root function is defined to be the unique function that analytically extends √x from the positive reals to ℂ \ (−∞, 0].
Assume for now that by √x we mean the principal branch of the square root function. Now what does √(z² − 1) mean? It could mean just what we said at the top of the post: we square z, subtract 1, and apply the (principal branch of the) square root function. If we do that, we must exclude those values of z such that (z² − 1) is negative. This means we have to cut out the imaginary axis and the interval [−1, 1].
This is what Mathematica will do when asked to evaluate Sqrt[z^2 - 1]. The command
ComplexPlot[Sqrt[z^2 - 1], {z, -2 - 2 I, 2 + 2 I}]
makes the branch cuts clear by abrupt changes in color.

Now let’s take a different approach. Consider the function √(z² − 1) as a whole. Do not think of it procedurally as above, first squaring z etc. Instead, think of a it as a black box that takes in z and returns a complex number whose square is z² − 1.
This function has an obvious definition for z > 1. And we can extend this function, via analytic continuation, to more of the complex plane. We can do this directly, not by extending the square root function. But as before, we cannot extend the function analytically to all of ℂ. We have to cut something out. A common choice is to cut out [−1, 1]. This eliminates the need for a branch cut along the imaginary axis.
The function
extends √(z² − 1) the way we want [1].
The Mathematica code
ComplexPlot[Exp[(1/2) (Log[z - 1 ] + Log[z + 1])], {z, -2 - 2 I, 2 + 2 I}]
shows that our function is now continuous across the imaginary axis, though there’s still a discontinuity as you cross [−1, 1].

We used this analytic extension of √(z² − 1) in the previous post to eliminate branch cuts in an identity.
[1] The principal branch of the logarithm has a cut along the negative real axis. Why does our square root function, defined using log, not have a branch cut along the negative axis?
First of all, the log function, and Mathematica’s implementation of it Log[], isn’t undefined on (−∞, 1), it just isn’t continuous there. The function still has a value. By convention, the value is taken to be the limit of log(z) approaching z from above, i.e. from the 2nd quadrant.
Second, the value of (log(z – 1) + log(z + 1))/2 differs by a factor of 2πi when approaching a value z < −1 from above versus from below. This factor goes away when taking the exponential. So our function f(z) is analytic across (−∞, 1) even though the log functions in its definition are not.
The post Square root of x² − 1 first appeared on John D. Cook.2026-05-20 07:37:31
The previous post derived the identity
and said in a footnote that the identity holds at least for x > 1 and y > 1. That’s true, but let’s see why the footnote is necessary.
Let’s have Mathematica plot
The plot will be 0 where the identity above holds.

The plot is indeed flat for x > 1 and y > 1, and more, but not everywhere.
If we combine the two square roots
and plot again we still get a valid identity for x > 1 and y > 1, but the plot changes.

This is because √a √b does not necessarily equal √(ab) when the arguments may be negative.
The square root function and the arccosh function are not naturally single-valued functions. They require branch cuts to force them to be single-valued, and the two functions require different branch cuts. I go into this in some detail here.
There is a way to reformulate our identity so that it holds everywhere. If we replace
with
which is equivalent for z > 1, the corresponding identity holds everywhere.
We can verify this with the following Mathematica code.
f[z_] := Exp[(1/2) (Log[z - 1 ] + Log[z + 1])] FullSimplify[Cosh[ArcCosh[x] + ArcCosh[y]] - x y - f[x] f[y]]
This returns 0.
By contrast, the code
FullSimplify[ Cosh[ArcCosh[x] + ArcCosh[y]] - x y - Sqrt[x^2 - 1] Sqrt[y^2 - 1]]
simply returns its input with no simplification, unless we add restrictions on x and y. The code
FullSimplify[
Cosh[ArcCosh[x] + ArcCosh[y]] - x y - Sqrt[x^2 - 1] Sqrt[y^2 - 1],
Assumptions -> {x > -1 && y > -1}]
does return 0.
2026-05-19 20:09:54
Markov numbers are integer solutions to
x² + y² + z² = 3xyz.
The Wikipedia article on Markov numbers mentions that Don Zagier studied Markov numbers by looking the approximating equation
x² + y² + z² = 3xyz + 4/9
which is equivalent to
f(x) + f(y) = f(z)
where f(t) is defined as arccosh(3t/2). It wasn’t clear to me why the two previous equations are equivalent, so I’m writing this post to show that they are equivalent.
Before showing the equivalence of Zagier’s two equations, let’s look at an example that shows solutions to his second equation approximate solutions to Markov’s equation.
The following code verifies that (5, 13, 194) is a solution to Markov’s equation.
x, y, z = 5, 13, 194 assert(x**2 + y**2 + z**2 == 3*x*y*z)
With the same x and y above, let’s show that the z in Zagier’s second equation is close to the z above.
from math import cosh, acosh f = lambda t: acosh(3*t/2) g = lambda t: cosh(t)*2/3 z = g(f(x) + f(y)) print(z)
This gives z = 194.0023, which is close to the value of z in the Markov triple above.
Now suppose
f(x) + f(y) = f(z)
which expands to
arccosh(3x/2) + arccosh(3y/2) = arccosh(3z/2).
It seems sensible to apply cosh to both sides. Is there some identity for cosh of a sum? Maybe you recall the equation for cosine of a sum:
cos(a + b) = cos(a) cos(b) − sin(a) sin(b).
Then Osborn’s rule says the corresponding hyperbolic identity is
cosh(a + b) = cosh(a) cosh(b) − sinh(a) sinh(b).
Osborn’s rule also says that the analog of the familiar identity
sin²(a) + cos²(b) = 1
is
sinh²(a) = cosh²(b) − 1.
From these two hyperbolic identities we can show that [1]
cosh( arccosh(a) + arccosh(b) ) = ab + √(a² − 1) √(b² − 1).
The identity derived above is the tool we need to reduce our task to routine algebra.
If
arccosh(3x/2) + arccosh(3y/2) = arccosh(3z/2)
then
(3x/2) (3y/2) + √((3x/2)² − 1) √((3y/2)² − 1) = 3z / 2
which simplifies to Zagier’s equation
x² + y² + z² = 3xyz + 4/9.
[1] The equation holds at least for x > 1 and y > 1, which is enough for this post. More general arguments run into complications due to branch cuts.
The post Approximating Markov’s equation first appeared on John D. Cook.2026-05-15 20:34:29
I’ve written a couple posts lately about reverse engineering the internal state of a random number generator, first Mersenne Twister then lehmer64. This post will look at xorshift128, implemented below.
import random
# Seed the generator state
a: int = random.getrandbits(32)
b: int = random.getrandbits(32)
c: int = random.getrandbits(32)
d: int = random.getrandbits(32)
MASK = 0xFFFFFFFF
def xorshift128() -> int:
global a, b, c, d
t = d
s = a
t ^= (t << 11) & MASK t ^= (t >> 8) & MASK
s ^= (s >> 19) & MASK
a, b, c, d = (t ^ s) & MASK, a, b, c
return a
Recovering the internal state of the generator is simple: it’s the four latest outputs in reverse order. This is illustrated by the following chart.

This means that once we’ve seen four outputs, we can predict the rest of the outputs. The following code demonstrates this.
Let’s generate five random values.
out = [xorshift128() for _ in range(5)]
Running
print(hex(out[4]))
shows the output 0xc3f4795d.
If we reset the state of the generator using the first four outputs
d, c, b, a, _ = out print(hex(xorshift128()))
we get the same result.
Mersenne Twister and lehmer64 have good statistical properties, despite being predictable. The xorshift128 generator is even easier to predict, but it also has good statistical properties. These generators would be fine for many applications, such as Monte Carlo simulation, but disastrous for use in cryptography.
The post on lehmer64 mentioned at the end that the internal state of PCG64 can also be recovered from its output. However, doing so requires far more sophisticated math and thousands of hours of compute time. Still, it’s not adequate for cryptography. For that you’d need a random number generator designed to be secure, such as ChaCha.
So why not just use a cryptographically secure random number generator (CSPRNG) for everything? You could, but the other generators mentioned in this post use less memory and are much faster. PCG64 occupies an interesting middle ground: simple and fast, but not easily reversible.
The post Recovering the state of xorshift128 first appeared on John D. Cook.