MoreRSS

site iconJohn D. CookModify

I have decades of consulting experience helping companies solve complex problems involving applied math, statistics, and data privacy.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of John D. Cook

Square root of x² − 1

2026-05-20 08:49:50

How should we define √(z² − 1)? Well, you could square z, subtract 1, and take the square root. What else would you do?!

The question turns out to be more subtle than it looks.

When x is a non-negative real number, √x is defined to be the non-negative real number whose square is x. When x is a complex number √x is defined to be a function that extends √x from the real line to the complex plane by analytic continuation. But we can’t extend √x as an analytic function to the entire complex plane ℂ. We have to choose to make a “cut” somewhere, and the conventional choice is to make a cut along the negative real axis.

Using the principle branch

The “principle branch” of the square root function is defined to be the unique function that analytically extends √x from the positive reals to ℂ \ (−∞, 0].

Assume for now that by √x we mean the principle branch of the square root function. Now what does √(z² − 1) mean? It could mean just what we said at the top of the post: we square z, subtract 1, and apply the (principle branch of the) square root function. If we do that, we must exclude those values of z such that (z² − 1) is negative. This means we have to cut out the imaginary axis and the interval [−1, 1].

This is what Mathematica will do when asked to evaluate Sqrt[z^2 - 1]. The command

ComplexPlot[Sqrt[z^2 - 1], {z, -2 - 2 I, 2 + 2 I}]

makes the branch cuts clear by abrupt changes in color.

A different approach

Now let’s take a different approach. Consider the function √(z² − 1) as a whole. Do not think of it procedurally as above, first squaring z etc. Instead, think of a it as a black box that takes in z and returns a complex number whose square is z² − 1.

This function has an obvious definition for z > 1. And we can extend this function, via analytic continuation, to more of the complex plane. We can do this directly, not by extending the square root function. But as before, we cannot extend the function analytically to all of ℂ. We have to cut something out. A common choice is to cut out [−1, 1]. This eliminates the need for a branch cut along the imaginary axis.

The function

f(z) = \exp\left( \tfrac{1}{2}\left( \log(z - 1) + \log(z + 1) \right) \right)

extends √(z² − 1) the way we want [1].

The Mathematica code

ComplexPlot[Exp[(1/2) (Log[z - 1 ] + Log[z + 1])], {z, -2 - 2 I, 2 + 2 I}]

shows that our function is now continuous across the imaginary axis, though there’s still a discontinuity as you cross [−1, 1].

We used this analytic extension of √(z² − 1) in the previous post to eliminate branch cuts in an identity.

Related posts

[1] The principle branch of the logarithm has a cut along the negative real axis. Why does our square root function, defined using log, not have a branch cut along the negative axis?

First of all, the log function, and Mathematica’s implementation of it Log[], isn’t undefined on (−∞, 1), it just isn’t continuous there. The function still has a value. By convention, the value is taken to be the limit of log(z) approaching z from above, i.e. from the 2nd quadrant.

Second, the value of (log(z – 1) + log(z + 1))/2 differs by a factor of 2πi when approaching a value z < −1 from above versus from below. This factor goes away when taking the exponential. So our function f(z) is analytic across (−∞, 1) even though the log functions in its definition are not.

The post Square root of x² − 1 first appeared on John D. Cook.

Closer look at an identity

2026-05-20 07:37:31

The previous post derived the identity

\cosh\Big( \operatorname{arccosh}(x) + \operatorname{arccosh}(y)\Big) = xy + \sqrt{x^2 -1} \sqrt{y^2 -1}

and said in a footnote that the identity holds at least for x > 1 and y > 1. That’s true, but let’s see why the footnote is necessary.

Let’s have Mathematica plot

\left| \cosh\Big( \operatorname{arccosh}(x) + \operatorname{arccosh}(y)\Big) - xy - \sqrt{x^2 -1} \sqrt{y^2 -1} \right|

The plot will be 0 where the identity above holds.

The plot is indeed flat for x > 1 and y > 1, and more, but not everywhere.

If we combine the two square roots

\left| \cosh\Big( \operatorname{arccosh}(x) + \operatorname{arccosh}(y)\Big) - xy - \sqrt{(x^2 -1) (y^2 -1)} \right|

and plot again we still get a valid identity for x > 1 and y > 1, but the plot changes.

This is because √ab does not necessarily equal √(ab) when the arguments may be negative.

The square root function and the arccosh function are not naturally single-valued functions. They require branch cuts to force them to be single-valued, and the two functions require different branch cuts. I go into this in some detail here.

There is a way to reformulate our identity so that it holds everywhere. If we replace

\sqrt{z^2 - 1}

with

f(z) = \exp\left( \tfrac{1}{2}\left( \log(z - 1) + \log(z + 1) \right) \right)

which is equivalent for z > 1, the corresponding identity holds everywhere.

We can verify this with the following Mathematica code.

f[z_] := Exp[(1/2) (Log[z - 1 ] + Log[z + 1])]
FullSimplify[Cosh[ArcCosh[x] + ArcCosh[y]] - x y - f[x] f[y]]

This returns 0.

By contrast, the code

FullSimplify[
 Cosh[ArcCosh[x] + ArcCosh[y]] - x y - Sqrt[x^2 - 1] Sqrt[y^2 - 1]]

simply returns its input with no simplification, unless we add restrictions on x and y. The code

FullSimplify[
 Cosh[ArcCosh[x] + ArcCosh[y]] - x y - Sqrt[x^2 - 1] Sqrt[y^2 - 1], 
 Assumptions -> {x > -1 && y > -1}]

does return 0.

Related posts

The post Closer look at an identity first appeared on John D. Cook.

Approximating Markov’s equation

2026-05-19 20:09:54

Markov numbers are integer solutions to

x² + y² + z² = 3xyz.

The Wikipedia article on Markov numbers mentions that Don Zagier studied Markov numbers by looking the approximating equation

x² + y² + z² = 3xyz + 4/9

which is equivalent to

f(x) + f(y) = f(z)

where f(t) is defined as arccosh(3t/2). It wasn’t clear to me why the two previous equations are equivalent, so I’m writing this post to show that they are equivalent.

Examples

Before showing the equivalence of Zagier’s two equations, let’s look at an example that shows solutions to his second equation approximate solutions to Markov’s equation.

The following code verifies that (5, 13, 194) is a solution to Markov’s equation.

x, y, z = 5, 13, 194
assert(x**2 + y**2 + z**2 == 3*x*y*z)

With the same x and y above, let’s show that the z in Zagier’s second equation is close to the z above.

from math import cosh, acosh

f = lambda t: acosh(3*t/2)
g = lambda t: cosh(t)*2/3
z = g(f(x) + f(y))
print(z)

This gives z = 194.0023, which is close to the value of z in the Markov triple above.

Applying Osborn’s rule

Now suppose

f(x) + f(y) = f(z)

which expands to

arccosh(3x/2) + arccosh(3y/2)  = arccosh(3z/2).

It seems sensible to apply cosh to both sides. Is there some identity for cosh of a sum? Maybe you recall the equation for cosine of a sum:

cos(ab) = cos(a) cos(b) − sin(a) sin(b).

Then Osborn’s rule says the corresponding hyperbolic identity is

cosh(ab) = cosh(a) cosh(b) − sinh(a) sinh(b).

Osborn’s rule also says that the analog of the familiar identity

sin²(a) + cos²(b) = 1

is

sinh²(a) = cosh²(b) − 1.

From these two hyperbolic identities we can show that [1]

cosh( arccosh(a) + arccosh(b) ) = ab + √(a² − 1) √(b² − 1).

Slug it out

The identity derived above is the tool we need to reduce our task to routine algebra.

If

arccosh(3x/2) + arccosh(3y/2)  = arccosh(3z/2)

then

(3x/2)  (3y/2)  + √((3x/2)² − 1) √((3y/2)² − 1) = 3z / 2

which simplifies to Zagier’s equation

x² + y² + z² = 3xyz + 4/9.

Related posts

[1] The equation holds at least for x > 1 and y > 1, which is enough for this post. More general arguments run into complications due to branch cuts.

The post Approximating Markov’s equation first appeared on John D. Cook.

Recovering the state of xorshift128

2026-05-15 20:34:29

I’ve written a couple posts lately about reverse engineering the internal state of a random number generator, first Mersenne Twister then lehmer64. This post will look at xorshift128, implemented below.

import random

# Seed the generator state
a: int = random.getrandbits(32)
b: int = random.getrandbits(32)
c: int = random.getrandbits(32)
d: int = random.getrandbits(32)

MASK = 0xFFFFFFFF

def xorshift128() -> int:
    global a, b, c, d

    t = d
    s = a

    t ^= (t << 11) & MASK t ^= (t >>  8) & MASK
    s ^= (s >> 19) & MASK

    a, b, c, d = (t ^ s) & MASK, a, b, c

    return a

Recovering state

Recovering the internal state of the generator is simple: it’s the four latest outputs in reverse order. This is illustrated by the following chart.

 a b c d output 5081e5ca 79259a41 63e12955 651e537d c793d11c c793d11c 5081e5ca 79259a41 63e12955 ad52e33a ad52e33a c793d11c 5081e5ca 79259a41 f8f09343 f8f09343 ad52e33a c793d11c 5081e5ca a7009622 a7009622 f8f09343 ad52e33a c793d11c fe42a8ef

This means that once we’ve seen four outputs, we can predict the rest of the outputs. The following code demonstrates this.

Let’s generate five random values.

out = [xorshift128() for _ in range(5)]

Running

print(hex(out[4]))

shows the output 0xc3f4795d.

If we reset the state of the generator using the first four outputs

d, c, b, a, _ = out
print(hex(xorshift128()))

we get the same result.

Good stats, bad security

Mersenne Twister and lehmer64 have good statistical properties, despite being predictable. The xorshift128 generator is even easier to predict, but it also has good statistical properties. These generators would be fine for many applications, such as Monte Carlo simulation, but disastrous for use in cryptography.

The post on lehmer64 mentioned at the end that the internal state of PCG64 can also be recovered from its output. However, doing so requires far more sophisticated math and thousands of hours of compute time. Still, it’s not adequate for cryptography. For that you’d need a random number generator designed to be secure, such as ChaCha.

So why not just use a cryptographically secure random number generator (CSPRNG) for everything? You could, but the other generators mentioned in this post use less memory and are much faster. PCG64 occupies an interesting middle ground: simple and fast, but not easily reversible.

The post Recovering the state of xorshift128 first appeared on John D. Cook.

Initialize and print 128-bit integers in C

2026-05-12 20:20:20

If you look very closely at my previous post, you’ll notice that I initialize a 128-bit integer with a 64-bit value. The 128-bit unsigned integer represents the internal state of a random number generator. Why not initialize it to a 128-bit value? I was trying to keep the code simple.

A surprising feature of C compilers, at least of GCC and Clang, is that you cannot initialize a 128-bit integer to a 128-bit integer literal. You can’t directly print a 128-bit integer either, which is why the previous post introduces a function print_u128.

The code

__uint128_t x = 0x00112233445566778899aabbccddeeff;

Produces the following error message.

error: integer literal is too large to be represented in any integer type

The problem isn’t initializing a 128-bit number to a 128-bit value; the problem is that the compiler cannot parse the literal expression

0x00112233445566778899aabbccddeeff

One solution to the problem is to introduce the macro

#define U128(hi, lo) (((__uint128_t)(hi) << 64) | (lo))

and use it to initialize the variable.

__uint128_t x = U128(0x0011223344556677, 0x8899aabbccddeeff);

You can verify that x has the intended state by calling print_u128 from the previous post.

void print_u128(__uint128_t n)
{
    printf("0x%016lx%016lx\n",
           (uint64_t)(n >> 64),      // upper 64 bits
           (uint64_t)n);             // lower 64 bits
}

Then

print_u128(x);

prints

0x00112233445566778899aabbccddeeff

Update. The code for print_u128 above compiles cleanly with gcc but clang gives the following warning.

warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]

You can suppress the warning by including the inttypes header and modifying the print_u128 function.

Here’s the final code. It compiles cleanly under gcc and clang.

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#define U128(hi, lo) (((__uint128_t)(hi) << 64) | (lo))

void print_u128(__uint128_t n)
{
    printf("0x%016" PRIx64 "%016" PRIx64 "\n",
           (uint64_t)(n >> 64),
           (uint64_t)n);
}

int main(void)
{
    __uint128_t x = U128(0x0011223344556677, 0x8899aabbccddeeff);
    print_u128(x);
    return 0;
}
The post Initialize and print 128-bit integers in C first appeared on John D. Cook.

Hacking the lehmer64 RNG

2026-05-12 19:07:31

A couple days ago I wrote about hacking the Mersenne Twister. I explained how to recover the random number generator’s internal state from a stream of 640 outputs.

This post will do something similar with the lehmer64 random number generator. This generator is very simple to implement. Daniel Lemire found it to be “the fastest conventional random number generator that can pass Big Crush,” a well-respected test for pseudorandom number generators.

Implementing lehmer64

The lehmer64 generator can be implemented in C by

__uint128_t g_lehmer64_state;

uint64_t lehmer64() {
    g_lehmer64_state *= 0xda942042e4dd58b5ULL;  
  return g_lehmer64_state >> 64;
}

The analogous code in Python would have to simulate the overflow behavior of a 128-bit integer by reducing the state mod 2128 after the multiplication.

Reverse engineering lehmer64 is easier than reverse engineering the Mersenne Twister because only three outputs are needed. However, the theory behind the exploit is more sophisticated. See [1].

The following code sets the state to an arbitrary initial seed value and generates three values.

#include <stdio.h>
#include <stdint.h>

int main(void)
{
    g_lehmer64_state = 0x4789499d78770934; // seed
    for (int i = 0; i < 3; i++) {
        printf("0x%016lx\n", lehmer64());
    }

    return 0;
}

The code prints the following.

0x3d144d12822bcc2e
0x85a67226191a568d
0x53e803dffc88e8f8

Exploiting lehmer64

Here is Python code for recovering the state of the lehmer64 generator given in [1].

def reconstruct(X):
    a = 0xda942042e4dd58b5
    r = round(2.64929081169728e-7 * X[0] + 3.51729342107376e-7 * X[1] + 3.89110109147656e-8 * X[2])
    s = round(3.12752538137199e-7 * X[0] - 1.00664345453760e-7 * X[1] - 2.16685184476959e-7 * X[2])
    t = round(3.54263598631140e-8 * X[0] - 2.05535734808162e-7 * X[1] + 2.73269247090513e-7 * X[2])
    u = r * 1556524 + s * 2249380 + t * 1561981
    v = r * 8429177212358078682 + s * 4111469003616164778 + t * 3562247178301810180
    state = (a*u + v) % (2**128)
    return state

Let’s call reconstruct with the output of our C code.

X = [0x3d144d12822bcc2e, 0x85a67226191a568d, 0x53e803dffc88e8f8]
print( hex( reconstruct(X) ) )

This prints

0x3d144d12822bcc2e1b81101c593761c4

Now for the confusing part: at what point is the number above the state of the generator? Is it the state before or after generating the three values? Neither! It is the state after generating the first value.

We can verify this by modifying the C code as follows and rerunning it.

void print_u128(__uint128_t n)
{
    printf("0x%016lx%016lx\n",
           (uint64_t)(n >> 64),      // upper 64 bits
           (uint64_t)n);             // lower 64 bits
}

int main(void)
{
    g_lehmer64_state = 0x4789499d78770934; // seed
    for (int i = 0; i 

The main goal of [1] is to recover the state of the PCG generator, not the lehmer64 generator. The latter was a side quest. Recovering the state of PCG64 is much harder; the authors estimate it takes about 20,000 CPU-hours. The paper shows that a technique used as part of pursuing their main goal can quickly recover the lehmer64 state.

Related posts

[1] Charles Bouillaguet, Florette Martinez, and Julia Sauvage. Practical seed-recovery for the PCG Pseudo-Random Number Generator. IACR Transactions on Symmetric Cryptology. ISSN 2519-173X, Vol. 2020, No. 3, pp. 175–196.

The post Hacking the lehmer64 RNG first appeared on John D. Cook.