2025-10-13 20:53:16
I’ve written before about Bitcoin’s elliptic curve and Monero’s elliptic curve. In the Monero post I wrote “Bitcoin and Ethereum use the elliptic curve secp256k1.” That’s true, but it’s incomplete. Ethereum does use the elliptic curve secp256k1 for digital signatures, as does Bitcoin, but Ethereum also uses a different elliptic curve for its consensus layer.
Ethereum’s consensus layer uses the elliptic curve BLS12-381. This post will say a little bit about this curve, starting with unpacking the cryptic name. I won’t go into the advanced math behind the curve’s design but instead will focus on concrete calculations in Python to try to make the curve more tangible. The same curve is used in other cryptocurrencies, including Zcash.
First of all, BLS stands for Paulo Barreto, Ben Lynn, and Michael Scott, the developers of a family of elliptic curves known as the BLS curves, including BLS12-381. Incidentally, in the context of cryptography, BLS can refer to the BLS curves or to BLS signatures. The “L” is the same person in both instances, but the “B” and “S” in BLS signatures refer to Dan Boneh and Hovav Shacham.
The 381 in BLS12-381 refers to the fact that it is defined over a finite field whose order is a 381-bit number, namely
p = 0x1a0111ea397fe69a4b1ba7b6434bacd764774b84f38512bf6730d2a0f6b0f6241eabfffeb153ffffb9feffffffffaaab
We can verify that p is prime and that it has 381 bits.
>>> from sympy import isprime >>> isprime(p) True >>> 2**380 < p < 2**381 True
The value of p comes from applying a certain polynomial to a parameter z with low Hamming weight, i.e. with lots of zeros in its binary representation. Why this matters is beyond the scope of this post, but we can show that it’s true.
>>> z = -0xd201000000010000 >>> p == (z-1)**2*(z**4 - z**2 + 1)//3 + z True
The elliptic curve BLS12-381 is the set of points satisfying
y² = x³ + 4 mod p.
The 12 in BLS12-381 refers to an embedding degree k that we’ll get to shortly.
The elliptic curve BLS12-381 is pairing friendly, which is the reason for its use in the Ethereum consensus layer. This layer uses pairing-based cryptography to aggregate signatures. I may write more about that someday, but not today.
As I wrote a couple months ago,
An elliptic curve E/Fq is said to be pairing-friendly if r divides qk − 1 for some small k. Here r is the size of the largest prime-order subgroup of the curve.
In the case of BLS12-381, r = z4 − z2 + 1 and r is a 255-bit prime number.
>>> r = z**4 - z**2 + 1 >>> isprime(r) True >>> 2**254 < r < 2**255 True
And now we can verify that that the embedding degree is 12, showing the BLS12-381 is a pairing-friendly curve.
>>> (p**12 - 1) % r 0
So what is being paired with what? And what is being embedded into what? The group G1 of order r, is a subgroup of BLS12-381. It is paired with another group G2 also of order r, and there is a bilinear mapping from (G1, G2) to the multiplicative group of the finite field with p12 elements. For more details, see the section on BLS12-381 in Ben Edginton’s book Upgrading Ethereum: A technical handbook on Ethereum’s move to proof of stake and beyond.
2025-10-13 01:40:36
The inverse of the matrix
is the matrix
assuming ad − bc ≠ 0.
Also, the inverse of the bilinear function (a.k.a. Möbius transformation)
is the function
again assuming ad − bc ≠ 0.
The elementary takeaway is that here are two useful equations that are similar in appearance, so memorizing one makes it easy to memorize the other. We could stop there, but let’s dig a little deeper.
There is apparently an association between 2 × 2 matrices and Möbius transformations
This association is so strong that we can use it to compute the inverse of a Möbius transformation by going to the associated matrix, inverting it, and going back to a Möbius transformation. In diagram form, we have the following
Now there are a few loose ends. First of all, we don’t really have a map between Möbius transformations and matrices per se; we have a map between a particular representation of a Möbius transformation and a 2 × 2 matrix. If we multiplied a, b, c, and d in a Möbius transformation by 10, for example, we’d still have the same transformation, just a different representation, but it would go to a different matrix.
What we really have is a map between Möbius transformations and equivalence classes of invertible matrices, where two matrices are equivalent if one is a non-zero multiple of the other. If we wanted to make the diagram above more rigorous, we’d replace ℂ2×2 with PL(2, ℂ), linear transformations on the complex projective plane. In sophisticated terms, our map between Möbius transformations and matrices is an isomorphism between automorphisms of the Riemann sphere and PL(2, ℂ).
Möbius transformations act a lot like linear transformations because they are linear transformations, but on the complex projective plane, not on the complex numbers. More on that here.
The post Inverting matrices and bilinear functions first appeared on John D. Cook.2025-10-12 06:10:32
For reasons I don’t understand, it’s more common, at least in my experience, to see references on how to generate points on a sphere than in a sphere. But the latter is easily obtained from the former.
The customary way to generate points on a sphere in n dimensions is to generate points from an n-dimensional normal (Gaussian) distribution and normalize by dividing by the length. And the way to generate samples from an n-dimensional normal is simply to generate a list of n one-dimensional normals [1].
The way to generate points in a sphere is to generate a point S on the sphere then randomly adjust its radius r. If you generate a uniform random point in [0, 1] and set r = u then you’ll over-sample points near the origin. Instead, the thing to do is to set
r = u1/n
That’s because volume is proportional to rn. Taking the nth root of u mades the volume inside a sphere of radius r uniformly distributed.
To illustrate this, I’ll use the same technique as the post on sampling from a tetrahedron: generate random points in a sphere and see whether the right proportion fall inside a (hyper)cube inside the sphere.
We’ll work in n = 4 dimensions and use a sphere of radius R = 5. Our test cube will be the cube with all vertex coordinates equal to 1 or 2. We’ll have to use a lot of sample points. (This is why naive Monte Carlo breaks down in high dimensions: it may be that few if any points land in the region of interest.)
Here’s the code to do the sampling.
import numpy as np from scipy.stats import norm def incube(v): retval = True for x in v: retval = retval and 1 <= x <= 2 return retval def sample(R, n): x = norm.rvs(size=n) x /= np.linalg.norm(x) u = np.random.random() x *= R*u**(1/n) return x R = 5 n = 4 N = 100_000_000 V = R**n * np.pi**(n/2) / gamma(n/2 + 1) print(V) count = 0 for _ in range(N): v = sample(R, n) if incube(v): count += 1 print(count/N, 1/V)
This prints
0.0003252 0.000324
The two results agree to four decimal places, which is about what we’d expect given 1/√N = 0.0001.
[1] If you’re porting the sampler above to an environment where you don’t have a function like norm.rvs
for generating normal random samples, you can roll your own using the Box-Muller transformation. For example, here is a C++ implementation from scratch.
2025-10-11 20:20:59
Exactly one month ago I wrote about sampling points from a triangle. This post will look at the three dimensional analog, sampling from a tetrahedron. The generalization to higher dimensions works as well.
In the triangle post, I showed that a naive approach doesn’t work but a variation does. If the vertices of the triangle are A, B, and C, then the naive approach is to generate three uniform random numbers, normalize them, and use them to form a linear combination of the vertices. That is, generate
r1, r2, r3
uniformly from [0, 1], then define
r1′ = r1/(r1 + r2 + r3)
r2′ = r2/(r1 + r2 + r3)
r3′ = r3/(r1 + r2 + r3)
and return as a random sample the point
r1′ A + r2′ B + r3′ C.
This oversamples points near the middle of the triangle and undersamples points near the vertices. But everything above works if you replace uniform samples with exponential samples.
The analogous method works for a tetrahedron with vertices A, B, C, and D. Generate exponential random variables ri for i = 1, 2, 3, 4. Then normalize, defining each ri′ to be ri divided by the sum of all the ri s. Then the random sample is
r1′ A + r2′ B + r3′ C + r4′ D.
In the case of the triangle, it was easy to visualize that uniformly generated rs did not lead to uniform samples of the triangle, but that exponentially generated rs did. This is harder to do in three dimensions. Even if the points were uniformly sampled from a tetrahedron, they might not look uniformly distributed from a given perspective.
So how might we demonstrate that our method works? One approach would be to take a cube inside a tetrahedron and show that the proportion of samples that land inside the cube is what we’d expect, namely the ratio of the volume of the cube to the volume of the tetrahedron.
This brings up a couple interesting subproblems.
To find the volume of a tetrahedron, form a matrix M whose columns are the vectors from three of the vertices to the remaining vertex. So we could take A − D, B − D, and C − D. The volume of the tetrahedron is the determinant of M divided by 6.
A tetrahedron is convex, and so the line segment joining any two points inside the tetrahedron lies entirely inside the tetrahedron. It follows that if all the vertices of a cube are inside the a tetrahedron, the entire cube is inside.
So this brings us to the problem of testing whether a point is inside a tetrahedron. We can do this by converting the coordinates of the point to barycentric coordinates, then testing whether all the coordinates are non-negative and add to 1.
I made up four points as vertices of a tetrahedron.
import numpy as np A = [0, 0, 0] B = [5, 0, 0] C = [0, 7, 0] D = [0, 1, 6] A, B, C, D = map(np.array, [A, B, C, D])
For my test cube I chose the cube whose vertex coordinates are all combinations of 1 and 2. So the vertex nearest the origin is (1, 1, 1) and the vertex furthest from the origin is (2, 2, 2). You can verify that all the vertices are inside the tetrahedron, and so the cube is inside the tetrahedron.
The tetrahedron has volume 35, and the cube has volume 1, so we expect 1/35th of the random points to land inside the cube. Here’s the code to sample the tetrahedron and calculate the proportion of points that land in the cube.
from scipy.stats import expon def incube(v): return 1 <= v[0] <= 2 and 1 <= v[1] <= 2 and 1 <= v[2] <= 2 def good_sample(A, B, C, D): r = expon.rvs(size=4) r /= sum(r) return r[0]*A + r[1]*B + r[2]*C + r[3]*D def bad_sample(A, B, C, D): r = np.random.random(size=4) r /= sum(r) return r[0]*A + r[1]*B + r[2]*C + r[3]*D N = 1_000_000 badcount = 0 goodcount = 0 for _ in range(N): v = bad_sample(A, B, C, D) if incube(v): badcount += 1 v = good_sample(A, B, C, D) if incube(v): goodcount += 1 print(badcount/N, goodcount/N, 1/35)
This produced
0.094388 0.028372 0.028571…
The good (exponential) method match the expected proportion to three decimal places but the bad (uniform) method put about 3.3 times as many points in the cube as expected.
Note that three decimal agreement is about what we’d expect via the central limit theorem since 1/√N = 0.001.
If you’re porting the sampler above to an environment where you don’t a function for generating exponential random samples, you can roll your own by returning −log(u) where u is a uniform sample from [0. 1].
After writing the post about sampling from triangles, I wrote a followup post about sampling from polygons. In a nutshell, divide your polygon into triangles, then randomly select a triangle in proportion to its area, then sample with that triangle. You can do the analogous process in three dimensions by dividing a general volume into tetrahedra.
The post Random samples from a tetrahedron first appeared on John D. Cook.2025-10-10 22:26:29
Many hope that AI will be “smart enough” to make breakthrough scientific discoveries in the near future, such as find a cure for cancer. Some research efforts have sought to create an “AI Scientist” that can make discoveries in an automated or semi-automated way; for a recent example, see [1]. Others [2] have called out the methodological pitfalls of some of these attempts. Still others question whether a truly original discovery is even possible at all for an AI.
OpenAI released GPT-5 on August 7. Some thought it lackluster and falling behind compared to expectations. Others however found performance in some areas to be much advanced compared to its predecessor.
Two recent reports show the new model’s utility. Scott Aaronson published a paper last month [3], [4] in which “a key technical step in the proof of the main result came from AI.” Also, Terence Tao reported earlier this month [5] his use of ChatGPT to find a first counterexample to an unsolved mathematics problem.
I’m sure this resonates with the experience of other researchers using the tool. Recently, in the course of a discussion I had with ChatGPT, it came up with a new algorithmic recipe for something I was working on, based on ideas combined from several papers but in itself apparently original. That was a very simple case—but on a grander scale, connecting two ideas together in a novel way can lead to a real breakthrough. For example, Faltings’ proof of the Mordell Conjecture in 1983 was based on recognizing a subtle internal connection among some already existing theorems.
There is always the specter of concern that an idea “maybe was already in the training data.” It can be difficult to prove otherwise. But deep domain experts like Scott Aaronson and Terence Tao are likely to know with high likelihood whether the idea is truly an original never-before-published result or not.
If past is prologue, we can hope for more powerful models in the future that can solve increasingly hard problems.
[1] Yixuan Weng, Minjun Zhu, Qiujie Xie, Qiyao Sun, Zhen Lin, Sifan Liu, Yue Zhang, “DeepScientist: Advancing Frontier-Pushing Scientific Findings Progressively.” https://arxiv.org/abs/2509.26603.
[2] Ziming Luo, Atoosa Kasirzadeh, Nihar B. Shah, “The More You Automate, the Less You See: Hidden Pitfalls of AI Scientist Systems.” https://arxiv.org/abs/2509.08713.
[3] “The QMA Singularity.” https://scottaaronson.blog/?p=9183
[4] Scott Aaronson, Freek Witteveen, “Limits to black-box amplification in QMA.” https://arxiv.org/abs/2509.21131.
[5] https://mathstodon.xyz/@tao/115306424727150237
The post GPT-5 for AI-assisted discovery first appeared on John D. Cook.
2025-10-09 21:57:56
This morning I took an old blog post and turned it into an X thread. I think the thread is easier to read. More expository and less rigorous.
The post and thread look at generalizations of the fact that every integer and its fifth power end in the same digit. The conclusion is that n and nk end in the same digit base b if b is square-free and k = φ(b) + 1. So, for example, 10 is square-free (i.e. not divisible by a non-trivial square) and φ(10) + 1 = 5.
Benjamin Clark replied suggesting looking at λ(b) + 1 in addition to φ(n) + 1.
To back up a bit, φ(n) is Euler’s totient function, the number of positive integers less than and relatively prime to n. Robert Carmichael’s totient function λ(n) is a closely related function, one that replaced Euler’s function in implementations of RSA encryption—more specifically in RSA decryption—something I wrote about earlier this week.
Euler’s generalization of Fermat’s little theorem says if a is relatively prime to m, then
aφ(n) = 1 (mod n).
Carmichael’s totient λ(n) is the smallest number such that
aλ(n) = 1 (mod n)
when a is relatively prime to n.
Sometimes φ(n) = λ(n), namely for n = 1, 2, 4, and every odd prime power. Otherwise λ(n) is a proper divisor of φ(n).
Carmichael conjectured that for every n there is at least one other integer m ≠ n such that φ(m) = φ(n). He proved that this is true at least for n less than 1037. The conjecture is now known to be true for n less than 101010.
The post More on Carmichael first appeared on John D. Cook.