MoreRSS

site iconJohn D. CookModify

I have decades of consulting experience helping companies solve complex problems involving applied math, statistics, and data privacy.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of John D. Cook

Lebesgue constants

2026-03-27 04:05:06

I alluded to Lebesgue constants in the previous post without giving them a name. There I said that the bound on order n interpolation error has the form

ch^{n+1} + \lambda \delta

where h is the spacing between interpolation points and δ is the error in the tabulated values. The constant c depends on the function f being interpolated, and to a lesser extent on n. The constant λ is independent of f but depends on n and on the relative spacing between the interpolation nodes. This post will look closer at λ.

Given a set of n + 1 nodes T

a = x_0 < x_1 < x_2 < \cdots < x_{n-1} < x_n = b

define

\ell_j(x) := \prod_{\begin{smallmatrix}i=0\\ j\neq i\end{smallmatrix}}^{n} \frac{x-x_i}{x_j-x_i}

Then the Lebesgue function is defined by

\lambda_n(x) = \sum_{j=0}^n |\ell_j(x)|

and the Lebesgue constant for the grid is the maximum value of the Lebesgue function

\Lambda_n(T)=\max_{x\in[a,b]} \lambda_n(x)

The values of Λ are difficult to compute, but there are nice asymptotic expressions for Λ when the grid is evenly spaced:

\Lambda_n \sim \frac{2^{n+1}}{n \log n}

When the grid points are at the roots of a Chebyshev polynomial then

\Lambda_n \approx \frac{2}{\pi} \log(n + 1) + 1

The previous post mentioned the cases n = 11 and n = 29 for evenly spaced grids. The corresponding values of Λ are approximately 155 and 10995642. So 11th order interpolation is amplifying the rounding error in the tabulated points by a factor of 155, which might be acceptable. But 29th order interpolation is amplifying the rounding error by a factor of over 10 million.

The corresponding values of Λ for Chebyshev-spaced nodes are 2.58 and 3.17. Chebyshev spacing is clearly better for high-order interpolation, which you have that option.

The post Lebesgue constants first appeared on John D. Cook.

How much precision can you squeeze out of a table?

2026-03-26 22:33:37

Richard Feynman said that almost everything becomes interesting if you look into it deeply enough. Looking up numbers in a table is certainly not interesting, but it becomes more interesting when you dig into how well you can fill in the gaps.

If you want to know the value of a tabulated function between values of x given in the table, you have to use interpolation. Linear interpolation is often adequate, but you could get more accurate results using higher-order interpolation.

Suppose you have a function f(x) tabulated at x = 3.00, 3.01, 3.02, …, 3.99, 4.00 and you want to approximate the value of the function at π. You could approximate f(π) using the values of f(3.14) and f(3.15) with linear interpolation, but you could also take advantage of more points in the table. For example, you could use cubic interpolation to calculate f(π) using f(3.13), f(3.14), f(3.15), and f(3.16). Or you could use 29th degree interpolation with the values of f at 3.00, 3.01, 3.02, …, 3.29.

The Lagrange interpolation theorem lets you compute an upper bound on your interpolation error. However, the theorem assumes the values at each of the tabulated points are exact. And for ordinary use, you can assume the tabulated values are exact. The biggest source of error is typically the size of the gap between tabulated x values, not the precision of the tabulated values. Tables were designed so this is true [1].

The bound on order n interpolation error has the form

c hn + 1 + λ δ

where h is the spacing between interpolation points and δ is the error in the tabulated values. The value of c depends on the derivatives of the function you’re interpolating [2]. The value of λ is at least 1 since λδ is the “interpolation” error at the tabulated points.

The accuracy of an interpolated value cannot be better than δ in general, and so you pick the value of n that makes c hn + 1 less than δ. Any higher value of n is not helpful. And in fact higher values of n are harmful since λ grows exponentially as a function of n [3].

See the next post for mathematical details regarding the λs.

Examples

Let’s look at a specific example. Here’s a piece of a table for natural logarithms from A&S.

Here h = 10−3, and so linear interpolation would give you an error on the order of h² = 10−6. You’re never going to get error less than 10−15 since that’s the error in the tabulated values, so 4th order interpolation gives you about as much precision as you’re going to get. Carefully bounding the error would require using the values of c and λ above that are specific to this context. In fact, the interpolation error is on the order of 10−8 using 5th order interpolation, and that’s the best you can do.

I’ll briefly mention a couple more examples from A&S. The book includes a table of sine values, tabulated to 23 decimal places, in increments of h = 0.001 radians. A rough estimate would suggest 7th order interpolation is as high as you should go, and in fact the book indicates that 7th order interpolation will give you 9 figures of accuracy,

Another table from A&S gives values of the Bessel function J0 in with 15 digit values in increments of h = 0.1. It says that 11th order interpolation will give you four decimal places of precision. In this case, fairly high-order interpolation is useful and even necessary. A large number of decimal places are needed in the tabulated values relative to the output precision because the spacing between points is so wide.

Related posts

[1] I say were because of course people rarely look up function values in tables anymore. Tables and interpolation are still widely used, just not directly by people; computers do the lookup and interpolation on their behalf.

[2] For functions like sine, the value of c doesn’t grow with n, and in fact decreases slowly as n increases. But for other functions, c can grow with n, which can cause problems like Runge phenomena.

[2] The constant λ grows exponentially with n for evenly spaced interpolation points, and values in a table are evenly spaced. The constant λ grows only logarithmically for Chebyshev spacing, but this isn’t practical for a general purpose table.

The post How much precision can you squeeze out of a table? first appeared on John D. Cook.

From Mendeleev to Fourier

2026-03-24 23:01:35

The previous post looked at an inequality discovered by Dmitri Mendeleev and generalized by Andrey Markov:

Theorem (Markov): If P(x) is a real polynomial of degree n, and |P(x)| ≤ 1 on [−1, 1] then |P′(x)| ≤ n² on [−1, 1].

If P(x) is a trigonometric polynomial then Bernstein proved that the bound decreases from n² to n.

Theorem (Bernstein): If P(x) is a trigonometric polynomial of degree n, and |P(z)| ≤ 1 on [−π, π] then |P′(x)| ≤ n on [−π, π].

Now a trigonometric polynomial is a truncated Fourier series

T(x) = a_0 + \sum_{n=1}^N a_n \cos nx + \sum_{n=1}^N b_n \sin nx

and so the max norm of the T′ is no more than n times the max norm of T.

This post and the previous one were motivated by Terence Tao’s latest post on Bernstein theory.

Related posts

The post From Mendeleev to Fourier first appeared on John D. Cook.

Mendeleev’s inequality

2026-03-24 20:47:32

Dmitri Mendeleev is best known for creating the first periodic table of chemical elements. He also discovered an interesting mathematical theorem. Empirical research led him to a question about interpolation, which in turn led him to a theorem about polynomials and their derivatives.

I ran across Mendeleev’s theorem via a paper by Boas [1]. The opening paragraph describes what Mendeleev was working on.

Some years after the chemist Mendeleev invented the periodic table of the elements he made a study of the specific gravity of a solution as a function of the percentage of the dissolved substance. This function is of some practical importance: for example, it is used in testing beer and wine for alcoholic content, and in testing the cooling system of an automobile for concentration of anti-freeze; but present-day physical chemists do not seem to find it as interesting as Mendeleev did.

Mendeleev fit his data by patching together quadratic polynomials, i.e. he used quadratic splines. A question about the slopes of these splines lead to the following.

Theorem (Mendeleev): Let P(x) be a quadratic polynomial on [−1, 1] such that |P(x)| ≤ 1. Then |P′(x)| ≤ 4.

Mendeleev showed his result to mathematician Andrey Markov who generalized it to the following.

Theorem (Markov): If P(x) is a real polynomial of degree n, and |P(x)| ≤ 1 on [−1, 1] then |P′(x)| ≤ n² on [−1, 1].

Both inequalities are sharp with equality if and only if P(x) = ±Tn(x), the nth Chebyshev polynomial. In the special case of Mendeleev’s inequality, equality holds for

T2(x) = 2x² − 1.

Andrey Markov’s brother Vladimir proved an extension of Andrey’s theorem to higher derivatives,

Related posts

[1] R. P. Boas, Jr. Inequalities for the Derivatives of Polynomials. Mathematics Magazine, Vol. 42, No. 4 (Sep., 1969), pp. 165–174

The post Mendeleev’s inequality first appeared on John D. Cook.

Set intersection and difference at the command line

2026-03-23 19:39:47

A few years ago I wrote about comm, a utility that lets you do set theory at the command line. It’s a really useful little program, but it has two drawbacks: the syntax is hard to remember, and the input files must be sorted.

If A and B are two sorted lists,

    comm A B

prints A − B, B − A, and A ∩ B. You usually don’t want all three, and so comm lets you filter the output. It’s a little quirky in that you specify what you don’t want instead of what you do. And you have to remember that 1, 2, and 3 correspond to A − B, B − A, and A ∩ B respectively.

Venn diagram of comm parameters

A couple little scripts can hide the quirks. I have a script intersect

    comm -12 <(sort "$1") <(sort "$2")

and another script setminus

    comm -23 <(sort "$1") <(sort "$2")

that sort the input files on the fly and eliminate the need to remember comm‘s filtering syntax.

The setminus script computes A − B. To find B − A call the script with the arguments reversed.

The post Set intersection and difference at the command line first appeared on John D. Cook.

Embedded regex flags

2026-03-21 00:58:42

The hardest part of using regular expressions is not crafting regular expressions per se. In my opinion, the two hardest parts are minor syntax variations between implementations, and all the environmental stuff outside of regular expressions per se.

Embedded regular expression modifiers address one of the environmental complications by putting the modifier in the regular expression itself.

For example, if you want to make a grep search case-insensitive, you pass it the -i flag. But if you want to make a regex case-insensitive inside a Python program, you pass a function the argument re.IGNORECASE. But if you put (?i) at the beginning of your regular expression, then the intention to make the match case-insensitive is embedded directly into the regex. You could use the regex in any environment that supports (?i) without having to know how to specify modifiers in that environment.

I was debugging a Python script this morning that worked under one version of Python and not under another. The root of the problem was that it was using re.findall() with several huge regular expression that had embedded modifiers. That was OK up to Python 3.5, then it was a warning between versions 3.6 and 3.10, and it’s an error in versions 3.11 and later.

The problem isn’t with all embedded modifiers, only global modifiers that don’t appear at the beginning of the regex. Older versions of Python, following Perl’s lead, would let you put a modifier like (?i) in the middle of a regex, and apply the modifier from that point to the end of the expression. In the latest versions of Python, you can either place the modifier at the beginning of the regex, or use a scoped modifier like (?:…) in the middle of the expression.

I didn’t want to edit the regular expressions in my code—some had over a thousand characters—so I changed re.findall() to regex.findall(). The third-party regex module is generally more Perl-compatible than Python’s standard re module.

The post Embedded regex flags first appeared on John D. Cook.