2024-11-21 02:00:47
A client emailed me a screenshot of a table rather than pasting the table as text into an email.
I thought about using an LLM to convert it to text, but the table is confidential client information and so I shouldn’t upload it anywhere.
I searched for a command line utility to do OCR and found tesseract
. I installed it with
sudo apt install tesseract-ocr libtesseract-dev tesseract-ocr-eng
and ran it with the default settings
tesseract screenshot.png textfile
It worked remarkably well. I had to change a C to a U, but otherwise I didn’t have to add or change any text, but I did have to delete a few extraneous parentheses generated by the software.
I work locally in part out of habit; it was the only way to work when I started using a computer. It has numerous advantages, such as being able to keep working when a hurricane knocks out my internet connection, but above all it is private.
I pay more attention to privacy than is convenient because I work in data privacy. And aside from my privacy, I have to protect our clients’ privacy.
Update: According to the comments, ChatGPT uses tesseract
. Assuming that’s true, using tesseract
directly is better than ChatGPT because it does exactly what you want. No ambiguity as far as what expected. No potential for tinkering with your results before you see them.
2024-11-20 20:40:50
A perfect number is a positive integer equal to the sum of its proper divisors, all divisors less than itself. The first three examples are as follows.
6 = 1 + 2 + 3
28 = 1 + 2 + 4 + 7 + 14
496 = 1 + 2 + 4 + 8 + 16 + 31 + 62 + 124 + 248
Every known perfect number is even. No one has proved that odd perfect numbers don’t exist, but people keep proving properties than an odd perfect number must have, and maybe some day this list of properties will contain a contradiction, proving that such numbers don’t exist. For example, an odd perfect number, if it exists, must have over 1,500 digits.
A Mersenne prime is a prime number of the form 2p− 1. Euclid proved that if M is a Mersenne prime, then M(M + 1)/2 is an even perfect number [1]. Leonhard Euler proved the converse of Euclid’s theorem two millennia later: all even perfect numbers have the form M(M + 1)/2 where M is a Mersenne prime.
There are currently 52 known Mersenne primes. The number of Mersenne primes is conjectured to be infinite, and the Mersenne primes discovered so far have roughly fit the projected distribution of such primes, so there is reason to suspect that there are infinitely many perfect numbers. There are at least 52.
By Euler’s theorem, all even perfect numbers have the form M(M + 1)/2 , and so all even perfect numbers are triangular numbers.
P = 1 + 2 + 3 + … + M
Even perfect numbers have the form 2p−1(2p − 1), and so this means all perfect numbers when written in binary consist of p ones followed by p − 1 zeros.
For example,
496 = 31 × 32 / 2 = 24(25 − 1)
and 496 = 111110000two, five ones followed by four zeros.
[1] Euclid didn’t use the term “Mersenne prime” because he lived 17 centuries before Marin Mersenne, but he did prove that if 2p − 1 is prime, then 2p−1(2p − 1) is perfect.
The post Perfect numbers first appeared on John D. Cook.2024-11-19 19:57:51
I stumbled on a post on X this morning, a commentary on the photo of RFK eating food from McDonalds that has been making the rounds.
This photo divides Puritans from Southerners.
Puritans think because RFK Jr is on the side of health food he can never commit such a “sin.”
Southerners think a rare treat is fine & it’s more important to be gracious in a gathering than to be a stickler.
I can’t read minds, and so I don’t know what RFK’s motives were, but I do believe the post is right about Southern graciousness. Of course this disposition is not limited to the South, nor does everyone in the South live this way.
A puritanical mindset, more typical of metaphorical Puritans than literal Puritans, is flat: all virtues are equally important. A gracious mindset is hierarchical: some virtues take precedent over others. And in a traditional Southern mindset, not offending hosts or guests has high priority.
The post Food and Grace first appeared on John D. Cook.2024-11-18 21:14:31
I’ve had a Bluesky account for over a year, but never posted much on it. Recently I noticed I’d gotten more followers on Bluesky and thought I might try posting there more often.
I am not moving to Bluesky. I have orders of magnitude more followers on X than on Bluesky and so I will focus my effort on X. But I expect to post on Bluesky occasionally.
I learned this morning that there is a bridge that will automatically post your Mastodon content to Bluesky. You should be able to follow
johndcook.mathstodon.xyz.ap.brid.gy
on Bluesky to see the content that I post on my Mastodon account at
johndcook.mathstodon.xyz
Note the server is mathstodon, not mastodon.
It may take some time before the bridge works. I suppose information needs to be propagated analogous to how DNS works. At the time of writing, I can see the bridge account by going to the Bluesky page for the bridge, but I’ve not yet been able to pull up the account in the Bluesky app.
Update: The bridge account is working from the Bluesky app. I’ve posted two articles to Mastodon and verified that they appear in the bridge account.
Just to be clear, we’re talking about two separate Bluesky accounts. My Bluesky account is
johndcook.bsky.social
but my Mastodon posts will be automatically posted to a separate account,
johndcook.mathstodon.xyz.ap.brid.gy.
The post Bluesky account first appeared on John D. Cook.2024-11-17 04:48:27
The basic idea of GPS is that if you know the distance to several satellites, you can figure out your position. But you don’t actually know, or need to know, the distance to the satellites: you know the time (according to each satellite’s clock) when the signals were sent, and you know the time (according to your clock) when the signals arrived.
The atomic clocks on satellites are synchronized with each other to within a nanosecond, but they’re not synchronized with your clock. There is some offset t between your clock and the satellites’ clocks. Presumably t is small, but it matters.
If you observe m satellites, you have a system of m equations in 4 unknowns:
|| ai − x || = ti − t
where ai is the known position of the ith satellite in 3 dimensions, x is the observer’s position in three dimensions, and ti is the difference between the time when the signal left the ith satellite (according to its clock) and the time when the signal arrived (according to the observer’s clock). This assumes we choose units so that the speed of light is c = 1.
So we have a system of m equations in 4 unknowns. It’s plausible there could be a unique solution provided m = 4. However, this is not guaranteed.
Here’s an example to suggest why there may not be a unique solution. Suppose t is known to be 0. Then observing 3 satellites will give us 3 equations in 3 unknowns. Each ti determines a sphere of radius ti. Suppose two spheres intersect in a circle, and the third sphere intersects this circle in two points. This means we have two solutions to our system of equations.
In [1] the authors thoroughly study the solution to the GPS system of equations. They allow the satellites and the observer to be anywhere in space and look for conditions under which the system has a unique solution. In practice, GPS satellites are approximately confined to a sphere (more on that here) and the observer is also approximately confined to a sphere, namely the earth’s surface, but the authors do not take advantage of these assumptions.
The authors also assume the problem is formulated in n dimensional space, where n does not necessarily equal 3. It’s complicated to state when the system of equations has a unique solution, but allowing n to vary does not add to the complexity.
I’m curious whether there are practical uses for the GPS problem when n > 3. There are numerous practical problems involving the intersections of spheres in higher dimensions, where the dimensions are not Euclidean spacial dimensions but rather abstract degrees of freedom. But off hand I cannot think of a problem that would involve the time offset that GPS location finding has.
[1] Mireille Boutin, Gregor Kemperc. Global positioning: The uniqueness question and a new solution method. Advances in Applied Mathematics 160 (2024)
The post The mathematics of GPS first appeared on John D. Cook.