2025-01-10 06:05:58
Every six months or so, this neat group called the International Earth Rotation Service issues a directive on whether there will be a leap second inserted at the end of that six month period. You usually find out at the beginning of January or the beginning of July, and thus would have a leap second event at the end of June or December, respectively.
Ten years ago, in January 2015, they announced a leap second would be added at the end of June 2015. The last one had been three years prior, and when it happened, it screwed things up pretty bad for the cat picture factory. They hit kernel problems, userspace problems, and worse.
This time, I was working there, and decided there would not be a repeat. The entire company's time infrastructure would be adjusted so it would simply slow down for about 20 hours before the event, and so it would become a whole second "slow" relative to the rest of the world. Then at midnight UTC, the rest of the world would go 58, 59, 60, 0, and we'd go 57, 58, 59, 0, and then we'd be in lock-step again.
So how do you do something like this? Well, you have to get yourself into a position where you can add a "lie" to the time standard. This company had a handful of these devices which had a satellite receiver for GPS on one side and an Ethernet port for NTP on the other with a decent little clock on the inside. I just had to get between those and everyone else so they would receive my adjusted time scale for the duration, then we could switch back when things were ready.
This is the whole "leap smearing" thing that you might have heard of if you run in "time nut" circles. Someone else came up with it and they had only published their formula for computing the lie over a spread of time. The rest of it was "left as an exercise for the reader", so to speak.
Work like this benefits from being highly visible, so I bought a pair of broadcast-studio style clocks which spoke NTP over Ethernet and installed them on my desk. One of them was pointed at the usual GPS->NTP infrastructure, and the other was pointed at the ntp servers running my hacked-up code which could have "lies" injected.
I'd start up a test and watch them drift apart. At first, you can't even tell, but after a couple of hours, you get to where one subtly updates just a bit before the other one. You can even see it in pictures: parts of one light up before the other.
Then at the end of the ramp, they're a full second apart, but they're still updating at the same time. It's just that one goes from 39 to 40 when the other goes from 40 to 41.
Back and forth I went with my test clocks, test systems, and a handful of guinea pig boxes that volunteered to subscribe to the hacked-up time standard during these tests. We had to find a rate-of-change that would be accepted by the ntp daemons all over the fleet. There's only so much of a change you can introduce to the rate of change itself, and that meant a lot of careful experimentation to find out just what would work.
We ended up with something like 20 hours to smear off a single second.
The end of June approached, and it was time to do a full-scale test. I wanted to be sure that we could survive being a second out of whack without having the confounding factor of the whole rest of the world simultaneously dealing with their own leap second stuff. We needed to know if we'd be okay, and the only way to know was to smear it off, hold a bit to see if anything happened, then *smear it back on*.
This is probably the first anyone outside the company's heard of it, but about a week before, I smeared off a whole second and left the ENTIRE company's infra (laptops and all) running a second slow relative to the rest of the world. Then we stayed there for a couple of hours if I remember correctly, and then went forward again and caught back up.
A week later, we did it for real and it just worked.
So, yes, in June 2015, I slowed down the whole company by a second.
Of course, here it is ten years later, and the guy in charge just sent it back fifty years. Way to upstage me, dude.
2025-01-07 04:59:42
Many years ago, I was given one of those massive radio clocks that also reported indoor/outdoor temperature and humidity. They were apparently sold at Costco under the "SkyScan" brand, and had two parts: the (very large) indoor display which stood upright with a little plastic foot in the back, and a small sensor which was intended to be placed outside.
It was never quite 100%. It had issues receiving the 60 kHz WWVB clock signal and so it would get out of sync. It also apparently had its own rules for DST baked in, so when the US changed them in 2007, it would go stupid twice a year for a couple of weeks.
It had other problems. I once caught it saying that it was the 9th day of the 14th month, and oh hey, it's a Tuesday. Lousy Smarch weather or no, that's so unbelievably wrong I just had to stop using it.
However, I kept the sensor as more of a curiosity. In recent years I got into the whole business of tracking temperatures in certain interesting places, and this old sensor kept bugging me. The usual tools can see it doing its 433 MHz announcements but nobody seems to have decoded it, or at least, documented it. I've been taking a whack at it every now and then, and finally got it to a usable spot.
So, for the sake of the dozen or so people out there who might have one of these things nearly 20 years later and who want to use just the remote sensor, here's what I've learned about it.
The encoding is no frills: on is 1, off is 0. The on pulse or the off gap lasts about 3900 us (curiously close to 1 sec / 256, for what it's worth).
Transmissions are sent twice, twelve seconds apart. The first one has a particular bit set to 0, and the second one has it set to 1, so I've called it the transmission number. The whole process restarts every 60 seconds.
There appears to be some kind of preamble or sync which is just 10.
Bits 2-33 are followed by inverted versions of themselves in bits 34-65.
Humidity is BCD: 4 bits and another 4 bits, so [4] [7] for 47%.
The channel seems to be 3 bits but is only ever 1, 2, or 3.
Temperature is also BCD with 3 groups of 4 bits: [3] [2] [7] == 32.7C.
All of this has been determined empirically. There are timing diagrams for this and other similar devices on the usual "fccid" sites, but none of them match this scheme exactly.
So, from the top, 2 preamble bits (10), then...
4 + 4 humidity bits (BCD)
1 transmission bit
3 channel bits
1 sign bit (1 = negative temperature)
1 battery bit (1 = battery is low)
6 id bits
4 + 4 + 4 temperature bits (BCD)
... and then the part past the preamble repeats, inverted. This can be used to detect errors.
For the sake of grep bait: the SkyScan 81690 with a sensor bearing the FCC ID of OQH-000000-05-002, or perhaps OQH-CHUNGS-05-002. It might be a "C-8102" model, but again, all of the FCC stuff I've been able to find is not quite right.
2025-01-05 08:00:01
I've been thinking about things that annoy me about other web pages. Safari recently gained the ability to "hide distracting items" and I've been having great fun telling various idiot web "designers" to stuff it. Reclaiming a simple experience free of wibbly wobbly stuff has been great.
In doing this, I figured maybe I should tell people about the things I don't do here, so they realize how much they are "missing out" on.
I don't force people to have Javascript to read my stuff. The simplest text-based web browser going back about as far as you can imagine should be able to render the content of the pages without any trouble. This is because there's no JS at all in these posts.
I don't force you to use SSL/TLS to connect here. Use it if you want, but if you can't, hey, that's fine, too.
The last two items mean you could probably read posts via telnet as long as you were okay with skipping over all of the HTML <tag> <crap>. You might notice that the text usually word-wraps around 72, so it's not that much of a stretch.
I don't track "engagement" by running scripts in the post's pages that report back on how long someone's looked at it... because, again, no JS.
I don't set cookies. I also don't send unique values for things like Last-Modified or ETag which also could be used to identify individuals. You can compare the values you get with others and confirm they are the same.
I don't use visitor IP addresses outside of a context of filtering abuse.
I don't do popups anywhere. You won't see something that interrupts your reading to ask you to "subscribe" and to give up your e-mail address.
I don't do animations outside of one place. Exactly one post has something in it which does some graphical crap that changes by itself. It's way back in July 2011, and it's in a story ABOUT animating a display to show the absence of a value. It doesn't try to grab your attention or mislead you, and it's not selling anything.
I don't use autoplaying video or audio. There are a couple of posts where you can click on your browser's standard controls to start playback of a bit of audio that's related to the post. Those are also not used to grab your attention, mislead you, or sell something.
I don't try to "grab you" when you back out of a page to say "before you go, check out this other thing". The same applies to closing the window or tab: you won't get this "are you sure?" crap. If you want out, you get out *the first time*.
I don't pretend that posts are evergreen by hiding their dates. Everything has a clear date both in the header of the page and built into the URL. If it's out of date, it'll be pretty obvious.
I don't put crap in the pages which "follows you" down the page as you scroll. You want to see my header again? Cool, you can scroll back up to it if it's a particularly long post. I don't keep a "dick bar" that sticks to the top of the page to remind you which site you're on. Your browser is already doing that for you.
There are no floating buttons saying things like "contact me" or "pay me" or "check out this service I totally didn't just write this post to hawk on the red or orange sites". I don't put diagonal banner things across the corners. I don't blur it out and force you to click on something to keep going. TL;DR I don't cover up the content, period.
I don't mess with the scrolling of the page in your browser. You won't get some half-assed attempt at "smoothing" from anything I've done. You won't get yanked back up to the top just because you switched tabs and came back later.
I don't do some half-assed horizontal "progress bar" as you scroll down the page. Your browser probably /already/ has one of those if it's graphical. It's called the scroll bar. (See also: no animations.)
I don't litter the page with icons that claim to be for "sharing" or "liking" a post but which frequently are used to phone home to the mothership for a given service to report that someone (i.e., YOU) has looked at a particular page somewhere. The one icon you will find on all posts links to the "how-to" page for subscribing to my Atom feed, and that comes from here and phones home to nobody.
I don't use "invisible icons" or other tracker crap. You won't find evil 1x1s or things of that nature. Nobody's being pinged when you load one of these posts.
I don't load the page in parts as you scroll it. It loads once and then you have it. If you get disconnected after that point, you can still read the whole thing. There's nothing more to be done.
I don't add images without ALTs and/or accompanying text in the post which aims to describe what's going on for the sake of those who can't get at the image for whatever reason (and there are a great many). (Full disclosure: I wasn't always so good at writing the descriptions, and old posts that haven't been fixed yet are hit or miss.)
I don't do nefarious things to "outgoing links" to report back on which ones have been clicked on by visitors. A link to example.com is just <a href="http://example.com/">blah blah blah</a> with no funny stuff added. There are no ?tracking_args added or other such nonsense, and I strip them off if I find them on something I want to use here. If you click on a link, that's between you and your browser, and I'm none the wiser. I really don't want to know, anyway. I also don't mess with whether it opens in a tab or new window or whatever else.
I don't redirect you through other sites and/or domains in order to build some kind of "tracking" "dossier" on you. If you ask for /w/2024/12/17/packets/, you get that handed to you directly. (And if you leave off the trailing slash, you get a 301 back to that, because, well, it's a directory, and you really want the index page for it.)
I don't put godawful vacuous and misleading clickbait "you may be interested in..." boxes of the worst kind of crap on the Internet at the bottom of my posts, or anywhere else for that matter.
My pages actually have a bottom, and it stays put. If you hit [END] or scroll to the bottom, you see my footer and that's it. It won't try to jam more crap in there to "keep you engaged". That's it. If you want more stuff to read, that's entirely up to you, and you can click around to do exactly that.
I don't make any money just because someone lands on one of my posts. You won't find ads being injected by random terrible companies. In fact, keeping this stuff up and available costs me a chunk every month (and always has). I sell the occasional book and get the occasional "buy me a cup of tea or lunch" type of thing, and I do appreciate those. (I tried doing paid watch-me-code "lessons" years ago, but it really didn't go anywhere, and it's long gone now.)
I'm pretty sure everything that loads as part of one of my posts is entirely sourced from the same origin - i.e., http[s]://rachelbythebay.com/ something or other. The handful of images (like the feed icon or the bridge pic), sounds, the CSS, and other things "inlined" in a post are not coming from anywhere else. You aren't "leaving tracks" with some kind of "trust me I'm a dolphin" style third-party "CDN" service. You connect to me, ask for stuff, and I provide it. Easy.
I say "pretty sure" on the last one because there are almost 1500 posts now, and while my page generation stuff doesn't even allow for an IMG SRC that comes from another origin, there are some "raw" bits of HTML in a few old weird posts that break the usual pattern. I don't think I've ever done an IMG or SOURCE or LINK from off-site in a raw block, though.
I don't even WANT stuff coming from off-site, since it tends to break. I find that I can really only rely on myself to keep URLs working over time.
Phew! That's all I can think of for the moment.
2024-12-28 06:33:10
Questions, questions, questions. Sometimes I have answers. Here we go with more reader feedback.
...
M.D. asks if I could share a story about when a company "did something really wrong".
OK, how about yet another case of "real names" gone wrong? I'm talking about the thing where some derpy programmer writes a filter to "exclude naughty words" and ends up rejecting people with names that HAPPEN to match a four-letter English word. My canonical example is "Nishit" because that's what actually happened at another job back around 2009.
But that's old news. I'm talking about "yet another case". This one is from right at the end of 2019. It seems like someone decided they were going to "move a metric" from the support queue. There were a TINY NUMBER of customers who had deliberately signed up with obviously offensive names. They were being handled by reasonable bags of mostly water (i.e., people) who could look at it and figure out what to keep and what to purge.
Well, the derpy programmer with a goal to hit by the end of the quarter apparently struck again, and they wrote a thing to ban anyone who matched a short list. Then they ran it, and - surprise - it banned a bunch of real people who weren't doing anything wrong. Of course, those people have probably had many problems on other services, and now THIS company was the latest one to show how damn stupid and unfeeling it could be.
"My last name really is 'Cocks'. How would you like me to proceed?"
Unsurprisingly, this pissed off a bunch of people and generated a blip in the news cycle. Internally, it was brought to the weekly review meeting that I was somehow still running at that point. Someone was there and presented the case, and it was pretty clear they were going through the motions because we called them on the carpet.
For some really stupid reason (literally every other senior engineer and manager was at some offsite planning thing that morning, and *someone* had to run this meeting), I was the most senior person in the room, and so I felt I had to ask them the question as "the voice of the company" (whatever that even means):
"Can I get you to promise to never do this again?"
They wouldn't commit to it. I got no reply. They just looked at me. Conclusion: this will definitely happen again. Nobody gave a damn about what happened to the customers, and how bad the whole thing looked.
Afterwards, I talked to some friends who had worked in the trenches in customer support. They knew what was happening in terms of the "trouble reports" that would come in from people using the app. They had a good feel for what was actually a problem and what was clearly "OKR scamming".
Near as we can figure, they decided to code this up because it would let them claim to have automated some class of tickets that were being filed. It's like, sure, it would in fact remove the handful of tickets that get filed about this. It would also generate a godawful amount of hurt (and bad PR and so on) a few hours or days later, and would have to be turned off. But, the person managed to ship the feature, and so they can get their bonus, or promotion, or whatever.
Of course, karma is a bitch. A few months later, COVID hit and the company started laying people off in droves. I bet all of those people are gone now. Unfortunately, this also means anyone who learned a lesson from this event is probably gone, too. Hmph.
For anyone who's in today's "lucky 10,000" set, have some additional reading on this topic.
...
A reader responded to my "mainless" program post and asked if you could avoid the segfault by putting "_exit()" into the destructor or similar. They're totally right. If you make the program bail out instead of trying to carry along into uncharted territory, it will in fact exit cleanly. You already have unistd.h in my example, so go for it.
...
Another reader commented on the same post, asking just "The file name is a bit worrying. Is everything alright?". I guess they missed the link to knowyourmeme.com, or are generally unfamiliar with the notion of "things that aren't really trolling but are sort of funny kind of".
When it comes to that kind of stuff, that lizard gets me.
Hhhhhehe.
...
Jeff asks if I'm on Mastodon. I am not using that (or any other aspect of the "Fediverse"). I am also not on Twitter. I used to have a "business" account out there for a while, but never really used it, and deleted it last year when it became apparent where Twitter was headed. I stand by that decision.
I should mention that there are some dubious parts of the whole Fediverse thing, as someone who runs a site that occasionally gets links shared around. Posting a link to one of these things basically summons a giant swarm of (bot) locusts who all want to generate a link preview at the same time. I was going to write a post about this a while back, but it would be short and kind of threadbare, so I'll just mention it here instead.
Now, since all of this /w/ stuff is just a bunch of flat files on disk, all of it gets served out of memory and it's like ho hum. I can't run out of database connections or anything like that. I'm basically limited by the level of bandwidth I pay for on my switch ports. Eventually, they all have what they came for and it stops. But, for a minute or two, it can be interesting to watch.
There's a certain "cadence" to a web server's activity logs as seen in "tail -f". When this happens, it scrolls so fast you can barely read it. You definitely notice.
This is far from a new issue. A report that it can collectively be used "as a DDOS tool" was filed back in August 2017, when it was a far smaller problem.
It should not surprise anyone who's been doing this kind of thing for a while that the bug report in question (which I won't link here, but which people will still look up and try to brigade) has been closed since 2018.
Did anyone ever see the classic Simpsons episode where someone falls down a well, and at the end of the episode they "solve the problem" by sticking a "CAUTION WELL" sign in the ground?
"That should do it." - Groundskeeper Willie
I can hear his voice in my head any time I see a bug like that.
2024-12-18 11:13:27
I don't think people really appreciate what kind of mayhem some of their software gets up to. I got a bit of feedback the other night from someone who's been confounded by the site becoming unreachable. Based on running traceroutes, this person thinks that maybe it's carrier A or carrier B, or maybe even my own colocation host.
I would have responded to this person directly, but they didn't leave any contact info, so all I can do is write a post and hope it reaches them and others in the same situation.
It's not any of the carriers and it's not Hurricane Electric. It's my end, and it's not an accident. Hosts that get auto-filtered are usually running some kind of feed reader that flies in the face of best practices, and then annoys the web server, receives 429s, and then ignores those and keeps on going.
The web server does its own thing. I'm not even in the loop. I can be asleep and otherwise entirely offline and it'll just chug along without me.
A typical timeline goes like this:
Somewhere around here, the web server decided that it wasn't being listened to, and so it decided it was going to stop listening, too.
Some time after this, it will "forgive" and then things will work again, but of course, if there's still a bad feed reader running out there, it will eventually start this process all over again.
A 20 minute retry rate with unconditional requests is wasteful. That's three requests per hour, so 72 requests per day. That'd be about 36 MB of traffic that's completely useless because it would be the same feed contents over and over and over.
Multiply that by a bunch of people because it's a popular feed, and that should explain why I've been tilting at this windmill for a while now.
If you're running a feed reader and want to know what its behavior looks like, the "feed reader score" project thing I set up earlier this year is still running, and is just humming along, logging data as always.
You just point your reader at a special personalized URL, and you will receive a feed with zero nutritional content but many of your reader's behaviors (*) will be analyzed and made available in a report page.
It's easy... and I'm not even charging for it. (Maybe I should?)
...
(*) I say _many_ of the behaviors since a bunch of these things have proven that my approach of just handing people a bunch of uniquely-keyed paths on the same host is not nearly enough. Some of these feed readers just go and make up their own paths and that's garbage, but it also means my dumb little CGI program at /one/particular/path doesn't see it. It also means that when they drill / or /favicon.ico or whatever, it doesn't see it. I can't possibly predict all of their clownery, and need a much bigger hammer.
There's clearly a Second System waiting to be written here.
As usual, the requirements become known after you start doing the thing.
2024-12-13 02:52:01
Hey there. Are you one of these "Fediverse" enthusiasts? Are you hard core enough to run an instance of some of this stuff? Do you run Pleroma? Is it version 2.7.0? If so, you probably should do something about that, like upgrading to 2.7.1 or something.
Based on my own investigations into really bad behavior in my web server logs, there's something that got into 2.7.0 that causes dumb things to happen. It goes like this: first, it shows up and does a HEAD. Then it comes back and does a GET, but it sends complete nonsense in the headers. Apache hates it, and it gets a 400.
What do I mean by nonsense? I mean sending things like "etag" *in the request*. Guess what, that's a server-side header. Or, sending "content-type" "and "content-length" *in the request*. Again, those are server-side headers unless you're sending a body, and why the hell would you do that on a GET?
I mean, seriously, I had real problems trying to understand this behavior. Who sends that kind of stuff in a request, right? And why?
This is the kind of stuff I was seeing on the inbound side:
raw_header { name: "user-agent" value: "Pleroma 2.7.0-1-g7a73c34d; < guilty party removed >" } raw_header { name: "date" value: "Thu, 05 Dec 2024 23:52:38 GMT" } raw_header { name: "server" value: "Apache" } raw_header { name: "last-modified" value: "Tue, 30 Apr 2024 04:03:30 GMT" } raw_header { name: "etag" value: "\"26f7-6174873ecba70\"" } raw_header { name: "accept-ranges" value: "bytes" } raw_header { name: "content-length" value: "9975" } raw_header { name: "content-type" value: "text/html" } raw_header { name: "Host" value: "rachelbythebay.com" }
Sending date and server? What what what?
Last night, I finally got irked enough to go digging around in their git repo, and I think I found a smoking gun. I don't know Elixir *at all*, so this is probably wrong on multiple levels, but something goofy seems to have changed with a commit in July, resulting in this:
def rich_media_get(url) do headers = [{"user-agent", Pleroma.Application.user_agent() <> "; Bot"}] with {_, {:ok, %Tesla.Env{status: 200, headers: headers}}} <- {:head, Pleroma.HTTP.head(url, headers, http_options())}, {_, :ok} <- {:content_type, check_content_type(headers)}, {_, :ok} <- {:content_length, check_content_length(headers)}, {_, {:ok, %Tesla.Env{status: 200, body: body}}} <- {:get, Pleroma.HTTP.get(url, headers, http_options())} do {:ok, body}
Now, based on my addled sense of comprehension for this stuff, this is just a guess, but it sure looks like it's populating "headers" with a user-agent, then fires that off as a HEAD. Then it takes the *incoming* headers, adds them to that, then turns the whole mess around and sends it as a GET.
Assuming I'm right, that would explain the really bizarre behavior.
There was another commit about a month later and the code changed quite a bit, including a telling change to NOT send "headers" back out the door on the second request:
defp head_first(url) do with {_, {:ok, %Tesla.Env{status: 200, headers: headers}}} <- {:head, Pleroma.HTTP.head(url, req_headers(), http_options())}, {_, :ok} <- {:content_type, check_content_type(headers)}, {_, :ok} <- {:content_length, check_content_length(headers)}, {_, {:ok, %Tesla.Env{status: 200, body: body}}} <- {:get, Pleroma.HTTP.get(url, req_headers(), http_options())} do {:ok, body} end end
Now both requests call a function (req_headers) which itself just supplies the user-agent as seen before.
What's frustrating is that the commit for this doesn't explain that it's fixing an inability to fetch previews of links or anything of the sort, and so the changelog for 2.7.1 doesn't say it either. This means users of the thing would have no idea if they should upgrade past 2.7.0.
Well, I'm changing that. This is your notification to upgrade past that. Please stop regurgitating headers at me. I know my servers are named after birds, but they really don't want to be fed that way.
...
One small side note for the devs: having version numbers and even git commit hashes made it possible to bracket this thing. Without those in the user-agent, I would have been stuck trying to figure it out based on the dates the behavior began, and that's never fun. The pipeline from "git commit" to actual users causing mayhem can be rather long.
So, whoever did that, thanks for that.