MoreRSS

site iconXe IasoModify

Senior Technophilosopher, Ottawa, CAN, a speaker, writer, chaos magician, and committed technologist.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Xe Iaso

An year of the Linux Desktop

2025-05-07 08:00:00

Co-Authored-By: @scootaloose.com

Windows has been a pain in the ass as of late. Sure, it works, but there's starting to be so much overhead between me and the only reason I bother booting into it these days: games. Every so often I'll wake up to find out that my system rebooted and when I sign in I'm greeted with yet another "pweez try copilot >w< we pwomise you will like it ewe" full-screen dialogue box with "yes" or "nah, maybe later" as my only options. That or we find out that they somehow found a reason to put AI into another core windows tool, probably from a project manager’s desperate attempt to get promoted.

Scoots is neutral
Scoots

The idea of consent in the tech industry is disturbingly absent, I hate it here.

The silicon valley model of consent



[image or embed]

— Xe ( @xeiaso.net ) March 31, 2025 at 11:59 PM

As much as I'd like to like Copilot, Recall, or Copilot (yes those are separate products), if a feature is genuinely transformative enough to either justify the security risk of literally recording everything I do or enhances the experience of using my computer enough to hand over control to an unfeeling automaton, I'll use it. It probably won't be any better than Apple Intelligence though.

When we built our gaming towers, we decided to build systems around the AMD Ryzen 7950X3D and NVidia RTX 4080. These are a fine combination in practice. You get AMD's philosophy of giving you enough cores that you can do parallel computing jobs without breaking a sweat and the RTX 4080 being one of the best cards on the market for rasterization and whatever ray tracing you feel like doing. I don't personally do ray tracing in games, but I like that it is an option for people who want to.

The main problem with NVidia GPUs is that NVidia's consumer graphics department seems to be under the assumption that games don't need as much video memory as they do. You get absolutely bodied in the amount of video memory. Big games can use upwards of 15 GB of video memory, and the OS / Firefox needs 2 GB of video memory. In total, that's one more gigabyte than the 16 I have. You can't just plug in more vram too, you need to either get unobtanium-in-canada RTX 4090s or pay several body organs for enterprise grade GPUs.

AMD is realistically the only other option on the market. AMD sucks for different reasons, but at least they give you enough video memory that you can survive.

Scoots is thonk
Scoots

Intel?

Cadey is coffee
Cadey

As someone that uses an IRC nick that's the same as the Intel Linux GPU driver, I'm never using an Intel GPU on linux.

One of the most frustrating issues we've run into as of late is macrostutters when gaming. Macrostutters are when the game hitches and the entire rendering pipeline gets stuck for at least two frames, then everything goes back to normal. This is most notable in iRacing and Final Fantasy XIV (14). In iRacing's case, it can cause you to get into an accident because you get an over 100 millisecond to 5 second pause. Mind you, the game is playable, but the macrostutters can make the experience insufferable.

Scoots is explain
Scoots

It’s all the rage in the iRacing forums when they’re not slinging potatoes at each other over other issues, it’s great!

In the case of Final Fantasy XIV (amazing game by the way, don't play it), this can cause you to get killed because you missed an attack telegraph due to it happening while your rendering pipeline was stopped. I have been killed to macrostutters as white mage (pure healer class for fellow RPG affictionados) in Windows at least 3 times in the last week and I hate it.

So, the thought came to our minds: why are we bothering with Windows? We've had a good experience with SteamOS on our Steam Decks.

Numa is concern
Numa

It probably helps that you don't mess with the Steam Deck at all and leave it holy.

We have a home theatre PC that runs Bazzite. A little box made up of older hardware we upgraded from. Runs tried and true hardware that has matured well and not a single unknown variable in it (AMD Ryzen 5 3600 and an RX5700XT, on a B450 motherboard, the works). Besides the normal HDR issues on Linux, it's been pretty great for couch gaming!

I've also been using Linux on the desktop off and on for years. My career got started because Windows Vista was so unbearably bad that I had to learn how to use Linux on the desktop in order to get a usable experience out of my dual core PC with 512 MB of ram.

Scoots is explain
Scoots

I’m honestly amazed Vista even attempted to run – much less install – on that low of memory configuration… I’ve been a Windows user for as long as I remember, my oldest memory of using it dating back to Windows 98, but there’s probably some home video of me messing around in MS Paint with Windows 95 somewhere in a box. Around 2009 I used a shared family laptop that came shipped with Vista that suddenly decommissioned itself out of existence. I installed Kubuntu (KDE flavour of Ubuntu, I could and still cannot stand GNOME lol) on it for a time until Windows 7 came around to save it. My mother and sister did not really adapt to using Linux and I was the only one trying to use it around that time. It was functional enough back then I suppose – the hardest I drove that laptop was playing Adobe Flash games – but we could not do my schoolwork on it properly, namely because OpenOffice and Word hated each other.

Surely 2025 will be the year of the Linux Desktop.

Numa is smug
Numa

Foreshadowing is a narrative device in which a narrator gives an advance hint as to what comes up later in the story.

The computing dream

My husband has very simple computing needs compared to me. He doesn't do software development in his free time (save simple automation with PowerShell, bash, or Python). He doesn't do advanced things like elaborate video editing, 3d animation, or content creation. Sure sometimes he'll need to clip a segment out of a longer video file, but really that's not the same thing as making an hbomberguy video or streaming to Twitch. The most complicated thing he wants to do at the moment is play Final Fantasy XIV, which as far as games go isn't really that intensive.

Scoots is explain
Scoots

I still have my library of simulators where most of them would technically work fine under Proton, as most have been tested to work there. However, given the mishmash of hardware and the fact that iRacing has anticheat and the launcher barely functions in Linux under Proton, I decided to delegate my expensive hobby machine to secondary duty and I left it running Windows as is. Its sole purpose now is for my racing sims and any strange game that does not play nice with Proton. Namely any multiplayer games that have kernel-level anticheat. I scrapped the idea of dual booting before anything else because I have had enough bad experiences with Windows’ Main Character Syndrome that I was going to airgap my install of Linux to a whole other computer.

I have some more complicated needs seeing as software I make runs on UNESCO servers, but really as long as I have a basic Linux environment, Homebrew, and Steam, I'll be fine. I am also afflicted with catgirl simulator, but I do my streaming from Windows due to vtubing software barely working there and me being enough of a coward to not want to try to run it in Linux again.

When he said he wanted to go for Linux on the desktop, I wanted to make sure that we were using the same distro so that I had enough of the same setup to be able to help when things inevitably go wrong. I wanted something boring, well-understood, and strongly supported by upstream. I ended up choosing the most boring distribution I could think of: Fedora.

Fedora is many things, but it's what systemd, mesa, the Linux Kernel, and GNOME are developed against. This means that it's one of the most boring distributions on the planet. It has most of the same package management UX ergonomics as Red Hat Enterprise Linux, it's well documented and most of the quirks are well known or solved, and overall it's the least objectionable choice on the planet.

In retrospect, I'm not sure if this was a mistake or not.

He wanted to build a pure AMD system to stave off any potential NVidia related problems. We found some deals and got him the following:

  • CPU: AMD Ryzen 9 9800X3D (release date: November 2024)
  • GPU: AMD RX9070XT (16GB) (release date: March 2025)
  • A B850M based motherboard (release date: January 2025)
  • 32GB of DDR5-6000 RAM
  • A working SSD
  • A decent enough CPU cooler
  • A case that functions as a case
Scoots is explain
Scoots

I will spend more money on a case if it means it won’t draw blood while trying to work on it, so far my last two cases spared my fingers. This build was also woefully overpriced because I was paying for the colour tax on all my components, going with a full white build this time.

Fedora 41

I had recently just installed Fedora 41 on my tower and had no issues. My tower has an older CPU and motherboard so I didn't expect any problems. Most of that hardware I listed above was released after Fedora 41 was released in late October 2024. I expected some issues for hardware compatibility for the first boot, but figured that an update and reboot would fix it. From experience I know that Fedora doesn't ever roll new install images after they release a major version. This makes sense from their perspective for mirror bandwidth reasons.

When we booted into the installer on his tower, the screen was stuck at 1024x768 on a 21:9 ultrawide. Fine enough, we can deal with that. The bigger problem was the fact that the ethernet card wasn't working. It wasn't detected in the PCI device tree. Luckily the board shipped with an embedded Wi-Fi card, so we used that to limp our way into Fedora. I figured it'd be fine after some updates.

It was not fine after that. The machine failed to boot after that round of updates. It felt like the boot splash screen was somehow getting the GPU driver into a weird state and the whole system hung. Verbose boot didn't work. I was almost worried that we had dead hardware or something.

Fedora 42

Okay, fine, the hardware is new. I get it. Let's try Fedora 42 beta. Surely that has a newer kernel, userland, and everything that we'd need to get things working out of the box.

Yep, it did. Everything worked out of the box. The ethernet card was detected and got an IP instantly. The install was near instant. We had the full screen resolution at 100hz like we expected, and after an install 1Password and other goodies were set up. Steam was installed, Final Fantasy XIV was set up, the controller was configured, and a good time was had by all. The microphone and DAC even worked!

Once everything was working, I set up an automount for the NAS so that he could access our bank of wallpapers and the like. Everything was working and we were happy.

Numa is smug
Numa

Again, foreshadowing is a narrative device in which a narrator gives an advance hint as to what comes up later in the story.

Coincidentally, we built the system the day before Fedora 42 was released. I had him run an update and he chose to do it from the package management GUI, “Discover”. I have a terminal case of Linux brain and don't feel comfortable running updates in a way that I can't see the logs. This is what happens when you do SRE work for long enough. You don't trust anything you can't directly look at or touch.

Scoots is blushies
Scoots

I am Windows Update brained, it’s ingrained into my soul after 27 years @_@

We rebooted for the update and then things started to get weird. The biggest problem was X11 apps not working. We got obscure XWayland errors that a mesa dev friend never thought was possible. I seriously began to get worried that we had some kind of half-hardware failure or something inexplicable like that.

I thought that there was some kind of strange issue upgrading from Fedora 42 Beta to Fedora 42 full. I can't say why this would happen, but it's completely understandable to go there after a few hours of fruitless debugging. We reinstalled because we ran out of ideas.

Why the iGPU, Steam?

Once everything was back and running, we ran into a strange issue: Steam kept starting on the integrated GPU instead of the dedicated GPU. This would be a problem, but luckily enough games somehow preferred using the dedicated GPU so it all worked out. After an update got pushed, this caused Steam to die or sometimes throw messages about chromium not working on the GPU "llvmpipe".

Numa is neutral
Numa

Life pro tip: if you ever see the GPU name "llvmpipe" in Linux, that means you're using software rendering!

Debugging this was really weird. Based on what we could figure out with a combination of nvtop, hex-diving into /sys, and other demonic incantations that no mortal should understand, the system somehow flagged the dedicated GPU as the integrated GPU and vice versa. This was causing the system to tell Steam and only Steam that it needed to start on the integrated GPU.

After increasingly desperate means of trying to disable the integrated GPU or de-prioritize it, we ended up disabling the integrated GPU in the bios. I was worried this would make debugging a dead dedicated GPU harder, but my husband correctly pointed out that we have at least 5 known working GPUs of different generations laying around with the right power connectors.

Shader pipeline explosions and GPU driver crashes

Anyways we got everything working but sometimes when resuming from sleep Final Fantasy XIV causes a spectacular shader pipeline explosion. I'm not sure how to describe it further, but in case you have any idea how to debug this we've attached a video:

Seizure warning
No, really don't say I didn't warn you
Want to watch this in your video player of choice? Take this:
https://files.xeiaso.net/blog/2025/yotld/shadowbringers-seizure-warning/index.m3u8
Scoots is explain
Scoots

HOOOOOME RIDING HOOOOOOME DYING HOOOOOPE HOOOOOOOLD ONTO HOOOOPE, OOOOOOOHGQEROKHQekrg’qneqo;nfhouehqa

I'm pretty sure this is a proton issue, or a mesa issue, or an amdgpu issue, or a computer issue. If I had any idea where to file this it'd be filed, but when we tried to debug it and get a GPU pipeline trace the problem instantly vanished. Aren't computers the best?

Going back to the NAS

S3 suspend is not a solved problem in the YOTLD 2025. Sometimes on resume the display driver crashes and my husband needs to force a power cycle. When he rebooted, XWayland apps wouldn't start. Discord, Steam, and Proton depend on XWayland. This is a very bad situation.

Originally we thought the display driver crashing was causing this, but after manual restarts under normal circumstances were also causing it it got our attention. The worst part was that this was inconsistent, almost like something in the critical dependency chain was working right sometimes and not working at all other times. We started to wonder if Fedora actually tested anything before shipping it because updates made the pattern of working vs not working change.

One of the most simple apps in the x11 suite is xeyes. It's a simple little thing where it has a pair of cartoon eyes that look at your mouse cursor. It's the display pipeline equivalent to pinging google.com to make sure your internet connection works. If you've never seen it before, here's what it looks like:

Want to watch this in your video player of choice? Take this:
https://files.xeiaso.net/blog/2025/yotld/xeyes/index.m3u8

Alas, it was not working.

After some investigation, the only commonality I could find was the X11 socket folder in /tmp not existing. X11 uses Unix sockets (sockets but via the filesystem) for clients (programs) to communicate with the server (display compositor). If that folder isn't created with the right flags, XWayland can't create the right socket for X clients and will rightly refuse to work.

On a hunch, I made xxx-hack-make-x11-dir.service:

[Unit]
        Description=Simple service test
        After=tmp.mount
        Before=display-manager.service
        
        [Service]
        Type=simple
        ExecStart=/bin/bash -c "mkdir -p /tmp/.X11-unix; chmod -R 1777 /tmp/.X11-unix"
        
        [Install]
        WantedBy=local-fs.target
        

This seemed to get it working. It worked a lot more reliably when I properly set the sticky bit on the .X11-unix folder so that his user account could create the XWayland socket.

In case you've never seen the "sticky bit" in practice before, Unix permissions have three main fields per file:

  1. User permissions (read, write, execute)
  2. Group permissions (read, write, execute)
  3. Other user permissions (read, write, execute)

This applies to both files and folders (where the execute bit on folders is what gives a user permission to list files in that folder, I don't fully get it either). However in practice there's a secret fourth field which includes magic flags like the sticky bit.

The sticky bit is what makes temporary files work for multi-user systems. At any point, any program on your system may need to create a temporary file. Many programs will assume that they can always create temporary files. These programs may be running as any user on the system, not just the main user account for the person that uses the computer. However, you don't want users to be able to clobber each other's temporary files because the write bit on folders also allows you to delete files. That would be bad. This is what the sticky bit is there to solve: making a folder that everyone can write to, but only the user that created a temporary file can delete it.

Notably, the X11 socket directory needs to have the sticky bit set because of facts and circumstances involving years of legacy cruft that nobody wants to fix.

$ stat /tmp/.X11-unix
          File: /tmp/.X11-unix
          Size: 120       	Blocks: 0      	IO Block: 4096   directory
        Device: 0,41	Inode: 2       	Links: 2
        Access: (1777/drwxrwxrwt)  Uid: (	0/	root)   Gid: (	0/	root)
        Access: 2025-05-05 21:33:39.601616923 -0400
        Modify: 2025-05-05 21:34:09.234769003 -0400
        Change: 2025-05-05 21:34:09.234769003 -0400
         Birth: 2025-05-05 21:33:39.601616923 -0400
        

Once xxx-hack-make-x11-dir.service was deployed, everything worked according to keikaku.

Numa is neutral
Numa

Life pro tip: keikaku means plan!

A gnawing feeling at the fabric of reality

The system was stable. Everything was working. But when multiple people that work at Red Hat are telling you that the problems you are running into are so strange that you need to start filing bug reports in the dark sections of the bug tracker, you start to wonder if you're doing something wrong. The system was having configuration error-like issues on components that do not have configuration files.

While we were drafting this article, we decided to take a look at the problem a bit further. There was simply no way that we needed xxx-hack-make-x11-dir.service as a load-bearing dependency on our near plain install of Fedora, right? This should just work out of the box, right???

We went back to the drawing board. His system was basically stock Fedora, and we only really did three things to it outside of the package management universe:

  1. Create a mount unit to mount the NAS' SMB share at /mnt/itsuki
  2. Create an automount unit to automatically mount the SMB share at boot time
  3. xxx-hack-make-x11-dir.service to frantically hack around issues

Notably, I had the NAS automount set up too and was also having strange issues with the display stack, including but not limited to the GNOME display manager forgetting that Wayland existed and instantly killing itself on launch.

On a hunch, we disabled the units in the reverse order that we created them to undo the stack and get closer to stock Fedora. First, we disabled the xxx-hack-make-x11-dir.service unit. When he rebooted, this broke XWayland as we expected. Then we disabled the NAS automount and rebooted the system.

XWayland started working.

My guess is that this unit somehow created a cyclical dependency:

# mnt-itsuki.automount
        [Unit]
        Requires=remote-fs-pre.target
        After=remote-fs-pre.target
        
        [Automount]
        Where=/mnt/itsuki
        TimeoutIdleSec=0
        
        [Install]
        WantedBy=remote-fs.target
        
Cadey is facepalm
Cadey

Oh...this was me, wasn't it...

Scoots is explain
Scoots

Your sysadmin privileges are revoked for 24 hours.

Aoi is sus
Aoi

Gasp! Not the sysadmin privileges!

Turns out it was me. The actual unit I wanted was this:

# mnt-itsuki.automount
        [Unit]
        
        [Automount]
        Where=/mnt/itsuki
        TimeoutIdleSec=0
        
        [Install]
        WantedBy=multi-user.target
        

Thanks, Arch Linux Wiki page on Samba!

Other than that, everything's been fine! The two constants that have been working throughout all of this were 1Password and Firefox, modulo that one time I updated Firefox in dnf and then got a half-broken browser until I restarted it. I did have to disable the nftables backend in libvirt in order to get outbound TCP connections working though.

Fedora tips m'lady

Fedora is pretty set and forget, but it's not without its annoyances. The biggest one is how Fedora handles patented video codecs and how this intersects with FFmpeg, the swiss army chainsaw of video conversion.

Cadey is enby
Cadey

Seriously, FFmpeg is one of the best programs ever made. If you have any image or video file format, you can use FFmpeg to make it any other video or image file format. Seriously one of the best programs ever made and it's absolutely surreal that they use Anubis to protect their bug tracker.

Fedora ships a variant of FFmpeg they call ffmpeg-free. Notably this version has "non-free" codecs compiled out, so you can deal with webm, av1, and other codecs without issue. However h.264, or the 4 in .mp4 is not in that codec list. Basically everything on the planet has support for h.264, so this is the "default format" that many systems use. Heck, all the videos I've embedded into this post are encoded with h.264.

You can pretty easily swap out ffmpeg-free with normal un-addled ffmpeg if you install the RPM Fusion repository, but that has its own fun.

Forward and back and then forward and back and then go forward and back, then get no downgrading

RPM Fusion is the not-quite-official-but-realistically-most-users-use-it-so-it's-pretty-much-official side repo that lets you install "non-free" software. This is how you get FFmpeg, steam, and the NVidia binary drivers that make your GPU work.

One of the most annoying parts about RPM Fusion is that whenever they push new versions of anything, every old package is deleted off of their servers. This means that if you need to do a downgrade to debug issues (like strange XWayland not starting issues), you CANNOT restore your system to an older state because the package manager will see that the packages it needs aren't available from upstream and rightly refuse to put your system in an inconsistent state.

I have tried to get in contact with the RPMFusion team to help them afford more storage should they need it, but they have not responded to my contact attempts. If you are someone or know someone there that will take money or storage donated on the sole condition that they will maintain a few months of update backlog, please let me know.

Conclusion

I'm not really sure how to end something like this. Sure things mostly work now, but I guess the big lesson is that if you are a seasoned enough computer toucher, eventually you will stumble your way into a murder mystery and find out that you are both the killer and the victim being killed at the same time.

Scoots is explain
Scoots

And you’re also the detective!

But, things work* and I'm relatively happy with the results.

I'm on GitHub Sponsors

2025-04-21 08:00:00

If you wanted to give me money but Patreon was causing grief, I'm on GitHub Sponsors now! Help me reach my goal of saving the world from AI scrapers with the power of anime.

Anubis works

2025-04-12 08:00:00

That meme is not an understatement, Anubis has been deployed by the United Nations.

For your amusement, here is how the inner monologue of me finding out about this went:

Aoi is wut
Aoi

What. You can't be serious, can you?

Aoi is wut
Aoi

No, that can't be a real domain of the United Nations, can it?

Cadey is coffee
Cadey

Wikipedia lists unesco.org as the official domain of the United Nations Educational, Scientific and Cultural Organization. I'm pretty sure it's real.

Aoi is sus
Aoi

No way. No fucking way. What the heck, how is this real. What is YOUR LIFE??? God I got the worst 2025 bingo card this year.

I hate to shake my can and ask for donations, but if you are using Anubis and it helps, please donate on Patreon. I would really love to not have to work in generative AI anymore because the doublethink is starting to wear at my soul.

Also, do I happen to know anyone at UNESCO? I would love to get in touch with their systems administrator team and see if they had any trouble with setting it up. I'm very interested in making it easier to install.

This makes the big deployments that I know about include:

  • The Linux Kernel Mailing List archives
  • FreeBSD's SVN (and soon git)
  • SourceHut
  • FFmpeg
  • Wine
  • UNESCO
  • The Science Olympiad Student Center
  • Enlightenment (the desktop environment)
  • GNOME's GitLab

The conversation I'm about to have with my accountant is going to be one of the most surreal conversations of all time.

The part that's the most wild to me is when I stop and consider the scale of these organizations. I think that this means that the problem is much worse than I had previously anticipated. I know that at some point YouTube was about to hit "the inversion" where they get more bot traffic than they get human traffic. I wonder how much this is true across most of, if not all of the Internet right now.

I guess this means that I really need to start putting serious amounts of effort into Anubis and the stack around it. The best way that can be ensured is if I can get enough money to survive so I can put my full time effort into it. I may end up hiring people.

This is my life now. Follow me on Bluesky if you want to know when the domino meme gets more ridiculous!

Life pro tip: put your active kubernetes context in your prompt

2025-04-05 08:00:00

Today I did an oopsie. I tried to upgrade a service in my homelab cluster (alrest) but accidentally upgraded it in the production cluster (aeacus). I was upgrading ingress-nginx to patch the security vulnerabilities released a while ago. I should have done it sooner, but things have been rather wild lately and now kernel.org runs some software I made.

Cadey is coffee
Cadey
A domino effect starting at 'Amazon takes out my git server' ending in 'software running on kernel.org'.
A domino effect starting at 'Amazon takes out my git server' ending in 'software running on kernel.org'.

Either way, I found out that Oh my ZSH (the ZSH prompt toolkit I use) has a plugin for kube_ps1. This lets you put your active Kubernetes context in your prompt so that you're less likely to apply the wrong manifest to the wrong cluster.

To install it, I changed the plugins list in my ~/.zshrc:

-plugins=(git)
        +plugins=(git kube-ps1)
        

And then added configuration at the end for kube_ps1:

export KUBE_PS1_NS_ENABLE=false
        export KUBE_PS1_SUFFIX=") "
        
        PROMPT='$(kube_ps1)'$PROMPT
        

This makes my prompt look like this:

(⎈|alrest) ➜  site git:(main) ✗
        

Showing that I'm using the Kubernetes cluster Alrest.

Aoi is wut
Aoi

Wouldn't it be better to modify your configuration such that you always have to pass a --context flag or something?

Cadey is coffee
Cadey

Yes, but some of the tools I use don't have that support universally. Until I can ensure they all do, I'm willing to settle for tamper-evident instead of tamper-resistant.

Why upgrading ingress-nginx broke my HTTP ingress setup

Apparently when I set up the Kubernetes cluster for my website, the Anubis docs and other things like my Headscale server, I did a very creative life decision. I started out with the "baremetal" self-hosted ingress-nginx install flow and then manually edited the Service to be a LoadBalancer service instead of a NodePort service.

I had forgotten about this. So when the upgrade hit the wrong cluster, Kubernetes happily made that Service into a NodePort service, destroying the cloud's load balancer that had been doing all of my HTTP ingress.

Thankfully, Kubernetes dutifully recorded logs of that entire process, which I have reproduced here for your amusement.

Event type Reason Age From Message
Normal Type changed 13m service-controller LoadBalancer -> NodePort
Normal DeletingLoadBalancer 13m service-controller Deleting load balancer
Normal DeletedLoadBalancer 13m service-controller Deleted load balancer
Cadey is facepalm
Cadey

OOPS!

Numa is smug
Numa

Pro tip if you're ever having trouble waking up, take down production. That'll wake you up in a jiffy!

Thankfully, getting this all back up was easy. All I needed to do was change the Service type back to LoadBalancer, wait a second for the cloud to converge, and then change the default DNS target from the old IP address to the new one. external-dns updated everything once I changed the IP it was told to use, and now everything should be back to normal.

Well, at least I know how to do that now!

Building native packages is complicated

2025-03-31 08:00:00

Cadey is enby
Cadey

Anubis is an AI scraper bot filter that puts a wall between your website and the lowest-hanging fruit of bot filtering. I developed it to protect my git server, but it's also used to protect bug trackers, Mastodon instances, and more. The goal is to help protect the small Internet so small communities can continue to exist at the scale they're currently operating at without having to resort to overly expensive servers or terrifyingly complicated setups.

Anubis has kind of exploded in popularity in the last week. GitHub stars are usually a very poor metric because they're so easy to game, but here's the star graph for Anubis over the last week:

A graph showing the GitHub star history for Anubis, it hockey sticked upwards and has a sine wave at about a 45 degree angle.
A graph showing the GitHub star history for Anubis, it hockey sticked upwards and has a sine wave at about a 45 degree angle.

Normally when I make projects, I don't expect them to take off. I especially don't expect to front page news on Ars Technica and TechCrunch within the span of a few days. I very much also do not expect to say sentences like "FFmpeg uses a program I made to help them stop scraper bots taking down their issue tracker". The last week has been fairly ridiculous in that regard.

There has been a lot of interest in me distributing native packages for Anubis. These packages would allow administrators that don't use Docker/OCI Containers/Podman to use Anubis. I want to build native packages, but building native packages is actually a fair bit more complicated than you may realize out of the gate. I mean it sounds simple, right?

Numa is smug
Numa

But you can JUST build a tarball, right? Anubis is written in Go, you can just do that easily with GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -o var/anubis-linux-amd64 ./cmd/anubis and not faff around with the Docker webshit bloat right?

Cadey is facepalm
Cadey

It would be "just" that simple for getting something out of the door, but it's a poor UX compared to what Docker gives--

Numa is delet
Numa

I have detected bloat, you YAML merchant you! Reject complexity! Return to native packages!

Okay, okay, the conversations don't go exactly like that, but that's what it can feel like sometimes.

Here's a general rule of thumb: "just" is usually a load-bearing word that hides a lot of complexity. If it was "just" that simple, it would have already been done.

A note to downstream packagers

If you want to package Anubis for your distribution of choice, PLEASE DO IT! Please make sure to let me know so I can add it to the docs along with instructions about how to use it.

Seriously, nothing in this post should be construed into saying "do not package this in distros". A lot of "stable" distros may have difficulty with this because I need Go 1.24 features for an upcoming part of Anubis. I just want to cover some difficulties in making binary packages that y'all already have had to reckon with that other people have not yet had to think about.

With all that said, buckle up.

Anubis' threat model

Before I go into the hard details of building native packages and outlining what I think is the least burdensome solution, it may be helpful to keep Anubis' (and Techaro's) threat model in mind. The solution I am proposing will look "overkill", but given these constraints I'm sure you'll think it's "just right".

Things in Anubis' favor

Anubis is open source software under the MIT license. The code is posted on GitHub and is free for anyone to view the code, download it, learn from it, produce derivative works from it, and otherwise use the software for any purpose.

Cadey is enby
Cadey

Personally, I'd hope that bigger users of this will contribute back to the project somehow so that I can afford to have working on this be my full-time job. However I understand that not every project or community can afford doing that. Open source projects are usually nights and weekends affairs. The goal of Anubis is to protect the small internet, and small internet usually comes with small budgets.

Anubis is trusted by some big organizations like GNOME, Sourcehut, and ffmpeg. This social proof is kind of both a blessing and a curse, because it means if anything goes wrong, it could go very wrong all at once.

The project is exploding in popularity. Here's that star count graph again:

A graph showing the GitHub star history for Anubis, it hockey sticked upwards and has a sine wave at about a 45 degree angle.
A graph showing the GitHub star history for Anubis, it hockey sticked upwards and has a sine wave at about a 45 degree angle.

The really wild part about that star count graph is that you can see a sine wave if you rotate it by 45 degrees. A sine wave in metrics like that lets you know that growth is healthy and mostly human-sourced. This is wild to see.

Things not in Anubis' favor

Right now the team is one person that works on this during nights and weekends. As much as I'd like this to not be the case, my time is limited and my dayjob must take precedence so that I can afford to eat and pay my bills. Don't get me wrong, I'd love to work on this full time, but my financial situation currently requires me to keep my dayjob.

I also have a limited capacity for debugging "advanced" issues (such as those that only show up when you are running a program as a native package instead of in an environment like Docker/OCI/Podman), and I am as vulnerable to burnout as you are.

Speaking of burnout, this project has exploded in popularity. I've never had a project go hockey stick like this before. It's happened at companies I've worked at, sure, but never something that's been "my fault". This is undefined territory for me. Waking up and finding out you're on the front page of Ars Technica and getting emails requesting comment from TechCrunch reporters is kinda stressful.

Cadey is coffee
Cadey

Maybe it's a "good" kind of stress though? I don't know. What I do know is that I broke my media training that I've gotten from past employers over the years replying to the journalist from TechCrunch.

Some personal facts and circumstances which I am not going to go into detail about have made my sleep particularly bad the last week. As I'm writing this, I had a night with a solid 8 hours of sleep, so maybe that's on the mend. However when you get bad sleep for a bit, it tends to not make you have a good time.

Anubis is security software. Security software usually needs to be held to a higher standard than most other types of software. This means that "small typos" or forgotten bits of configuration from the initial rageware spike can actually become glaring security issues. There's been a lot of "founder code" cleanup so far and I can only see more coming in the future.

Numa is neutral
Numa

Of note: the Techaro standard about security is relevant here. At Techaro, we realize that computers are fractals of complexity and that any program is essentially built and reliant upon the behavior of a massive number of unknown unknowns. When at all possible we'll try to minimize the amount of security related bugs that may show up, but those unknown unknowns can and will bite at any time. Over time, that list of security advisories is undoubtedly going to grow because that's just how networked software works.

We try to not measure the number of vulnerabilities that made it in, we measure how we react when one does come in. Techaro treats people that report security issues to [email protected] seriously.

There's a few other standards like this (Be not a cancer upon the earth, The purpose of a system is what it does, etc.), and we could blow hot air about them, but I think it's better for them to be demonstrated by example instead of claimed on a webbed site with catchy slogans and lofty words puffed up by hot air.

Also, if this goes wrong, I'm going to get personally mega-cancelled. I would really like that to not happen, but this is the biggest existential risk and why I want to take making binary packages this seriously.

So with all of those constraints in mind, here's why it's not easy to "just" make binary packages.

Packaging software is hard

Like was said earlier:

Numa is smug
Numa

But you can JUST build a tarball, right? Anubis is written in Go, you can just do that easily with GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -o var/anubis-linux-amd64 ./cmd/anubis and not faff around with the Docker webshit bloat right?

Sure, it is possible to JUST build a tarball with a single shell script like this:

cd var
        DIR="./anubis-$(cat VERSION)-linux-amd64"
        mkdir -p $DIR/{bin,docs,run}
        GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -o $DIR/bin/anubis ./cmd/anubis
        cp ../README.md $DIR
        cp ../LICENSE $DIR
        cp ../docs/docs/admin/installation.mdx $DIR/docs/installation.md
        cp ../web/static/botPolicy.json $DIR/docs/botPolicy.json
        cp ../run/* $DIR/run/
        tar cJf ${DIR}.txz ${DIR}
        

And just repeat it for every GOOS/GOARCH pair that we want to support (probably gonna start out with linux/amd64, linux/arm64, freebsd/amd64, freebsd/arm64). Let's be real, this would work, but the main problem I have with this is that this is a poor developer experience, and it also is a poor administrator experience (mostly because those binary tarball packages leave installation as an exercise for the reader).

Don't repeat yourself

When I make binary packages for Anubis, I want to specify the package once and then have the tooling figure out how to make it happen. Ideally, the same build instructions should be used for both distribution package builds and tarballs. If the experience for developers is bad or requires you to minimax into ecosystem-specific tooling, this means that the experience is fundamentally going to be better on one platform over another. I want this software to be as universally useful as possible, so I need to design the packaging process to:

  1. Have the same developer experience as Docker images do ("push button, receive bacon").
  2. Have the most seamless administrator experience possible (ideally through distribution-native flows).
  3. Have as little cognitive overhead as possible for all parties involved (take the most obvious approach possible, when things need to diverge then be very loud about it).

The administrator experience bit is critical. As much as we'd all like them to, administrators simply do not have the time or energy to do detailed research into how a tool works. Usually they have a problem, they find the easiest to use tool, square peg into round hole it to solve the immediate problem, and then go back to the twelve other things that are on fire.

One of the really annoying downsides to wanting to do native packages as a downstream project is that the interfaces for making them suck. They suck so much for the role of a project like Anubis. They're optimized for consuming software from remote tarballs, doing the bare minimum to fit within the distribution's ecosystem.

For the context they operate in, this makes a lot of sense. Their entire shtick is pulling in software from third parties and distributing it to their users. There's a reason we call them Linux distributions.

However with Anubis, I want to have the packaging instructions live alongside the application source code. I already trust the language-level package managers that Anubis uses (and have been careful to minimize dependencies to those that are more trustable in general), and they already have hash validation. Any solution RFC-2119-MUST build on top of this trust and use it as a source of strength.

There frankly is a middle path to be found between the "simple binary tarball" and mini-maxxing into the exact perculiarities of how rpmbuild intersects with Go modules and NPM.

Honestly, this is why I was planning on paywalling binary packages. Binary packages also have the added complication that you have to play nice with the ways that administrators configure their servers. Debugging this is costly in terms of time and required experties. I am one person doing this on nights and weekends. I only have so much free time. I wish I had more, but right now I simply do not.

Cutting scope

When creating a plan like this, it's best to start with the list of problems you want to solve so that you can aggressively cut scope to remove the list of problems you don't want to solve yet. Luckily, the majority of the Linux ecosystem seems to have standardized around systemd. This combined with the fact that Go can just build static binaries means that I can treat the following OSes as fungible:

  • Ubuntu Server
  • Debian
  • Red Hat Enterprise Linux
  • OpenSUSE
  • SUSE Enterprise Linux
  • Rocky Linux
  • Alma Linux
  • CentOS Stream

Building Debian and Red Hat packages will cover all of them.

Additionally, anything else that is RPM/Debian based except maybe Devuan. Of those, there's three main CPU architectures that are the most common with a long tail of other less common ones:

  1. x86_64 (64 bit x86, goarch amd64)
  2. aarch64 (64 bit arm, goarch arm64)
  3. riscv64 (64 bit RISC-v, goarch riscv64)

At some level, this only means that we need to build 6 variants (one per CPU for Debian and Red Hat derived distros) to cover 99.9% of the mutable distributions in common use. This is a much more manageable level of complexity, I could live with this.

Anything else in the "long tail" (FreeBSD, OpenBSD, Alpine Linux, etc.) is probably better handled by their native packages or ports system anyways, but this is a perfect place for the binary tarballs to act as a stopgap.

Yeeting packages out

When I was working on the big Kubernetesification of my homelab last year, I was evaluating Rocky Linux and I was also building a tool called yeet as a middle ground between complicated shell scripts and bespoke one-off deployment tools written in Go. A lot of my build and deploy scripts boiled down to permutations of the following steps:

  1. Build docker image somehow (Nix, Docker, Earthly, ko, etc.)
  2. Push to remote host
  3. Trigger rolling release to the new image

I started building yeet because I had previously done this with shell scripts that I copied to every project's folder. This worked great until I had a semantics bug in one of them that turned into a semantics bug in all of them. I figured I needed a more general metapattern and thus yeet was born.

I had known about a tool called nfpm. nfpm lets you specify the filesystem layout of distribution packages in a big ol' yaml file that got processed and spat out something you can pass to dpkg -i or dnf -y install. The only problem is that it just put the files into the package. This was fine for something like the wallpapers I submitted to the Bluefin project, but the build step was left as an exercise for the reader.

I was also just coming down from a pure NixOS UX and wanted something that could let me get 80% of the nice parts of Nix with only requiring me to put in 20% of the effort that it took to make Nix.

So I extended yeet to build RPM packages. Here is the yeetfile.js that yeet uses to build its own packages:

["amd64", "arm64", "riscv64"].forEach((goarch) =>
          rpm.build({
            name: "yeet",
            descriptions: "Yeet out actions with maximum haste!",
            homepage: "https://within.website/",
            license: "CC0",
            goarch,
        
            build: (out) => {
              go.build("-o", `${out}/usr/bin/yeet`, ".");
            },
          })
        );
        

That's it. When you run this (with just yeet), it will create a gitignored folder named var and build 6 packages there. You can copy these packages to anywhere you can store files (such as in object storage buckets). yeet's runtime will natively set the GOARCH, GOOS, and CGO_ENABLED variables for you under the hood so that you can just focus on the build directions without worrying about the details.

The core rules of yeet are:

  1. Thou shalt never import code from another file nor require npm for any reason.
  2. If thy task requires common functionality, thou shalt use native interfaces when at all possible.
  3. If thy task hath been copied and pasted multiple times, yon task belongeth in a native interface.

These native interfaces are things like go.build(args...) to trigger go build, docker.push(tag) to push docker images to a remote repository, git.tag() to get a reasonable version string, etc.

All of these are injected as implicit global objects. This is intended to let me swap out the "runtime backend" of these commands so that I can transparently run them on anonymous Kubernetes pods, other servers over SSH, or other runtime backends that one could imagine would make sense for a tool like yeet.

Cadey is coffee
Cadey

Deep lore: yeet was originally going to be Techaro's first product, but Anubis beat it to the punch!

The game plan

I plan to split building binary packages into at least two release cycles. The first release will be all about making it work, and the second release will be about making it elegant.

When I'm designing Anubis I have three sets of experiences in mind:

  1. User experience: what the experience is like for the end users that don't know (or care) what Anubis is
  2. Developer experience: what the experience is like for me and the open source contributors developing Anubis
  3. Administrator experience: what the experience is like for people setting it up on their boxes

The balance here is critical. These forces are fundamentally all in conflict, but currently the packaging situation is way too far balanced towards developer experience and not towards administrator experience. I hope that this strategy makes it easier for websites like The Cutting Room Floor to get relief from the attacks they're facing.

Making it work at all

The first phase is focused on making this work at all. A lot of the hard parts involving making yeet able to build Debian and Red Hat packages are already done. A lot of the rest of this involves software adulting including:

  • Setting up a dedicated Mac mini to act as the build and signing host. This machine MUST NOT be used for anything else.
  • Write documentation on how to build your own Debian/RPM packages with yeet.
  • Write documentation specific to how to use the packages to run Anubis.
  • Ensure that critical documentation is copied to the packages so that users can self-serve without access to the Anubis website.
  • Uploading built packages to GitHub releases.

A lot of this is going to be tested with the (currently private) TecharoHQ/yeet repository.

Integration with distribution-specific package managers

The next stage will involve making the administrator experience a lot nicer. When administrators install packages, they expect them to be in repositories. This enables all software on the system to be updated at once, which is critical to the anticipated user experience. In order to reach this milestone, here's what I need to do:

  • Fork yeet out of the /x/ monorepo and into a TecharoHQ repo (this has already been started, but will be slow going at first, I need to de-Xe the implementation of yeet).
  • Set up repositories backed by object storage. That Mac mini from before will be the only machine that is allowed to write to that bucket. I will have to coordinate with the object storage provider I will be using in order to make sure this is the case.
  • Publish a noarch Debian/RPM backage to bootstrap the Techaro root of trust and repo files. There will be at least two repositories: Debian and RPM. I may have to do the thing that Tailscale does where they lie about what distros are supported so that the repo URLs don't stand out too much compared to the experience that administrators expect from other publishers.
  • Add support for basically every platform that Go can compile Linux binaries for. A lot of the small internet runs on smaller hardware. By making things more compatible, we can protect more small communities.

The first pass of a repository backend will be done with off the shelf tooling like aptly and rpm-s3. It's gonna suck at first, but this pass is all about making it work. Making it elegant can come next.

Leaving the ecosystem in a better state than we found it

Finally, I will improve the ecosystem such that other people can build on top of the Anubis tooling. Among the other tasks involved with this:

  • Publish tooling for automating the management of repositories stored in object storage buckets based on what I like about aptly and rpm-s3. This will probably be included into yeet depending on facts and circumstances.
  • Publish Software Bill of Materials (SBOM) reports with all packages. Ideally this will be done in the package themselves, but the important part is to make them available at all.
  • Add Alpine and Arch Linux packages/repositories.
Cadey is coffee
Cadey

By the way, regarding Software Bills of Materials, I have not found clear guidance in Ubuntu, Debian, OpenSUSE, Red Hat, or Fedora's documentation on what I'm supposed to do with it or where to put it so that the system can collect and merge it. This is extra important from the perspective of Anubis because this packaging strategy is bypassing the normal flow for building distribution packages. Am I missing something? I found a dead thread about adding the SBOM to an RPM header, but that's it.

This will take time

As much as I'd love to end this with a "and here's how you can try this now" invitation, I'm simply not ready for that yet. This is gonna take time, and the best way to make this happen faster is to donate to the project so I can focus more of my free time on Anubis.

Cadey is coffee
Cadey

Also, don't do the mistake I did. I started playing Final Fantasy 14. It is so good. It sucks you in hard. 10/10. Don't play it unless you're willing to get enthralled by it.

Hopefully the expanded forms of yeet and whatever repository management tooling end up being useful to other projects. But for now, I'm starting with something small, slim, and opinionated. It's much easier to remove opinions from a project than it is to add them.

In the meantime, I hope you can understand why I've only shipped a Docker/OCI/Podman image so far. The amount of scope reduction is immense. However this project is getting popular and in order to show I'm taking it seriously I need to expand the packaging scope to include machines that don't run Docker.

The surreal joy of having an overprovisioned homelab

2025-03-25 08:00:00

I like making things with computers. There’s just one problem about computer programs: they have to run somewhere. Sure, you can just spin up a new VPS per project, but that gets expensive and most of my projects are very lightweight. I run most of them at home with the power of floor desktops.

Tonight I’ll tell you what gets me excited about my homelab and maybe inspire you to make your own. I'll get into what I like about it and clue you into some of the fun you get to have if one of your projects meant to protect your homelab goes hockey-stick.

Want to watch this in your video player of choice? Take this:
https://files.xeiaso.net/talks/2025/surreal-joy-homelab/index.m3u8
Cadey is enby
Cadey

For this one, a lot of the humor works better in the video.

The title slide with the name of the speaker, their sigil, and contact info for the speaker.
The title slide with the name of the speaker, their sigil, and contact info for the speaker.

Hi everyone! I’m Xe, and I’m the CEO of Techaro, the anti-AI AI company. Today I’m gonna talk with you about the surreal job of having an over-provisioned homelab and what you can do with one of your own. Buckle up, it’s gonna be a ride.

So to start, what’s a homelab? You may have heard of the word before, but what is it really?

A homelab is a playground for devops.
A homelab is a playground for devops.

It’s a playground for devops. It’s where you can mess around to try and see what you can do with computers. It’s where you can research new ways of doing things, play with software, and more. Importantly though, it’s where you can self-host things that are the most precious to you. Online platforms are vanishing left and right these days. It’s a lot harder for platforms that run on hardware that you look at to go away without notice.

An about the speaker slide explaining Xe's background.
An about the speaker slide explaining Xe's background.

Before we continue though, let’s cover who I am. I’m Xe. I live over in Orleans with my husband and our 6 homelab servers. I’m the CEO of the totally real company Techaro. I’m an avid blogger that’s written Architect knows how many articles. I stream programming crimes on Fridays.

The agenda slide covering all the topics that are about to be listed below
The agenda slide covering all the topics that are about to be listed below

Today we’re gonna cover:

  • What a homelab is
  • What you can run on one
  • A brief history of my homelab
  • Tradeoffs I made to get it to its current form
  • What I like about it

Finally I’ll give you a stealth mountain into the fun you can have when you self host things.

A disclaimer that this talk is going to be funny
A disclaimer that this talk is going to be funny

Before we get started though, my friend Leg Al told me that I should say this.

This talk may contain humor. Upon hearing something that sounds like it may be funny, please laugh. Some of the humor goes over people’s heads and laughing makes everyone have a good time.

Oh, also, any opinions are my own and not the opinions of Techaro.

Unless it would be funny for those opinions to be the opinions of Techaro, then it would be totally on-brand.

A pink haired anthropomorphic orca character whacking the hell out of a server rack with the text 'Servers at home' next to it in rather large text
A pink haired anthropomorphic orca character whacking the hell out of a server rack with the text 'Servers at home' next to it in rather large text

But yes, tl;dr: when you have servers at home, it’s a homelab. They come in all shapes and sizes from single mini pcs from Kijiji to actual rack mount infrastructure in a basement. The common theme though is experimentation and exploration. We do these things not because they are easy, but because they look like they might be easy. Let’s be real, they usually are easy but you can’t know until you’ve done it to know for sure, right?

What I run

In order to give you ideas on what you can do with one, here’s what I run in my homelab. I use a lot of this all the time. It’s just become a generic place to put things with relative certainty that they’ll just stay up. I also use it to flex my SRE muscle because working in marketing has started to atrophy that and I do not want to lose that skillset.

The Plex logo with a green haired gremlin wearing a pirate hat sticking out behind it
The Plex logo with a green haired gremlin wearing a pirate hat sticking out behind it

One of the services I run is Plex which lets me—Wait, what, how did you get there?...One second.

The Plex logo
The Plex logo

Like I was saying, one of the services I run is Plex which lets me watch TV shows and movies without having to go through flowcharts of doom to figure out where to watch them.

Numa is smug
Numa

Remember: it’s a service problem.

The Pocket-ID homepage
The Pocket-ID homepage

One of the best things I set up was pocket-id, an OIDC provider. Before your eyes glaze over, here’s what you should think.

'One ring to rule them all' with 'ring' hastily replaced with 'account'
'One ring to rule them all' with 'ring' hastily replaced with 'account'

A lot of the time with homelabs and self-hosted services you end up making a new account, admin permissions flags, group memberships, and profile pictures for every service. This sucks and does not scale. Something like Pocket-ID lets you have one account to rule them all. It’s such a time-saver.

Cadey is coffee
Cadey

I wish I set one up a long time ago.

A screenshot of my homelab's Gitea server
A screenshot of my homelab's Gitea server

I also run a git server! It’s where Techaro’s super secret projects like the Anubis integration jungle live.

A screenshot of the github actions self hosted runner docs
A screenshot of the github actions self hosted runner docs

I run my own GitHub Actions runners because let’s face it, who would win: free cloud instances that are probably oversubscribed or my mostly idle homelab 5950x’s?

A screenshot of the Longhorn UI showing 42.6 terabytes of storage available to use
A screenshot of the Longhorn UI showing 42.6 terabytes of storage available to use

One of the big things I run is Longhorn, which spreads out the storage across my house. This is just for the Kubernetes cluster, the NAS has an additional 64-ish terabytes of space where I store my tax documents, stream VODs, and…Linux ISOs.

Logos for tools like ingress-nginx, external-dns, cert manager, let's encrypt, and Docker
Logos for tools like ingress-nginx, external-dns, cert manager, let's encrypt, and Docker

Like any good cluster I also have a smattering of support services like cert-manager, ingress-nginx, a private docker registry, external-dns, a pull-through cache of the docker hub for when they find out that their business model is unsustainable because nobody wants to pay for the docker hub, etc. Just your standard Kubernetes setup sans the standard “sludge pipe” architecture.

A screenshot showing proof of Eric Chlebek coining the term sludge pipe architecture
A screenshot showing proof of Eric Chlebek coining the term sludge pipe architecture

By the way, I have to thank my friend Eric Chlebek for coming up with the term “sludge pipe” architecture to describe modern CI/CD flows. I mean look at this:

A screenshot of ArgoCD showing off the standard sludge pipe architecture
A screenshot of ArgoCD showing off the standard sludge pipe architecture

You just pipe the sludge into git repos and it flows into prod! Hope it doesn’t take anything out!

A smattering of the webapps I host in my homelab
A smattering of the webapps I host in my homelab

I’ve also got a smattering of apps that I’ve written for myself over the years, including but not limited to the hlang website, Techaro’s website, the Stealth Mountain feed on Bluesky, a personal API that’s technically part of my blog’s infrastructure, the most adorable chatbot you’ve ever seen, a bot to post things on a subreddit to Discord for a friend, and Architect knows how many other small experiments.

The history of my homelab

Like I said though, you don’t always need to start out with a complicated multi-node system with distributed storage. Most of the time you’ll start out with a single computer that can turn on. I did.

A picture of my 2012 trash can mac pro on my desk
A picture of my 2012 trash can mac pro on my desk

I started out with this: a trash can Mac Pro that was running Ubuntu. I pushed a bunch of strange experiments to it over the years and it’s where I learned how to use Docker in anger. It’s been a while and I lost the config management for it, but I’m pretty sure it ran bog-standard Docker Compose with a really old version of Caddy. I’m pretty sure this was the machine I used as my test network when I was maintaining an IRC network. Either way, 12 cores and 16 GB of RAM went a long way in giving me stuff to play with. This lasted me until I moved to Montreal in mid-2019. It’s now my Prometheus server.

Then in 2020 I got the last tax refund I’m probably ever going to get. It was about 2.2 thousand snow pesos and I wanted to use it to build a multi-node homelab cluster. I wanted to experiment with multi-node replicated services without Kubernetes.

A triangular diagram balancing wattage, cost, and muscle
A triangular diagram balancing wattage, cost, and muscle

When I designed the nodes, I wanted to pick something that had a balance of cost, muscle, and wattage. I also wanted to get CPUs that had built-in PCI to HDMI converters in them so I can attach a “crash cart” to debug them. This was also before the AI bubble, so I didn’t have langle mangles in mind. I also made sure to provision the nodes with just enough power supply overhead that I could add more hard drives, GPUs, or whatever else I wanted to play with as new shiny things came out.

A picture of three of my homelab nodes, kos-mos, ontos, and pneuma
A picture of three of my homelab nodes, kos-mos, ontos, and pneuma

Here’s a few of them on screen, from left to right that is kos-mos, ontos, and pneuma. Most of the nodes have 32 GB of RAM and a Core-i5 10-600 with 12 threads. Pneuma has a Ryzen 5950x (retired from my husband’s gaming rig when he upgraded to a 7950x3D) and 64 GB of RAM. Pneuma used to be my main shellbox until I did the big Kubernetes changeover.

Not shown are Logos and Shachi. Shachi is my old gaming tower and has another 5950x in it. In total this gives me something like 100 cores and 160 GB of RAM. This is way overkill for my needs, but allows me to basically do whatever I want. Don’t diss the power of floor desktops!

Eventually, Stable Diffusion version 1 came out and then I wanted to play with it. The only problem was that it needed a GPU. Luckily we had an RTX 2060 laying around and I was able to get it up and running on Ontos. Early Stable Diffusion was so much fun. Like look at this.

An AI generated illustration of a figure that vaguely looks like Richard Stallman having a great time with an acid trip in the forest
An AI generated illustration of a figure that vaguely looks like Richard Stallman having a great time with an acid trip in the forest

The prompt for this was “Richard Stallman acid trip in a forest, Lisa frank 420 modern computing, vaporwave, best quality”. This was hallucinated, pun intended, on Ontos’ 2060. I used that 2060 for a while but then bigger models came out. Thankfully I got a job at a cloud provider so I could just leech off of their slack time. But I wanted to get langle mangles running at home so Logos got an RTX 3060 to run Ollama.

A badly photoshopped screenshot of The End of Evangelion with a certain Linux distribution's logo over Rei's face while Shinji and Asuka look on in horror at the last sunset humanity will ever see
A badly photoshopped screenshot of The End of Evangelion with a certain Linux distribution's logo over Rei's face while Shinji and Asuka look on in horror at the last sunset humanity will ever see

At a certain point though, a few things happened that made me realize that I was going off course for what I wanted. My homelab nodes weren’t actually redundant like I wanted. The setup I used had me allocate tasks to specific nodes, and if one of them fell over I had to do configuration pushes to move services around. This was not according to keikaku.

Numa is smug
Numa

By the way, translator’s note: keikaku means plan.

Then the distribution I was using made…creative decisions in community management and I realized that my reach as a large-scale content creator (I hate that term) and blogger meant that by continuing to advocate for that distro in its current state, I was de-facto harming people. So then I decided to look for something else.

The Kubernetes logo
The Kubernetes logo

Let’s be real, the kind of things I wanted out of my homelab were literally Kubernetes shaped. I wanted a bunch of nodes that I could just push jobs to and let the machine figure out where it lives. I couldn’t have that with my previous setup no matter how much I wanted because the tools just weren’t there to do it in real life.

A screenshot of my 'Do I need Kubernetes?' post
A screenshot of my 'Do I need Kubernetes?' post

This was kind of a shock, as previously I had been on record saying that you don’t in fact need Kubernetes. At the time I gave this take though, there were other options. Docker Swarm was still actively in development. Nomad was a thing that didn’t have any known glaring flaws other than being well Nomad, and Kubernetes was really looking like an over engineered pile of jank.

It really didn’t help that one of my past jobs was to create a bog-standard sludge pipe architecture on AWS and Google Cloud but way before cert-manager was stable. Ingress-nginx was still in beta. Everything was in flux.

Instructions on how to use hand dryers, but with the text 'Push button, receive bacon' under each step
Instructions on how to use hand dryers, but with the text 'Push button, receive bacon' under each step

Kubernetes itself was fine, but it was not enough to push button and receive bacon and get your web apps running somewhere. I get that’s not the point of Kubernetes per se, it scales from web apps to fighter jets, but at the end of the day you gotta ship something, right?

It really just burnt me out and I nearly left the industry at large as a result of the endless churn of bullshit. The admission that Kubernetes was what I needed really didn’t come easy. It was one of the last things I wanted to use; but with everything else either dying out from lack of interest or having known gaping flaws show up, it’s what I was left with.

Then at some point I thought, “eh, fuck it, what do I have to lose” and set it up. It worked out pretty great actually.

A screenshot of a Discord conversation where someone asks me what I think about Kubernetes after using it for a while, I reply 'I don't hate it'
A screenshot of a Discord conversation where someone asks me what I think about Kubernetes after using it for a while, I reply 'I don't hate it'

After a few months someone in the patron discord asked me what I thought about Kubernetes in my homelab after using it for a while and my reply was “It’s nice to not have to think about it”. To be totally honest, as someone with sludge pipe operator experience, “it’s nice to not have to think about it” is actually high praise. It just kinda worked out and I didn’t have to spend too much time or energy on it modulo occasional upgrades.

What I like about it

And with that in mind, here’s what I really like about my homelab setup as it is right now.

I can just push button and receive bacon. If I want to run more stuff, I push it to the cluster. If I want to run less stuff, I delete it from the cluster. Backups happen automatically every night. The backup restore procedure works. Pushing apps is trivial. Secrets are integrated with 1password. Honestly, pushing stuff to my homelab cluster is so significantly easier than it’s ever been at any company I’ve ever worked at. Even when I was a sludge pipe operator.

One of the best parts is that I haven’t really had to fight it. Stuff just kinda works and it’s glorious. My apps are available internally and externally and I don’t really have to think too much about the details.

Of course, I didn’t just stop there. I took things one step farther and then realized across my /x/ repo that I had a bunch of services fall into a few basic patterns:

  • The first generic shape of service is the headless bot that just does a thing like monitor an RSS feed and poke a web hook somewhere. This only really needs a Deployment to manage the versions of the container images and maybe some secrets for API keys or the like.
  • Second, I need to run programs that listen internally and serve API calls. Maybe they have some persistent storage. Either way, they definitely need a DNS name within the cluster so other services can use that API to do things like post messages on IRC.
  • Third, some of the things I run are web apps. Webapps are pretty much the same, but they need a DNS name outside the cluster and a way to get HTTP ingress routed to the pod. I use nginx for that, but the configuration can be a bit fiddly and manual. It’d be nice to hyper automate it so that I don’t have to think about the details, I just think about the App.

I was really inspired by Heroku’s setup back when I worked there. With Heroku you just pushed your code and let the platform figure it out. Given that I had a few known “shapes” of apps, what if I just made my own resources in Kubernetes to do that?

apiVersion: x.within.website/v1
        kind: App
        metadata:
          name: httpdebug
        
        spec:
          image: ghcr.io/xe/x/httpdebug:latest
          autoUpdate: true
        
          ingress:
            enabled: true
            host: httpdebug.xelaso.net
        

So I did that, thanks to Yoke. I just define an App, and it creates everything downstream for me. 1Password Secrets can be put in the filesystem or the environment. Persistent storage is a matter of saying where to mount it and how much I want. HTTP ingresses are a simple boolean flag with the DNS name. External DNS records, TLS certificates, and the whole nine yards is naught but an implementation detail. A single flag lets me create a Tor hidden service out of the App so that people can view it wherever they want in the world without government interference. I can add Kubernetes roles by just describing the permissions I want. It’s honestly kind of amazing.

A screenshot of the Techaro Bluesky account ominously posting about HyperCloud
A screenshot of the Techaro Bluesky account ominously posting about HyperCloud

This is something I want to make more generic so that you can use it too, I’ll get to it eventually. It’s in the cards.

Learning to play defense

In the process of messing with my homelab, I’ve had to learn to play defense.

Numa is smug
Numa

This isn’t something that the Jedi will teach you, learning how to do this is much more of a Sith legend.

Something to keep in mind though: I have problems you don’t. My blog gets a lot of traffic in weird patterns. If it didn’t, I’d run it at home, but it does so I have to host it in the cloud. However, remember that git server? Yeah, that runs at home.

A brown haired anime catgirl running away from a swarm of bots, generated with Flux [schnell]
A brown haired anime catgirl running away from a swarm of bots, generated with Flux [schnell]

When you host things on the modern internet, bots will run in once the cert is minted and start pummeling the hell out of it. I like to think that the stuff I make can withstand this, but some things just aren’t up to snuff. It’s not their fault mind you, modern scraper bots are unusually aggressive.

Honestly it feels like when modern scrapers are designed, they have these goals in mind:

Numa is smug
Numa
  • Speed up requests when the server is overloaded, because if it’s returning responses faster it must be able to handle more traffic, right?
  • Oh and if the server is responding with anything but 200, just retry that page later. It’ll be fine, right?
  • Not to mention, those Linux kernel commits from 15 years ago may have changed since you last looked, so why not just scrape everything all over again a few days later?
  • Caches? That requires more code. We gotta ship fast and iterate. We can’t spend time downloading git repositories or caching the etags. That’ll slow us down!
  • Oh, they’re blocking our datacenter IP addresses? No problem! We’ll just cycle through sketchy residential proxy services so that they just think it’s a bunch of people using normal chrome to fetch unusual amounts of webpages.

What could go wrong? Pass me the booch yo.

A smug green haired anime woman telling you to not use VPNs
A smug green haired anime woman telling you to not use VPNs

By the way, public service announcement. Don’t use VPNs unless you have a really good reason. Especially don’t use free VPNs. Those sketchy residential proxy services are all powered by people using free VPNs. If you aren’t a customer, you are the product.

What makes this worse is that git servers are the most pathologically vulnerable to the onslaught of doom from modern internet scrapers because remember, they click on every link on every page.

A screenshot of a webpage with about 50 billion yellow tags highlighted, each is a clickable link
A screenshot of a webpage with about 50 billion yellow tags highlighted, each is a clickable link

See those little yellow tags? Those are all links. Do the math. There’s a lot of them. Not to mention that git packfiles are stored in compressed files which can’t seek. Every time they open every link on every page, they go deeper and deeper into uncached git pack file resolution because let’s face it, who on this planet is going out of their way to look at every file in every commit of GTK from 2004 and older. Not many people it turns out!

And that’s how Amazon’s scraper took out my Git server. I tried some things and they didn’t work including but not limited to things I can’t say in a recording. I debated taking it offline completely and just having the stuff I wanted to expose publicly be mirrored on GitHub. That would have worked, but I didn’t want to give up. I wanted to get even.

Then I had an idea. Raise your hand if you know what I do enough to know how terrifying that statement is.

More of you than I thought.

Somehow I ended up on the wikipedia page for weighing of souls. Anubis, the god of the underworld, weighed your soul and if it was lighter than a feather you got to go into the afterlife. This felt like a good metaphor.

A screenshot of Anubis' readme, showing a brown haired jackal waifu looking happy and successful
A screenshot of Anubis' readme, showing a brown haired jackal waifu looking happy and successful

And thus I had a folder name to pass to mkdir. Anubis weighs the soul of your connection using a SHA256 proof-of-work challenge in order to protect upstream resources from scraper bots. This was a super nuclear response, but remember, this was the state of my git server:

A server that was immolated by fire
A server that was immolated by fire

I just wanted uptime, man.

Either way, the absolute hack I had worked, so I put it on GitHub. Honestly, when I’ve done this before it got ignored. So I just had my 4080 dream up some placeholder assets, posted an blog about it, and went back to playing video games.

Then people started using it. I put it in its own repo and posted about it on Bluesky.

Screenshots of people raving about Anubis
Screenshots of people raving about Anubis

I wasn’t the only one having this problem it seems! It’s kinda taking off! This is so wild and not the kind of problem I usually have.

The GitHub star count graph going hockey-stick
The GitHub star count graph going hockey-stick

Like the graphs went hockey stick.

The GitHub star count graph going even more hockey-stick
The GitHub star count graph going even more hockey-stick

Like really hockey-stick.

The GitHub star count graph continuing to be a hockey-stick
The GitHub star count graph continuing to be a hockey-stick

It just keeps going up and it’s not showing any signs of stopping any time soon.

Anubis' GitHub star count compared to my other big projects
Anubis' GitHub star count compared to my other big projects

For context, here it is compared to my two biggest other projects. It's the mythical second Y axis graph shape. So yeah, you can understand that it’s gonna take a while to circle back to the Techaro HyperCloud.

The cool part about this in my book though is that because I had a problem that was only exposed with the hardware my homelab uses (specifically because my git server was apparently running on rotational storage, oops), I got creative, made a solution, pushed it to GitHub, and now it’s in use to protect GNOME’s GitLab, SourceHut, small community projects, god knows how many git forges, and I’ve heard that basically every major open source project that self-hosts infrastructure is evaluating it to protect their websites too. I really must have touched a nerve or something.

Conclusion

In conclusion:

If you like it, you should self-host it. Online services are vanishing so frequently. Everything is centralizing around the big web and it makes me afraid for what the future of the small Internet could look like should this continue.

Anubis looking pensive next to 'Think small'
Anubis looking pensive next to 'Think small'

Think small. A single node with a 2012 grade CPU and 16 gigabytes of dedotated wam lasted me until 2019. When I get a computer, I use the whole computer. If it’s fine for me, it’s more than enough for you.

A smug green haired anime woman telling you to fuck around and find out, but not as a threat
A smug green haired anime woman telling you to fuck around and find out, but not as a threat

Fuck around and find out. That’s not just a threat. That’s a mission statement.

Remember that if you get an idea, fuck around, find out, and write down what you’ve learned: you’ve literally just done science. Well, with computers, so it’d be computer science, but you get my point.

And if bots should come in and start a-pummeling away, remember: you’re not in the room with them. They’re in the room with you. Remember Slowloris? A little birdie told me that it works server to client too. Consider that.

The GReeTZ / special thanks slide
The GReeTZ / special thanks slide

My time with you is about to come to an end, but before we go, I just want to thank everyone on this list. You know what you did. If you’re not on this list, you know what you didn’t do.

The conclusion slide with more contact info
The conclusion slide with more contact info

And with that, I've been Xe! I'll be around if you have questions or want stickers. Stay warm!

If I don’t get to you, please email your questions to [email protected]. With all that out of the way, does anyone have any questions?