2024-12-11 08:00:00
Web components might be great, if only you could render them on the server.
Or can you? The lack of server-side rendering has become a sort of folk belief that oft goes unquestioned, and many people form opinions based on this (alleged) missing feature.
Well, I am happy to report that the fears are unfounded: you can absolutely server-side render a web component. But there are a few different ways it can go down.
Let's start from the top.
The building blocks of web components — template elements, custom elements and (declarative) shadow DOM — are all just HTML tags.
So from a pedantic point of view, server-side rendering a web component is trivial: just put a <template>
or a <custom-element>
tag in your markup.[1]
I'm being glib, but this already genuinely powerful! Custom elements let you attach logic to specific points in the light DOM. Rather than declaring that attachment point in a separate JavaScript file, though, you can do it directly in your HTML markup. This strategy — using custom elements without templates or shadow DOM, to enhance light DOM elements that already exist — has come to be called HTML web components.
You don't even need to bring in JavaScript for web components to be useful. Hawk Ticehurst shared a pattern he calls CSS web components, in which custom element attributes are used as hooks for CSS selectors. This gives us a props-like API to modify a component's appearance without writing a byte of JavaScript.
None of this is what people usually mean, though.
"Web component" is really just an umbrella term for those three APIs, but in practice people use it to mean adding client-side behavior to custom elements by subclassing HTMLElement
.
And when they talk about server-side rendering web components, they mean running that subclass on the server and having it spit out the markup it would generate on the client.
This approach — running the same code in two different environments — is popularly called isomorphism. It's how most major JavaScript frameworks approach server-side rendering. But for whatever reason, resources for doing it with web components are few and far between.
Is isomorphism purely a JavaScript framework thing, or is there a more standard way to do it?
If you open the HTML custom elements spec and search for the word "server", you'll get two results and both of them are about form processing. You won't find it at all in the DOM spec.
You might be thinking, "wait a minute — isn't declarative shadow DOM a spec for server-side rendering web components?" The answer to that is: not really. Declarative shadow DOM defines a way to set up shadow roots within HTML (i.e. without JavaScript).
But many web components don't use shadow DOM at all — they render plain old light DOM. And while the spec details how HTML is parsed into a shadow root, it's agnostic as to how that HTML gets generated.
That makes sense, though! At the risk of stating the obvious: browser specs are written for the browser. Their concern is how the browser interprets the HTML it receives; how that HTML is created is outside of their purview.
Okay, no official guidance from the W3C. Now what?
At this point, libraries like Lit, WebC and Enhance enter the discussion. These are tools that target web components as an output format: you use them to build a component, and at the end you get a custom element that you can use in any website or web app. Many of them also let you render that component to static HTML on the server. Case closed, right?
Not quite. The code you write in these libraries might not look much like "vanilla" web components at all. Here's an example of an Enhance element:
export default function MyElement({ html, state }) {
const { attrs } = state;
const { name } = attrs;
return html`
<p>Hello ${name}!</p>
<style>
p {
color: rebeccapurple;
}
</style>
`;
}
You'd be forgiven for thinking you were looking at a React component! Frankly, I don't see a fundamental difference between these and frameworks like Svelte or Vue, which can also use web components as a compile target.[2]
To be clear, I don't mean any of this as a slight. These are all good tools — web component libraries and JavaScript frameworks both. Ultimately, they all:
That makes me uneasy. When I choose to write a web component rather than, say, a Svelte component, one of my main reasons is to work directly with the web platform. I don’t want to add a build step or change how I write my code.
One of the coolest things about web components is that they act as a decoupling layer. It doesn't matter whether a component is built with Enhance, Lit or anything else; I can drop it into a Svelte app or an Astro site or a Markdown file or a page of handwritten HTML and it will Just Work and I will be none the wiser.
Which is why I'm not super thrilled about server-side rendering solutions that are tied to particular libraries. The interoperability promise is broken — or, at the very least, weakened. How a component is built is no longer simply an implementation detail. What I've chosen for the frontend now exerts influence on the backend, and vice versa.[3]
So the goal is to take existing web components and render them server-side:
This is kind of like a bizarro progressive enhancement. Rather than starting with HTML and enhancing it with JavaScript, we're starting with JavaScript and enhancing it with HTML.
Note that this is not mutually exclusive with traditional progressive enhancement! With this strategy, the component's server-side rendered HTML can still deliver baseline functionality even if the JavaScript fails to load. The fact that the same code that generated the HTML later ends up running in the browser is an implementation detail.
Curiously, there doesn't seem to be much written online about this approach. If people are doing it, they're not really talking about it. That's why I decided to build a proof of concept.
We'll start by trying to server-side render this web component:[4]
customElements.define("greet-person", class extends HTMLElement {
connectedCallback() {
const name = this.getAttribute("name");
this.innerHTML = `<p>Hello, ${name}!</p>`;
}
});
We want to be able to write this in our HTML:
<greet-person name="Jake"></greet-person>
…and have it expand into this:
<greet-person name="Jake">
<p>Hello, Jake!</p>
</greet-person>
If we're going to take any web component that works on the client, that means we'll need a way to emulate the DOM. There are a bunch of libraries that do this, but we'll use one called Happy DOM.
With Happy DOM in our toolbelt, the code to actually do the rendering is pretty short:
import { Window } from "happy-dom";
const globals = new Window();
global.document = globals.document;
global.customElements = globals.customElements;
global.HTMLElement = globals.HTMLElement;
export async function render(html: string, imports: Array<() => Promise<void>> = []) {
await Promise.all(imports.map((init) => init()));
document.documentElement.innerHTML = html;
return document.documentElement.getHTML({ serializableShadowRoots: true });
}
At the module's top level, we create a Happy DOM Window
.
Just like in a browser, an instance of Window
contains all the global variables available — HTMLElement
, customElements
, you name it.
We'll take these global variables and set them on Node's global
object, which makes them available to all modules we might import.[5]
We only need to define one function to make this work.
We'll call it render
, and it'll take two props: the HTML to render as a string, and an array of functions that import the web component classes.
After awaiting the return value of each of those functions, we set the window's documentElement.innerHTML
to the string we passed into the render
function, then serialize the emulated DOM to an HTML string and return it.
We call the render
function like this:
import { render } from "./render.js";
const html = `<!doctype html>
<html lang="en">
<body>
<greet-person name="Jake"></greet-person>
</body>
</html>
`;
const result = await render(html, [
() => import("./greet-person.js")
]);
The important part here is that we need to import the render
function before we import the web components.
That way, by the time the web components are declared, all the browser APIs they rely on are already available on Node's global scope.[6]
That works for web components that stay within the light DOM. What about the shadow DOM?
Let's update our component:
customElements.define("greet-person", class extends HTMLElement {
constructor() {
super();
this.attachShadow({ mode: "open", serializable: true });
this.shadowRoot.innerHTML = "<p>Hello, <slot></slot>!</p>";
}
});
If you've used shadow DOM before, this probably looks familiar.
One thing that may be new to you — or at least, it was to me — is the serializable
property, which instructs the element to render the shadow root into HTML.
Now, if we put this in our markup:
<greet-person><span>Jake</span></greet-person>
…it'll expand into this:
<greet-person>
<template shadowrootmode="open" shadowrootserializable="">
<p>Hello, <slot></slot></p>
</template>
<span>Jake</span>
</greet-person>
There you go: isomorphic web components.
Any way you cut it, you can server-side render web components today:
Smart people can disagree about the best approach, but I'm partial to isomorphic rendering. It works with all web components, no matter how they're written. It fully embraces web platform APIs, rather than treating them as a compile target. And it makes our components resilient to toolchain entropy by gracefully degrading to client-side rendering.
Even if you don't agree with me, though, there's a server-side rendering solution out there for you. That's the nice thing about the web: it's flexible like that.
Imagine if I just ended the article there?
"Yes you can, just use a <template>
, next question." ↩︎
Granted, the web component libraries approach web components from a place of respect, whereas the feelings of JavaScript framework authors seem to range from "mild annoyance" to "seething hatred". ↩︎
Enhance takes an interesting stance here and uses WASM to decouple rendering from the programming language. You still need to write your components in Enhance's format, but then you can use them in any stack you want. ↩︎
There are better ways to define a web component than just throwing it in customElements.define
, but we'll go with this for brevity. ↩︎
If your component depends on more globals than customElements
and HTMLElement
, you'll also need to set those on the global
object. ↩︎
render
accepts an array of functions that will dynamically import the web component files for convenience, but you could omit that parameter and just ensure that you imported the web components after the render
function. ↩︎
2024-11-22 08:00:00
I recently built a Bluesky bot called Link Notifier that sends you a DM whenever someone posts a link to your website. To build it, I had to dig into the Bluesky firehose. That seems like a pretty common entry point for people looking to build on top of Bluesky, so I figured I'd share what I learned.
There are a couple ways to get at the Bluesky firehose:
You might be tempted — as I was at first — to avoid all the gory details and just reach for a library like @skyware/jetstream
.
Don't be intimidated! The Jetstream API is actually remarkably simple, and you can easily consume it without adding a dependency to your project.
Here's a small example running in the browser that consumes the Bluesky Jetstream: a web component that shows the latest post every second. (This is totally unfiltered; I'm sorry if anything unsavory shows up here.)
The full code of this component is less than 40 lines — including the templating and all the web component boilerplate! The code that reads from the Jetstream takes up about six. There are no dependencies outside of the browser's standard library.
Before we look at any code, though, let's take a quick detour through the AT Protocol and Jetstream API.
Jetstream is a WebSocket server: we connect via a WebSocket connection, and it sends events as WebSocket messages encoded in JSON. You can host a Jetstream instance yourself, but as of today Bluesky hosts official instances that you can use without authentication.
The connection string for a Jetstream instance looks like this:
wss://jetstream2.us-west.bsky.network/subscribe
Once you're connected, Jetstream will start sending events. They look like this:
{
"did": "did:plc:eygmaihciaxprqvxpfvl6flk",
"time_us": 1725911162329308,
"kind": "commit",
"commit": {
"rev": "3l3qo2vutsw2b",
"operation": "create",
"collection": "app.bsky.feed.like",
"rkey": "3l3qo2vuowo2b",
"record": {
"$type": "app.bsky.feed.like",
"createdAt": "2024-09-09T19:46:02.102Z",
"subject": {
"cid": "bafyreidc6sydkkbchcyg62v77wbhzvb2mvytlmsychqgwf2xojjtirmzj4",
"uri": "at://did:plc:wa7b35aakoll7hugkrjtf3xf/app.bsky.feed.post/3l3pte3p2e325"
}
},
"cid": "bafyreidwaivazkwu67xztlmuobx35hs2lnfh3kolmgfmucldvhd3sgzcqi"
}
}
That's the full event of a post being liked.
It's pretty dense! There are bunch of terms like "collection" and "did" that are idiosyncratic to AT Protocol. Most of them can be found in the glossary, but I'll try to define them in my own words as they come up as well.
In AT Protocol, everything a user does is found in a repo.
Each repo has a DID: a Decentralized ID that uniquely identifies it.
The did
property at the root of the event object is a reference to the repo of the user who took the action (in this case, the user who liked the post).
The kind
property disambiguates between three types of events:
commit
for events that create, update or delete something in a repo.identity
for events that describe some change to the repo itself (not quite sure which — I assume changing a handle would be one example).account
for events that describe a change in account status (e.g. from "active" to "deactivated").For our purposes, we're only worried about commit
events.
Those events all have a nested commit
object with an operation
property: create
, update
or delete
.
I'll let you guess what those mean.
Each commit object also has a collection
property.
This is way to "group" events across repos.
For example, to listen to all new posts, we'd ignore all events in collections other than app.bsky.feed.post
.
If we do all the filtering in the client, we'd be receiving a ton of data we don't need.
Jetstream provides a way to avoid this: append a wantedCollections
query string parameter to the connection string.[2]
Say we're only interested in new posts and likes. We'd connect to this long URL:
wss://jetstream2.us-west.bsky.network/subscribe?wantedCollections=app.bsky.feed.post&wantedCollections=app.bsky.feed.like
That wouldn't absolve us of the need to filter on the client — we'd still need to branch between new posts and likes within our app — but it would prevent us from sifting through a ton of other events we don't care about.
We can also use asterisks as "wildcards" to filter through multiple collections at once.
For example, to get events in all feed collections, we'd set wantedCollections
to app.bsky.feed.*
.
On create
and update
commits, the record
is the "contents" of it — either the thing that was just created, or the thing with which to replace the previous record.
As a reminder, here's what the record looked like in the example event:
{
"$type": "app.bsky.feed.like",
"createdAt": "2024-09-09T19:46:02.102Z",
"subject": {
"cid": "bafyreidc6sydkkbchcyg62v77wbhzvb2mvytlmsychqgwf2xojjtirmzj4",
"uri": "at://did:plc:wa7b35aakoll7hugkrjtf3xf/app.bsky.feed.post/3l3pte3p2e325"
}
}
And here's what a record might look like for posts:
{
"$type": "app.bsky.feed.post",
"text": "Hello World!",
"createdAt": "2023-08-07T05:31:12.156888Z"
}
This is a minimal example; Bluesky's documentation details how to handle links, quotes and so forth.
Notice that in some cases (such as the text
property of a post record) the record itself contains information, while in others (such as the subject
property of a like record) it contains references to other records.
The simplest possible Jetstream client looks something like this:
const jetstream = new WebSocket("wss://jetstream1.us-east.bsky.network/subscribe");
jetstream.onmessage = e => console.log(e.data);
Voilà: two lines of code and every event from the Bluesky firehose gets logged to the console!
With a little elbow grease, we can come up with something a little more ergonomic.
Let's write a client that mimics the @skyware/jetstream
API:
const jetstream = new Jetstream();
jetstream.onCreate("app.bsky.graph.follow", event => {
// ...
});
jetstream.onDelete("app.bsky.feed.post", event => {
// ...
});
jetstream.start();
This is still a pretty simple client that doesn't cover everything we might ever want to do with Jetstream, but it's more than enough to get us started.
We'll start by writing a Jetstream
class:
class Jetstream {
endpoint = "jetstream1.us-east.bsky.network";
emitters = new Map<string, EventTarget>();
ws?: WebSocket;
constructor(options: { endpoint?: string } = {}) {
this.endpoint = options.endpoint ?? this.endpoint;
}
}
By default, our client connects to the Bluesky-hosted Jetstream instance at jetstream1.us-east.bsky.network
.
The user can override that by passing an endpoint into the constructor.
We also see two additional members:
emitters
, which holds a map of EventTarget
s keyed by the collection names.ws
, which will hold the WebSocket client when we connect to the Jetstream instance.First, we'll write a private #listen
method that calls an event listener when the client receives an in a given collection with a specific operation:
class Jetstream {
// ...
#listen(collection: string, operation: string, listener: (event: unknown) => void) {
const emitter = this.emitters.get(collection) || new EventTarget();
this.emitters.set(collection, emitter);
emitter.addEventListener(operation, event => listener(event.detail));
}
}
It gets an EventTarget
from the map at the given collection key — creating one if it doesn't exist — and attaches an event listener for events matching the given commit operation.[3]
When we're dispatching the events later, we'll use CustomEvent
s, which allow you to include arbitrary data in their detail
property.
Since the use of CustomEvent
s is an implementation detail, we'll just pass that property to the listener, rather than the whole event.
From here, we can make public wrapper methods for each of those commit operations:
class Jetstream {
// ...
onCreate(collection: string, listener: (event: unknown) => void) {
this.#listen(collection, "create", listener);
}
onUpdate(collection: string, listener: (event: unknown) => void) {
this.#listen(collection, "update", listener);
}
onDelete(collection: string, listener: (event: unknown) => void) {
this.#listen(collection, "delete", listener);
}
}
These don't really do much other than make that #listen
method slightly more convenient to use.
Next, let's take a look at the start
method:
class Jetstream {
// ...
start() {
if (this.ws) this.ws.close();
this.ws = new WebSocket(this.url);
this.ws.onmessage = ev => {
const data = JSON.parse(ev.data);
if (data.kind !== "commit") return;
const emitter = this.emitters.get(data.commit.collection);
if (!emitter) return;
emitter.dispatch(new CustomEvent(data.commit.operation, { detail: data }));
};
}
}
This looks pretty familiar: it's a thin abstraction over the barebones client we saw earlier.
commit
events.Sharp-eyed readers might notice that we haven't defined the class's url
member yet:
class Jetstream {
// ...
get url() {
const url = new URL(`wss://${this.endpoint}/subscribe`);
for (const collection of this.emitters.keys()) {
url.searchParams.append("wantedCollections", collection);
}
return url.toString();
}
}
It's a getter that constructs the WebSocket URL, adding wantedCollections
query string parameters for any collections in which we're listening for events.
That way, we'll only receive the slice of the Jetstream containing the collections we care about.
For posterity, here's the full code:
class Jetstream {
endpoint = "jetstream1.us-east.bsky.network";
emitters = new Map<string, EventTaret>();
ws?: WebSocket;
get url() {
const url = new URL(`wss://${this.endpoint}/subscribe`);
for (const collection of this.emitters.keys()) {
url.searchParams.append("wantedCollections", collection);
}
return url.toString();
}
constructor(options: { endpoint?: string } = {}) {
this.endpoint = options.endpoint ?? this.endpoint;
}
#listen(collection: string, operation: string, listener: (event: unknown) => void) {
const emitter = this.emitters.get(collection) || new EventTarget();
this.emitters.set(collection, emitter);
emitter.addEventListener(operation, listener);
}
onCreate(collection: string, listener: (event: unknown) => void) {
this.#listen(collection, "create", listener);
}
onUpdate(collection: string, listener: (event: unknown) => void) {
this.#listen(collection, "update", listener);
}
onDelete(collection: string, listener: (event: unknown) => void) {
this.#listen(collection, "delete", listener);
}
start() {
if (this.ws) this.ws.close();
this.ws = new WebSocket(this.url);
this.ws.onmessage = ev => {
const data = JSON.parse(ev.data);
if (data.kind !== "commit") return;
const emitter = this.emitters.get(data.commit.collection);
if (!emitter) return;
emitter.dispatch(new CustomEvent(data.commit.operation, { detail: data }));
};
}
}
40 lines of code and we've replicated a significant portion of @skyware/jetstream
!
Use it, modify it and make something cool.
There's an official blog post announcing it, as well as a more in-depth explanation on the original author Jaz's blog. Both are worth reading, but not required to understand the rest of this article. ↩︎
There are a bunch of other options as well.
For instance, you can have Jetstream only send events for specific repos by including a wantedDids
query string parameter.
All the options are listed in the GitHub readme. ↩︎
Why does the listener take an event of type unknown
?
Technically, Jetstream could send anything over the wire!
@skyware/jetstream
provides a typed event definition, but they just cast the type than actually checking that it's correct.
If you want type safety, you should parse the incoming events using a library like Valibot. ↩︎
2024-11-05 08:00:00
The web development community talks a lot about single-page apps, but are we all on a single page?
Heydon Pickering tackled this question in his similarly-named article What Is A Single-Page Application? The TL;DR — spoiler alert! — is that it's a website that uses a ton of JavaScript to improve user experience by showing you a loading spinner.
That's obviously tongue-in-cheek, but it's a reaction to the working definition that most people use. For better or worse, "single-page app" is usually a euphemism for "JavaScript framework app".
I recently wrote about building a single-page app with htmx using service workers to render everything client-side — no loading spinners in sight! In response, Thomas Broyer objected to the premise that htmx and single-page apps were opposites. He showed me an article that he wrote called Naming things is hard, SPA edition (which you should also go read!) that breaks down rendering into a spectrum.
In a bid to cement my burgeoning reputation as a Quadrant Chart Guy, I feel compelled to add even more nuance to the situation:
I'm sorry. Kinda.
Okay, let's define the extrema of each axis:
If you just came here for an answer to the title, that's it; I guess you can go home now. But I think it's interesting to look at the various tools people use and how they fit in.
Most tools for building websites don't lock you into just one quadrant.
After all, any tool lets you drop in a plain un-enhanced <a>
tag and at the very least get MPA behavior, and most JavaScript usage outside of Google Tag Manager relies on client-side rendering (even if done manually).
So: without casting any aspersions, here's my ontology of web app architectures organized by rendering and navigation.
This is a pretty large tent, encompassing WordPress, Django, Rails (pre-Turbolinks) Jekyll, Hugo, Eleventy and myriad others. It also includes hand-authored HTML, though I wouldn't describe that as a "tool" so much as a "way of life".
Tools in this category are on the bottom left of the chart: server-side rendered multi-page apps.
The tradeoffs of this quadrant are well known:
This experience has remained mostly unchanged for 30 years. And it's great! With only a little bit of HTML and CSS, you can make a pretty good website; the many Motherfucking Website variations show just how far a few tags and properties get you. The low barrier to entry is one of the main reasons the web flourished.
Three decades on, improvements in HTML and CSS are starting to mitigate some of the downsides. Preloading resources, for example, allows the browser to preemptively download associated files, which can make navigation almost instantaneous. And cross-document view transitions — not yet well supported, but hopefully soon! — promise to allow multi-page apps to navigate with fancy animations.
That said: requiring a network request and a whole new page for every interaction is a pretty strong constraint! As developers' ambitions grew, they leaned more and more heavily on JavaScript, which led to…
Although JavaScript was invented way back in 1995, I don't think a schism truly happened until 2010 or so. That's when the stereotypical single-page apps began to emerge: rather than using small snippets of JavaScript to add client-side functionality to server-side rendered HTML, people started building apps with a JavaScript framework and rendering them on the client.
Note that I'm not talking about Next.js or similar tools (I'll get to them in the next section).
I'm talking about Backbone, Angular 1, React with a custom Webpack setup… basically, JavaScript apps before circa 2018, when people would ship an HTML file with an empty <body>
except for one lonely <script>
tag.
Used thusly, JavaScript frameworks are the diametric opposite of traditional web frameworks: both navigation and rendering happens on the client. As such, they fit neatly into the top right quadrant: client-side rendered single-page apps.
What are the benefits of this quadrant?
In practice, I think many of the purported benefits of client-side rendered SPAs turned out to be wishful thinking:
There are also more general drawbacks:
If I sound critical of this category, it's only because the industry has largely recognized these drawbacks and moved on to other architectures. While JavaScript frameworks are more popular than ever, they tend to exist as components of larger systems rather than than as app frameworks in and of themselves.
Client-side rendered SPAs still have their uses, though. When I made my local-first trip planning app, I built it as a client-side rendered SPA. There was really no other way to build it — since the client has the canonical copy of the data, there's not even a server to do any rendering! As local-first picks up steam, I hope and expect to see this architecture make a resurgence in a way that does capture the upside of the quadrant's tradeoffs.
JavaScript frameworks had about half a decade of client-side rendering glory before people realized that delivering entire applications that way was bad for performance. To address that, developers starting building metaframeworks[1] — Next.js, Remix, SvelteKit, Nuxt and Solid Start, among others — that rendered on the server as well.
In metaframeworks, rendering happens in two different ways:
These steps slot neatly into the top left and top right quadrants, respectively:
JavaScript metaframeworks are an attempt to get the "best of both worlds" between server-side rendered multi-page apps and client-side rendered single-page apps. In particular, they fix the cold cache initial page load and SEO drawbacks of the latter. With React Server Components, React-based metaframeworks can omit UI code from the JavaScript bundle as well.[2]
Depending on whom you ask, this is either good because it really is a "best of both worlds" situation, or bad because your UI is probably useless before it hydrates with the JavaScript (that your users still need to download). But "probably" in that sentence is doing at least some amount of lifting; many metaframeworks like SvelteKit and Remix embrace progressive enhancement and work without JavaScript by default.
A couple years ago, Nolan Lawson attempted to bridge the two camps:
At the risk of grossly oversimplifying things, I propose that the core of the debate can be summed up by these truisms:
- The best SPA is better than the best MPA.
- The average SPA is worse than the average MPA.
I think that's a fair take, but there are a couple other architectures still remaining that make things a little blurrier.
Recently we've seen the emergence of a new category: server-side rendered multi-page frameworks that embrace islands of interactivity for rich client-side behavior. While the idea itself isn't new, the current crop of frameworks built around it are — Astro, Deno Fresh and Enhance, among others.
In case you're unfamiliar: an island of interactivity is a region of an otherwise static HTML page that is controlled by JavaScript. It's an acknowledgment that while richly interactive applications do exist, the richly interactive part is often surrounded by a more traditional website. The classic example is a carousel, but the pattern is broadly useful; the interactive demos on this very blog are built as islands within static HTML.
What that means in practice is that these websites will fit mostly into the bottom left quadrant — except for the namesake islands of interactivity, which fit into the bottom right.
Similar to JavaScript metaframeworks, islands frameworks also try to get the "best of both worlds" between client-side and server-side rendering — albeit as MPAs rather than SPAs. The bet is that reducing complexity around the static parts of a page is a better tradeoff than giving developers more control. As with traditional web frameworks, the gap between them should narrow as support for view transitions gets better.
This pattern is less all-encompassing than some of the others, but it's worth mentioning because the past few years have seen it explode in popularity. By "partial swapping", I mean making an HTTP request for the server to render an HTML fragment that gets inserted directly into the page.
To wit, websites using partial swapping generally fall on the server-side rendered side of the chart, spanning both the single-page and multi-page quadrants:
The most famous partial swapping tool is htmx, which people tend to use in conjunction with "traditional" server-side rendered frameworks. Other libraries like Unpoly and Turbo work similarly. Some frameworks in other categories, such as Rails (with Turbo) and Deno Fresh, have adopted partial swapping as well.
As I've written before, people act as though this pattern is saving the web from SPAs. Once we widen our view like this, though, we can see that's a false dichotomy. In fact, by making it easier for developers to replace finer-grained regions of the page, partial swapping is actually a tool for creating SPAs[3] — albeit server-side rendered ones.
It's not all or nothing! The htmx documentation outlines how this pattern can work in conjunction with client-side scripting approaches such as islands. I won't make a chart with three of the four quadrants filled in, but you get the idea: these boundaries are fluid, and good tools don't lock developers into a specific region.
Partial swapping can also be used as a polyfill for cross-document view transitions. Frameworks like Astro allow authors to load full pages asynchronously, progressively enhancing MPAs into server-side rendered SPAs.
None of this is particularly groundbreaking. But I agree with Thomas that imprecise terminology doesn't help whatever discourse plays out on the hot-take-fueled Internet argument fora. Hopefully, this can serve as a reference point when we talk about when and where these architectures are appropriate.
Not to be confused with Meta Frameworks, which just means React. ↩︎
Dan Abramov gives an example of this in The Two Reacts. Imagine a blog with posts written in Markdown. An app fetching those posts from across the network and rendering them on the client would need to include a full Markdown parser in the JavaScript bundle. React Server Components allow the server to parse the Markdown, and send only the result to the client to be rendered. ↩︎
Of course, you don't have to use partial swapping to create a full-on SPA. In Less htmx is More, htmx maintainer Alexander Petros advocates using it judiciously and relying on regular links and form submissions that cause the browser to do a full-page navigation (in other words, progressive enhancement). ↩︎
2024-10-29 08:00:00
How many of us start every web project by copy-and-pasting Eric Meyer's famous CSS reset?
I did that for a long time without even really reading it. I knew that it smoothed over some of the inconsistencies between various browsers, and that was enough for me.
Then, a few years ago, I read a blog post by Josh W Comeau called My Custom CSS Reset. I realized that there's no deep magic; everything a reset does is just normal CSS that you can read and understand.[1]
My CSS reset is loosely based on Josh's, but it takes a slightly more opinionated stance and applies some light default styling that I end up adding to almost every project anyway. CSS has also rapidly improved over the past few years, and I've tried to take advantage of that by including a few modern features like cascade layers, logical properties, nicer text wrapping and nesting.[2]
I suppose it kind of blurs the line between "CSS reset" and "classless CSS framework". My goal isn't really to adhere to a strict definition of either one; I've just found this set of styles to be a good starting point for most websites I build.
If you haven't read Josh's post, go do that now. He explains his reset in detail, with a bunch of great interactive examples. I'm going to talk mostly about the points at which my reset differs from his.
Here it is:
{/* prettier-ignore */}
@layer reset {
*, *::before, *::after {
box-sizing: border-box;
}
* {
margin: 0;
padding: 0;
}
body {
line-height: 1.5;
}
img, picture, video, canvas, svg {
display: block;
max-inline-size: 100%;
}
input, button, textarea, select {
font: inherit;
letter-spacing: inherit;
word-spacing: inherit;
color: currentColor;
}
p, h1, h2, h3, h4, h5, h6 {
overflow-wrap: break-word;
}
ol, ul {
list-style: none;
}
:not([class]) {
h1&, h2&, h3&, h4&, h5&, h6& {
margin-block: 0.75em;
line-height: 1.25;
text-wrap: balance;
letter-spacing: -0.05ch;
}
p&, ol&, ul& {
margin-block: 1em;
}
ol&, ul& {
padding-inline-start: 1.5em;
list-style: revert;
}
li& {
margin-block: 0.5em;
}
}
}
First, everything is in a cascade layer called reset
:
@layer reset {
/* ... */
}
This sets the precedence of the reset.
Briefly: normal styles (i.e. not !important
) outside of layers have precedence over normal styles in layers.
The order of precedence for layers is determined by the order in which they're declared, with later layers overriding earlier ones.
Generally, resets are placed first in a stylesheet, which means that any other styles will be able to easily override these.
Note that the layer can also be declared at import time rather than as a block:
@import url("reset.css") layer(reset);
Next, removing the margins and padding:
* {
margin: 0;
padding: 0;
}
Josh only removes the margins; I remove the padding as well. There aren't many elements that I use without overriding the built-in padding — and as we'll see later, I end up adding it back in some cases.
Next, media elements:
{/* prettier-ignore */}
img, picture, video, canvas, svg {
display: block;
max-inline-size: 100%;
}
Like Josh, I set them to display: block
to avoid that weird extra vertical space.
Instead of max-width
, though, I use max-inline-size
: a logical property which controls layout based on the direction of the text.
It sets the maximum size along the inline dimension.
For horizontal languages like English, that's width, meaning it functions exactly the same as max-width
— but for vertical writing modes (such as in some Asian languages) it instead functions as max-height
.
Next, form controls:
{/* prettier-ignore */}
input, button, textarea, select {
font: inherit;
letter-spacing: inherit;
word-spacing: inherit;
color: currentColor;
}
This is another tweak to Josh's styles.
While font: inherit
makes these elements inherit the font-family
, font-size
, etc, from their parent, it doesn't touch the color (which defaults to black) or the letter or word spacing.
Setting letter-spacing
and word-spacing
to inherit
makes them inherit the corresponding properties from their parent, while setting color
to currentColor
makes them also take on their parent's text color.
One notable place where Josh's reset diverges from Eric Meyer's is in resetting list styles:
{/* prettier-ignore */}
ol, ul {
list-style: none;
}
I find that I remove the list styles more often than not, so it's included in the reset. Again, we'll later see the list styles reapplied in some cases.
Here's where this reset crosses over into "classless CSS framework". I'm going to start adding back styles to the page, using CSS nesting to wrap everything in this selector:
:not([class]) {
/* ... */
}
All following selectors are nested inside this block, which selects every element without a class.
The idea is that if I add a class to an element, it's probably because I want to customize that element specifically, rather than using the base styles.
The :not([class])
selector lets me add that base set of styles to elements I often use without classes (such as <p>
or <li>
) without forcing myself to "zero out" those styles if those same elements make sense semantically in a different context.
First up, the headings:
{/* prettier-ignore */}
h1&, h2&, h3&, h4&, h5&, h6& {
margin-block: 0.75em;
line-height: 1.25;
text-wrap: balance;
letter-spacing: -0.05ch;
}
The &
sigil combines each nested selector with its ancestors.
This is equivalent to selecting h1:not([class])
, h2:not([class])
and so forth.[3]
Here are the properties, one by one:
margin-block: 0.75em
is another logical property.
It sets the margins along the block dimension, which for horizontal languages means top and bottom.
It's defined in em
, so it scales based on whatever font size the element ends up using.line-height: 1.25
sets the leading a little tighter than the rest of the text.text-wrap: balance
is a new value for the text-wrap
property that tries to balance the number of characters on each line.letter-spacing: -0.05ch
makes the tracking slightly tighter. It's defined in ch
, which scales based on the width of the glyph "0" in the element's font family and size.Next, paragraphs and lists:
{/* prettier-ignore */}
p&, ol&, ul& {
margin-block: 1em;
}
Similar to headings, I set the margin along the block dimension in em
s to bring back some vertical spacing proportional to the font size.
Since most of these elements will be adjacent to others of the same type, 1em
might seem like a lot — but margin collapse will prevent the margins from combining.
Next, I bring back the list styles I reset earlier:
{/* prettier-ignore */}
ol&, ul& {
padding-inline-start: 1.5em;
list-style: revert;
}
li& {
margin-block: 0.5em;
}
Let's tackle <ol>
and <ul>
first:
padding-inline-start: 1.5em
adds padding at the beginning of the inline dimension, which for left-to-right langauges means left. This indents each list item.list-style: revert
adds back the default list marker.Finally, for <li>
I add back some margin along the block axis for vertical rhythm.
That's it! As with Josh's reset (and Eric's before his), feel free to use or modify this for your own projects; consider it public domain. If there's anything you think should be added, removed or changed, or any tips I've missed or new features that might improve this, I'd love to hear about it.
letter-spacing: inherit
to his form elements, to which Dan credited an article by Adrian Roselli.
Based on that article, I added both letter-spacing: inherit
and word-spacing: inherit
.max-width
on media elements with max-inline-size
.To be fair to myself, when I started web development in the age of Internet Explorer 6, I was in high school and browser inconsistencies were really bad. At the time, getting websites to work across browsers really did involve taming a bunch of dragons. ↩︎
Keep in mind that for any newer CSS feature, you should check support statistics somewhere like Can I Use before deciding to add it to your own website. ↩︎
Technically, &
gets replaced with its ancestor selectors inside of :is()
, so :not([class]) { h1& }
translates to h1:is(:not([class]))
.
In this case, that's the same thing as h1:not([class])
, but that's not the case for every nested selector. ↩︎
2024-10-07 08:00:00
People talk about htmx as though it's saving the web from single-page apps. React has mired developers in complexity (so the story goes) and htmx is offering a desperately-needed lifeline.
htmx creator Carson Gross wryly explains the dynamic like this:
no, this is a Hegelian dialectic:
- thesis: traditional MPAs
- antithesis: SPAs
- synthesis (higher form): hypermedia-driven applications w/ islands of intereactivity
Well, I guess I missed the memo, because I used htmx to build a single-page app.
It's a simple proof of concept todo list. Once the page is loaded, there is no additional communication with a server. Everything happens locally on the client.
How does that work, given that htmx is focused on managing hypermedia exchanges over the network?
With one simple trick:[1] the "server-side" code runs in a service worker.
Briefly, a service worker acts as a proxy between a webpage and the wider Internet. It intercepts network requests and allows you to manipulate them. You can alter requests, cache responses to be served offline or even create new responses out of whole cloth without ever sending the request beyond the browser.
That last capability is what powers this single-page app. When htmx makes a network request, the service worker intercepts it. The service worker then runs the business logic and generates new HTML, which htmx then swaps into the DOM.
There are a couple of advantages over a traditional single-page app built with something like React, too. Service workers must use IndexedDB for storage, which is stateful between page loads. If you close the page and then come back, the app retains your data — this happens "for free", a pit of success consequence of choosing this architecture. The app also works offline, which doesn't come for free but is pretty easy to add once the service worker is set up already.
Of course, service workers have a bunch of pitfalls as well.
One is the absolutely abysmal support in developer tools, which seem to intermittently swallow console.log
and unreliably report when a service worker is installed.
Another is the lack of support for ES modules in Firefox, which forced me to put all my code (including a vendored version of IDB Keyval, which I included because IndexedDB is similarly annoying) in a single file.
This is not an exhaustive list! I would describe the general experience of working with service workers as "not fun".
But! In spite of all that, the htmx single-page app works. Let's dive in!
Let's start with the HTML:
<!DOCTYPE html>
<html>
<head>
<title>htmx spa</title>
<meta charset="utf-8" />
<link rel="stylesheet" href="./style.css" />
<script src="./htmx.js"></script>
<script type="module">
async function load() {
try {
const registration = await navigator.serviceWorker.register("./sw.js");
if (registration.active) return;
const worker = registration.installing || registration.waiting;
if (!worker) throw new Error("No worker found");
worker.addEventListener("statechange", () => {
if (registration.active) location.reload();
});
} catch (err) {
console.error(`Registration failed with ${err}`);
}
}
if ("serviceWorker" in navigator) load();
</script>
<meta name="htmx-config" content='{"scrollIntoViewOnBoost": false}' />
</head>
<body hx-boost="true" hx-push-url="false" hx-get="./ui" hx-target="body" hx-trigger="load"></body>
</html>
This should look familiar if you've ever built a single-page app: the empty husk of an HTML document, waiting to be filled in by JavaScript.
That long inline <script>
tag just sets up the service worker and is mostly stolen from MDN.
The interesting bit here is the <body>
tag, which uses htmx to set up the meat of the app:
hx-boost="true"
tells htmx to use Ajax to swap in the responses of link clicks and form submissions without a full page navigationhx-push-url="false"
prevents htmx from updating the URL in response to said link clicks and form submissionshx-get="./ui"
tells htmx to load the page at /ui
and swap it inhx-target="body"
tells htmx to swap the results into the <body>
elementhx-trigger="load"
tells htmx that it should do all this when the page loadsSo basically: /ui
returns the actual markup for the app, at which point htmx takes over any links and forms to make it interactive.
What's at /ui
?
Enter the service worker!
It uses a small home-brewed Express-like "library" to handle boilerplate around routing requests and returning responses.
How that library actually works is beyond the scope of this post, but it's used like this:
spa.get("/ui", async (_request, { query }) => {
const { filter = "all" } = query;
await setFilter(filter);
const headers = {};
if (filter === "all") headers["hx-replace-url"] = "./";
else headers["hx-replace-url"] = "./?filter=" + filter;
const html = App({ filter, todos: await listTodos() });
return new Response(html, { headers });
});
When a GET
request is made to /ui
, this code…
App
"component" to HTML with the active filter and list of todossetFilter
and listTodos
are pretty simple functions that wrap IDB Keyval:
async function setFilter(filter) {
await set("filter", filter);
}
async function getFilter() {
return get("filter");
}
async function listTodos() {
const todos = (await get("todos")) || [];
const filter = await getFilter();
switch (filter) {
case "done":
return todos.filter(todo => todo.done);
case "left":
return todos.filter(todo => !todo.done);
default:
return todos;
}
}
The App
component looks like this:
function App({ filter = "all", todos = [] } = {}) {
return html`
<div class="app">
<header class="header">
<h1>Todos</h1>
<form class="filters" action="./ui">
<label class="filter">
All
<input
type="radio"
name="filter"
value="all"
oninput="this.form.requestSubmit()"
${filter === "all" && "checked"}
/>
</label>
<label class="filter">
Active
<input
type="radio"
name="filter"
value="left"
oninput="this.form.requestSubmit()"
${filter === "left" && "checked"}
/>
</label>
<label class="filter">
Completed
<input
type="radio"
name="filter"
value="done"
oninput="this.form.requestSubmit()"
${filter === "done" && "checked"}
/>
</label>
</form>
</header>
<ul class="todos">
${todos.map(todo => Todo(todo))}
</ul>
<form
class="submit"
action="./todos/add"
method="get"
hx-select=".todos"
hx-target=".todos"
hx-swap="outerHTML"
hx-on::before-request="this.reset()"
>
<input
type="text"
name="text"
placeholder="What needs to be done?"
hx-on::after-request="this.focus()"
/>
</form>
</div>
`.trim();
}
(As before, we'll skip some of the utility functions like html
, which just provides some small conveniences when interpolating values.)
App
can be broken down into roughly three sections:
/ui
, which re-renders the app using the steps described above.
Thehx-boost
attribute from before intercepts the form submission and swaps the response back into the <body>
without refreshing the page.Todo
component./todos/add
.[2]hx-target=".todos"
tells htmx to replace an element on the page with class todos
; hx-select=".todos"
tells htmx that rather than using the entire response, it should just use an element with class todos
.Let's take a look at that /todos/add
route:
async function addTodo(text) {
const id = crypto.randomUUID();
await update("todos", (todos = []) => [...todos, { id, text, done: false }]);
}
spa.get("/todos/add", async (_request, { query }) => {
if (query.text) await addTodo(query.text);
const html = App({ filter: await getFilter(), todos: await listTodos() });
return new Response(html, {});
});
Pretty simple! It just saves the todo and returns a response with the re-rendered UI, which htmx thens swap into the DOM.
Now, let's look at that Todo
component from before:
function Icon({ name }) {
return html`
<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" viewBox="0 0 12 12">
<use href="./icons.svg#${name}" />
</svg>
`;
}
function Todo({ id, text, done, editable }) {
return html`
<li class="todo">
<input
type="checkbox"
name="done"
value="true"
hx-get="./todos/${id}/update"
hx-vals="js:{done: event.target.checked}"
${done && "checked"}
/>
${editable
? html`<input
type="text"
name="text"
value="${text}"
hx-get="./todos/${id}/update"
hx-trigger="change,blur"
autofocus
/>`
: html`<span
class="preview"
hx-get="./ui/todos/${id}?editable=true"
hx-trigger="dblclick"
hx-target="closest .todo"
hx-swap="outerHTML"
>
${text}
</span>`}
<button class="delete" hx-delete="./todos/${id}">${Icon({ name: "ex" })}</button>
</li>
`;
}
There are three main parts here: the checkbox, the delete button and the todo text.
First, the checkbox.
It triggers a GET
request to /todos/${id}/update
every time it's checked or unchecked, with a query string done
matching its current state; htmx swaps the full response into the <body>
.
Here's the code for that route:
async function updateTodo(id, { text, done }) {
await update("todos", (todos = []) =>
todos.map(todo => {
if (todo.id !== id) return todo;
return { ...todo, text: text || todo.text, done: done ?? todo.done };
})
);
}
spa.get("/todos/:id/update", async (_request, { params, query }) => {
const updates = {};
if (query.text) updates.text = query.text;
if (query.done) updates.done = query.done === "true";
await updateTodo(params.id, updates);
const html = App({ filter: await getFilter(), todos: await listTodos() });
return new Response(html);
});
(Notice that the route also supports changing the todo text. We'll get to that in a minute.)
The delete button is even simpler: it makes a DELETE
request to /todos/${id}
.
As with the checkbox, htmx swaps the full response into the <body>
.
Here's that route:
async function deleteTodo(id) {
await update("todos", (todos = []) => todos.filter(todo => todo.id !== id));
}
spa.delete("/todos/:id", async (_request, { params }) => {
await deleteTodo(params.id);
const html = App({ filter: await getFilter(), todos: await listTodos() });
return new Response(html);
});
The final part is the todo text, which is made more complicated by the support for editing the text.
There are two possible states: "normal", which just displays a simple <span>
with the todo text (I'm sorry that this isn't accessible!) and "editing", which displays an <input>
that allows the user to edit it.
The Todo
component uses the editing
"prop" to determine which state to render.
Unlike in a client-side framework like React, though, we can't just toggle state somewhere and have it make the necessary DOM changes. htmx makes a network request for the new UI, and we need to return a hypermedia response that it can then swap into the DOM.
Here's the route:
async function getTodo(id) {
const todos = await listTodos();
return todos.find(todo => todo.id === id);
}
spa.get("/ui/todos/:id", async (_request, { params, query }) => {
const todo = await getTodo(params.id);
if (!todo) return new Response("", { status: 404 });
const editable = query.editable === "true";
const html = Todo({ ...todo, editable });
return new Response(html);
});
At a high level, the coordination between webpage and service worker looks something like this:
<span>
s/ui/todos/${id}?editable=true
Todo
component that includes the <input>
rather than the <span>
When the user changes the input, a similar process happens, calling the /todos/${id}/update
endpoint instead and swapping the whole <body>
.
If you've used htmx, this should be a pretty familiar pattern.
That's it! We now have a single-page app built with htmx (and service workers) that doesn't rely on a remote web server. The code I omitted for brevity is available on GitHub.
So, this technically works. Is it a good idea? Is it the apotheosis of hypermedia-based applications? Should we abandon React and build apps like this?
htmx works by adding indirection to the UI, loading new HTML from across a network boundary. That can make sense in a client-server app, because it reduces indirection with regard to the database by colocating it with rendering. On the other hand, the client-server story in a framework like React can be painful, requiring careful coordination between clients and servers via an awkward data exchange channel.
When all interactions are local, though, the rendering and data are already colocated (in memory) and updating them in tandem with a framework like React is easy and synchronous. In this case, the indirection that htmx requires starts to feel more burdensome than liberatory.[3] For fully local apps, I don't think juice is worth the squeeze.
Of course, most apps aren't fully local — usually, there's a mix of local interactions and network requests. My sense is that even in that case, islands of interactivity is a better pattern than splitting your "server-side" code between the service worker and the actual server.
In any event, this was mostly an exercise to see what it might look like to build a fully local single-page app using hypermedia, rather than imperative or functional programming.
Note that hypermedia is a technique rather than a specific tool.
I chose htmx because it's the hypermedia library framework du jour, and I wanted to stretch it as far as I could.
There are other tools like Mavo that explicitly focus on this use case, and indeed you can see that the Mavo implementation of TodoMVC is far simpler than what I've built here.
Better still would be some sort of HyperCard-esque app in which you could build the whole thing visually.
All in all, my little single-page htmx todo app was fun to build. If nothing else, take this as a reminder that you can and should occasionally try using your tools in weird and unexpected ways!
React developers hate him! ↩︎
You might notice that the form method is GET
rather than POST
. That's because service workers in Firefox don't seem to support request bodies, which means we need to include any relevant data in the URL. ↩︎
htmx isn't actually a required component of this architecture.
You could, in theory, build a fully client-side single-page app with no JavaScript at all (outside of the service worker) by simply wrapping every button in a <form>
tag and replacing the full page on every action.
Since the responses all come from the service worker, it would still be lightning fast; you could probably even add in some slick animations using cross-document view transitions. ↩︎
2024-10-01 08:00:00
I just got back from a travel sabbatical. While the trip turned out great, the planning process was decidedly… less so. Figuring out six months of travel is a daunting task, and I quickly became dissatisfied with existing tools.
True to myself, I yak shaved the problem. Introducing Waypoint: a local-first web app for planning trips!
You might be thinking "hey, that looks a lot like that trip planning app Ink & Switch built", and you'd be right: Embark was the single biggest influence on Waypoint. In fact, Embark is even more ambitious — pulling in data like weather forecasts, embedding arbitrary views like calendars and introducing a new formula language for live calculations. I highly recommend reading their writeup! But Ink & Switch didn't make Embark public, and I needed to plan a long trip, so here we are.
I want to talk about three things: the big ideas behind Waypoint, how I actually built it and what I learned.
(Quick disclaimer: Waypoint is not — and probably will never be — production-ready software. I built it to fit my exact needs while planning this trip. There are rough edges, missing features and bugs. There's no authentication. I'm sharing it because I think it's a useful case study in building an actual local-first app, not because I'm trying to dethrone Google Maps.)
I tried a few existing tools before deciding to build my own. Apple Notes was too spartan, Notion and Google Maps were too clunky and Wanderlog was much too structured to use for research and exploration.[1]: In every tool, it was either difficult to enter rough, unstructured ideas, or difficult to take those ideas and create a more formal plan.
Waypoint addresses three important shortcomings of other tools:
In short, I wanted an app where I could jot down loose notes about places I was interested in visiting, visualize different routes and gradually narrow it all down into an actual itinerary.
The interface I landed on has two panels: a text editor on the left and a map on the right.
One common task when planning a trip is gathering a list of locations you're interested in visiting.
The simplest solution is using a normal text editor. Data entry is quick; the only real limiting factor is how fast you can type. The obvious drawback is that locations are displayed textually rather than plotted on a map, obscuring any spatial relationship between them.
The only dedicated tool for this that I really know of is Google's My Maps (the neglected stepchild of Google Maps). It nails the spatial visualization criterion. But data entry is awkward and slow; tasks like organizing places into groups require a lot of clicking.
In Waypoint, the main interactive component is a rich text editor.
You use it just as you would Google Docs or Microsoft Word — type notes, add some formatting, cut and paste lines to rearrange your thoughts.
Adding a location is as easy as typing its name, using an @mention
-style autocomplete inspired by Embark.
Characters show up as quickly as you can type them, and any changes are reflected instantly on the map view beside the document.
Even when apps make data entry easy, that data is often transient, making it difficult to see comparisons.
For example, if you want to see where two locations are relative to each other in Apple or Google Maps, you're forced to use the navigation feature to create a route between them. And only one route is visible at a time — to see a different set of locations, you need to clear the route you're currently looking at. This makes it very difficult to, say, determine which of a group of locations are near each other in order to cluster them on different days of an itinerary.
In Waypoint, every location is plotted on a map, so you always have a bird's eye view of your trip.
To show routes, you can create a "route list" by beginning a line with ~
(just as you would with -
for a bulleted list, or 1.
for a numbered list).
Every location in the list has a route drawn between its marker on the map and the next one.
By default, the routes are the driving directions between the two locations, but you can toggle between that and a straight line by clicking on the location name and unchecking “Navigate”.
It's easy to add, remove and rearrange locations in the route: just use the text editing commands you already know to edit the list, and the map automatically updates! To compare two routes, you can just copy and paste the whole list and rearrange as you see fit.
A bird's eye view is nice, but sometimes you want to "zoom in" on a subset of your work. To accommodate this, Waypoint also includes a focus mode — inspired by iA Writer — which dims all paragraphs other than the one under your text cursor. On the map, Waypoint only shows the locations and routes in that paragraph.
Together, these features enable a powerful workflow: make a route list, copy and paste it below, alter the second list, enable focus mode and move your cursor between the two to quickly see the difference between them. No other tool I tried made this nearly as quick or as easy.
At a glance, Waypoint isn't too different from your average single-page app:
Hold up — that last one seems kinda weird?
It's actually the key difference between Waypoint and a traditional single-page app. Rather than storing data on a server using a database like MySQL or Postgres, Waypoint is a local-first app that stores its data on the client using a CRDT.
(Some brief exposition: CRDTs are data structures that can be stored on different computers and are guaranteed to eventually converge upon the same state. For a fuller explanation, check out my article An Interactive Intro to CRDTs, which breaks down the fundamental ideas behind CRDTs and how they work.)
CRDTs are often used to build collaborative experiences like you might see in Google Docs or Figma — except rather than requiring a centralized server to resolve conflicts, the clients can do it themselves. That decentralized sync allows the clients to store the canonical state of the data, rather than a copy fetched from a web server.
This approach confers some important benefits:
That's why this kind of app is called local-first. If you have the app and you have your data, you can still work on it — even if you're not connected to the Internet or the developer has gone out of business.
All of this might seem like overkill for a personal app with a single user. But I was planning this trip with my wife, Sarah, so Waypoint quickly needed realtime collaboration. To address that, Waypoint uses a library called Y-Sweet by Jamsocket.
There are two parts to Y-Sweet:
@y-sweet/client
: an npm package that gets included in the client-side bundle. This package is a Yjs “provider” — a plugin that syncs a Yjs document somewhere.Architecturally, Y-Sweet acts as a bus: clients connect to the Y-Sweet server rather than directly to each other. Whenever a client connects or makes changes, it syncs its local document with the Y-Sweet server. Y-Sweet merges the client's document into its own copy, saves it to S3 and broadcasts updates to other clients. Since CRDTs are guaranteed to eventually converge on the same state, at the end of this process all clients have the same document.
This also makes it easy to share documents. Each Waypoint document is identified by a UUID. When a user opens a link with a given document's UUID, their Waypoint client connects to Y-Sweet and tries to sync their local copy with Y-Sweet's copy. If that user has never opened that document, they have no local copy, and the sync operation results in them just getting Y-Sweet's copy in its entirety.
Here's a diagram of Waypoint's architecture after introducing Y-Sweet:
One reasonable objection here is that it looks an awful lot like a traditional client-server app — just replace Y-Sweet with a normal application server and S3 with a database. Doesn't that defeat the whole purpose of local-first?
Ink & Switch addresses this in the case study of their Pushpin software:
Thus, in addition to local data storage on each device, the cross-device data synchronisation mechanism should also depend on servers to the least degree possible, and servers should avoid taking unnecessary responsibilities. Where servers are used, we want them to be as simple, generic, and fungible as possible, so that one unavailable server can easily be replaced by another. Further, these servers should ideally not be centralised: any user or organisation should be able to provide servers to serve their needs.
You can think of Y-Sweet as a "cloud peer". Under the hood, it runs plain old stock Yjs — the exact same code that runs on the client. If you connected Waypoint to your own Y-Sweet server, there would be no discernible difference. To borrow Ink & Switch's parlance: it's "simple, generic, and fungible".
Y-Sweet is one of two Yjs providers that Waypoint uses.
The other, called y-indexeddb
, takes care of offline editing: it persists the Yjs document to the browser's local IndexedDB storage.
Even if a user gets disconnected from the Internet, edits a document and then closes their browser, none of their work will be lost.
A popular question lately: what actually counts as local-first?
My mantra is "if the client has the canonical copy of the data, it's local-first".[3] But Ink & Switch formalizes this with seven proposed ideals. Let's see how Waypoint stacks up:
Five "yes", one "sorta" and one "no". Keep in mind that all the relevant technologies are off-the-shelf; most of these capabilities came for free by choosing Yjs (although any given CRDT library would have worked similarly) and Y-Sweet. Not bad!
Okay, so what did I learn?
Most importantly, local-first is not some pie-in-the-sky dream architecture. Although there are still problems to be worked out,[4]it's very possible to build a useful local-first app, today, with existing tools.
It helps a lot that various libraries in the ecosystem compose well.
Just snapping together ProseMirror, Yjs and Y-Sweet gave me a collaborative rich text editor with shared cursors.
Adding in yjs-indexeddb
made it work offline.
This was all mostly out of the box, with very little setup; the degree to which everything Just Works is impressive.
That said, I think this is a best-case scenario — text editors seem to be the most "plug and play" genre of local-first app. But in general, the building blocks all fit together nicely.
The same can't be said of Svelte — or, presumably, frontend JavaScript frameworks in general — which needed some massaging to work with Yjs.
To determine when to re-render, "reactive" frameworks like Svelte and Solid track property access using Proxies, whereas "immutable" frameworks like React rely on object identity. A Yjs document is a class instance that mutates its internal state, which doesn't play well with either paradigm. To have Svelte re-render when the document changed, I had to trick it into invalidating its state:
let ydoc = $state(new YDoc());
// HACK: the yjs doc is mutated internally, so we need to manually invalidate the reactive variable
let outline = $state(this.ydoc.getXmlFragment("outline"));
ydoc.on("update", () => {
outline = undefined;
outline = this.ydoc.getXmlFragment("outline");
});
Even so, in a lot of ways the developer experience was still much better than in a traditional single-page app. Here's (roughly) the code to update the document title:
let title = $state("" + ydoc.getText("title"));
function setTitle(next: string) {
const text = ydoc.getText("title");
text.delete(0, text.length);
text.insert(0, next);
title = next;
}
Sure, there's some weird CRDT-related boilerplate, but still: no async function, no try…catch
, no worrying about the server.
I just set the title and move on with my life; Yjs will worry about syncing it in the background.
That might sound like magic, but I think it's just a natural consequence of a fundamentally better abstraction. Using a local-first architecture rather than client-server promises to dramatically simplify single-page apps.
I was dreading adding offline support, but it turned out to be surprisingly easy.
SvelteKit supports service workers out of the box, and the documentation even provides some example code as a starting point.
It wasn't perfect, but it got me probably 95% of the way there — I could load any document I'd already opened, even without an Internet connection.
And as far as saving edits made offline, integrating y-indexeddb
took one single line of code.
I hope you enjoyed this! I had a lot of fun building Waypoint. This was my first hands-on foray into the local-first ecosystem, and it turned out to be a lot smoother than I anticipated.
If you want to see the code behind this explanation, you can find it on GitHub.
We did end up using Wanderlog once we had our itinerary. The mobile app is buggy, but its ability to automatically import tickets and confirmations from our email and save them for offline access was incredibly useful. ↩︎
This is a slight oversimplification describing the managed version of Y-Sweet. For Waypoint, I self-hosted Y-Sweet, which involves running the Y-Sweet server on a Cloudflare Worker and using a small server-side library within SvelteKit to negotiate the connection. ↩︎
Technically, if the client has have the canonical copy of the data but never sends it over the network, it's not really local-first — just local. I explore this dynamic more in The Website vs. Web App Dichotomy Doesn't Exist. ↩︎
One such problem is access control, which I did not attempt to address with Waypoint. Ink & Switch has an ongoing project called Beehive exploring approaches to solving this. ↩︎