2025-08-21 08:00:00
Jujutsu ("jj") sits atop a Git repository and its commands mostly mirror into Git operations; for example, a jj commit is a Git commit.
When collaborating with others with Git you push and pull branches. Meanwhile, jj has a feature called "bookmarks" that are the mechanism for working with Git branches, but which have fairly different behavior from Git branches.
This post goes into the why and how to use bookmarks for Git collaboration.
Part of jj's whole deal is that it collapses many Git concepts (stashes, staging, fixups, in-progress rebases, conflicts) into a single unified model of working with history, which then lets you use the same tools to do all of those things. For example, to fix up an old commit you jump to it, edit it, and jump back to where you were; to fix a rebase conflict you jump to the conflicting commit, edit it, and jump back to where you were, using the same commands.
All this jumping around means that the Git idea of being "on" a particular branch does not make sense in jj. When working on a change I might stop part way through doing one thing, start a different thing based on a commit a few steps back, possibly reshuffle commits around, and have a few extra commits on the side with experiments lingering around as well. Based on my former Git expertise I might have done this kind of thing by making a bunch of Git stashes and branches.
Instead, in jj when you work you are "on" a commit, and when you switch you switch between commits, not branches. After a year of using jj I can assure you that not having branch names for these has worked out just fine.
Like a Git branch, a jj bookmark is a name that points to a commit, and there are the commands you'd expect to create/delete/rename and move bookmarks around. Unlike Git branches, bookmarks are fixed to a commit unless you manually move them; when you create new commits jj does not automatically move bookmarks around.
In my experience with jj, I have had no use for bookmarks other than for interacting with Git. In principle you could use them to make note of important commits, which I suppose is where the name comes from. Maybe other people have different workflows.
In a colocated jj/Git repository (which is the normal way to use jj), bookmarks are 1:1 with Git branches: creations/modifications/etc via either system are reflected in the other.
After cloning a Git repository, jj creates "remote" bookmarks with names like
main@origin
. These are immutable and represent the state of the remote
repository.
You could also make a local bookmark named main
that is wholly independent.
But on a fresh clone, the local bookmark main
is marked as tracking
main@origin
. Conceptually this is similar to Git's notion of an "upstream"
branch, but with different behavior.
Suppose main
is a tracking bookmark. jj attempts to keep it in sync with
main@origin
:
jj git push
, if main
is ahead of main@origin
, jj pushes the
changes. (When you're in a state where a push would make a change, jj status
shows the bookmark name as main*
.)jj git fetch
, jj updates main@origin
as well as updates your
local main
if it's behind.If after a fetch the two sides diverge (both contain commits), then the local
main
will be marked as conflicting and point to both commits. This displays
in status as main??
. You will need to manually choose where it points with
jj bookmark set main -r ...
to fix it before using it again.
At least for me this was super weird at first, but now makes so much sense that I cannot remember why I was confused. I think the right way to think about it is that a tracking bookmark is modeling "what I intend this bookmark to be, both locally and remotely" and the jj push/fetch commands keep that in sync.
As distinct from Git, note there is no separate "fetch" vs "pull" commands. (A historical note: apparently both "fetch" vs "pull" commands existed in Git and Mercurial. They agreed that one meant "download the changes" and the other meant "do that and also merge them", but they flipped the meanings!)
If you are just making changes locally and just want to push your changes to
main, you must update the bookmark before pushing with a command like
jj bookmark set main -r @
. This is currently the clunkiest part of jj. There
have been conversations in the project about how to improve it.
If you search for jj tug
online you will see a common alias people set up to
automate this.
If you are comfortable with Git push syntax, an alternative I use for when I just want to push my code is to tell Git exactly what I want to push and where to put it:
$ git push origin SOMEHASH:main
Note this is plain git push
, no jj or any bookmarks involved.
If you want to push a bookmark/branch for someone else to review or pull, the commands are:
$ jj bookmark create some-name
$ jj git push
(The second command will complain that some-name
does not exist remotely, and
then tell you how to fix it. There are flags for specifying which remote to push
to etc.)
Typically in jj you won't have bookmark names ready when you're sending off code reviews. To simplify things jj can generate a bookmark name for you as it pushes.
$ jj git push -c @
Creating bookmark push-sytrsqlnznzr for revision sytrsqlnznzr
Changes to push to origin:
Add bookmark push-sytrsqlnznzr to 5865f9673d0f
This is my primary workflow when working on GitHub, even solo. Pushing changes in a pull request lets the CI run over it.
jj treats history as mutable, making it natural to edit and reorder commits as you work. When collaborating with others, modifying history can be confusing or dangerous.
jj has a notion of "immutable" commits, which is the part of history that should not be modfied. In the default configuration this effectively means code that has been pushed to Git cannot be modified, with the exception of code in tracked bookmarks. This means that you can continue modify a branch after pushing it, for example in response to code reviews. The next push will update it.
There are further safety checks around things like not letting you move a branch backwards (because that would trim off the later commits). In practice I don't understand all the rules, and sometimes it will prompt me to pass a flag to say "I really do mean to do this". It has been fine so far.
(This final section is trivia and only interesting because I came to understand it when writing this post.)
jj commands that accept commits take a "revset" argument, which is the little
language for specifying commits. For example you can say jj diff -r @-
to see
the diff of the previous commit; the @-
expression means "parent of the
current commit". As the name suggests, revsets can refer to sets of commits. (jj
really ought to pick either "commit" or "revision" for talking about things,
it's confusing to have these as synonyms!)
When a branch is conflicting, the revset it names refers to multiple commits:
the commit you had locally and the commit seen remotely. Meanwhile, note that
the command to create a merge in jj is to create a commit with multiple parents,
jj new parent1 parent2 ...
.
Putting these together, with a conflicting main??
bookmark, you can do:
jj diff -r main
to show a diff of what the merge of the two commits would
look likejj new all:main
to create a merge commit of the two commits (where the
"all:" prefix
means something like "I really do mean for this to refer to multiple commits";
looks like they're still figuring out how this should work)I have never had a reason to need this trivia but it is kind of neat to see how these pieces fit together.
2025-08-09 08:00:00
I contributed a minor feature to the Jujutsu version control system, which I wrote about previously.
When you run diff --stat
in Git, it shows you a summary of your change as a
list of modified files and counts of added and removed lines for each modified
file. For binary files, Git displays the difference in byte size. Here's an
example commit where I grew a .dll
file:
commit 9649ab9bf70c92a1ebe2ac39b4d2ef86b1de37b9
Author: Evan Martin <[email protected]>
Date: Thu Oct 17 11:56:28 2024 -0700
dinput: more stubs
win32/dll/dinput.dll | Bin 2560 -> 3584 bytes
win32/src/winapi/dinput/builtin.rs | 48 ++++++++++++++++++++++++++++++++++++++++++++----
win32/src/winapi/dinput/dinput.rs | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++--
3 files changed, 95 insertions(+), 6 deletions(-)
Jujutsu has the same feature except it didn't handle binary files: it would just count the number of 0x0a bytes in the file, which is not very useful. So I fixed that.
This is a very minor feature but it turned out to be more subtle than I expected for one main reason: the above output is sized to make each line fit the terminal width, which means it truncates the file names if they are too long and also scales the graph on the right to fit. You need to end up being careful to measure all the relevant text and be careful with rounding as well as underflowing zero (e.g. if the terminal is too narrow to fit the filename at all).
Here are some minor notes.
Slightly different output: Git shows 2560 -> 3584 bytes
, but after
discussion in the PR about whether to show plain byte counts or pretty-print the
numbers, I convinced myself that the other lines in diff --stat
output are
only showing the magnitude of the change and not the before/after. So my output
looks like (binary) +1024 bytes
. This means that you can't tell a grown file
from a fully added or removed file, but that was already true for text files,
and that's never bothered me in my years of using Git.
expect tests: I had learned about "expect tests" from this Jane Street blog post. From the post it sounded like a great feature but it was for OCaml only, so I never tried it. I was delighted to discover that Jujutsu uses them via Insta, a Rust library that provides a similar thing.
In the test for my change it runs jj diff --stat
and asserts what the output
looks like, as follows. The cool thing about Insta is that I didn't need to
hand-update this text; instead it can run the test and interactively step
through which outputs differ, and for the changes I accept it automatically
inserts them back into the code.
let output = work_dir.run_jj(["diff", "--stat"]);
// Rightmost display column ->|
insta::assert_snapshot!(output, @r"
binary_added.png | (binary) +12 bytes
binary_modified.png | (binary)
...fied_to_text.png | (binary) -8 bytes
binary_removed.png | (binary) -16 bytes
...y_valid_utf8.png | (binary) +3 bytes
5 files changed, 0 insertions(+), 0 deletions(-)
[EOF]
");
(The idea of expect tests is deeper than just textual command output! Read the original blog post for more.)
Colored output: When generating textual output, Jujutsu tags substrings with
keywords like added
or binary
which then feeds into an outer system that
assigns colors to these semantic categories. This is a neat mechanism to keep
colors consistent across different commands while allowing for customization. In
particular if you customize the output of other commands like log
, you'll
interact with these.
Rust build output is massive: This is my first tinkering with Jujutsu, but
over the ~two months that I worked on this, my target/
dir (containing Rust
build output) grew to over 25gb. Jeepers. I think it was maybe intermediate
outputs of various libraries whose versions themselves varied over that time
period?
PS: I didn't actually work on it for two months! I worked on it for couple hours, forgot about it, picked up some weeks later, and then repeated that a few times.
Double width characters: File names can be Unicode, and even in a terminal
some Unicode characters (particularly Chinese) are supposed to occupy two
columns. This means to measure the width filename and properly elide it with
...
you need not only Unicode character handling, but also data tables about
which codepoints are double-width.
This code was already all implemented and I did not touch it, but I mostly note that even a pretty basic thing like "shorten a filename to make the text align on the terminal" quickly becomes a whole project if you try to do it thoroughly.
Commit access: After writing a few PRs, they granted me access to merge my own changes. Pretty cool thing to do for a first-time contributor! I expect the repo is set up to refuse force-pushes so I suppose if I mess things up they can always fix it.
Future work: When writing this blog post I looked at the output a bit more carefully and noticed yet more aligning is to be done.
2025-05-25 08:00:00
This post is part of a series on retrowin32.
The Rust compiler compiles code in parallel. But the unit of caching is the crate — a concept larger than a module, which corresponds maybe to a library in the C world or package in the JS world. A typical program is a single crate. This means every time you run the compiler, it compiles all the code from scratch. To improve build performance, you can split a program into multiple crates, under the hope that with each compile you can reuse the crates you didn't modify.
retrowin32 was already arranged as a few crates along some obvious boundaries. The x86 emulator, the win32 implementation, the native and web targets were each separate. But the win32 implementation and the underlying system were necessarily pretty tangled, because (among other things) x86 code calls win32 functions which might need to call back into x86 code.
This meant any change to the win32 implementation recompiled a significant
quantity of code. This post is about how I managed to split things up further,
with one crate per Windows library. retrowin32 now has crates like
builtin-gdi32
and builtin-ddraw
that implement those pieces of Windows, and
they can now compile and cache in parallel (mostly).
Going in, there was a god object Machine
that held both the CPU emulator (e.g.
the state of the registers) as well as the rest of the system (e.g. memory and
kernel state). When the Machine
emulated its way to a win32 function call (as
described in
the syscalls post), it passed
itself to the target, which would allow it to poke at system state and
potentially call back into further emulation.
For example, the Windows CreateWindow
API creates a window and as part of that
process it synchronously "sends" the
WM_CREATE
message, which concretely means within CreateWindow
we invoke the window
procedure and hand control back to the emulated code.
You cannot have cycles between crates, so this cycle meant that we must put
Machine
and all the win32 implementation in one single crate. The fix, like
with most computer science problems, is adding a layer of abstraction.
A new shared crate defines a System
trait, which is the interface expressing
"things from the underlying system that a win32 function implementation might
need to call". This is then passed to win32 APIs and implemented by Machine
,
allowing us to compile win32 functions as separate crates, each depending only
on the definition of System
.
One interesting consequence of this layout is that the win32 implementation no
longer directly depends on any emulator at all, as long as the System
interface exposes some way to invoke user code. You could hypothetically imagine
a retrowin32 that runs on native 32-bit x86, or alternatively one that lets you
port a Windows program that you have source for to a non-x86 platform like
winelib.
I mentioned above that Machine
also holds system state. For example, gdi32
implements the drawing API, which provides functions that vend handles to device
contexts. The new gdi32
library enabled by the System
interface can declare
what state it needs, but we must store that state somewhere.
Further, there are interdependencies between these various Windows libraries.
user32
, which handles windows and messaging, needs to use code from gdi32
to
implement drawing upon windows. But the winmm
crate, which implements audio,
is independent from those.
One obvious way — the way I imagine it might work in real Windows — is for this state to be held in per-library static globals. I came up with a different solution that is a little strange so I thought I would write it down and see if any reader has a name for it or a better way.
To restate the problem, there's a core Machine
type that depends on all the
libraries and which holds all the program state. But we want to be able to build
each library independently, possibly with interdependencies between them,
without them holding a dependency on Machine
itself.
The answer is for the ouroborus-breaking System
trait to expose a
dynamically-typed "get my state by its type" function:
fn state(&self, id: &std::any::TypeId) -> &dyn std::any::Any;
Each library, e.g. gdi32
, can register its state (a gdi32::State
, perhaps)
and fetch it when needed from the system. This way a library like user32
can
call gdi32
and both of them can access their own internal state off of the
shared state object.
It's maybe just a static with extra steps. I'm not sure yet if I like it.
Most of the win32 API is now in separate crates. (The remaining piece is
kernel32
, which is the lowest-level piece and will need some more work to pull
apart.)
Here's a waterfall of the part of the build that involves these separate crates:
Per xkcd this probably won't save me time overall, but at least I don't have to wait as long when I'm repeatedly cycling.
At the bottom you see the final win32
crate that ties everything together.
This one is still too slow (possibly due to kernel32), but it's better than
before!