2025-03-07 08:00:00
Hello! Today I want to talk about ANSI escape codes.
For a long time I was vaguely aware of ANSI escape codes (“that’s how you make text red in the terminal and stuff”) but I had no real understanding of where they were supposed to be defined or whether or not there were standards for them. I just had a kind of vague “there be dragons” feeling around them. While learning about the terminal this year, I’ve learned that:
So I wanted to put together a list for myself of some standards that exist around escape codes, because I want to know if they have to feel unreliable and frustrating, or if there’s a future where we could all rely on them with more confidence.
Have you ever pressed the left arrow key in your terminal and seen ^[[D
?
That’s an escape code! It’s called an “escape code” because the first character
is the “escape” character, which is usually written as ESC
, \x1b
, \E
,
\033
, or ^[
.
Escape codes are how your terminal emulator communicates various kinds of information (colours, mouse movement, etc) with programs running in the terminal. There are two kind of escape codes:
ESC[D
, “Ctrl+left arrow” might be ESC[1;5D
, and clicking the mouse might
be something like ESC[M :3
.Now let’s talk about standards!
The first standard I found relating to escape codes was ECMA-48, which was originally published in 1976.
ECMA-48 does two things:
ESC[
+ something and “OSC” codes, which are ESC]
+ something)ESC[D
, or “turn text red” is ESC[31m
. In the spec, the “cursor left”
one is called CURSOR LEFT
and the one for changing colours is called
SELECT GRAPHIC RENDITION
.The formats are extensible, so there’s room for others to define more escape codes in the future. Lots of escape codes that are popular today aren’t defined in ECMA-48: for example it’s pretty common for terminal applications (like vim, htop, or tmux) to support using the mouse, but ECMA-48 doesn’t define escape codes for the mouse.
There are a bunch of escape codes that aren’t defined in ECMA-48, for example:
I believe (correct me if I’m wrong!) that these and some others came from xterm, are documented in XTerm Control Sequences, and have been widely implemented by other terminal emulators.
This list of “what xterm supports” is not a standard exactly, but xterm is extremely influential and so it seems like an important document.
In the 80s (and to some extent today, but my understanding is that it was MUCH more dramatic in the 80s) there was a huge amount of variation in what escape codes terminals actually supported.
To deal with this, there’s a database of escape codes for various terminals called “terminfo”.
It looks like the standard for terminfo is called X/Open Curses, though you need to create an account to view that standard for some reason. It defines the database format as well as a C library interface (“curses”) for accessing the database.
For example you can run this bash snippet to see every possible escape code for “clear screen” for all of the different terminals your system knows about:
for term in $(toe -a | awk '{print $1}')
do
echo $term
infocmp -1 -T "$term" 2>/dev/null | grep 'clear=' | sed 's/clear=//g;s/,//g'
done
On my system (and probably every system I’ve ever used?), the terminfo database is managed by ncurses.
I think it’s interesting that there are two main approaches that applications take to handling ANSI escape codes:
TERM
environment variable. Fish does this, for example.Some examples of programs/libraries that take approach #2 (“don’t use terminfo”) include:
I got curious about why folks might be moving away from terminfo and I found this very interesting and extremely detailed rant about terminfo from one of the fish maintainers, which argues that:
[the terminfo authors] have done a lot of work that, at the time, was extremely important and helpful. My point is that it no longer is.
I’m not going to do it justice so I’m not going to summarize it, I think it’s worth reading.
I was just talking about the idea that you can use a “common set” of escape codes that will work for most people. But what is that set? Is there any agreement?
I really do not know the answer to this at all, but from doing some reading it seems like it’s some combination of:
and maybe ultimately “identify the terminal emulators you think your users are going to use most frequently and test in those”, the same way web developers do when deciding which CSS features are okay to use
I don’t think there are any resources like Can I use…? or Baseline for the terminal though. (in theory terminfo is supposed to be the “caniuse” for the terminal but it seems like it often takes 10+ years to add new terminal features when people invent them which makes it very limited)
I also asked on Mastodon why people found terminfo valuable in 2025 and got a few reasons that made sense to me:
TERM
environment variable to
control how programs behave (for example with TERM=dumb
), and there’s
no standard for how that should work in a post-terminfo worldThe way that ncurses uses the TERM
environment variable to decide which
escape codes to use reminds me of how webservers used to sometimes use the
browser user agent to decide which version of a website to serve.
It also seems like it’s had some of the same results – the way iTerm2 reports itself as being “xterm-256color” feels similar to how Safari’s user agent is “Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.3 Safari/605.1.15”. In both cases the terminal emulator / browser ends up changing its user agent to get around user agent detection that isn’t working well.
On the web we ended up deciding that user agent detection was not a good practice and to instead focus on standardization so we can serve the same HTML/CSS to all browsers. I don’t know if the same approach is the future in the terminal though – I think the terminal landscape today is much more fragmented than the web ever was as well as being much less well funded.
A few more documents and standards related to escape codes, in no particular order:
I sometimes see people saying that the unix terminal is “outdated”, and since I love the terminal so much I’m always curious about what incremental changes might make it feel less “outdated”.
Maybe if we had a clearer standards landscape (like we do on the web!) it would be easier for terminal emulator developers to build new features and for authors of terminal applications to more confidently adopt those features so that we can all benefit from them and have a richer experience in the terminal.
Obviously standardizing ANSI escape codes is not easy (ECMA-48 was first published almost 50 years ago and we’re still not there!). I don’t even know what all of the challenges are. But the situation with HTML/CSS/JS used to be extremely bad too and now it’s MUCH better, so maybe there’s hope.
2025-02-13 20:27:56
I was talking to a friend about how to add a directory to your PATH today. It’s
something that feels “obvious” to me since I’ve been using the terminal for a
long time, but when I searched for instructions for how to do it, I actually
couldn’t find something that explained all of the steps – a lot of them just
said “add this to ~/.bashrc
”, but what if you’re not using bash? What if your
bash config is actually in a different file? And how are you supposed to figure
out which directory to add anyway?
So I wanted to try to write down some more complete directions and mention some of the gotchas I’ve run into over the years.
Here’s a table of contents:
If you’re not sure what shell you’re using, here’s a way to find out. Run this:
ps -p $$ -o pid,comm=
97295 bash
97295 zsh
$$
isn’t valid syntax in fish, but in any case the error
message tells you that you’re using fish, which you probably already knew)Also bash is the default on Linux and zsh is the default on Mac OS (as of 2024). I’ll only cover bash, zsh, and fish in these directions.
~/.zshrc
~/.bashrc
, but it’s complicated, see the note in the next section~/.config/fish/config.fish
(you can run echo $__fish_config_dir
if you want to be 100% sure)Bash has three possible config files: ~/.bashrc
, ~/.bash_profile
, and ~/.profile
.
If you’re not sure which one your system is set up to use, I’d recommend testing this way:
echo hi there
to your ~/.bashrc
~/.bashrc
is being used! Hooray!~/.bash_profile
~/.profile
if the first two options don’t work.(there are a lot of elaborate flow charts out there that explain how bash decides which config file to use but IMO it’s not worth it to internalize them and just testing is the fastest way to be sure)
Let’s say that you’re trying to install and run a program called http-server
and it doesn’t work, like this:
$ npm install -g http-server
$ http-server
bash: http-server: command not found
How do you find what directory http-server
is in? Honestly in general this is
not that easy – often the answer is something like “it depends on how npm is
configured”. A few ideas:
cargo
, npm
, homebrew
, etc),
when you first set it up it’ll print out some directions about how to update
your PATH. So if you’re paying attention you can get the directions then.PATH
for younpm config get prefix
(then append /bin/
)go env GOPATH
(then append /bin/
)asdf info | grep ASDF_DIR
(then append /bin/
and /shims/
)Once you’ve found a directory you think might be the right one, make sure it’s
actually correct! For example, I found out that on my machine, http-server
is
in ~/.npm-global/bin
. I can make sure that it’s the right directory by trying to
run the program http-server
in that directory like this:
$ ~/.npm-global/bin/http-server
Starting up http-server, serving ./public
It worked! Now that you know what directory you need to add to your PATH
,
let’s move to the next step!
Now we have the 2 critical pieces of information we need:
~/.npm-global/bin/
)~/.bashrc
, ~/.zshrc
, or ~/.config/fish/config.fish
)Now what you need to add depends on your shell:
bash instructions:
Open your shell’s config file, and add a line like this:
export PATH=$PATH:~/.npm-global/bin/
(obviously replace ~/.npm-global/bin
with the actual directory you’re trying to add)
zsh instructions:
You can do the same thing as in bash, but zsh also has some slightly fancier syntax you can use if you prefer:
path=(
$path
~/.npm-global/bin
)
fish instructions:
In fish, the syntax is different:
set PATH $PATH ~/.npm-global/bin
(in fish you can also use fish_add_path
, some notes on that further down)
Now, an extremely important step: updating your shell’s config won’t take effect if you don’t restart it!
Two ways to do this:
bash
to start a new shell (or zsh
if you’re using zsh, or fish
if you’re using fish)I’ve found that both of these usually work fine.
And you should be done! Try running the program you were trying to run and hopefully it works now.
If not, here are a couple of problems that you might run into:
If the wrong version of a program is running, you might need to add the directory to the beginning of your PATH instead of the end.
For example, on my system I have two versions of python3
installed, which I
can see by running which -a
:
$ which -a python3
/usr/bin/python3
/opt/homebrew/bin/python3
The one your shell will use is the first one listed.
If you want to use the Homebrew version, you need to add that directory
(/opt/homebrew/bin
) to the beginning of your PATH instead, by putting this in
your shell’s config file (it’s /opt/homebrew/bin/:$PATH
instead of the usual $PATH:/opt/homebrew/bin/
)
export PATH=/opt/homebrew/bin/:$PATH
or in fish:
set PATH ~/.cargo/bin $PATH
All of these directions only work if you’re running the program from your shell. If you’re running the program from an IDE, from a GUI, in a cron job, or some other way, you’ll need to add the directory to your PATH in a different way, and the exact details might depend on the situation.
in a cron job
Some options:
/home/bork/bin/my-program
echo "PATH=$PATH"
.I’m honestly not sure how to handle it in an IDE/GUI because I haven’t run into that in a long time, will add directions here if someone points me in the right direction.
PATH
entries making it harder to debugIf you edit your path and start a new shell by running bash
(or zsh
, or
fish
), you’ll often end up with duplicate PATH
entries, because the shell
keeps adding new things to your PATH
every time you start your shell.
Personally I don’t think I’ve run into a situation where this kind of
duplication breaks anything, but the duplicates can make it harder to debug
what’s going on with your PATH
if you’re trying to understand its contents.
Some ways you could deal with this:
PATH
, open a new terminal to do it in so you get
a “fresh” state. This should avoid the duplication.PATH
at the end of your shell’s config (for example in
zsh apparently you can do this with typeset -U path
)PATH
when adding it (for
example in fish I believe you can do this with fish_add_path --path /some/directory
)How to deduplicate your PATH
is shell-specific and there isn’t always a
built in way to do it so you’ll need to look up how to accomplish it in your
shell.
PATH
Here’s a situation that’s easy to get into in bash or zsh:
PATH
bash
to reload your configThis happens because in bash, by default, history is not saved until you exit the shell.
Some options for fixing this:
bash
to reload your config, run source ~/.bashrc
(or
source ~/.zshrc
in zsh). This will reload the config inside your current
session.source
When you install cargo
(Rust’s installer) for the first time, it gives you
these instructions for how to set up your PATH, which don’t mention a specific
directory at all.
This is usually done by running one of the following (note the leading DOT):
. "$HOME/.cargo/env" # For sh/bash/zsh/ash/dash/pdksh
source "$HOME/.cargo/env.fish" # For fish
The idea is that you add that line to your shell’s config, and their script
automatically sets up your PATH
(and potentially other things) for you.
This is pretty common (for example Homebrew suggests you eval brew shellenv
), and there are
two ways to approach this:
. "$HOME/.cargo/env"
to your shell’s config). "$HOME/.cargo/env"
in my shell (or the fish version if using fish)echo "$PATH" | tr ':' '\n' | grep cargo
to figure out which directories it added/Users/bork/.cargo/bin
and shorten that to ~/.cargo/bin
~/.cargo/bin
to PATH (with the directions in this post)I don’t think there’s anything wrong with doing what the tool suggests (it might be the “best way”!), but personally I usually use the second approach because I prefer knowing exactly what configuration I’m changing.
fish_add_path
fish has a handy function called fish_add_path
that you can run to add a directory to your PATH
like this:
fish_add_path /some/directory
This is cool (it’s such a simple command!) but I’ve stopped using it for a couple of reasons:
fish_add_path
will update the PATH
for every session in the
future (with a “universal variable”) and sometimes it will update the PATH
just for the current session and it’s hard for me to tell which one it will
do. In theory the docs explain this but I could not understand them.PATH
a few weeks or
months later because maybe you made a mistake, it’s kind of hard to do
(there are instructions in this comments of this github issue though).Hopefully this will help some people. Let me know (on Mastodon or Bluesky) if you there are other major gotchas that have tripped you up when adding a directory to your PATH, or if you have questions about this post!
2025-02-06 00:57:00
A few weeks ago I ran a terminal survey (you can read the results here) and at the end I asked:
What’s the most frustrating thing about using the terminal for you?
1600 people answered, and I decided to spend a few days categorizing all the responses. Along the way I learned that classifying qualitative data is not easy but I gave it my best shot. I ended up building a custom tool to make it faster to categorize everything.
As with all of my surveys the methodology isn’t particularly scientific. I just posted the survey to Mastodon and Twitter, ran it for a couple of days, and got answers from whoever happened to see it and felt like responding.
Here are the top categories of frustrations!
I think it’s worth keeping in mind while reading these comments that
These comments aren’t coming from total beginners.
Here are the categories of frustrations! The number in brackets is the number of people with that frustration. I’m mostly writing this up for myself because I’m trying to write a zine about the terminal and I wanted to get a sense for what people are having trouble with.
People talked about struggles remembering:
One example comment:
There are just so many little “trivia” details to remember for full functionality. Even after all these years I’ll sometimes forget where it’s 2 or 1 for stderr, or forget which is which for
>
and>>
.
People talked about struggling with switching systems (for example home/work computer or when SSHing) and running into:
as well as differences inside the same system like pagers being not consistent with each other (git diff pagers, other pagers).
One example comment:
I got used to fish and vi mode which are not available when I ssh into servers, containers.
Lots of problems with color, like:
This comment felt relatable to me:
Getting my terminal theme configured in a reasonable way between the terminal emulator and fish (I did this years ago and remember it being tedious and fiddly and now feel like I’m locked into my current theme because it works and I dread touching any of that configuration ever again).
Half of the comments on keyboard shortcuts were about how on Linux/Windows, the keyboard shortcut to copy/paste in the terminal is different from in the rest of the OS.
Some other issues with keyboard shortcuts other than copy/paste:
Ctrl-W
in a browser-based terminal and closing the windowCtrl-Shift-
, no Super
, no Hyper
, lots of ctrl-
shortcuts aren’t
possible like Ctrl-,
)Ctrl+left arrow
for something else)Aside from “the keyboard shortcut for copy and paste is different”, there were a lot of OTHER issues with copy and paste, like:
There were lots of comments about this, which all came down to the same basic complaint – it’s hard to discover useful tools or features! This comment kind of summed it all up:
How difficult it is to learn independently. Most of what I know is an assorted collection of stuff I’ve been told by random people over the years.
A lot of comments about it generally having a steep learning curve. A couple of example comments:
After 15 years of using it, I’m not much faster than using it than I was 5 or maybe even 10 years ago.
and
That I know I could make my life easier by learning more about the shortcuts and commands and configuring the terminal but I don’t spend the time because it feels overwhelming.
Some issues with shell history:
One example comment:
It wasted a lot of time until I figured it out and still annoys me that “history” on zsh has such a small buffer; I have to type “history 0” to get any useful length of history.
People talked about:
Here’s a representative comment:
Finding good examples and docs. Man pages often not enough, have to wade through stack overflow
A few issues with scrollback:
One example comment:
When resizing the terminal (in particular: making it narrower) leads to broken rewrapping of the scrollback content because the commands formatted their output based on the terminal window width.
Lots of comments about how the terminal feels hampered by legacy decisions and how users often end up needing to learn implementation details that feel very esoteric. One example comment:
Most of the legacy cruft, it would be great to have a green field implementation of the CLI interface.
Lots of complaints about POSIX shell scripting. There’s a general feeling that shell scripting is difficult but also that switching to a different less standard scripting language (fish, nushell, etc) brings its own problems.
Shell scripting. My tolerance to ditch a shell script and go to a scripting language is pretty low. It’s just too messy and powerful. Screwing up can be costly so I don’t even bother.
Some more issues that were mentioned at least 10 times:
Ctrl-S
, cat
ing a binary, etc)There were also 122 answers to the effect of “nothing really” or “only that I can’t do EVERYTHING in the terminal”
One example comment:
Think I’ve found work arounds for most/all frustrations
I’m not going to make a lot of commentary on these results, but here are a couple of categories that feel related to me:
Trying to categorize all these results in a reasonable way really gave me an appreciation for social science researchers’ skills.
2025-01-11 17:46:01
Hello! Recently I ran a terminal survey and I asked people what frustrated them. One person commented:
There are so many pieces to having a modern terminal experience. I wish it all came out of the box.
My immediate reaction was “oh, getting a modern terminal experience isn’t that hard, you just need to….”, but the more I thought about it, the longer the “you just need to…” list got, and I kept thinking about more and more caveats.
So I thought I would write down some notes about what it means to me personally to have a “modern” terminal experience and what I think can make it hard for people to get there.
Here are a few things that are important to me, with which part of the system is responsible for them:
p
in vim to paste (text editor, maybe the OS/terminal emulator too)ls
(shell config)Ctrl+left arrow
to work (shell or application)less
: (terminal emulator and applications)There are a million other terminal conveniences out there and different people value different things, but those are the ones that I would be really unhappy without.
My basic approach is:
fish
shell. Mostly don’t configure it, except to:
EDITOR
environment variable to my favourite terminal editorls
to ls --color=auto
neovim
, with a configuration that I’ve been very slowly building over the last 9 years or so (the last time I deleted my vim config and started from scratch was 9 years ago)A few things that affect my approach:
What if you want a nice experience, but don’t want to spend a lot of time on configuration? Figuring out how to configure vim in a way that I was satisfied with really did take me like ten years, which is a long time!
My best ideas for how to get a reasonable terminal experience with minimal config are:
fish
or zsh
with oh-my-zsh
EDITOR
environment variable to your favourite terminal text
editorls
to ls --color=auto
Ctrl-C
to copy, Ctrl-V
to paste, Ctrl-A
to select all) in micro and
they do what you’d expect. I would probably try switching to helix except
that retraining my vim muscle memory seems way too hard. Also helix doesn’t
have a GUI or plugin system yet.Personally I wouldn’t use xterm, rxvt, or Terminal.app as a terminal emulator, because I’ve found in the past that they’re missing core features (like 24-bit colour in Terminal.app’s case) that make the terminal harder to use for me.
I don’t want to pretend that getting a “modern” terminal experience is easier than it is though – I think there are two issues that make it hard. Let’s talk about them!
bash and zsh are by far the two most popular shells, and neither of them provide a default experience that I would be happy using out of the box, for example:
And even though I love fish, the fact that it isn’t POSIX does make it hard for a lot of folks to make the switch.
Of course it’s totally possible to learn how to customize your prompt in bash
or whatever, and it doesn’t even need to be that complicated (in bash I’d
probably start with something like export PS1='[\u@\h \W$(__git_ps1 " (%s)")]\$ '
, or maybe use starship).
But each of these “not complicated” things really does add up and it’s
especially tough if you need to keep your config in sync across several
systems.
An extremely popular solution to getting a “modern” shell experience is oh-my-zsh. It seems like a great project and I know a lot of people use it very happily, but I’ve struggled with configuration systems like that in the past – it looks like right now the base oh-my-zsh adds about 3000 lines of config, and often I find that having an extra configuration system makes it harder to debug what’s happening when things go wrong. I personally have a tendency to use the system to add a lot of extra plugins, make my system slow, get frustrated that it’s slow, and then delete it completely and write a new config from scratch.
In the terminal survey I ran recently, the most popular terminal text editors
by far were vim
, emacs
, and nano
.
I think the main options for terminal text editors are:
micro
or helix
which seem to offer a pretty good out-of-the-box
experience, potentially occasionally run into issues with using a less
mainstream text editorcode
as their EDITOR
in the terminal.The last issue is that sometimes individual programs that I use are kind of
annoying. For example on my Mac OS machine, /usr/bin/sqlite3
doesn’t support
the Ctrl+Left Arrow
keyboard shortcut. Fixing this to get a reasonable
terminal experience in SQLite was a little complicated, I had to:
I find that debugging application-specific issues like this is really not easy and often it doesn’t feel “worth it” – often I’ll end up just dealing with various minor inconveniences because I don’t want to spend hours investigating them. The only reason I was even able to figure this one out at all is that I’ve been spending a huge amount of time thinking about the terminal recently.
A big part of having a “modern” experience using terminal programs is just
using newer terminal programs, for example I can’t be bothered to learn a
keyboard shortcut to sort the columns in top
, but in htop
I can just click
on a column heading with my mouse to sort it. So I use htop instead! But discovering new more “modern” command line tools isn’t easy (though
I made a list here),
finding ones that I actually like using in practice takes time, and if you’re
SSHed into another machine, they won’t always be there.
Something I find tricky about configuring my terminal to make everything “nice” is that changing one seemingly small thing about my workflow can really affect everything else. For example right now I don’t use tmux. But if I needed to use tmux again (for example because I was doing a lot of work SSHed into another machine), I’d need to think about a few things, like:
and probably more things I haven’t thought of. “Using tmux means that I have to change how I manage my colours” sounds unlikely, but that really did happen to me and I decided “well, I don’t want to change how I manage colours right now, so I guess I’m not using that feature!”.
It’s also hard to remember which features I’m relying on – for example maybe my current terminal does have OSC 52 support and because copying from tmux over SSH has always Just Worked I don’t even realize that that’s something I need, and then it mysteriously stops working when I switch terminals.
Personally even though I think my setup is not that complicated, it’s taken me 20 years to get to this point! Because terminal config changes are so likely to have unexpected and hard-to-understand consequences, I’ve found that if I change a lot of terminal configuration all at once it makes it much harder to understand what went wrong if there’s a problem, which can be really disorienting.
So I usually prefer to make pretty small changes, and accept that changes can
might take me a REALLY long time to get used to. For example I switched from
using ls
to eza a year or two ago and
while I like it (because eza -l
prints human-readable file sizes by default)
I’m still not quite sure about it. But also sometimes it’s worth it to make a
big change, like I made the switch to fish (from bash) 10 years ago and I’m
very happy I did.
Trying to explain how “easy” it is to configure your terminal really just made me think that it’s kind of hard and that I still sometimes get confused.
I’ve found that there’s never one perfect way to configure things in the terminal that will be compatible with every single other thing. I just need to try stuff, figure out some kind of locally stable state that works for me, and accept that if I start using a new tool it might disrupt the system and I might need to rethink things.
2024-12-12 17:28:22
Recently I’ve been thinking about how everything that happens in the terminal is some combination of:
top
or vim
or cat
)The first three (your operating system, shell, and terminal emulator) are all kind of known quantities – if you’re using bash in GNOME Terminal on Linux, you can more or less reason about how how all of those things interact, and some of their behaviour is standardized by POSIX.
But the fourth one (“whatever program you happen to be running”) feels like it could do ANYTHING. How are you supposed to know how a program is going to behave?
This post is kind of long so here’s a quick table of contents:
Ctrl-C
q
Ctrl-D
on an empty lineCtrl-W
should delete the last word-
means stdin/stdoutAs far as I know, there are no real standards for how programs in the terminal should behave – the closest things I know of are:
cp
should work but AFAIK it doesn’t have anything to say about how for
example htop
should behave.But even though there are no standards, in my experience programs in the terminal behave in a pretty consistent way. So I wanted to write down a list of “rules” that in my experience programs mostly follow.
My goal here isn’t to convince authors of terminal programs that they should follow any of these rules. There are lots of exceptions to these and often there’s a good reason for those exceptions.
But it’s very useful for me to know what behaviour to expect from a random new terminal program that I’m using. Instead of “uh, programs could do literally anything”, it’s “ok, here are the basic rules I expect, and then I can keep a short mental list of exceptions”.
So I’m just writing down what I’ve observed about how programs behave in my 20 years of using the terminal, why I think they behave that way, and some examples of cases where that rule is “broken”.
There are a bunch of common conventions that I think are pretty clearly the program’s responsibility to implement, like:
~/.BLAHrc
or ~/.config/BLAH/FILE
or /etc/BLAH/
or something--help
should print help textBut in this post I’m going to focus on things that it’s not 100% obvious are
the program’s responsibility. For example it feels to me like a “law of nature”
that pressing Ctrl-D
should quit a REPL, but programs often
need to explicitly implement support for it – even though cat
doesn’t need
to implement Ctrl-D
support, ipython
does. (more about that in “rule 3” below)
Understanding which things are the program’s responsibility makes it much less surprising when different programs’ implementations are slightly different.
Ctrl-C
The main reason for this rule is that noninteractive programs will quit by
default on Ctrl-C
if they don’t set up a SIGINT
signal handler, so this is
kind of a “you should act like the default” rule.
Something that trips a lot of people up is that this doesn’t apply to
interactive programs like python3
or bc
or less
. This is because in
an interactive program, Ctrl-C
has a different job – if the program is
running an operation (like for example a search in less
or some Python code
in python3
), then Ctrl-C
will interrupt that operation but not stop the
program.
As an example of how this works in an interactive program: here’s the code in prompt-toolkit (the library that iPython uses for handling input)
that aborts a search when you press Ctrl-C
.
q
TUI programs (like less
or htop
) will usually quit when you press q
.
This rule doesn’t apply to any program where pressing q
to quit wouldn’t make
sense, like tmux
or text editors.
Ctrl-D
on an empty lineREPLs (like python3
or ed
) will usually quit when you press Ctrl-D
on an
empty line. This rule is similar to the Ctrl-C
rule – the reason for this is
that by default if you’re running a program (like cat
) in “cooked mode”, then
the operating system will return an EOF
when you press Ctrl-D
on an empty
line.
Most of the REPLs I use (sqlite3, python3, fish, bash, etc) don’t actually use cooked mode, but they all implement this keyboard shortcut anyway to mimic the default behaviour.
For example, here’s the code in prompt-toolkit that quits when you press Ctrl-D, and here’s the same code in readline.
I actually thought that this one was a “Law of Terminal Physics” until very recently because I’ve basically never seen it broken, but you can see that it’s just something that each individual input library has to implement in the links above.
Someone pointed out that the Erlang REPL does not quit when you press Ctrl-D
,
so I guess not every REPL follows this “rule”.
Terminal programs rarely use colours other than the base 16 ANSI colours. This
is because if you specify colours with a hex code, it’s very likely to clash
with some users’ background colour. For example if I print out some text as
#EEEEEE
, it would be almost invisible on a white background, though it would
look fine on a dark background.
But if you stick to the default 16 base colours, you have a much better chance that the user has configured those colours in their terminal emulator so that they work reasonably well with their background color. Another reason to stick to the default base 16 colours is that it makes less assumptions about what colours the terminal emulator supports.
The only programs I usually see breaking this “rule” are text editors, for example Helix by default will use a purple background which is not a default ANSI colour. It seems fine for Helix to break this rule since Helix isn’t a “core” program and I assume any Helix user who doesn’t like that colorscheme will just change the theme.
Almost every program I use supports readline
keybindings if it would make
sense to do so. For example, here are a bunch of different programs and a link
to where they define Ctrl-E
to go to the end of the line:
None of those programs actually uses readline
directly, they just sort of
mimic emacs/readline keybindings. They don’t always mimic them exactly: for
example atuin seems to use Ctrl-A
as a prefix, so Ctrl-A
doesn’t go to the
beginning of the line.
Also all of these programs seem to implement their own internal cut and paste
buffers so you can delete a line with Ctrl-U
and then paste it with Ctrl-Y
.
The exceptions to this are:
git
, cat
, and nc
) don’t have any line editing support at all (except for backspace, Ctrl-W
, and Ctrl-U
)I wrote more about this “what keybindings does a program support?” question in entering text in the terminal is complicated.
I’ve never seen a program (other than a text editor) where Ctrl-W
doesn’t
delete the last word. This is similar to the Ctrl-C
rule – by default if a
program is in “cooked mode”, the OS will delete the last word if you press
Ctrl-W
, and delete the whole line if you press Ctrl-U
. So usually programs
will imitate that behaviour.
I can’t think of any exceptions to this other than text editors but if there are I’d love to hear about them!
Most programs will disable colours when writing to a pipe. For example:
rg blah
will highlight all occurrences of blah
in the output, but if the
output is to a pipe or a file, it’ll turn off the highlighting.ls --color=auto
will use colour when writing to a terminal, but not when
writing to a pipeBoth of those programs will also format their output differently when writing
to the terminal: ls
will organize files into columns, and ripgrep will group
matches with headings.
If you want to force the program to use colour (for example because you want to
look at the colour), you can use unbuffer
to force the program’s output to be
a tty like this:
unbuffer rg blah | less -R
I’m sure that there are some programs that “break” this rule but I can’t think
of any examples right now. Some programs have an --color
flag that you can
use to force colour to be on, in the example above you could also do rg --color=always | less -R
.
-
means stdin/stdoutUsually if you pass -
to a program instead of a filename, it’ll read from
stdin or write to stdout (whichever is appropriate). For example, if you want
to format the Python code that’s on your clipboard with black
and then copy
it, you could run:
pbpaste | black - | pbcopy
(pbpaste
is a Mac program, you can do something similar on Linux with xclip
)
My impression is that most programs implement this if it would make sense and I can’t think of any exceptions right now, but I’m sure there are many exceptions.
These rules took me a long time for me to learn because I had to:
Ctrl-C
will exit programs")Ctrl-C
will exit find
but not less
”)Ctrl-C
will generally quit
noninteractive programs, but in interactive programs it might interrupt the
current operation instead of quitting the program")A lot of my understanding of the terminal is honestly still in the “subconscious pattern recognition” stage. The only reason I’ve been taking the time to make things explicit at all is because I’ve been trying to explain how it works to others. Hopefully writing down these “rules” explicitly will make learning some of this stuff a little bit faster for others.
2024-11-29 16:23:31
Here’s a niche terminal problem that has bothered me for years but that I never really understood until a few weeks ago. Let’s say you’re running this command to watch for some specific output in a log file:
tail -f /some/log/file | grep thing1 | grep thing2
If log lines are being added to the file relatively slowly, the result I’d see is… nothing! It doesn’t matter if there were matches in the log file or not, there just wouldn’t be any output.
I internalized this as “uh, I guess pipes just get stuck sometimes and don’t
show me the output, that’s weird”, and I’d handle it by just
running grep thing1 /some/log/file | grep thing2
instead, which would work.
So as I’ve been doing a terminal deep dive over the last few months I was really excited to finally learn exactly why this happens.
The reason why “pipes get stuck” sometimes is that it’s VERY common for programs to buffer their output before writing it to a pipe or file. So the pipe is working fine, the problem is that the program never even wrote the data to the pipe!
This is for performance reasons: writing all output immediately as soon as you can uses more system calls, so it’s more efficient to save up data until you have 8KB or so of data to write (or until the program exits) and THEN write it to the pipe.
In this example:
tail -f /some/log/file | grep thing1 | grep thing2
the problem is that grep thing1
is saving up all of its matches until it has
8KB of data to write, which might literally never happen.
Part of why I found this so disorienting is that tail -f file | grep thing
will work totally fine, but then when you add the second grep
, it stops
working!! The reason for this is that the way grep
handles buffering depends
on whether it’s writing to a terminal or not.
Here’s how grep
(and many other programs) decides to buffer its output:
isatty
function
So if grep
is writing directly to your terminal then you’ll see the line as
soon as it’s printed, but if it’s writing to a pipe, you won’t.
Of course the buffer size isn’t always 8KB for every program, it depends on the implementation. For grep
the buffering is handled by libc, and libc’s buffer size is
defined in the BUFSIZ
variable. Here’s where that’s defined in glibc.
(as an aside: “programs do not use 8KB output buffers when writing to a terminal” isn’t, like, a law of terminal physics, a program COULD use an 8KB buffer when writing output to a terminal if it wanted, it would just be extremely weird if it did that, I can’t think of any program that behaves that way)
One annoying thing about this buffering behaviour is that you kind of need to remember which commands buffer their output when writing to a pipe.
Some commands that don’t buffer their output:
I think almost everything else will buffer output, especially if it’s a command where you’re likely to be using it for batch processing. Here’s a list of some common commands that buffer their output when writing to a pipe, along with the flag that disables block buffering.
--line-buffered
)-u
)fflush()
function)-l
)-u
)-u
)Those are all the ones I can think of, lots of unix commands (like sort
) may
or may not buffer their output but it doesn’t matter because sort
can’t do
anything until it finishes receiving input anyway.
Also I did my best to test both the Mac OS and GNU versions of these but there are a lot of variations and I might have made some mistakes.
Also, here are a few programming language where the default print statement will buffer output when writing to a pipe, and some ways to disable buffering if you want:
setvbuf
)python -u
, or PYTHONUNBUFFERED=1
, or sys.stdout.reconfigure(line_buffering=False)
, or print(x, flush=True)
)STDOUT.sync = true
)$| = 1
)I assume that these languages are designed this way so that the default print function will be fast when you’re doing batch processing.
Also whether output is buffered or not might depend on how you print, for
example in C++ cout << "hello\n"
buffers when writing to a pipe but cout << "hello" << endl
will flush its output.
Ctrl-C
on a pipe, the contents of the buffer are lostLet’s say you’re running this command as a hacky way to watch for DNS requests
to example.com
, and you forgot to pass -l
to tcpdump:
sudo tcpdump -ni any port 53 | grep example.com
When you press Ctrl-C
, what happens? In a magical perfect world, what I would
want to happen is for tcpdump
to flush its buffer, grep
would search for
example.com
, and I would see all the output I missed.
But in the real world, what happens is that all the programs get killed and the
output in tcpdump
’s buffer is lost.
I think this problem is probably unavoidable – I spent a little time with
strace
to see how this works and grep
receives the SIGINT
before
tcpdump
anyway so even if tcpdump
tried to flush its buffer grep
would
already be dead.
After a little more investigation, there is a workaround: if you find
tcpdump
’s PID and kill -TERM $PID
, then tcpdump will flush the buffer so
you can see the output. That’s kind of a pain but I tested it and it seems to
work.
It’s not just pipes, this will also buffer:
sudo tcpdump -ni any port 53 > output.txt
Redirecting to a file doesn’t have the same “Ctrl-C
will totally destroy the
contents of the buffer” problem though – in my experience it usually behaves
more like you’d want, where the contents of the buffer get written to the file
before the program exits. I’m not 100% sure whether this is something you can
always rely on or not.
Okay, let’s talk solutions. Let’s say you’ve run this command:
tail -f /some/log/file | grep thing1 | grep thing2
I asked people on Mastodon how they would solve this in practice and there were 5 basic approaches. Here they are:
Historically my solution to this has been to just avoid the “command writing to pipe slowly” situation completely and instead run a program that will finish quickly like this:
cat /some/log/file | grep thing1 | grep thing2 | tail
This doesn’t do the same thing as the original command but it does mean that you get to avoid thinking about these weird buffering issues.
(you could also do grep thing1 /some/log/file
but I often prefer to use an
“unnecessary” cat
)
You could remember that grep has a flag to avoid buffering and pass it like this:
tail -f /some/log/file | grep --line-buffered thing1 | grep thing2
Some people said that if they’re specifically dealing with a multiple greps
situation, they’ll rewrite it to use a single awk
instead, like this:
tail -f /some/log/file | awk '/thing1/ && /thing2/'
Or you would write a more complicated grep
, like this:
tail -f /some/log/file | grep -E 'thing1.*thing2'
(awk
also buffers, so for this to work you’ll want awk
to be the last command in the pipeline)
stdbuf
stdbuf
uses LD_PRELOAD to turn off libc’s buffering, and you can use it to turn off output buffering like this:
tail -f /some/log/file | stdbuf -o0 grep thing1 | grep thing2
Like any LD_PRELOAD
solution it’s a bit unreliable – it doesn’t work on
static binaries, I think won’t work if the program isn’t using libc’s
buffering, and doesn’t always work on Mac OS. Harry Marr has a really nice How stdbuf works post.
unbuffer
unbuffer program
will force the program’s output to be a TTY, which means
that it’ll behave the way it normally would on a TTY (less buffering, colour
output, etc). You could use it in this example like this:
tail -f /some/log/file | unbuffer grep thing1 | grep thing2
Unlike stdbuf
it will always work, though it might have unwanted side
effects, for example grep thing1
’s will also colour matches.
If you want to install unbuffer, it’s in the expect
package.
It’s a bit hard for me to say which one is “best”, I think personally I’m
mostly likely to use unbuffer
because I know it’s always going to work.
If I learn about more solutions I’ll try to add them to this post.
I think it’s not very common for me to have a program that slowly trickles data into a pipe like this, normally if I’m using a pipe a bunch of data gets written very quickly, processed by everything in the pipeline, and then everything exits. The only examples I can come up with right now are:
tail -f
kubectl logs
I think it would be cool if there were a standard environment variable to turn
off buffering, like PYTHONUNBUFFERED
in Python. I got this idea from a
couple of blog posts by Mark Dominus
in 2018. Maybe NO_BUFFER
like NO_COLOR?
The design seems tricky to get right; Mark points out that NETBSD has environment variables called STDBUF
, STDBUF1
, etc which gives you a
ton of control over buffering but I imagine most developers don’t want to
implement many different environment variables to handle a relatively minor
edge case.
I’m also curious about whether there are any programs that just automatically flush their output buffers after some period of time (like 1 second). It feels like it would be nice in theory but I can’t think of any program that does that so I imagine there are some downsides.
Some things I didn’t talk about in this post since these posts have been getting pretty long recently and seriously does anyone REALLY want to read 3000 words about buffering?