from cachy import enable_cachy
enable_cachy()2025-10-13 08:00:00
In this post, I’m going to explain how I created a book chapter from Andrej Karpathy’s tokenizers video tutorial using SolveIt. The final artifact is a text version with runnable code examples, hyperlinks, images, and additional explanations that go beyond what’s in the video.
Before we continue, a quick word about SolveIt. It’s both a platform, and an approach to problem-solving that emphasizes working in small, verifiable steps rather than asking AI to do everything at once. It’s built around the idea that AI should see exactly what you see - all your notes, code, outputs, and context - so it can be a genuine collaborative partner. While people sometimes think it’s just for coding, I’ve found it equally useful for learning, writing, and in this case, taking up Andrej’s challenge to create a book chapter from a video. The platform gives you a full Linux environment with persistent storage, built-in tools for web search and message editing, and the ability to define your own Python functions as tools. Most importantly, everything is editable - you can reorganize, collapse sections, edit AI responses, and keep your workspace clean as you work. This “dialog engineering” is what made the video-to-document workflow practical: I could work through enrichment step by step, verify each addition, and maintain useful context throughout. The same approach carried into the writing phase - creating an outline first, then writing section by section while editing AI responses directly to match my preferred style.
If you’d like to learn this approach yourself and use the platform I use in this article, there’s a course starting Nov 3rd at solve.it.com.
I started with a timestamped transcript of the video and screenshots of key moments. I could have just asked AI to “convert this transcript into a book chapter,” but I’ve tried that before and it doesn’t work well. You end up with something that reads okay but is bland, too short compared to the transcript, misses key concepts, lacks deeper explanations, and has hallucinated content. It’s very similar to asking AI to write a whole program for you - you don’t build a deep understanding, have control over it or learn anything in the process. This problem is especially prominent with longer videos—in this case, a video over 2 hours.
Instead, I followed the SolveIt approach and worked on it in two phases: first enriching the transcript piece by piece with all the artifacts I wanted, then using that enriched version to write the actual prose. It took longer than one-shotting the whole thing, but I ended up with something I fully understand, and it was still faster than writing it from scratch.

Dialog 1 - Enriching the Transcript – This first dialog focused on enriching the transcript piece by piece.
Dialog 2 - Writing the Book Chapter – The second dialog used the enriched transcript to write the final book chapter.
The transcript was long - over 2 hours of content. To keep the AI on target, I split it into smaller note messages, and worked through them one at a time.
def split_tscript_as_msgs(dst, yt_video_id=None):
tscript_md = tscript_with_imgs(scribe_dst, False)
if yt_video_id: tscript_md = tscript_add_yt_links(tscript_md, yt_video_id)
sidx, chunks = 0, []
lines = tscript_md.splitlines()
for idx, l in enumerate(lines):
if l.startswith('!['):
chunks.append('\n\n'.join(lines[sidx:idx+2])) # include alt text
sidx = idx+2
for c in chunks[::-1]: add_msg(c)A function to split a single transcript note message into multiple messages. You can implement your own split logic.
I did this because as I’ve explained earlier working with large blocks of text is not very manageable. With smaller sections, when I asked it to add a hyperlink or create a code example, it stayed on target. Plus I could run code immediately to verify it worked before moving on.
When Andrej mentioned his previous video “Let’s build GPT from scratch,” I didn’t want to just leave that as plain text. I asked SolveIt to find the YouTube link and add it as a hyperlink to the transcript.
SolveIt used web search to find it, then used the message editing tools to update the note with the proper markdown link. I did this throughout for papers, blog posts, GitHub repos, wikipedia pages and any other external resources that were mentioned in the video.

In this screenshot, we can see at the top a note message containing part of the transcript. Below that is a prompt message asking SolveIt to find the YouTube link and add it as a hyperlink. The AI’s response shows it used web search to find the video (visible in the hovered citations), then called the update_msg function (a dialoghelper tool) with the message ID and new content that includes the proper markdown hyperlink. The message updates in real time within the dialog. The details of tool calls can be expanded, as shown in the image. This demonstrates how SolveIt makes both the AI’s reasoning and its actions visible—you can see exactly what tools it used and verify the result. If you want to learn more about SolveIt’s features like message editing tools, dialog engineering, and the full platform capabilities, check out this features overview video.
Some of the screenshots had information I wanted to pull into the text - code snippets, diagrams, or other content. Rather than doing it myself (which would be very time consuming), I used AI. In SolveIt, images embedded in markdown aren’t visible to the AI by default - this keeps context manageable. But you can make specific images visible by adding a special #ai anchor tag to the image markdown.
Once I made an image visible, I could ask SolveIt to work with it. In this example, I asked it to extract code from a screenshot. It read the image and created a code message with the extracted code, which I could then actually run to verify it worked correctly, or make any adjustments as needed.

Early on, before the enrichment, I asked SolveIt to identify which GitHub repositories were mentioned or relevant to the tokenizer tutorial by giving it the full transcript. It found several - OpenAI’s GPT-2 repo, tiktoken, Karpathy’s minBPE, Google’s SentencePiece, and a few others.
Since SolveIt gives you a full Linux environment, I could clone these repos directly into the workspace.
!git clone https://github.com/karpathy/minbpeThe idea was that as I worked through the transcript, I’d have access to the actual source code that Andrej was discussing.
This turned out to be really useful. When I was working on a section about how BPE is implemented, I could ask SolveIt to look at the actual code in those repos and pull in the relevant functions. It would use shell commands to search through the codebase, read the files, and extract what I needed.
Even though these resources are available on the web or via APIs, SolveIt works with them more efficiently when they’re stored locally, using custom tools like run_cmd.
import subprocess, shlex
def run_cmd(cmd: str, timeout=30):
"Run a bash command and return stdout, stderr, and return code"
try:
add_msg(f"!{cmd}", msg_type='code')
result = subprocess.run(shlex.split(cmd), capture_output=True, text=True, timeout=timeout)
return dict(stdout=result.stdout, stderr=result.stderr, returncode=result.returncode)
except subprocess.TimeoutExpired: return dict(error=f'Command timed out after {timeout}s')
except Exception as e: return dict(error=str(e))
I noticed some situations where Andrej’s explanation could use code examples to clarify the concept. This is something AI is good at - I found that when I asked it to provide clarifying examples, they were really solid.
For instance, in one section Andrej was explaining the differences between UTF-8, UTF-16, and UTF-32 encoding. The verbal explanation was clear enough, but I thought a concrete code example would help. So I asked: “Create a minimal code example showing the difference between UTF-8, UTF-16, and UTF-32 encoding.”
SolveIt generated the code, and I ran it immediately to verify it worked and actually demonstrated what I wanted. If it wasn’t quite right, I could adjust it or ask for modifications. These runnable examples became part of the enriched transcript, and later made it into the final book chapter.

As I worked through the transcript, there were things I didn’t fully understand or that seemed like they could use more explanation. Instead of just accepting gaps in my understanding, I asked questions.
For example, at one point Andrej mentioned that tokens go from 0 to 255 initially in the BPE algorithm. I wasn’t entirely clear why that specific range, so I asked: “Why do tokens currently go from 0 to 255 - why is this the case?”
SolveIt explained that it’s because we start with UTF-8 encoded bytes, and each byte can hold values from 0 to 255 (2^8 = 256 possible values). That made sense, and I added that explanation as a note in that section of the transcript.
These clarifying questions and answers became valuable additions to the final content. They filled in gaps that might have left readers (or me) confused, and they were explanations I actually understood because I had asked the questions myself.

The actual workflow rhythm looked like this: I’d open a section of the transcript, read through it, and decide what it needed. Maybe it mentioned a paper that should be linked. Maybe there was a concept that needed a code example. Maybe I had a question about something.
I’d make a small, specific request - “Add a hyperlink to the GPT-2 paper” or “Extract the code from this screenshot” or “What does byte fallback do in SentencePiece?” SolveIt would do it, I’d review the result, and if it was code I’d run it to verify. Then I’d move to the next section.
Two things made this work smoothly. First, I defined some simple Python functions as tools. Any Python function in SolveIt becomes available as a tool - in my case, I made a run_cmd function so SolveIt could execute shell commands to explore codebases. SolveIt also has built-in tools via dialoghelper for editing messages, which I used constantly to update the transcript sections.
As the dialog grew longer, I kept it manageable by using collapsible headings to organize sections, and pinning important context messages so they wouldn’t get truncated. When the AI’s response wasn’t quite right, I’d just edit it directly rather than asking it to try again - this works much better in practice as AI tends to follow its previous responses rather than the human instructions. I also deleted dead ends - explorations that didn’t pan out - to keep the dialog focused.
This wasn’t fast, but it was thorough. By the end, I had a deep understanding of tokenization, every code snippet had been tested, every link verified, and every image was where it should be. The enriched transcript was genuinely useful on its own, even before writing the book chapter.
Once I had the enriched transcript, I created a new dialog to write the actual book chapter. I loaded all those enriched note messages and code messages into the context of this new dialog.
I didn’t jump straight into writing. Instead, I asked SolveIt to create an outline first. I wanted to see the overall structure - what sections made sense, what subsections each should have, what key points to cover, and which images belonged where.
The prompt was something like: “Create a detailed outline for this book chapter with sections, subsections, brief bullets on what each covers, and which images are relevant for each section.”
SolveIt gave me a structured skeleton that I could review. This outline became my roadmap for writing. Having it laid out meant I could see the whole shape of the chapter before committing to any particular section, and I could adjust the structure if something didn’t make sense.

With the outline in place, I started writing. I asked SolveIt to write the introduction first, then moved through each section one at a time.
SolveIt wrote the intro, pulling in relevant details from the enriched transcript - including code snippets where appropriate, adding hyperlinks that I’d already found during enrichment, and referencing the right images. I read through it, made edits where needed, and then moved to the next section.
The key was doing this incrementally. I didn’t ask it to write the whole thing at once. Each section was its own request, its own review, its own iteration. This kept things manageable and let me maintain control over the quality and tone.

Sometimes SolveIt’s first attempt at a section wasn’t quite right - maybe the tone was off, or it was too verbose, or it didn’t emphasize the right things. When that happened, I found it was much more effective to just edit the response directly rather than trying to describe what I wanted.
I’d go into the AI’s response, rewrite parts of it to match my preferred style, and then tell SolveIt: “I’ve updated your previous response to better match the tone I want. Please continue in this style for the remaining sections.”
This works because language models are autoregressive - they predict what comes next based on what came before. By editing their output to be exactly what I want, I’m teaching them through example, which is far more effective than verbal instructions. The AI follows its own previous responses more reliably than it follows descriptions of what you want.

After writing each section, I’d review it myself first. Does it make sense? Is it accurate? Does it match the enriched transcript? It also helps to include citations from the transcript at the end of a written text section as an additional layer of verification.
Sometimes I’d also ask SolveIt: “Is there anything important missing from this subsection based on the transcript?”. This caught things I’d overlooked. Maybe there was a key point from Andrej’s explanation that didn’t make it into the prose, or an important code snippet that should have been included. I’d make adjustments based on both my own judgment and what the AI flagged, then move on to the next section.
This back-and-forth reviewing wasn’t wasted time. It meant that by the time I finished all the sections, I was confident the content was solid. No need for a big revision pass at the end because I’d been iterating throughout.

Once all the sections were written and reviewed, I needed to merge them into a single cohesive document. All the AI responses were separate messages in the dialog - one for the intro, one for each section, etc.
I used tools from dialoghelper to combine all written sections into a single note message. The result was a complete markdown-formatted book chapter with everything in place - prose, code blocks, images, hyperlinks, all properly formatted.
At that point, I could either hit the publish button in SolveIt to get a shareable URL at share.solveit.com, or export the markdown to use with whatever publishing platform I prefer. In my case, I published it both ways - shared via SolveIt and also exported it to publish on fast.ai’s blog using Quarto.
This two-phase process took longer than just asking AI to “convert this transcript to a book chapter.” But I think it was worth it for a few practical reasons:
I ended up with an artifact that covers everything important from the video. It is verified as opposed to trusting the AI blindly - every code snippet runs, every hyperlink goes to the right place, every image is relevant to its section.
I maintained control throughout. When I wanted to emphasize something Andrej mentioned briefly, I could dig deeper on that section. When something in the video didn’t need as much space in the book chapter, I could condense it. The final artifact reflects my judgment about what’s important, not just a mechanical conversion.
I actually learned the material. Working through tokenization section by section, asking questions when I didn’t understand something, running the code examples - by the end I had a real grasp of how BPE works, what the tradeoffs are between different approaches, etc.
None of this is to say you shouldn’t use AI. I used it constantly throughout this process. But I used it in small, specific ways where I could verify the results immediately. That made all the difference.
You can use this approach with any video transcript you can get your hands on. Some practical sources:
yt-dlp --write-auto-sub to download auto-generated captionsOnce you have a transcript, the workflow is the same. Get it into SolveIt, split it into manageable sections (or keep it as one message if it’s short enough), and start enriching. The tools are there - web search for finding links, image analysis for extracting information from screenshots, code execution for verifying examples, file system access for cloning repos or downloading resources, and dialoghelper tools for manipulating messages.
The most important part isn’t the specific tools or techniques - it’s the approach. Work in small pieces. Verify as you go. Ask questions when you don’t understand something. Run code to make sure it works. Build genuine understanding rather than just reformatting content.
If you want to see the full example, the published book chapter shows what this workflow produces, and you can look at the two dialogs I linked earlier to see exactly how I worked through each phase.
2025-10-02 08:00:00
tldr from Jeremy: “How to Solve it With Code” is a course from fast.ai in iterative problem solving, and a platform (‘Solveit’) to make that easier. The course shows how to use AI in small doses to help learn as you build, but doesn’t rely on AI. The approach is based on decades of research and practice from Eric Ries and I. It’s basically the opposite of “vibe coding”; it’s all about small steps, deep understanding, and deep reflection. We wrote the platform because we didn’t find anything else sufficient for doing work the “solveit way”, so we made something for ourselves, and then decided to make it available more widely. You can follow the approach without using our platform, although it won’t be as smooth an experience.
It’s a strange time to be a programmer. It’s easier than ever to get started, but also easier than ever to let AI steer you into a situation where you’re overwhelmed by code you don’t understand. We’ve got an antidote that we’ve been using ourselves with 1000 preview users for the last year. It’s changed our lives at Answer.AI, and hundreds of our users say the same thing. Now we’re ready to share it with you. Signups are open, and will remain so until October 20th. Over five weeks, we’ll give you a taste of how our new approach and platform, “Solveit”, can be applied to everything from programming challenges, web development, and system administration to learning, writing, business, and more.
OK, let’s explain what on earth we’re talking about!…
At the end of last year, Jeremy Howard (co-founder of fast.ai, Answer.AI, Kaggle, Fastmail, creator of the first LLM…) and I ran a small trial course titled “How To Solve It With Code”. The response was so overwhelming that we had to close signups after just one day. 1000 keen beans joined us for a deep dive into our general approach to solving problems. The first few lessons were taught via the vehicle of the ‘Advent of Code’ programming challenges and run in a new, purpose-built tool called solveit. As the course progressed, we had lots of fun exploring web development, AI, business, writing and more. And the solveit tool became an extremely useful test-bed for ideas around AI-assisted coding, learning and exploration.
In the year since, we’ve continued to refine and expand both the process and the platform. We now basically live in the solveit platform. We do all our sysadmin work in it (Solveit itself is hosted on a new horizontally scalable multi-server platform we built and run entirely using Solveit!), host production apps in it (e.g all students in the course can use a Discord AI bot “Discord Buddy” that’s running inside a Solveit dialog!), develop most of our software in it, our legal team does contract drafting in it, we iterate on GUIs in it, and in fact we do the vast majority of our day to day work of all kinds in it.

From October 20th for five weeks, Jeremy and I will show you how to use the solveit approach, and give you full access to the platform that powers it (and you’ll have the option to continute to access the lessons and platform afterwards too). Also Eric Ries will join us for lessons about building startups that don’t just make money, but that stick to your vision for how you want to impact the world. You’ll be amongst the first people in the world to have the opportunity to read his new unreleased book.
But what IS “the solveit approach”? It isn’t some new AI thing, but actually is based on ideas that are at least 80 years old… To learn more, read on, or watch this video Jeremy and I recorded a few weeks ago.
George Polya was a Hungarian mathematician who wrote the influential book “How to Solve It” in 1945. In it, he shares his philosophies on education (focus on active learning, heuristic thinking, and careful questioning to guide students towards discovering answers for themselves) and outlines a four-step problem-solving framework:
He was focused on mathematics, but as Jeremy and I realized, these ideas translate far beyond maths! It turns out that it actually works great for coding, writing, reading, learning…
Of course, you can often just have AI code and write for you. But should you?
In most cases, we argue the answer is “no”.
There’s a myriad of problems waiting for you if you go down that path: - If you didn’t know the foundations of how to do it before, you don’t now either. You’ve learned nothing - If you keep working this way, you build up more and more code you don’t understand, creating technical and understanding debt that will eventually become crippling - You won’t be building up a foundation to solve harder tasks that neither humans nor AI can one-shot. So you’re limiting yourself to only solving problems that everyone else can trivially solve too. This is not a recipe for personal or organizational success!
On the other hand, if you build a discipline of always working to improve your understanding and expertise, you’ll discover that something delightful and amazing happens. Each time you tackle a task, you’ll find it’s a little easier than the last one. These improvements in understanding and capability will multiply, and you’ll find that your own skills develop even faster than AI improves. You’ll focus on using AI to help you dramatically increase your own productivity and abilities, instead of focusing on helping the AI improve its productivity and abilities!
Let’s consider a quick example of coding the solveit way (without even any AI yet). For 2024’s Advent of Code, Day 1’s solution involves comparing two lists, sorted by value (there’s a whole backstory involving elves, which you can read if you like). Let’s imagine we’ve considered the problem, and are now focused on a small sub-task: extracting the first (sorted) list. We start with the sample data provided:
x = '3 4\n4 3\n2 5\n1 3\n3 9\n3 3'Our plan might be:
After thinking through the plan, we begin working on individual steps. We aim to write no more than a few lines of code at a time, with each piece giving some useful output that you can use to verify that you’re on the right track:
lines = x.splitlines()
lines
>>> ['3 4', '4 3', '2 5', '1 3', '3 9', '3 3']Now we build up a list comprehension to get the first elements. We might start with [o for o in lines] and then add bits one at a time, inspecting the output, building up to:
l1 = [int(o.split()[0]) for o in lines]
l1
>>> [3, 4, 2, 1, 3, 3]Now sorting:
sorted(l1)
>>> [1, 2, 3, 3, 3, 4]Now that we’ve run all the pieces individually, and checked that the outputs are what we’d expect, we can stack them together into a function:
def get_list(x):
lines = x.splitlines()
l1 = [int(o.split()[0]) for o in lines]
return sorted(l1)
get_list(x)
>>> [1, 2, 3, 3, 3, 4]At this point, you’d reflect on the solution, think back to the larger plan, perhaps ask yourself if there are better ways you could do it. You may be thinking that this is far too much work for sorted(int(line.split()[0]) for line in x.splitlines()) – as your skill increases you can tailor the level of granularity, but the idea remains the same: working on small pieces of code, checking the outputs, only combining them into larger functions once you’ve tried them individually, and constantly reflecting back on the larger goal.
(We’ll come back to this shortly – but also consider for a moment how integrated AI can fit into the above process. Any time you don’t know how to do something, you can ask for help with just that one little step. Any time you don’t understand how something works, or why it doesn’t, you can have AI help you with that exact piece.)
The superpower that this kind of live, iterative coding gives you is near-instant feedback loops. Instead of building your giant app, waiting for the code to upload, clicking through to a website and then checking a debug console for errors – you’re inspecting the output of a chunk of code and seeing if it matches what you expected. It’s still possible to make mistakes and miss edge cases, but it is a LOT easier to catch most mistakes early when you code in this way.
This idea of setting things up so that you get feedback as soon as possible pops up again and again. Our cofounder Eric Ries talks about this in his book ‘The Lean Startup’, where getting feedback from customers is valuable for quick iteration on product or business ideas. Kaggle pros talk about the importance of fast evals – if you can test an idea in 5 minutes, you can try a lot more ideas than you could if each experiment requires 12 hours of model training.
So far so good – sounds like we’re describing the style of exploratory/literate programming taught in the fast.ai course, and used with tools like NBDev. Aren’t we in a new era though? Where is the AI?!
Well, it turns out that by building code in this way, with planning, notes and tests mixed in with the source code, you’re also building the perfect context for an AI to help with the code too. Solveit can see everything you can see. We’ve discovered that this actually transforms “AI+Human” capabilities in ways that surprised even us.
It’s become a key foundation of all our work at Answer.AI now: the AI should be able to see everything exactly as the human does, and vice versa, and both human and AI must be able to use the same tools. This makes the AI a true iterative partner to bounce ideas off, try experiments, and learn together with.
You can also feed additional context to Solveit by referencing specific variables, or having it use its built-in search and URL-reading tools. And any python function becomes a tool that you can ask solveit to use, making it easy to give it everything it needs to fetch more context or take “agentic” actions to give better responses.
This idea of having an AI that can see everything that you can see, in a shared environment, is put to good use in our beloved shell sage tool too!
One issue with current chat-based models is that once they go off the rails, it’s hard to get back on track. The model is now modelling a language sequence that involves the AI making mistakes – and more mistakes are likely to follow! If you’ve used language models much, then you’ve no doubt experienced this problem many times.
There is an interesting mathematical reason that this occurs. The vast majority of language model training is entirely about getting a neural network to predict the next word in a sentence – they are auto-regressive. Although they are later fine-tuned to do more than this, they are still at their heart really wanting to predict the next word of a sentence. In the documents used for training, there are plenty of examples of poor-quality reasoning and mistakes.Therefore, once an AI sees some mistakes in a chat, the most likely next tokens are going to be mistakes as well. That means that every time you are correcting the AI, you are making it more likely for the AI to give bad responses in the future!
Because solveit dialogs are fluid and editable, it’s much easier to go back and edit/remove mistakes, dead ends, and unrelated explorations. You can even edit past AI responses, to steer it into the kinds of behaviour you’d prefer. Combine this with the ability to easily hide messages from the AI or to pin messages to keep them in context even as the dialog grows beyond the context window and starts to be truncated, and you have a recipe for continued AI helpfulness as time goes on. We’ve been talking about this as “dialog engineering” for a long time – and it really is key to having AI work sessions that improve as time goes on, rather than degrading.
Of course, this is all useful for humans too! The discipline of keeping things tidy, using (collapsible) headings to organise sections, writing notes on what you’re doing or aiming for, and even past questions+answers with the AI all make it a pleasure to pick back up old work.
One thing is still (intentionally) hard in solveit though, and that is getting the AI to actually write all of your code in a hands-off way. We’ve made various choices to gently push towards the human remaining in control. Things like:
Even the choice to have the editor be fairly small and down at the bottom emphasizes that this is a REPL/dialog, optimised for building small, understandable pieces. It’s entirely possible to practice the solveit approach in other tools, but we’ve also found that a combination of these intentional choices and the extra affordances for dialog engineering rapidly feel indispensible.
This brings us back to a foundational piece of the solveit approach: a learning mindset. It’s great that we can ask AI to fill in the gaps of our knowledge, or to save some time with fiddly pieces like matplotlib plots or library-specific boilerplate. But when the AI suggests something you don’t know, it is important not to skip it and move on – otherwise that new piece will never be something you learn!
We try to build the discipline to stop and explore anytime something like this comes up. Fortunately, it’s really easy to do this – you can add new messages trying out whatever new thing the AI has shown you, asking how it works, getting demo code, and poking it until you’re satisfied. And then the evidence of that side-quest can be collapsed below a heading (for later ref) or deleted, leaving you back in the main flow but with a new piece of knowledge in your brain.
Like many programmers, I’ve had my share of existential worries given the rapid rise in AI’s coding ability. What if AI keeps getting better and better, to the point where there’s little point for the average person actually learning to master any of these skills? If you assume your coding skills stay static, and imagine the AI continuing to get better, you may feel kinda bleak. The thing is, skill doesn’t have to be static! And as both you and the AI you’re carefully using get better, you will learn faster and be able to accomplish more and more.
This is all hard work. It’s like exercise, or practicing a musical instrument. And like any pursuit of mastery, I don’t know that it’s for everyone. But as we’ve seen from all of the students who invested their time into the first cohort, the effort is well worth it in the end. Just take a look at the project showcase featuring a few hundred (!) things our community has made.
If you’re interested in joining us to learn how to use the Solveit approach yourself, head over to our site and sign up: solve.it.com, Signups are open until October 20th, but may close earlier if we fill up, so don’t wait too long!
2025-10-01 08:00:00

At AnswerAI we build software that makes working with A.I. that little bit easier. For example, in the past year we built a series of open source python packages (Claudette, Cosette) that make it much simpler to work with LLM providers like Anthropic and OpenAI.
These packages make many LLM calls which pose a bunch of challenges that can really slow down development.
As we build most of our software in notebooks non-deterministic responses create an additional problem. They add significant bloat to notebook diffs which makes code review more difficult 😢.
cachy?Although LLMs are relatively new these challenges are not, and an established solution already exists. You simply mock each LLM call so that it returns a specific response instead of calling the LLM provider. Indeed this approach works pretty well but it is a little cumbersome. In our case, we would need to call the LLM manually, capture the response, save it to our project, and write a mock that uses it. We would need to repeat this process for hundreds of LLM calls across our projects 😢.
We asked ourselves if we could do better and create something that just worked automatically in the background with zero manual intervention. That something better turned out to be very simple. We looked at the source code of the most popular LLM SDKs and found that they all use the httpx library to call their respective APIs. All we needed to do was modify httpx’s send method to save the response of every call to a local file (a.k.a a cache) and re-use it on future requests. Here’s some pseudo-code that implements just that.
@patch
def send(self:httpx._client.Client, r, **kwargs):
id_ = req2id(r) # convert request to a unique identifier
if id_ in cache: return httpx.Response(content=cache[id_])
res = self._orig_send(r, **kwargs)
update_cache(id_, res)
return resWe added this simple patch to one of our projects and the payoff was immediate.
The best part is that we got all of these benefits without having to write a single line of code and bloating our project with mocks and fixtures.
Since then we’ve added support for async, streaming, and turned it into into a separate package called cachy which we’re open sourcing today 🎉.
Setting up cachy is pretty straightforward.
pip install pycachy
from cachy import enable_cachy
enable_cachy() to the top of your notebook or scriptNow when you use Anthropic or OpenAI’s python SDK the response will be cached and re-used whenever you make the same LLM call again. You don’t need to write any additional code. cachy just works automatically in the background.
Here’s an example.
from cachy import enable_cachy
enable_cachy()Now, let’s request a completion from OpenAI.
from openai import OpenAI
cli = OpenAI()
r = cli.responses.create(model="gpt-4.1", input="Hey!")
rHey! How can I help you today? 😊
If we run the same request again, the response is now read from the cache.
r = cli.responses.create(model="gpt-4.1", input="Hey!")
rHey! How can I help you today? 😊
Although this post focuses on caching LLM responses, cachy can be used to cache any calls made with httpx. All you need to do is tell cachy what urls you want to cache.
enable_cachy(doms=["api.example.com", "api.demo.com"])cachy is one of those little quality of life improvements that keeps us in a flow state for longer and help us move that little bit faster. We hope you’ll find it useful.
2025-07-23 08:00:00
tldr: I got frustrated with the developer experience I was getting with the Stripe SDK and decided to create, in my opinion, a better one called FastStripe. FastStripe supports the full Stripe API thanks to the awesome OpenAPI spec that Stripe released, but it makes it cleaner, organizes it better, and integrates well with your IDE so that you get nice tab completion on your parameters. You get clean docstrings that explain what the function does and what each parameter does. We also add helper functions and make doing things like creating one-time payments be done in 6 lines of code, whereas with the official SDK, it’s roughly 25. Or setting up a recurring subscription—FastStripe gets it done in 9 lines of code where the equivalent in the official SDK is roughly 25 lines. It is out and about. It has been powering our own internal apps for almost a month now without any issues, all the while reducing their complexity. You can start using it by running We also continually update FastStripe for every new version of the Stripe API. Stop me if this sounds familiar: You want to take people’s money, and you want to make sure you can take it super easily. Like candy from a baby easy. And, yeah, yeah, yeah, of course, you want to, in exchange for that money, provide some service or product that the person is willing to exchange their money for. This used to be a nightmare to do and for some companies; it can still kind of feel like a nightmare (Cough, cough, Google) Stripe makes much of this process pretty easy, but by golly, trying to use their SDK over the last eight months has been a journey, and it’s been a long one. So long and bumpy that I realized early on that this just wasn’t going to cut it. Let me show you what I mean. Here’s what accepting payments typically looks like: Looks simple enough, right? Well, when you know what the parameters are, it is. But if I’m some weirdo who doesn’t actually have any of these memories, I need to go look at the source code to read the docstring and implementation details. Great, let’s do that! Here’s the actual source code for creating a checkout session: Well shit… I experienced this moment again and again and again when it came time to integrate payment processing into my apps. The only solution that I could find was to go to their website and look at the actual API reference docs. Here’s what those docks look like in case you’re interested: Docs like that bring a tear to my eye. It’s just so beautiful. Here’s a link to it as well if you want to see it for yourself, along with the rest of the docs, which I highly recommend, as they’re really well written. However, these trips to the docs caused a lot of context switching, which is a developer’s worst friend, and it’s also not a great experience for being able to explore different ways and features that Stripe offers to developers. I don’t want my teammates to have to experience this every time they want to launch an app that takes payments. I don’t want you, the reader, to have to do this either. It’s not fun. It kills an afternoon when it should take a few minutes. And so, I decided to implement what one of my previous colleagues, Isaac, here at Answer likes to call rage-driven development (RDD) and build FastStripe: the Stripe experience you deserve. Let’s see what it looks like to implement the above in FastStripe: A single method call, which under the hood handles creating the product or finding an existing product, sets up the price and creates your checkout session with sensible defaults. And if you want more control, FastStripe gives you access to the full Stripe API, even those esoteric ones that you’ll probably never use (By the way, did you know that Stripe has an API specifically for Climate products?! I didn’t until working on this project and really wish I could fill that part of my brain with something useful. Alas…). It also adds proper IDE support, so you have nice tab completion. You also have nice docstrings that explain each parameter, letting you stay in your happy place for longer: Well if you are still here, I’ll assume you took the red pill and are following me down the rabbit hole of how I built FastStripe. Let’s begin! Let’s talk about what made this all possible. Stripe, bless their souls, went ahead and published a truly beautiful OpenAPI spec for their entire API. Now, if you’re not familiar, OpenAPI specs are like the blueprint for how to talk to an API. They describe every endpoint, every parameter, and even decent human-friendly descriptions for what things do and what you need to provide. And Stripe’s is exceptionally thorough. What’s even cooler is that these specs are really easily parsed since they are written in either JSON or YAML. Years back, my CEO Jeremy Howard and Hamel Husain did this to dynamically generate a python SDK for the GitHub API called ghapi. ghapi provides 100% always-updated coverage of the entire GitHub API. Because we automatically convert the OpenAPI spec to a Pythonic API, ghapi is always up to date with the latest changes to GitHub APIs. Furthermore, because this is all done dynamically, the entire package is only 35kB in size! And I thought to myself, what a wonderful world it would be if I could do the same for Stripe. Let’s pay a little bit of attention to the We then take these endpoint descriptions and use them to automatically generate Python classes where we override the signature and docstring of the class’s Or explore all the resources by doing the same for the root This makes exploring the Stripe API so much easier than reading through countless API doc pages. FastStripe follows Stripe’s monthly API versioning to ensure stability and compatibility. Rather than automatically using the latest version (which could break existing code when endpoints change), we pin FastStripe releases to specific Stripe API versions. For example, FastStripe version 2025.06.30.0 corresponds to Stripe’s API version from June 30th, 2025. The final number increments when we add new high-level convenience methods like But wait, there’s more! FastStripe supports, thanks to the awesomeness of the OpenAPI spec, the entire Stripe API. However, we also add some helper functions to streamline some of the more common happy paths. The FastStripe version accomplished the same in 6 lines of code compared to the roughly 25 lines (if you omit the comments) of code that vanilla Stripe takes. Under the hood, the one-time payment function in FastStripe will either find or create the product with an associated price for your one-time payment automatically, using the other helper functions that FastStripe provides, like Which again would have been roughly 25 lines of code compared to FastStripe’s 9. Like many REST APIs, getting a resource, such as the products that you’ve created under your Stripe account, requires you to deal with pagination. Stripe’s API will only return a limited number of results per request (e.g., 10, 25, 100), controlled by a The vanilla Stripe SDK exposes this as a cursor-based pagination system. In practice, this means if you want to get all products, customers, or invoices, you have to loop through the results manually, making repeated requests: FastStripe offers an easy way to automatically fetch all results. Similar to We also have So, if all of this sounded interesting and you’d like to try it for yourself, here is how: or subscription: FastStripe is open source and we’d love your feedback. Whether you’re building one app or a thousand, we want to make Stripe integrations as frictionless as possible.TL;DR
pip install faststripe and creating your very first one-time payment link:from faststripe.core import StripeApi
sapi = StripeApi('your-key-here')
checkout = sapi.one_time_payment(product_name='Digital Course', amount_cents=49_99,
success_url='http://localhost:5001/success',
cancel_url='http://localhost:5001/cancel')
print(checkout.url[:64] + "...")https://billing.answer.ai/c/pay/cs_test_a1gQnoO5ezm5yFB47GZNWO6I...The Stripe Experience You Deserve

import stripe
# Step 0: Set up Stripe API key
stripe.api_key = 'your-api-key'
# Step 1: Create a product (hope you remember the parameters)
product = stripe.Product.create(name='Digital Course')
# Step 2: Create a price (what parameters does this take again?)
price = stripe.Price.create(product=product.id, unit_amount=4999, # Wait, is this in cents or dollars?
currency='usd')
# Step 3: Create checkout session (time to hunt through docs)
checkout = stripe.checkout.Session.create(mode='payment', # What other modes are there?
line_items=[{'price': price.id, 'quantity': 1}],
success_url='http://localhost:5001/success',
cancel_url='http://localhost:5001/cancel')
print(checkout.url[:64] + "...")https://billing.answer.ai/c/pay/cs_test_a100gzzSnVxiOBse34iThOdq...@classmethod
def create(cls, **params: Unpack["Session.CreateParams"]) -> "Session":
"""
Creates a Checkout Session object.
"""
return cast(
"Session",
cls._static_request(
"post",
cls.class_url(),
params=params,
),
)
FastStripe
from faststripe.core import StripeApi
sapi = StripeApi('your-key-here')
checkout = sapi.one_time_payment(product_name='Digital Course', amount_cents=49_99,
success_url='http://localhost:5001/success',
cancel_url='http://localhost:5001/cancel')
print(checkout.url[:64] + "...")https://billing.answer.ai/c/pay/cs_test_a1u6skiy313rnW2pWwcPhqK5...def one_time_payment(
self:StripeApi, product_name, amount_cents,
success_url, cancel_url, currency='usd', quantity=1, **kw):
'Create a simple one-time payment checkout'
_, price = self.priced_product(product_name, amount_cents, currency)
return self.checkout.sessions_post(
mode='payment', line_items=[dict(price=price.id, quantity=quantity)],
automatic_tax={'enabled': True}, success_url=success_url, cancel_url=cancel_url, **kw)Down the Rabbit Hole We Go
man code behind the curtain. FastStripe first works by taking a snapshot of Stripe’s OpenAPI specification and creating an endpoints Python file, which converts that spec into a cleaner form. This form represents the path to the API, what HTTP verb to use, its summary (which will be used for creating the docstring), and the parameters associated with this path:# Generated from Stripe's OpenAPI spec for version 2025.05.28
eps = [
{
'path': '/v1/customers',
'verb': 'post',
'summary': 'Create a customer',
'params': [
{'name': 'email', 'description': "Customer's email address"},
{'name': 'name', 'description': "Customer's full name"},
# ... 20+ more parameters with descriptions
]
},
# ... hundreds more endpoints
]__call__ method. This means in your IDE you have nice tab completions and can easily view what each parameter does, which each endpoint does, and what each parameter is. And similar to GhApi, you can run things like sapi.checkout in a Jupyter environment and it will show all the available operations you can do under the checkout resource:sapi.checkout- checkout.sessions_get(created: 'str', customer: 'str', customer_details: 'str', ending_before: 'str', expand: 'str', limit: 'str', payment_intent: 'str', payment_link: 'str', starting_after: 'str', status: 'str', subscription: 'str'): List all Checkout Sessions
- checkout.sessions_session_get(session, expand: 'str'): Retrieve a Checkout Session
- checkout.sessions_session_post(session, collected_information: dict = None, expand: list = None, metadata: object = None, shipping_options: object = None): Update a Checkout Session
...sapi class:sapi- account
- accounts
- apple
- application
- apps
...Versioning
sapi.one_time_payment(), but the first three numbers always match Stripe’s API version.Helper Functions
sapi.one_time_payment() is one of these helper functions. In fact, I lied a bit in the intro when I showed the difference in code between doing it in vanilla Stripe and FastStripe. The more accurate Stripe version would be this:# Step 1: Create or find a product
products = stripe.Product.list(limit=100)
product = next((p for p in products if p.name == 'Digital Course'), None)
if not product:
product = stripe.Product.create(name='Digital Course')
# Handle pagination if you have >100 products
pass
# Step 2: Create or find a price
prices = stripe.Price.list(product=product.id, limit=100)
price = next((p for p in prices if p.unit_amount == 4999), None)
if not price:
price = stripe.Price.create(
product=product.id,
unit_amount=4999,
currency='usd'
)
# More pagination handling
# Step 3: Create checkout session
checkout = stripe.checkout.Session.create(
mode='payment',
line_items=[{'price': price.id, 'quantity': 1}],
success_url='http://localhost:5001/success',
cancel_url='http://localhost:5001/cancel'
)
print(checkout.url[:64] + "...")https://billing.answer.ai/c/pay/cs_test_a1y7FuflPm1o3jzOojiGpMHy...priced_product and find_product. We also got a similar helper function for subscriptions:checkout = sapi.subscription(
product_name='Pro Plan', amount_cents=19_99,
success_url='http://localhost:5001/welcome',
cancel_url='http://localhost:5001/pricing',
customer_email='[email protected]'
)
print(checkout.url[:64] + "...")https://billing.answer.ai/c/pay/cs_test_a1r4kjOWpmM2OKicG7dF5t1e...Pagination
limit parameter. Frequently, you have more results than this, so you need to make multiple requests using pagination parameters such as starting_after or ending_before to fetch the next chunk of data.products = []
starting_after = None
while True:
resp = stripe.Product.list(limit=100, starting_after=starting_after)
products.extend(resp.data)
if not resp.has_more:
break
starting_after = resp.data[-1].id
break
len(products), products[0].keys()(100,
dict_keys(['id', 'object', 'active', 'attributes', 'created', 'default_price', 'description', 'images', 'livemode', 'marketing_features', 'metadata', 'name', 'package_dimensions', 'shippable', 'statement_descriptor', 'tax_code', 'type', 'unit_label', 'updated', 'url']))ghapi, FastStripe has a paged function which turns any Stripe pagination endpoint into a Python generator that you can iterate through:from faststripe.page import *
for p in paged(sapi.customers.get, limit=2):
print(len(p.data), p.data[0].keys())
break2 dict_keys(['id', 'object', 'address', 'balance', 'created', 'currency', 'default_source', 'delinquent', 'description', 'discount', 'email', 'invoice_prefix', 'invoice_settings', 'livemode', 'metadata', 'name', 'next_invoice_sequence', 'phone', 'preferred_locales', 'shipping', 'tax_exempt', 'test_clock'])pages, which will return all items from all the pages as a list:prods = pages(sapi.products.get, limit=100)
len(prods), prods[0].keys()(658,
dict_keys(['id', 'object', 'active', 'attributes', 'created', 'default_price', 'description', 'images', 'livemode', 'marketing_features', 'metadata', 'name', 'package_dimensions', 'shippable', 'statement_descriptor', 'tax_code', 'type', 'unit_label', 'updated', 'url']))Getting Started with FastStripe
1. Stripe Setup
2. FastStripe Setup
pip install faststripefrom faststripe.core import StripeApi
sapi = StripeApi('your-key-here')
checkout = sapi.one_time_payment(product_name='Digital Course', amount_cents=49_99,
success_url='http://localhost:5001/success',
cancel_url='http://localhost:5001/cancel')
print(checkout.url[:64] + "...")https://billing.answer.ai/c/pay/cs_test_a1PxMDnqAbBYoqeNgdYyVxST...checkout = sapi.subscription(
product_name='Pro Plan', amount_cents=19_99,
success_url='http://localhost:5001/welcome',
cancel_url='http://localhost:5001/pricing',
customer_email='[email protected]'
)
print(checkout.url[:64] + "...")https://billing.answer.ai/c/pay/cs_test_a1oTHsHFpEdQWIwHVb5Ghav0...Next Steps
2025-06-13 08:00:00
TLDR: This post introduces fastmigrate, a Python database migration tool. It focuses on sqlite, and it does not require any particular ORM library. It’s suitable if you want to work directly with sqlite and keep things simple. For instructions, check out the fastmigrate repo.
Let’s talk migrations!

Uh, no. Let’s talk about the database migration pattern.
Migrations represent a powerful architectural pattern for managing change in your database. They let you write your application code so that it only needs to know about the latest version of your database, and they simplify the code you use to update the database itself.
But it is easy to overlook this pattern because many database helper libraries do so many other things at the same time, in such a complex fashion, that they obscure the simplicity of this basic pattern.
So today, we’re releasing fastmigrate, a library and command line tool for database migrations. It embraces the simplicity of the underlying pattern by being a simple tool itself. It provides a small set of commands. It treats migrations as just a directory of your own scripts. It only requires understanding the essential idea, not a lot of extra jargon. We like it!
This article will explain what database migrations are in general and what problem they solve, and then illustrate how to do migrations in sqlite with fastmigrate.
The core problem which migrations solve is to make it easier to change your database schema (and other basic structures) without breaking your application. They do this by making database versions explicit and managed, just like the changes in your application code.
To see how complexity creeps in otherwise, consider a typical sequence of events in developing an app. The first time the app runs, it only needs to handle one situation, the case where there is no database yet and it needs to create one. At this point, your app’s startup code might look like this:
# App v1
db.execute("CREATE TABLE documents (id INT, content TEXT);")But wait… The second time a user runs that same app, the table will already exist. So in fact your code should handle two possible cases – the case where the table does not exist, and the case where it already exists.
So in the next version of your app, you update your initialization code to the following:
# App v2
db.execute("CREATE TABLE IF NOT EXISTS documents (id INT, content TEXT);")Later, you might decide to add a new column to the database. So in your app’s third version, you add a second line:
# App v3
db.execute("CREATE TABLE IF NOT EXISTS documents (id INT, content TEXT);")
db.execute("ALTER TABLE documents ADD COLUMN title TEXT;")But wait again… You don’t want to alter the table like this if the column already exists. So App v4 will need more complex logic to handle that case. And so on.
Even this trivial example would create bugs if not handled properly. In a real app, as you introduce and then modify table relationships, such issues become more subtle, numerous, and stressful since one wrong step can lose user data.
What happens is that, with every new version, your application’s code grows more complicated because it is required to handle not just one state of the database but every possible previous state.
To avoid this, you would need to force separate database updates so that your application code knew exactly what to expect from the database. This is often not feasible when the app manages the database and every user gets to decide when to run their own installation of the app, as is the case in a mobile app, a desktop app, or a webapp with one database per user. Even in systems with a single database, forcing separate database updates would introduce an important new kind of change to manage – that is, database changes, which would need to be delicately coupled with changes in your application code.
This gets to the heart of the problem, which is that by default these various database states are implicit and unmanaged.
With your application code, a git commit unambiguously specifies both a version of your code and the change which produced it. Then, your deployment system lets you control exactly which version of your application your users will see next. But with your database, without some system, all you know is that the database is in some unnamed state produced by previous code. The version control and deployment tools which so nicely manage your application code will not automatically control which version of the database your application sees next.
The database migration pattern solves this problem with two key measures:
First, defining database versions, based on migrations. Instead of reasoning about unnamed database state, we introduce explicit version management of your database.
How do we do this? With migration scripts. A migration script is an isolated, single-purpose script whose only job is to take the database from one version (e.g., 5) to the next version (e.g., 6).
Fastmigrate keeps this simple and names the scripts based on the database version they produce so that, for instance, the script named 0006-add_user.sql must be the one and only script which produces database version 6. In a fundamental sense, the version numbers in the migration scripts define the set of recognized database versions. Thus, you can see the past version of your database by listing the scripts which produced those versions, just like looking at a log of git commits:
$ ls -1 migrations/
0001-initialize.sql
0002-add-title-to-documents.sql
0003-add-users-table.sqlThis structured approach enables the next key measure.
Second, writing the app to target one database version. Moving the database evolution code into these migration scripts means that the application code can forget about database changes and target only one version of the database, the latest version.
The application can rely on a migration library, like fastmigrate, to run whatever migrations are needed. That might mean recapitulating all the migrations to create the latest version of the database from nothing when running a fresh instance in development. Or it might mean applying only the latest migration, to bring a recent database version up to date. Or it might mean something in between. The point is, the application does not need to care.
One way to measure the simplification is to count how many fewer cases different parts of your system need to handle.
Before migrations, your application code was in effect responsible for handling all possible previous database states, even when it would have required increasingly careful attention to remember and understand just what all those states were. After migrations, everything is explicit, legible, and factored. The application is responsible for working with just one database version. And every database version has exactly one script which produces it from one previous version. (So clean! Doesn’t it make you want to sigh? Ahhhh…)
| Feature | Without migrations | With migrations |
|---|---|---|
| DB States | Uncounted, unnamed |
|
| DB Management | None |
|
| App Requirements | App must support all DB states, and manage DB changes | App must support only one DB version, the latest |
Let us follow the previous example again, and see how this works in fastmigrate.
Instead of embedding the evolving database schema logic into your app’s startup, you will define a series of migration scripts. These scripts are SQL, but you could also use Python or shell scripts. Your application will then use fastmigrate’s API to run those scripts as needed, bringing the database to the latest expected version automatically.
Your first migration script creates the table. Create a directory migrations/ and in that directory put the file 0001-initialize.sql.
-- migrations/0001-initialize.sql
CREATE TABLE documents (
id INTEGER PRIMARY KEY,
content TEXT
);The 0001 prefix is key: it indicates this is the first script to run, and also that it produces version 1 of your database.
Run pip install fastmigrate to install it from PyPi, so your app can use it.
Now your application startup code can rely on fastmigrate to create and/or update the database. Create your app, in a file called app.py:
from fastmigrate.core import create_db, run_migrations, get_db_version
db_path = "./app.db"
migrations_dir = "./migrations/"
# Ensures a versioned database exists.
# If no db exists, it's created and set to version 0.
# If a db exists, nothing happens
create_db(db_path)
# Apply any pending migrations from migrations_dir.
success = run_migrations(db_path, migrations_dir)
if not success:
print("Database migration failed! Application cannot continue.")
exit(1) # Or your app's specific error handling
# After this point, your application code can safely assume
# the 'documents' table exists exactly as defined in 0001-initialize.sql.
# The database is now at version 1.
version = get_db_version(db_path)
print(f"Database is at version {version}")The first time this Python code runs, create_db() initializes your database, and inserts metadata to mark it as a managed database with version 0. This is done by adding a small _meta table, which stores the current version and indicates it is a managed database.
Then, the function run_migrations() sees 0001-initialize.sql. Since version 1 is greater than the database’s current version 0, the function executes it, and marks the database’s version to 1. On subsequent runs, if no new migration scripts have been added, run_migrations() sees the database is already at version 1 and does nothing further.
You can run your app now, with python3 app.py, and the app will report that the db is at version 1, no matter how many times you run it. You will also be able to see in your directory data.db, the database file it created.
But what about schema evolution?
When you decide your documents table needs a title column, you only need to add a migration script which adds the column.
This change defines version 2 of your database. In the migrations directory, add a file named 0002-add-title-to-documents.sql.
-- migrations/0002-add-title-to-documents.sql
ALTER TABLE documents ADD COLUMN title TEXT;The key point is, your application startup code does not change: It remains the same Python snippet shown above.
When that code runs on a database which was previously at version 1 (i.e., where only 0001-initialize.sql had been applied), the following happens:
create_db(db_path) confirms the database exists and is at version 1.
run_migrations() scans the migrations/ directory. It finds 0002-add-title-to-documents.sql. Since the script’s version (2) is greater than the database’s current version (1), it executes this new script.
After successful execution, fastmigrate marks the database’s version to 2.
Your application code, which runs after these fastmigrate calls, can now assume the documents table has id, content, and the new title column.
Run your app again, with python3 app.py, and now it will report the database is at version 2.
If you are curious how this works under the hood, it is nothing occult. Fastmigrate marks a database by adding the _meta table, which you can see directly by using the sqlite3 executable:
$ sqlite3 app.db .tables
_meta documentsYou can look in it to see the version is now 2:
$ sqlite3 app.db "select * from _meta;"
1|2But this an implementation detail. The crucial point is the shift in approach:
The complex conditional logic is entirely removed from your application’s main startup sequence.
Schema changes are isolated into small, clearly named, versioned SQL scripts.
Your application’s core startup routine (create_db(), run_migrations()) is stable, even as the database schema evolves.
The rest of your application code, the part that actually uses the database, can always be written to expect the single, latest schema version defined by the highest-numbered migration script. It doesn’t need conditional paths for older database structures.
This "append-only" approach to migrations, where you always add new, higher-numbered scripts for subsequent changes, makes your database evolution explicit, managed, and easy to integrate. The responsibility for reaching the target schema version is delegated to fastmigrate.
When you check your code into version control, you should take care to include the migration script which defines the new database version along with the application code which requires that new database version. Then, your application code will always see exactly the database version which it requires.
Before integrating a new migration script into your app, you will of course want to test it. This is straightforward since migration scripts are designed to run in isolation. To help run them interactively, fastmigrate also provides a command line interface (CLI).
If you want to inspect the database your app just created, you can run the check version command:
$ fastmigrate_check_version --db app.db
FastMigrate version: 0.3.0
Database version: 2When the names of CLI commands match the API, they do exactly the same thing. fastmigrate_create_db behaves just like fastmigrate.create_db, fastmigrate_run_migrations like fastmigrate.run_migrations, and so on.
For instance, you can run these commands to create an empty managed db and run migrations on it:
$ fastmigrate_create_db --db data.db
Creating database at data.db
Created new versioned SQLite database with version=0 at: data.db
$ fastmigrate_run_migrations --db data.db --migrations migrations/
Applying migration 1: 0001-initialize.sql
✓ Database updated to version 1 (0.00s)
Applying migration 2: 0002-add-title-to-documents.sql
✓ Database updated to version 2 (0.00s)
Migration Complete
• 2 migrations applied
• Database now at version 2
• Total time: 0.00 secondsNothing new to learn!
For a more detailed walkthrough of the recommended workflow when introducing a new migration, please see our guide on safely adding migrations.
There is also guidance on taking a database which started outside of fastmigrate, and enrolling it as a managed database. Technically, this is nothing more than adding the private metadata which marks the database’s version. But the tool will gives you some help in getting started by generated a draft 0001-initialize.sql migration script, since you will need one which initializes a database equivalent to the database which you are enrolling. This generated script is only a draft since you should definitely verify manually that it is correct for your database.
Check out that map again and consider that our ancestors traveled thousands of miles, without even having air conditioning, podcasts, and AI chatbots to flatter them. It was rough and, yes, we don’t have it so bad.
But nevertheless, managing the evolution of a production database is stressful.
This is natural enough, since it’s the user’s data. The whole purpose of most software is to transform and store that data. So if you mess up your database, your software has failed at its main reason for existing.
The antidote to that stress is clarity. You want to know what you are doing.
Consider that warm feeling of comfort you get when someone refers to a git commit by its hash. (Mmmm.) That feeling is because a hash is unambiguous. If you ask git to compute which files changed between two commit hashes, you know exactly what the answer means. You want to have the same clarity regarding your database.
The migrations pattern brings that by ensuring your database has a simple version number which tells you what state it is in and, therefor, exactly what your application can expect.
And since it’s a simple idea, it needs only a simple tool.
That is why fastmigrate introduces only a few main commands – create_db, get_db_version, and run_migrations – and relies on things you already know, like how to list files and interpret an integer.
In contrast, many existing database tools are complex because they provide a lot of other things as well – object-relational mappers, templating systems, support for various backends, requirements for multiple config files with different syntaxes. If your system has grown in complexity to the point where it needs all that, then that is what you need.
But if you are able to keep your system simple, then a simple solution will serve you better. It will be easier to understand, easier to use, easier to hold in your head and in your hand. If you were chopping a carrot, would you want a good sharp knife? Or a food processor, with a special carrot-chopping attachment, which you need to read the manual of just to figure out how to attach it?
fastmigrate aims to be a good sharp knife. May you wield it with clarity and confidence!
2025-06-07 08:00:00
Note from Jeremy: I’m thrilled that the legendary Daniel Roy Greenfeld took the time to dig into a very recent addition I made to fastcore:
flexicache. It’s a super useful little tool which nowadays I use all the time. I hope you like it as much as Danny and I do!
When coding in Python really like to use decorators to cache results from functions and methods, often to memory and sometimes to ephemeral stores like memcached. In fact, I’ve worked on and created several cache decorators, including one that influenced the implementation of the @cached_property decorator in Python 3.8.
A cache decorator called flexicache is part of the fastcore library. flexicache allows you to cache in memory results from functions and methods in a flexible way. Besides having an implementation of LRU caching, each use of the decorator can be configured to use one or more cache invalidation policies.
Two policies, time_policy and mtime_policy are used to invalidate the cache based on time and file modification time respectively. The time_policy invalidates the cache after a specified number of seconds, while the mtime_policy invalidates the cache if the file has been modified since the last time it was cached.
Let’s try it out!
# Import necessary libraries
from fastcore.xtras import flexicache, time_policy, mtime_policy
# Libraries used in testing cache validity and cache invalidation
from random import randint
from pathlib import Path
from time import sleepHere’s a simple function returning a number between 1 to 1000 that we can show being cached. We’ll use this in all our examples.
def random_func(v):
return randint(1, 1000)
# Assert False as the function is not cached
assert random_func(1) != random_func(1)This is how we use the time_policy to cache the function.
@flexicache(time_policy(.1))
def random_func():
return randint(1, 1000)
# assert True as the function is cached
assert random_func() == random_func()Let’s use the sleep function to simulate time between calls to random_func.
result = random_func()
# True as the function is cached
assert result == random_func()
# Sleep for .2 seconds to allow cache to expire
sleep(0.2)
# Assert False as the cache has expired and the function is called again
assert result != random_func()We’ll try with mtime_policy, checking to see if touching a file invalidates the cache. We’ll use this site’s main.py file as the file to touch.
@flexicache(mtime_policy('../../main.py'))
def random_func():
return randint(1, 1000)
# Assert True as the function is cached
assert random_func() == random_func()Now let’s use the Path.touch() method to touch the file. This will update the file’s modification time to the current time, which should invalidate the cache.
# Call the function to cache the result
result = random_func()
assert result == random_func() # True as the function is cached
# Update the file's modification time, which invalidates the cache
Path('../../main.py').touch()
# Assert False as the cache is invalidated
assert result != random_func() A unique feature of flexicache is that you can use multiple policies at the same time. This allows you to combine the benefits of different caching strategies. In this example, we’ll use both time_policy and mtime_policy together. This means that the cache will be invalidated if either the time limit is reached or the file has been modified.
Testing the cache with both policies is identical to the previous examples. We’ll call the function, first with the time policy, then with the mtime policy, and finally with both policies. We’ll also touch the file to see if it invalidates the cache.
@flexicache(time_policy(.1), mtime_policy('../../main.py'))
def random_func():
return randint(1, 1000)
# True as the function is cached
assert random_func() == random_func()Testing time invalidation is the same as before. We’ll call the function, wait for the time limit to be reached, and then call it again to see if the cache is invalidated.
result = random_func()
# True as the function is cached
assert result == random_func()
# Sleep for .2 seconds to allow cache to expire
sleep(0.2)
# False as the cache has expired and the function is called again
assert result != random_func() Testing file timestamp is the same as before. We’ll call the function, touch the file, and then call it again to see if the cache is invalidated.
# Call the function to cache the result
result = random_func()
# True as the function is cached
assert result == random_func()
# Update the file's modification time, which invalidates the cache
Path('../../main.py').touch()
# Assert False as the cache is invalidated
assert result != random_func() Now let’s test out the flexicache decorator to see how it behaves as an lru_cache replacement. For reference, LRU caching is a caching strategy that keeps track of the most recently used items and removes the least recently used items when the cache reaches its maximum size. In other words, it takes out the latest items from the cache first when it runs out of space. It uses the FIFO (first in, first out) strategy to remove the oldest items from the cache.
We’ll use flexicache with maxsize (of cache) of 2, meaning after 2 saves it starts discarding the oldest cache entries. Entries in cache functions are identified in the cache by arguments (v),so we add an argument to the function.
@flexicache(maxsize=2)
def random_func(v):
return randint(1, 1000)Let’s see how it works.
result1 = random_func(1)
# True as the function is cached
assert result1 == random_func(1)
# True as the function is cached
assert random_func(2) == random_func(2) So far so good. The cache is working as expected. Now let’s start evicting the first items added to the cache. We’ll add a third item to the cache and see if the first one is evicted.
# True as the function for 3 is cached,
# but it will evict the result of random_func2(1)
assert random_func(3) == random_func(3)
# False as the first result is no longer cached
assert result1 != random_func(1) lru_cache is a built-in Python decorator that provides a simple way to cache the results of a function. It uses a Least Recently Used (LRU) caching strategy, which means that it keeps track of the most recently used items as based on arguments and removes the least recently used items when the cache reaches its maximum size. In other words, it takes out the latest items from the cache first when it runs out of space.
The downside is that it doesn’t have a timeout feature, so if you want to cache results for a specific amount of time, you need to implement that yourself.
fastcore.xtras.timed_cache is an implementation of flexicache that adds a timeout feature to functools.lru_cache.
from fastcore.xtras import timed_cache
# shortcut for @flexicache(time_policy(.1), maxsize=2)
@timed_cache(.1, maxsize=2)
def random_func(v):
return randint(1, 1000)
# True as the function is cached
assert random_func(1) == random_func(1)Testing the timeout is the same as before with flexicache(time_policy(.1), maxsize=2). We’ll call the function, wait for the timeout to be reached, and then call it again to see if the cache is invalidated.
# Wait long enough for the cache to expire
sleep(0.2)
# Assert False as the cache is time invalidated
assert result1 != random_func(1) Finally, confirm that the LRU cache is removing the first cached item. This is the same LRU cache set of tests we used in the section above about LRU caching. Again, we’ll add a third item to the cache and see if the first one is evicted.
result1 = random_func(1)
# True as the function is cached
assert result1 == random_func(1)
# True as the function is cached
assert random_func(2) == random_func(2)
# True as the function for 3 is cached,
# but it will evict the result of random_func2(1)
assert random_func(3) == random_func(3)
# False as the first result is no longer cached
assert result1 != random_func(1) 