MoreRSS

site iconGeoffrey HuntleyModify

I work remotely from a van that is slowly working its way around Australia.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Geoffrey Huntley

don’t waste your back pressure

2026-01-17 18:46:56

don’t waste your back pressure

I am fortunate to be surrounded by folks who listen and the link below post will go down as a seminal reading for people interested in AI context engineering.

A simple convo between mates - well Moss translated it into words and i’ve been waiting for it to come out so I didn’t front run him.

Don’t waste your back pressure ·
Back pressure for agents You might notice a pattern in the most successful applications of agents over the last year. Projects that are able to setup structure around the agent itself, to provide it with automated feedback on quality and correctness, have been able to push them to work on longer horizon tasks. This back pressure helps the agent identify mistakes as it progresses and models are now good enough that this feedback can keep them aligned to a task for much longer. As an engineer, this means you can increase your leverage by delegating progressively more complex tasks to agents, while increasing trust that when completed they are at a satisfactory standard.
don’t waste your back pressure

read this and internalise this

Enjoy. This is what engineering now looks like in the post loom/gastown era or even when doing ralph loops.

don’t waste your back pressure
software engineering is now about preventing failure scenarios and preventing the wheel from turning over through back pressure to the generative function

If you aren’t capturing your back-pressure then you are failing as a software engineer.

Back-pressure is part art, part engineering and a whole bung of performance engineering as you need "just enough" to reject invalid generations (aka "hallunications") but if the wheel spins too slow ("tests take a long time to run or for the application to compile") then it's too much resistance.

There are many ways to tune back-pressure and as Moss states it starts with choice of programming language, applying engineering knowledge to design a fast test suite that provides signal but perhaps my favorite one is pre-commit hooks (aka prek).

GitHub - j178/prek: ⚡ Better `pre-commit`, re-engineered in Rust
⚡ Better `pre-commit`, re-engineered in Rust. Contribute to j178/prek development by creating an account on GitHub.
don’t waste your back pressure

Under normal circumstances pre-commit hooks are annoying because they slow down humans but now that humans aren't the ones doing the software development it really doesn't matter anymore.

everything is a ralph loop

2026-01-17 14:43:54

everything is a ralph loop

I’ve been thinking about how I build software is so very very different how I used to do it three years ago.

No, I’m not talking about acceleration through usage of AI but instead at a more fundamental level of approach, techniques and best practices.

Standard software practices is to build it vertically brick by brick - like Jenga but these days I approach everything as a loop. You see ralph isn’t just about forwards (building autonomously) or reverse mode (clean rooming) it’s also a mind set that these computers can be indeed programmed.

watch this video to learn the mindset

I’m there as an engineer just as I was in the brick by brick era but instead am programming the loop, automating my job function and removing the need to hire humans.

Everyone right now is going through their zany period - just like i did with forward mode and building software AFK on full auto - however I hope that folks will come back down from orbit and remember this from the original ralph post.

While I was in SFO, everyone seemed to be trying to crack on multi-agent, agent-to-agent communication and multiplexing. At this stage, it's not needed. Consider microservices and all the complexities that come with them. Now, consider what microservices would look like if the microservices (agents) themselves are non-deterministic—a red hot mess.

What's the opposite of microservices? A monolithic application. A single operating system process that scales vertically. Ralph is monolithic. Ralph works autonomously in a single repository as a single process that performs one task per loop.

Software is now clay on the pottery wheel and if something isn’t right then i just throw it back on the wheel to address items that need resolving.

Ralph is an orchestrator pattern where you allocate the array with the required backing specifications and then give it a goal then looping the goal.

It's important to watch the loop as that is where your personal development and learning will come from. When you see a failure domain – put on your engineering hat and resolve the problem so it never happens again.

In practice this means doing the loop manually via prompting or via automation with a pause that involves having to prcss CTRL+C to progress onto the next task. This is still ralphing as ralph is about getting the most out how the underlying models work through context engineering and that pattern is GENERIC and can be used for ALL TASKS.

In other news I've been cooking on something called "The Weaving Loom". The source code of loom can now be found on my GitHub; do not use it if your name is not Geoffrey Huntley. Loom is something that has been in my head for the last three years (and various prototypes were developed last year!) and it is essentially infrastructure for evolutionary software. Gas town focuses on spinning plates and orchestration - a full level 8.

everything is a ralph loop
see https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04

I’m going for a level 9 where autonomous loops evolve products and optimise automatically for revenue generation. Evolutionary software - also known as a software factory.

everything is a ralph loop

This is a divide now - we have software engineers outwardly rejecting AI or merely consuming via Claude Code/Cursor to accelerate the lego brick building process....

but software development is dead - I killed it. Software can now be developed cheaper than the wage of a burger flipper at maccas and it can be built autonomously whilst you are AFK.

hi, it me. i’m the guy

I’m deeply concerned for the future of these people and have started publishing videos on YouTube to send down ladders before the big bang happens.

i now won’t hire you unless you have this fundamental knowledge and can show what you have built with it

Whilst software development/programming is now dead. We however deeply need software engineers with these skills who understand that LLMs are a new form of programmable computer. If you haven’t built your own coding agent yet - please do.

how to build a coding agent: free workshop
It’s not that hard to build a coding agent. 300 lines of code running in a loop with LLM tokens. You just keep throwing tokens at the loop, and then you’ve got yourself an agent.
everything is a ralph loop

ps. think this is out there?

It is but watch it happen live. We are here right now, it’s possible and i’m systemising it.

Here in the tweet below I am putting loom under the mother of all ralph loops to automatically perform system verification. Instead of days of planning, discussions and weeks of verification I’m programming this new computer and doing it afk whilst I DJ so that I don’t have to hire humans.

everything is a ralph loop

Any faults identified can be resolved through forward ralph loops to rectify issues. Over the last year the models have became quite good and it's only now that I'm able to realise this full vision but I'll leave you with this dear reader....

What if the models don't stop getting good?

How well will your fair if you are still building jenga stacks when there are classes of principal software engineers out there to prove a point that we are here right now and please pay attention.

everything is a ralph loop

Go build your agent, go learn how to program the new computer (guidance forthcoming in future posts), fall in love with all the possibilities and then join me in this space race of building automated software factories.

ps. socials

llm weights vs the papercuts of corporate

2025-12-08 23:55:28

llm weights vs the papercuts of corporate

In woodworking, there's a saying that you should work with the grain, not against the grain and I've been thinking about how this concept may apply to large language models.

These large language models are built by training on existing data. This data forms the backbone which creates output based upon the preferences of the underlying model weights.

We are now one year in where a new category of companies has been founded whereby the majority of the software behind that company was code-generated.

From here on out I’m going to call to these companies as model weight first. This category of companies can be defined as any company that is building with the data (“grain”) that has been baked into the large language models.

Model weight first companies do not require as much context engineering. They’re not stuffing the context window with rules to try attempt to override and change the base models to fit a pre-existing corporate standard and conceptualisation of how software should be.

The large language model has decided on what to call a method name or class name because that method or classs name is what the large language model prefers thus, when code is adapted, modified, and re-read into the context window, it is consuming its preferred choice of tokens.

Model-weight-first companies do not have the dogma of snake_case vs PascalCase vs kebab-case policies that many corporate companies have. Such policies were created for humans to create consistency so humans can comprehend the codebase. Something that is of a lesser concern now that AI is here.

Now variable naming is a contrived example, but I suspect in the years to come if a study was done to compare the velocity/productivity/success rates with AI of a model weight first company vs. a corporate company, I suspect a model weight company have vastly better outcomes because they're not trying to do context engineering to force the LLM to follow some pre-existing dogma. There is one universal truth with LLMs as they are now: the less that you use, the better the outcomes you get.

The less that you allocate (i.e., cursor rules or what else have you), then you'll have more context window available for actually implementing requirements of the software that needs to be built.

So if we take this thought experiment about the models having preferences for tokens and expand it out to another use case, let's say that you needed to build a Docker container at a model weight first company.

You could just ask an LLM to build a Docker container, and it knows how to build a Docker container for say Postgres, and it just works. But in the corporate setting, if you ask it to build a Docker container, and in that corporate you have to configure HTTPS, squid proxy, or some sort of artifactory and outbound internet access is restricted, that same simple thing becomes very comical.

You'll see an agent fill up with lots of failed tool calls unless you do context engineering to say "no, if you want to build a docker container, you got to follow these particular allocations of company conventions” in a crude attempt to override the preferences of the inbuilt model weights.

At a model weight first company, building a docker image is easy but at a corporate the agent will have one hell of a time and end up with a suboptimal/disappointing outcome.

So, perhaps this is going to be a factor that needs to be considered when talking and comparing the success rates of AI at one company versus another company, or across industries.

If a company is having problems with AI and getting outcomes from AI, are they a model weight first company or are they trying to bend AI to their whims?

Perhaps the corporates who succeed the most with the adoption of AI will be those who shed their dogma that no longer applies and start leaning into transforming to become model-weight-first companies.

ps. socials.

i ran Claude in a loop for three months, and it created a genz programming language called cursed

2025-09-09 11:36:48

i ran Claude in a loop for three months, and it created a genz programming language called cursed

It's a strange feeling knowing that you can create anything, and I'm starting to wonder if there's a seventh stage to the "people stages of AI adoption by software developers"

i ran Claude in a loop for three months, and it created a genz programming language called cursed

whereby that seventh stage is essentially this scene in the matrix...

It's where you deeply understand that 'you can now do anything' and just start doing it because it's possible and fun, and doing so is faster than explaining yourself. Outcomes speak louder than words.

There's a falsehood that AI results in SWE's skill atrophy, and there's no learning potential.

If you’re using AI only to “do” and not “learn”, you are missing out
- David Fowler

I've never written a compiler, yet I've always wanted to do one, so I've been working on one for the last three months by running Claude in a while true loop (aka "Ralph Wiggum") with a simple prompt:

Hey, can you make me a programming language like Golang but all the lexical keywords are swapped so they're Gen Z slang?

Why? I really don't know. But it exists. And it produces compiled programs. During this period, Claude was able to implement anything that Claude desired.

The programming language is called "cursed". It's cursed in its lexical structure, it's cursed in how it was built, it's cursed that this is possible, it's cursed in how cheap this was, and it's cursed through how many times I've sworn at Claude.

i ran Claude in a loop for three months, and it created a genz programming language called cursed
https://cursed-lang.org/

For the last three months, Claude has been running in this loop with a single goal:

"Produce me a Gen-Z compiler, and you can implement anything you like."

It's now available at:

the 💀 cursed programming language: programming, but make it gen z

the website

GitHub - ghuntley/cursed: the 💀 cursed programming language: programming, but make it gen z
the 💀 cursed programming language: programming, but make it gen z - ghuntley/cursed
i ran Claude in a loop for three months, and it created a genz programming language called cursed

the source code

whats included?

Anything that Claude thought was appropriate to add. Currently...

  • The compiler has two modes: interpreted mode and compiled mode. It's able to produce binaries on Mac OS, Linux, and Windows via LLVM.
  • There are some half-completed VSCode, Emacs, and Vim editor extensions, and a Treesitter grammar.
  • A whole bunch of really wild and incomplete standard library packages.

lexical structure

Control Flow:
ready → if
otherwise → else
bestie → for
periodt → while
vibe_check → switch
mood → case
basic → default

Declaration:
vibe → package
yeet → import
slay → func
sus → var
facts → const
be_like → type
squad → struct

Flow Control:
damn → return
ghosted → break
simp → continue
later → defer
stan → go
flex → range

Values & Types:
based → true
cringe → false
nah → nil
normie → int
tea → string
drip → float
lit → bool
ඞT (Amogus) → pointer to type T

Comments:
fr fr → line comment
no cap...on god → block comment

example program

Here is leetcode 104 - maximum depth for a binary tree:

vibe main
yeet "vibez"
yeet "mathz"

// LeetCode #104: Maximum Depth of Binary Tree 🌲
// Find the maximum depth (height) of a binary tree using ඞ pointers
// Time: O(n), Space: O(h) where h is height

struct TreeNode {
    sus val normie
    sus left ඞTreeNode   
    sus right ඞTreeNode  
}

slay max_depth(root ඞTreeNode) normie {
    ready (root == null) {
        damn 0  // Base case: empty tree has depth 0
    }
    
    sus left_depth normie = max_depth(root.left)
    sus right_depth normie = max_depth(root.right)
    
    // Return 1 + max of left and right subtree depths
    damn 1 + mathz.max(left_depth, right_depth)
}

slay max_depth_iterative(root ඞTreeNode) normie {
    // BFS approach using queue - this hits different! 🚀
    ready (root == null) {
        damn 0
    }
    
    sus queue ඞTreeNode[] = []ඞTreeNode{}
    sus levels normie[] = []normie{}
    
    append(queue, root)
    append(levels, 1)
    
    sus max_level normie = 0
    
    bestie (len(queue) > 0) {
        sus node ඞTreeNode = queue[0]
        sus level normie = levels[0]
        
        // Remove from front of queue
        collections.remove_first(queue)
        collections.remove_first(levels)
        
        max_level = mathz.max(max_level, level)
        
        ready (node.left != null) {
            append(queue, node.left)
            append(levels, level + 1)
        }
        
        ready (node.right != null) {
            append(queue, node.right)
            append(levels, level + 1)
        }
    }
    
    damn max_level
}

slay create_test_tree() ඞTreeNode {
    // Create tree: [3,9,20,null,null,15,7]
    //       3
    //      / \
    //     9   20
    //        /  \
    //       15   7
    
    sus root ඞTreeNode = &TreeNode{val: 3, left: null, right: null}
    root.left = &TreeNode{val: 9, left: null, right: null}
    root.right = &TreeNode{val: 20, left: null, right: null}
    root.right.left = &TreeNode{val: 15, left: null, right: null}
    root.right.right = &TreeNode{val: 7, left: null, right: null}
    
    damn root
}

slay create_skewed_tree() ඞTreeNode {
    // Create skewed tree for testing edge cases
    //   1
    //    \
    //     2
    //      \
    //       3
    
    sus root ඞTreeNode = &TreeNode{val: 1, left: null, right: null}
    root.right = &TreeNode{val: 2, left: null, right: null}
    root.right.right = &TreeNode{val: 3, left: null, right: null}
    
    damn root
}

slay test_maximum_depth() {
    vibez.spill("=== 🌲 LeetCode #104: Maximum Depth of Binary Tree ===")
    
    // Test case 1: Balanced tree [3,9,20,null,null,15,7]
    sus root1 ඞTreeNode = create_test_tree()
    sus depth1_rec normie = max_depth(root1)
    sus depth1_iter normie = max_depth_iterative(root1)
    vibez.spill("Test 1 - Balanced tree:")
    vibez.spill("Expected depth: 3")
    vibez.spill("Recursive result:", depth1_rec)
    vibez.spill("Iterative result:", depth1_iter)
    
    // Test case 2: Empty tree
    sus root2 ඞTreeNode = null
    sus depth2 normie = max_depth(root2)
    vibez.spill("Test 2 - Empty tree:")
    vibez.spill("Expected depth: 0, Got:", depth2)
    
    // Test case 3: Single node [1]
    sus root3 ඞTreeNode = &TreeNode{val: 1, left: null, right: null}
    sus depth3 normie = max_depth(root3)
    vibez.spill("Test 3 - Single node:")
    vibez.spill("Expected depth: 1, Got:", depth3)
    
    // Test case 4: Skewed tree
    sus root4 ඞTreeNode = create_skewed_tree()
    sus depth4 normie = max_depth(root4)
    vibez.spill("Test 4 - Skewed tree:")
    vibez.spill("Expected depth: 3, Got:", depth4)
    
    vibez.spill("=== Maximum Depth Complete! Tree depth detection is sus-perfect ඞ🌲 ===")
}

slay main_character() {
    test_maximum_depth()
}

If this is your sort of chaotic vibe, and you'd like to turn this into the dogecoin of programming languages, head on over to GitHub and run a few more Claude code loops with the following prompt.

study specs/* to learn about the programming language. When authoring the cursed standard library think extra extra hard as the CURSED programming language is not in your training data set and may be invalid. Come up with a plan to implement XYZ as markdown then do it

There is no roadmap; the roadmap is whatever the community decides to ship from this point forward.

At this point, I'm pretty much convinced that any problems found in cursed can be solved by just running more Ralph loops by skilled operators (ie. people with experience with compilers who shape it through prompts from their expertise vs letting Claude just rip unattended). There's still a lot to be fixed, happy to take pull-requests.

Ralph Wiggum as a “software engineer”
😎Here’s a cool little field report from a Y Combinator hackathon event where they put Ralph Wiggum to the test. “We Put a Coding Agent in a While Loop and It Shipped 6 Repos Overnight” https://github.com/repomirrorhq/repomirror/blob/main/repomirror.md If you’ve seen my socials lately,
i ran Claude in a loop for three months, and it created a genz programming language called cursed

The most high-IQ thing is perhaps the most low-IQ thing: run an agent in a loop.

LLMs are mirrors of operator skill
This is a follow-up from my previous blog post: “deliberate intentional practice”. I didn’t want to get into the distinction between skilled and unskilled because people take offence to it, but AI is a matter of skill. Someone can be highly experienced as a software engineer in 2024, but that
i ran Claude in a loop for three months, and it created a genz programming language called cursed

LLMs amplify the skills that developers already have and enable people to do things where they don't have that expertise yet.

Success is defined as cursed ending up in the Stack Overflow developer survey as either the "most loved" or "most hated" programming language, and continuing the work to bootstrap the compiler to be written in cursed itself.

Cya soon in Discord? - https://discord.gg/CRbJcKaGNT

the 💀 cursed programming language: programming, but make it gen z

website

GitHub - ghuntley/cursed: the 💀 cursed programming language: programming, but make it gen z
the 💀 cursed programming language: programming, but make it gen z - ghuntley/cursed
i ran Claude in a loop for three months, and it created a genz programming language called cursed

source code

ps. socials

anti-patterns and patterns for achieving secure generation of code via AI

2025-09-02 23:53:29

anti-patterns and patterns for achieving secure generation of code via AI

I just finished up a phone call with a "stealth startup" that was pitching an idea that agents could generate code securely via an MCP server. Needless to say, the phone call did not go well. What follows is a recap of the conversation where I just shot down the idea and wrapped up the call early because it's a bad idea.

If anyone pitches you on the idea that you can achieve secure code generation via an MCP tool or Cursor rules, run, don't walk.

Over the last nine months, I've written about the changes that are coming to our industry, where we're entering an arena where most of the code going forward is not going to be written by hand, but instead by agents.

the six-month recap: closing talk on AI at Web Directions, Melbourne, June 2025
Welcome back to our final session at WebDirections. We’re definitely on the glide path—though I’m not sure if we’re smoothly landing, about to hit turbulence, or perhaps facing a go-around. We’ll see how it unfolds. Today, I’m excited to introduce Geoffrey Huntley. I discovered Geoff earlier this year through
anti-patterns and patterns for achieving secure generation of code via AI

where I think the puck is going.

I haven't written code by hand for nine months. I've generated, read, and reviewed a lot of code, and I think perhaps within the next year, the large swaths of code in business will no longer be artisanal hand-crafted. Those days are fast coming to a close.

Thus, naturally, there is a question that's on everyone's mind:

How do I make the agent generate secure code?

Let's start with what you should not do and build up from first principles.

how to build a coding agent: free workshop

2025-08-24 08:23:00

😎
The following was developed last month and has already been delivered at two conferences. If you would like for me to run a workshop similar to this at your employer, please get in contact.
how to build a coding agent: free workshop

Hey everyone, I'm here today to teach you how to build a coding agent. By this stage of the conference, you may be tired of hearing the word "agent".

You hear the word frequently. However, it appears that everyone is using this term loosely without a clear understanding of what it means or how these coding agents operate internally. It's time to pull back the hood and show that there is no moat.

Learning how to build a coding agent is one of the best things you can do for your personal development in 2025, as it teaches you the fundamentals. Once you understand these fundamentals, you'll move from being a consumer of AI to a producer of AI who can automate things with AI.

Let me open with the following facts:

how to build a coding agent: free workshop
it's not that hard
how to build a coding agent: free workshop
to build a coding agent
how to build a coding agent: free workshop
it's 300 lines of code
how to build a coding agent: free workshop
running in a loop
how to build a coding agent: free workshop

With LLM tokens, that's all it is.

300 lines of code running in a loop with LLM tokens. You just keep throwing tokens at the loop, and then you've got yourself an agent.

how to build a coding agent: free workshop

Today, we're going to build one. We're going to do it live, and I'll explain the fundamentals of how it all works. As we are now in 2025, it has become the norm to work concurrently with AI assistance. So, what better way to demonstrate the point of this talk than to have an agent build me an agent whilst I deliver this talk?

0:00
/0:22

Cool. We're now building an agent. This is one of the things that's changing in our industry, because work can be done concurrently and whilst you are away from your computer.

The days of spending a week or a couple of days on a research spike are now over because you can turn an idea into execution just by speaking to your computer.

The next time you're on a Zoom call, consider that you could've had an agent building the work that you're planning to do during that Zoom call. If that's not the norm for you, and it is for your coworkers, then you're naturally not going to get ahead.

how to build a coding agent: free workshop
please build your own
how to build a coding agent: free workshop
as the knowledge
how to build a coding agent: free workshop
will transform you
how to build a coding agent: free workshop
from being a consumer
how to build a coding agent: free workshop
to a producer that can
how to build a coding agent: free workshop
automate things

The tech industry is almost like a conveyor belt - we always need to be learning new things.

If I were to ask you what a primary key is, you should know what a primary key is. That's been the norm for a long time.

In 2024, it is essential to understand what a primary key is.

In 2025, you should be familiar with what a primary key is and how to create an agent, as knowing what this loop is and how to build an agent is now fundamental knowledge that employers are looking for in candidates before they'll let you in the door.

Yes, You Can Use AI in Our Interviews. In fact, we insist - Canva Engineering Blog
How We Redesigned Technical Interviews for the AI Era
how to build a coding agent: free workshop

As this knowledge will transform you from being a consumer of AI to being a producer of AI that can orchestrate your job function. Employers are now seeking individuals who can automate tasks within their organisation.

If you're joining me later this afternoon for the conference closing (see below), I'll delve a bit deeper into the above.

the six-month recap: closing talk on AI at Web Directions, Melbourne, June 2025
Welcome back to our final session at WebDirections. We’re definitely on the glide path—though I’m not sure if we’re smoothly landing, about to hit turbulence, or perhaps facing a go-around. We’ll see how it unfolds. Today, I’m excited to introduce Geoffrey Huntley. I discovered Geoff earlier this year through
how to build a coding agent: free workshop

the conference closing talk

how to build a coding agent: free workshop

Right now, you'll be somewhere on the journey above.

On the top left, we've got 'prove it to me, it's not real,' 'prove it to me, show me outcomes', 'prove it to me that it's not hype', and a bunch of 'it's not good enough' folks who get stuck up there on that left side of the cliff, completely ignoring that there are people on the other side of the cliff, completely automating their job function.

In my opinion, any disruption or job loss related to AI is not a result of AI itself, but rather a consequence of a lack of personal development and self-investment. If your coworkers are hopping between multiple agents, chewing on ideas, and running in the background during meetings, and you're not in on that action, then naturally you're just going to fall behind.
What do I mean by some software devs are “ngmi”?
At “an oh fuck moment in time”, I closed off the post with the following quote. N period on from now, software engineers who haven’t adopted or started exploring software assistants, are frankly not gonna make it. Engineering organizations right now are split between employees who have had that “oh
how to build a coding agent: free workshop

don't be the person on the left side of the cliff.

The tech industry's conveyor belt continues to move forward. If you're a DevOps engineer in 2025 and you don't have any experience with AWS or GCP, then you're going to find it pretty tough in the employment market.

What's surprising to software and data engineers is just how fast this is elapsing. It has been eight months since the release of the first coding agent, and most people are still unaware of how straightforward it is to build one, how powerful this loop is, and its disruptive implications for our profession.

So, my name's Geoffrey Huntley. I was the tech lead for developer productivity at Canva, but as of a couple of months ago, I'm one of the engineers at Sourcegraph building Amp. It's a small core team of about six people. We build AI with AI.

how to build a coding agent: free workshop
ampcode.com
how to build a coding agent: free workshop
cursor
how to build a coding agent: free workshop
windsurf
how to build a coding agent: free workshop
claude code
how to build a coding agent: free workshop
github co-pilot
how to build a coding agent: free workshop
are lines of code running in a loop with LLM tokens

Cursor, Windsurf, Claude Code, GitHub Copilot, and Amp are just a small number of lines of code running in a loop of LLM tokens. I can't stress that enough. The model does all the heavy lifting here, folks. It's the model that does it all.

You are probably five vendors deep in product evaluation, right now, trying to compare all these agents to one another. But really, you're just chasing your tail.

It's so easy to build your own...

how to build a coding agent: free workshop

There are just a few key concepts you need to be aware of.

how to build a coding agent: free workshop

Not all LLMs are agentic.

The same way that you have different types of cars, like you've got a 40 series if you want to go off-road, and then you've also got people movers, which exist for transporting people.

The same principle applies to LLMs, and I've been able to map their behaviours into a quadrant.

A model is either high safety, low safety, an oracle, or agentic. It's never both or all.

If I were to ask you to do some security research, which model would you use?

That'd be Grok. That's a low safety model.

how to build a coding agent: free workshop

If you want something that's "ethics-aligned", it's Anthropic or OpenAI. So that's high safety. Similarly, you have oracles. Oracles are on the polar opposite of agentic. Oracles are suitable for summarisation tasks or require a high level of thinking.

how to build a coding agent: free workshop

Meanwhile, you have providers like Anthropic, and their Claude Sonnet is a digital squirrel (see below).

Claude Sonnet is a small-brained mechanical squirrel of <T>
how to build a coding agent: free workshop

The first robot used to chase tennis balls. The first digital robot chases tool calls.

Sonnet is a robotic squirrel that just wants to do tool calls. It doesn't spend too much time thinking; it biases towards action, which is what makes it agentic. Sonnet focuses on incrementally obtaining success instead of pondering for minutes per turn before taking action.

It seems like every day, a new model is introduced to the market, and they're all competing with one another. But truth be told, they have their specialisations and have carved out their niches.

The problem is that, unless you're working with these models at an intimate level, you may not have this level of awareness of the specialisations of the models, which results in consumers just comparing the models on two basic primitives:

  1. The size of the context window
  2. The cost

It's kind of like looking at a car, whether it has two doors or three doors, whilst ignoring the fact that some vehicles are designed for off-roading, while others are designed for passenger transport.

To build an agent, the first step is to choose a highly agentic model. That is currently Claude Sonnet, or Kimi K2.

Now, you might be wondering, "What if you want a higher level of reasoning and checking of work that the incremental squirrel does?". Ah, that's simple. You can wire other LLMs in as tools into an existing agentic LLM. This is what we do at Amp.

We call it the Oracle. The Oracle is just GPT wired in as a tool that Claude Sonnet can function call for guidance, to check work progress, and to conduct research/planning.

Oracle
how to build a coding agent: free workshop

Amp's oracle is just another LLM registered in as a tool to an agentic LLM that it can function call

how to build a coding agent: free workshop

The next important thing to learn is that you should only use the context window for one activity. When you're using Cursor or any one of these tools, it's essential to clear the context window after each activity (see below).

autoregressive queens of failure
Have you ever had your AI coding assistant suggest something so off-base that you wonder if it’s trolling you? Welcome to the world of autoregressive failure. LLMs, the brains behind these assistants, are great at predicting the next word—or line of code—based on what’s been fed into
how to build a coding agent: free workshop

LLM outcomtes are a needle in a haystack of what you've allocated into the haystack.

If you start an AI-assisted session to build a backend API controller, then reuse that session to research facts about meerkats. Then it should be no surprise when you tell it to redesign the website in the active session; the website might end up with facts about your API or meerkats, or both.

how to build a coding agent: free workshop
nb. the context window for Sonnet since delivering this workshop has increased to 1m

Context windows are very, very small. It's best to think of them as a Commodore 64, and as such, you should be treating it as a computer with a limited amount of memory. The more you allocate, the worse your outcome and performance will be.

The advertised context window for Sonnet is 200k. However, you don't get to use all of that because the model needs to allocate memory for the system-level prompt. Then the harness (Cursor, Windsurf, Claude Code, Amp) also needs to allocate some additional memory, which means you end up with approximately 176k tokens usable.

You probably heard a lot about the Model Context Protocols (MCPs). They are the current hot thing, and the easiest way to think about them is as a function with a description allocated to the context window that tells it how to invoke that function.

A common failure scenario I observe is people installing an excessive number of MCP servers or failing to consider the number of tools exposed by a single MCP tool or the aggregate context window allocation of all tools.

There is a cardinal rule that is not as well understood as it should be. The more you allocate to a context window, the worse the performance of the context window will be, and your outcomes will deteriorate.

Avoid excessively allocating to the context window with your agent or through MCP tool consumption. It's very easy to fall into a trap of allocating an additional 76K of tokens just for MCP tools, which means you only have 100K usable.

Less is more, folks. Less is more.

I recommend dropping by and reading the blog post below if you want to understand when to use MCP and when not to.

too many model context protocol servers and LLM allocations on the dance floor
This blog post intends to be a definitive guide to context engineering fundamentals from the perspective of an engineer who builds commercial coding assistants and harnesses for a living. Just two weeks ago, I was back over in San Francisco, and there was a big event on Model Context Protocol
how to build a coding agent: free workshop

When you should use MCP, when you should not use MCP, and how allocations work in the context window.

how to build a coding agent: free workshop

Let's head back and check on our agent that's being built in the background. If you look at it closely enough, you can see the loop and how it's invoking other tools.

Essentially, how this all works is outlined in the loop below.

how to build a coding agent: free workshop

For every piece of input from the user or result of a tool call that gets allocated to the response, and that response is sent off for inferencing:

how to build a coding agent: free workshop
The inferencing loop (minus tool registrations)

Let's open up our workshop materials (below) and run the basic chat application via:

go run chat.go
GitHub - ghuntley/how-to-build-a-coding-agent
Contribute to ghuntley/how-to-build-a-coding-agent development by creating an account on GitHub.
how to build a coding agent: free workshop
You: You are a weather expert. When I ask you about the weather in a given location, I want you to reply with get_weather(<location_name>). I will then tell you what the weather in that location is. Understood?

Claude: I understand your instructions. When you ask me about the weather in a specific location, I will respond with get_weather(<location_name>). Then you will provide me with the weather information for that location, and I can discuss it with you from there. I'm ready whenever you'd like to ask about the weather somewhere.

So, let's ask what the weather is today here in Melbourne, Australia?

You: What's the weather like today in Melbourne, Australia?

Claude: get_weather(Melbourne, Australia)

boom! There we go!

We've registered our first tool into the model, and any time that we tickle the latent space with a query about the weather, it's going to function call our new tool.

It's a stub tool currently, but you can see it; we have a method, and the LLM is passing the parameters into the method from our query.

how to build a coding agent: free workshop

When we talk about agentic, we're referring to something that's tool-calling, something that has been trained on its reward function to call a tool. There are models where they've been explicitly trained; all they want to do is chase nuts, much like a squirrel (see below). It's not nuts, it's tools.

Claude Sonnet is a small-brained mechanical squirrel of <T>
how to build a coding agent: free workshop

Whilst visiting the Computer History Museum in San Francisco, I stumbled upon the first robotic squirrel, and it got me thinking: "Wait a sec, is Sonnet our first digital squirrel?"

In the simplest way to explain what MCP is or a tool, it's just a function with a billboard on top that nudges the LLM's latent space to invoke that function. It is that simple. You register the tool, you describe how to invoke it, and then it just works.

how to build a coding agent: free workshop

Let's start the short journey towards making this a real coding agent. We have a mock function and a mock tool, but get_weather(Melbourne, Australia) currently does nothing, so let's fix that.

What we need to do is create a tool and then adjust our inferencing loop to check whether it wants to execute the tool (see line 5 above). If it does, execute the tool (see line 6 above) and then allocate the outcome of that tool back (see line 7 above) to the loop for more inferencing.

Here's what our updated sequence diagram looks like when tools have been registered.

how to build a coding agent: free workshop
this is what a coding agent is in it's purest form without any UI

As we're not making a weather application today, let's switch gears to build the first primitive of a coding agent.

how to build a coding agent: free workshop

The first primitive behind any coding agent is a read tool. All coding agents have a read tool that reads a file into the context window, either the whole file or, in the case of a larger file, in chunks.

var ReadFileDefinition = ToolDefinition{
	Name:        "read_file",
	Description: "Read the contents of a given relative file path. Use this when you want to see what's inside a file. Do not use this with directory names.",
	InputSchema: ReadFileInputSchema,
	Function:    ReadFile,
}
GitHub - ghuntley/how-to-build-a-coding-agent
Contribute to ghuntley/how-to-build-a-coding-agent development by creating an account on GitHub.
how to build a coding agent: free workshop

Let's open up our workshop materials once again (above) and create a file:

cat << EOF
I have a mane but I'm not a lion,
I have four legs but I'm not a table,
I can gallop but I'm not running,
People say I'm disagreeable because I always say "neigh."
What am I?
EOF > riddle.txt

Then let's run the read file tool:

go run read.go
You: Claude, mate. What's the answer to this riddle in riddle.txt?

Claude: I'll help you solve the riddle. Let me first read the contents of the riddle.txt file to see what the riddle is.

tool: read_file({"path":"riddle.txt"})

Reading file: riddle.txt
Successfully read file riddle.txt (175 bytes)

result: I have a mane but I'm not a lion,
I have four legs but I'm not a table,
I can gallop but I'm not running,
People say I'm disagreeable because I always say "neigh."
What am I?

Claude: After reading the riddle, I can solve it for you:

The answer to the riddle is: a horse.

Let's analyze the clues:

"I have a mane but I'm not a lion" - Horses have manes, which is hair growing from the top of the neck.

"I have four legs but I'm not a table" - Horses walk on four legs.

"I can gallop but I'm not running" - Galloping is a specific gait of horses.

"People say I'm disagreeable because I always say 'neigh'" - "Neigh" is the sound a horse makes, and it sounds like "nay" (which means "no" or disagreement).

All of these clues point to a horse as the answer to the riddle.
how to build a coding agent: free workshop

The second primitive behind any coding agent is a list tool. All coding agents have a list files tool, which lists all the files and directories in a given path.

var ListFilesDefinition = ToolDefinition{
	Name:        "list_files",
	Description: "List files and directories at a given path. If no path is provided, lists files in the current directory.",
	InputSchema: ListFilesInputSchema,
	Function:    ListFiles,
}
GitHub - ghuntley/how-to-build-a-coding-agent
Contribute to ghuntley/how-to-build-a-coding-agent development by creating an account on GitHub.
how to build a coding agent: free workshop

Let's open up our workshop materials once again (above) and run the basic chat application via:

go run list_files.go
You: Claude, mate. Is there a README.md in this folder? Also what is the contents of riddle.txt?

Claude: I'll help you check for a README.md file and show you the contents of riddle.txt.

First, let me list the files in the current directory to see if README.md exists:

tool: list_files({})
Listing files in directory: .
Successfully listed 88 items in .

result:

[".devenv.flake.nix",".direnv/",".envrc",".gitignore","AGENT.md","Makefile","README.md","bash_tool.go","chat.go","code_search_tool.go","devenv.lock","devenv.nix","devenv.yaml","edit_tool.go","go.mod","go.sum","list_files.go","prompts/","prompts/00-weather.md","prompts/01-read_file.md","prompts/02-list_files.md","prompts/03-bash_tool.md","prompts/04-edit_tool.md","read.go","renovate.json","riddle.txt"]

Claude: Yes, there is a README.md file in this folder.

Now, let me show you the contents of riddle.txt:

tool: read_file({"path":"riddle.txt"})
result: I have a mane but I'm not a lion,
I have four legs but I'm not a table,
I can gallop but I'm not running,
People say I'm disagreeable because I always say "neigh."
What am I?

Claude: The contents of riddle.txt is a riddle:

I have a mane but I'm not a lion,
I have four legs but I'm not a table,
I can gallop but I'm not running,
People say I'm disagreeable because I always say "neigh."
What am I?

The answer to this riddle appears to be a horse.

Boom! We now have a read file tool and a list files tool, and you can see that their functions are called and chained together from our original prompt.

how to build a coding agent: free workshop

The third primitive behind any coding harness is a bash tool, which does precisely what its name implies. It enables the LLM to execute commands on your computer.

var BashDefinition = ToolDefinition{
	Name:        "bash",
	Description: "Execute a bash command and return its output. Use this to run shell commands.",
	InputSchema: BashInputSchema,
	Function:    Bash,
}
GitHub - ghuntley/how-to-build-a-coding-agent
Contribute to ghuntley/how-to-build-a-coding-agent development by creating an account on GitHub.
how to build a coding agent: free workshop

Let's open up our workshop materials once again (above) and run the bash tool:

go run bash_tool.go
You: Hey Claude, provide a summary of all the processes running on this computer.

Claude: I'll provide a summary of all the processes running on this computer. Let me use the ps command to gather this information.

tool: bash({"command":"ps aux"})precisely
Executing bash command: ps aux
Bash command succeeded: ps aux (output: 87344 bytes)

result: <snipped>

Claude: Based on the output of the ps aux command, here's a summary of the processes running on this computer:

System Summary:

This is a multi-CPU system with 96 cores/threads (visible from the many CPU-related kernel threads)

Running Linux with systemd as the init system

The system has been up since August 20th (seen in START column)
how to build a coding agent: free workshop

The fourth primitive behind any coding harness is the edit tool. The edit tool also does precisely as the name implies. It takes the results back from inference and applies edits to the file.

GitHub - ghuntley/how-to-build-a-coding-agent
Contribute to ghuntley/how-to-build-a-coding-agent development by creating an account on GitHub.
how to build a coding agent: free workshop

Let's open up our workshop materials once again (above) and run edit tool:

go run edit_file.go
You: Claude, mate. Create fizzbuzz.js with nodejs that does fizzbuzz up to 20 and execute it to ensure it works

Claude: I'll create a Node.js implementation of FizzBuzz that counts up to 20 and then run it to make sure it works correctly.

First, let me create the fizzbuzz.js, file:

tool: edit_file({"path":"fizzbuzz.js","old_str":"","new_str":"// FizzBuzz implementation in Node.js\n// Prints numbers from 1 to 20\n// If number is divisible by 3, print "Fizz"\n// If number is divisible by 5, print "Buzz"\n// If number is divisible by both 3 and 5, print "FizzBuzz"\n\nfor (let i = 1; i <= 20; i++) {\n if (i % 3 === 0 && i % 5 === 0) {\n console.log("FizzBuzz");\n } else if (i % 3 === 0) {\n console.log("Fizz");\n } else if (i % 5 === 0) {\n console.log("Buzz");\n } else {\n console.log(i);\n }\n}"})

Editing file: fizzbuzz.js (replacing 0 chars with 485 chars)

File does not exist, creating new file: fizzbuzz.js
Creating new file: fizzbuzz.js (485 bytes)
Successfully created file fizzbuzz.js

result: Successfully created file fizzbuzz.js

Claude: Now let me verify the file was created:
tool: read_file({"path":"fizzbuzz.js"})

Reading file: fizzbuzz.js
Successfully read file fizzbuzz.js (485 bytes)
how to build a coding agent: free workshop
Claude: Now I'll execute the script to make sure it works correctly:

tool: bash({"command":"node fizzbuzz.js"})
Executing bash command: node fizzbuzz.js

Bash command executed successfully, output length: 77 chars
result:

1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
16
17
Fizz
19
Buzz

Claude: Great! The fizzbuzz.js file has been created and executed successfully. The output shows:

Numbers not divisible by 3 or 5 are printed as is (1, 2, 4, 7, 8, 11, 13, 14, 16, 17, 19)

Numbers divisible by 3 (but not 5) print "Fizz" (3, 6, 9, 12, 18)

Numbers divisible by 5 (but not 3) print "Buzz" (5, 10, 20)

Number 15 prints "FizzBuzz" as it's divisible by both 3 and 5

And voila, we now have a coding agent. Is it this point that we could throw some other prompts in and say, "Hey, let's do FizzBuzz to 100, and it will just work."

how to build a coding agent: free workshop

The fifth and final primitive behind any coding harness is a search tool. What if I were to tell you that there is no magic for indexing source code or any intelligence?

var CodeSearchDefinition = ToolDefinition{
	Name: "code_search",
	Description: `Search for code patterns using ripgrep (rg).

Use this to find code patterns, function definitions, variable usage, or any text in the codebase.
You can search by pattern, file type, or directory.`,
	InputSchema: CodeSearchInputSchema,
	Function:    CodeSearch,
}

Nearly every coding tool currently available uses the open source ripgrep binary under the hood. See the two blog posts below if you want more proof:

source code analysis of Amazon Kiro
It’s an uncertain time for our profession, but one thing is certain—things will change. Drafting used to require a room of engineers, but then CAD came along…
how to build a coding agent: free workshop
from Luddites to AI: the Overton Window of disruption
I’ve been thinking about Overton Windows lately, but not of the political variety. You see, the Overton window can be adapted to model disruptive innovation by framing the acceptance of novel technologies, business models, or ideas within a market or society. So I’ve been pondering about where, when and how
how to build a coding agent: free workshop

Let's open up our workshop materials once again and run the code search tool:

go run code_search_tool.go
You: Claude, mate. Search the source code for a riddle

Claude: I'd be happy to search the source code for a riddle. Let me use the code search tool to look for mentions of riddles in the codebase.

tool: code_search({"pattern":"riddle"})

Searching for pattern: riddle
Found 6 matches for pattern: riddle

result:

./README.md:332:- riddle.txt: A simple riddle for content analysis

Claude: I've found references to a riddle file called "riddle.txt" in the codebase. Let me check if this file exists by listing the files in the current directory.

tool: list_files({})
Listing files in directory: .
Successfully listed 25 files in .
result:

["./go.mod","./renovate.json","./devenv.yaml","./AGENT.md","./.devenv.flake.nix","./.envrc","./bash_tool.go","./fizzbuzz.js","./code_search_tool.go","./list_files.go","./Makefile","./devenv.nix","./devenv.lock","./prompts/01-read_file.md","./prompts/02-list_files.md","./prompts/00-weather.md","./prompts/03-bash_tool.md","./prompts/04-edit_tool.md","./chat.go","./go.sum","./riddle.txt","./edit_tool.go","./README.md","./.gitignore","./read.go"]

Claude: Great! I can see that "riddle.txt" exists in the current directory. Let me read its contents to see what the riddle is.

tool: read_file({"path":"riddle.txt"})

Reading file: riddle.txt
Successfully read file riddle.txt (175 bytes)
result: I have a mane but I'm not a lion,
I have four legs but I'm not a table,
I can gallop but I'm not running,
People say I'm disagreeable because I always say "neigh."
What am I?

Claude: I found the riddle in the file riddle.txt!
how to build a coding agent: free workshop

Everything from this point forward is just a matter of tuning your prompts. Due to time constraints, we didn't delve into the notion of the harness prompt. The harness prompt is where your tool registrations go. It contains information such as the operating system you're running, so it knows whether to use PowerShell (Windows) or bash (Linux or macOS).

It also contains instructions on how the agent should operate. I say should is because LLMs are non-deterministic. You can include the guidance, and it's just guidance. However, through prompt evaluation, tuning, and spending time playing with the models to understand how they behave, you can develop effective prompts.

Drop by the blog post below if you want to see an example of a fully fledged coding agent:

source code analysis of Amazon Kiro
It’s an uncertain time for our profession, but one thing is certain—things will change. Drafting used to require a room of engineers, but then CAD came along…
how to build a coding agent: free workshop

There are plenty of open-source coding agents already, such as SST Open Code

GitHub - sst/opencode: AI coding agent, built for the terminal.
AI coding agent, built for the terminal. Contribute to sst/opencode development by creating an account on GitHub.
how to build a coding agent: free workshop

Or this 100-line agent, which scored really high on the SWE Bench.

GitHub - SWE-agent/mini-swe-agent: The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores 68% on SWE-bench verified!
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores 68% on SWE-bench verified! - SWE-agent/mini-swe-agent
how to build a coding agent: free workshop

And if you want some inspiration, there are many repositories on GitHub with leaked developer tooling harness and tool prompts.

GitHub - x1xhlol/system-prompts-and-models-of-ai-tools: FULL v0, Cursor, Manus, Augment Code, Same.dev, Lovable, Devin, Replit Agent, Windsurf Agent, VSCode Agent, Dia Browser, Xcode, Trae AI, Cluely & Orchids.app (And other Open Sourced) System Prompts, Tools & AI Models.
FULL v0, Cursor, Manus, Augment Code, Same.dev, Lovable, Devin, Replit Agent, Windsurf Agent, VSCode Agent, Dia Browser, Xcode, Trae AI, Cluely &amp; Orchids.app (And other Open Sourced) System Pro…
how to build a coding agent: free workshop
how to build a coding agent: free workshop

In recap. What you just built was a coding agent. Perhaps you don't want to create a coding agent. What if you're in the data engineering profession? What would that look like? Think about all of the activities that you do day-to-day, where having the capability to automate using these primitives could be handy or valuable to your employer.

Your current workers are going to take your job, not AI.

If you're concerned about AI, the answer is straightforward: just invest in yourself. It really is that simple. This year is a particularly challenging time to be asleep at the wheel when it comes to personal development.

the six-month recap: closing talk on AI at Web Directions, Melbourne, June 2025
Welcome back to our final session at WebDirections. We’re definitely on the glide path—though I’m not sure if we’re smoothly landing, about to hit turbulence, or perhaps facing a go-around. We’ll see how it unfolds. Today, I’m excited to introduce Geoffrey Huntley. I discovered Geoff earlier this year through
how to build a coding agent: free workshop

conference lock-note

I hope to see you later this afternoon for the conference for the locknote (see above).

Go forward and build.

ps. socials