MoreRSS

site iconSimon WillisonModify

Creator of Datasette and Lanyrd, co-creator of the Django Web Framework.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Simon Willison

Adding dynamic features to an aggressively cached website

2026-01-29 06:10:08

My blog uses aggressive caching: it sits behind Cloudflare with a 15 minute cache header, which guarantees it can survive even the largest traffic spike to any given page. I've recently added a couple of dynamic features that work in spite of that full-page caching. Here's how those work.

Edit links that are visible only to me

This is a Django site and I manage it through the Django admin.

I have four types of content - entries, link posts (aka blogmarks), quotations and notes. Each of those has a different model and hence a different Django admin area.

I wanted an "edit" link on the public pages that was only visible to me.

The button looks like this:

Entry footer - it says Posted 27th January 2026 at 9:44 p.m. followed by a square Edit button with an icon.

I solved conditional display of this button with localStorage. I have a tiny bit of JavaScript which checks to see if the localStorage key ADMIN is set and, if it is, displays an edit link based on a data attribute:

document.addEventListener('DOMContentLoaded', () => {
  if (window.localStorage.getItem('ADMIN')) {
    document.querySelectorAll('.edit-page-link').forEach(el => {
      const url = el.getAttribute('data-admin-url');
      if (url) {
        const a = document.createElement('a');
        a.href = url;
        a.className = 'edit-link';
        a.innerHTML = '<svg>...</svg> Edit';
        el.appendChild(a);
        el.style.display = 'block';
      }
    });
  }
});

If you want to see my edit links you can run this snippet of JavaScript:

localStorage.setItem('ADMIN', '1');

My Django admin dashboard has a custom checkbox I can click to turn this option on and off in my own browser:

Screenshot of a Tools settings panel with a teal header reading "Tools" followed by three linked options: "Bulk Tag Tool - Add tags to multiple items at once", "Merge Tags - Merge multiple tags into one", "SQL Dashboard - Run SQL queries against the database", and a checked checkbox labeled "Show "Edit" links on public pages"

Random navigation within a tag

Those admin edit links are a very simple pattern. A more interesting one is a feature I added recently for navigating randomly within a tag.

Here's an animated GIF showing those random tag navigations in action (try it here):

Animated demo. Starts on the ai-ethics tag page where a new Random button sits next to the feed icon. Clicking that button jumps to a post with that tag and moves the button into the site header - clicking it multiple times jumps to more random items.

On any of my blog's tag pages you can click the "Random" button to bounce to a random post with that tag. That random button then persists in the header of the page and you can click it to continue bouncing to random items in that same tag.

A post can have multiple tags, so there needs to be a little bit of persistent magic to remember which tag you are navigating and display the relevant button in the header.

Once again, this uses localStorage. Any click to a random button records both the tag and the current timestamp to the random_tag key in localStorage before redirecting the user to the /random/name-of-tag/ page, which selects a random post and redirects them there.

Any time a new page loads, JavaScript checks if that random_tag key has a value that was recorded within the past 5 seconds. If so, that random button is appended to the header.

This means that, provided the page loads within 5 seconds of the user clicking the button, the random tag navigation will persist on the page.

You can see the code for that here.

And the prompts

I built the random tag feature entirely using Claude Code for web, prompted from my iPhone. I started with the /random/TAG/ endpoint (full transcript):

Build /random/TAG/ - a page which picks a random post (could be an entry or blogmark or note or quote) that has that tag and sends a 302 redirect to it, marked as no-cache so Cloudflare does not cache it

Use a union to build a list of every content type (a string representing the table out of the four types) and primary key for every item tagged with that tag, then order by random and return the first one

Then inflate the type and ID into an object and load it and redirect to the URL

Include tests - it should work by setting up a tag with one of each of the content types and then running in a loop calling that endpoint until it has either returned one of each of the four types or it hits 1000 loops at which point fail with an error

Then:

I do not like that solution, some of my tags have thousands of items

Can we do something clever with a CTE?

Here's the something clever with a CTE solution we ended up with.

For the "Random post" button (transcript):

Look at most recent commit, then modify the /tags/xxx/ page to have a "Random post" button which looks good and links to the /random/xxx/ page

Then:

Put it before not after the feed icon. It should only display if a tag has more than 5 posts

And finally, the localStorage implementation that persists a random tag button in the header (transcript):

Review the last two commits. Make it so clicking the Random button on a tag page sets a localStorage value for random_tag with that tag and a timestamp. On any other page view that uses the base item template add JS that checks for that localStorage value and makes sure the timestamp is within 5 seconds. If it is within 5 seconds it adds a "Random name-of-tag" button to the little top navigation bar, styled like the original Random button, which bumps the localStorage timestamp and then sends the user to /random/name-of-tag/ when they click it. In this way clicking "Random" on a tag page will send the user into an experience where they can keep clicking to keep surfing randomly in that topic.

Tags: caching, django, javascript, localstorage, ai, cloudflare, generative-ai, llms, ai-assisted-programming

The Five Levels: from Spicy Autocomplete to the Dark Factory

2026-01-29 05:44:29

The Five Levels: from Spicy Autocomplete to the Dark Factory

Dan Shapiro proposes a five level model of AI-assisted programming, inspired by the five (or rather six, it's zero-indexed) levels of driving automation.
  1. Spicy autocomplete, aka original GitHub Copilot or copying and pasting snippets from ChatGPT.
  2. The coding intern, writing unimportant snippets and boilerplate with full human review.
  3. The junior developer, pair programming with the model but still reviewing every line.
  4. The developer. Most code is generated by AI, and you take on the role of full-time code reviewer.
  5. The engineering team. You're more of an engineering manager or product/program/project manager. You collaborate on specs and plans, the agents do the work.
  6. The dark software factory, like a factory run by robots where the lights are out because robots don't need to see.

Dan says about that last category:

At level 5, it's not really a car any more. You're not really running anybody else's software any more. And your software process isn't really a software process any more. It's a black box that turns specs into software.

Why Dark? Maybe you've heard of the Fanuc Dark Factory, the robot factory staffed by robots. It's dark, because it's a place where humans are neither needed nor welcome.

I know a handful of people who are doing this. They're small teams, less than five people. And what they're doing is nearly unbelievable -- and it will likely be our future.

I've talked to one team that's doing the pattern hinted at here. It was fascinating. The key characteristics:

  • Nobody reviews AI-produced code, ever. They don't even look at it.
  • The goal of the system is to prove that the system works. A huge amount of the coding agent work goes into testing and tooling and simulating related systems and running demos.
  • The role of the humans is to design that system - to find new patterns that can help the agents work more effectively and demonstrate that the software they are building is robust and effective.

It was a tiny team and they stuff they had built in just a few months looked very convincing to me. Some of them had 20+ years of experience as software developers working on systems with high reliability requirements, so they were not approaching this from a naive perspective.

I'm hoping they come out of stealth soon because I can't really share more details than this.

Tags: ai, generative-ai, llms, ai-assisted-programming, coding-agents

One Human + One Agent = One Browser From Scratch

2026-01-28 00:58:08

One Human + One Agent = One Browser From Scratch

embedding-shapes was so infuriated by the hype around Cursor's FastRender browser project - thousands of parallel agents producing ~1.6 million lines of Rust - that they were inspired to take a go at building a web browser using coding agents themselves.

The result is one-agent-one-browser and it's really impressive. Over three days they drove a single Codex CLI agent to build 20,000 lines of Rust that successfully renders HTML+CSS with no Rust crate dependencies at all - though it does (reasonably) use Windows, macOS and Linux system frameworks for image and text rendering.

I installed the 1MB macOS binary release and ran it against my blog:

chmod 755 ~/Downloads/one-agent-one-browser-macOS-ARM64 
~/Downloads/one-agent-one-browser-macOS-ARM64 https://simonwillison.net/

Here's the result:

My blog rendered in a window. Everything is in the right place, the CSS gradients look good, the feed subscribe SVG icon is rendered correctly but there's a missing PNG image.

It even rendered my SVG feed subscription icon! A PNG image is missing from the page, which looks like an intermittent bug (there's code to render PNGs).

The code is pretty readable too - here's the flexbox implementation.

I had thought that "build a web browser" was the ideal prompt to really stretch the capabilities of coding agents - and that it would take sophisticated multi-agent harnesses (as seen in the Cursor project) and millions of lines of code to achieve.

Turns out one agent driven by a talented engineer, three days and 20,000 lines of Rust is enough to get a very solid basic renderer working!

I'm going to upgrade my prediction for 2029: I think we're going to get a production-grade web browser built by a small team using AI assistance by then.

Via Show Hacker News

Tags: browsers, predictions, ai, rust, generative-ai, llms, ai-assisted-programming, coding-agents, codex-cli, browser-challenge

Kimi K2.5: Visual Agentic Intelligence

2026-01-27 23:07:41

Kimi K2.5: Visual Agentic Intelligence

Kimi K2 landed in July as a 1 trillion parameter open weight LLM. It was joined by Kimi K2 Thinking in November which added reasoning capabilities. Now they've made it multi-modal: the K2 models were text-only, but the new 2.5 can handle image inputs as well:

Kimi K2.5 builds on Kimi K2 with continued pretraining over approximately 15T mixed visual and text tokens. Built as a native multimodal model, K2.5 delivers state-of-the-art coding and vision capabilities and a self-directed agent swarm paradigm.

The "self-directed agent swarm paradigm" claim there means improved long-sequence tool calling and training on how to break down tasks for multiple agents to work on at once:

For complex tasks, Kimi K2.5 can self-direct an agent swarm with up to 100 sub-agents, executing parallel workflows across up to 1,500 tool calls. Compared with a single-agent setup, this reduces execution time by up to 4.5x. The agent swarm is automatically created and orchestrated by Kimi K2.5 without any predefined subagents or workflow.

I used the OpenRouter Chat UI to have it "Generate an SVG of a pelican riding a bicycle", and it did quite well:

Cartoon illustration of a white pelican with a large orange beak and yellow throat pouch riding a green bicycle with yellow feet on the pedals, set against a light blue sky with soft bokeh circles and a green grassy hill. The bicycle frame is a little questionable. The pelican is quite good. The feet do not quite align with the pedals, which are floating clear of the frame.

As a more interesting test, I decided to exercise the claims around multi-agent planning with this prompt:

I want to build a Datasette plugin that offers a UI to upload files to an S3 bucket and stores information about them in a SQLite table. Break this down into ten tasks suitable for execution by parallel coding agents.

Here's the full response. It produced ten realistic tasks and reasoned through the dependencies between them. For comparison here's the same prompt against Claude Opus 4.5 and against GPT-5.2 Thinking.

The Hugging Face repository is 595GB. The model uses Kimi's janky "modified MIT" license, which adds the following clause:

Our only modification part is that, if the Software (or any derivative works thereof) is used for any of your commercial products or services that have more than 100 million monthly active users, or more than 20 million US dollars (or equivalent in other currencies) in monthly revenue, you shall prominently display "Kimi K2.5" on the user interface of such product or service.

Given the model's size, I expect one way to run it locally would be with MLX and a pair of $10,000 512GB RAM M3 Ultra Mac Studios. That setup has been demonstrated to work with previous trillion parameter K2 models.

Via Hacker News

Tags: ai, llms, hugging-face, vision-llms, llm-tool-use, ai-agents, pelican-riding-a-bicycle, llm-release, ai-in-china, moonshot, parallel-agents, kimi, janky-licenses

Tips for getting coding agents to write good Python tests

2026-01-27 07:55:29

Someone asked on Hacker News if I had any tips for getting coding agents to write decent quality tests. Here's what I said:


I work in Python which helps a lot because there are a TON of good examples of pytest tests floating around in the training data, including things like usage of fixture libraries for mocking external HTTP APIs and snapshot testing and other neat patterns.

Or I can say "use pytest-httpx to mock the endpoints" and Claude knows what I mean.

Keeping an eye on the tests is important. The most common anti-pattern I see is large amounts of duplicated test setup code - which isn't a huge deal, I'm much more more tolerant of duplicated logic in tests than I am in implementation, but it's still worth pushing back on.

"Refactor those tests to use pytest.mark.parametrize" and "extract the common setup into a pytest fixture" work really well there.

Generally though the best way to get good tests out of a coding agent is to make sure it's working in a project with an existing test suite that uses good patterns. Coding agents pick the existing patterns up without needing any extra prompting at all.

I find that once a project has clean basic tests the new tests added by the agents tend to match them in quality. It's similar to how working on large projects with a team of other developers work - keeping the code clean means when people look for examples of how to write a test they'll be pointed in the right direction.

One last tip I use a lot is this:

Clone datasette/datasette-enrichments
from GitHub to /tmp and imitate the
testing patterns it uses

I do this all the time with different existing projects I've written - the quickest way to show an agent how you like something to be done is to have it look at an example.

Tags: testing, coding-agents, python, generative-ai, ai, llms, hacker-news, pytest

ChatGPT Containers can now run bash, pip/npm install packages, and download files

2026-01-27 03:19:31

One of my favourite features of ChatGPT is its ability to write and execute code in a container. This feature launched as ChatGPT Code Interpreter nearly three years ago, was half-heartedly rebranded to "Advanced Data Analysis" at some point and is generally really difficult to find detailed documentation about. Case in point: it appears to have had a massive upgrade at some point in the past few months, and I can't find documentation about the new capabilities anywhere!

Here are the most notable new features:

  1. ChatGPT can directly run Bash commands now. Previously it was limited to Python code only, although it could run shell commands via the Python subprocess module.
  2. It has Node.js and can run JavaScript directly in addition to Python. I also got it to run "hello world" in Ruby, Perl, PHP, Go, Java, Swift, Kotlin, C and C++. No Rust yet though!
  3. While the container still can't make outbound network requests, pip install package and npm install package both work now via a custom proxy mechanism.
  4. ChatGPT can locate the URL for a file on the web and use a container.download tool to download that file and save it to a path within the sandboxed container.

This is a substantial upgrade! ChatGPT can now write and then test code in 10 new languages (11 if you count Bash), can find files online and download them into the container, and can install additional packages via pip and npm to help it solve problems.

(OpenAI really need to develop better habits at keeping their release notes up-to-date!)

I was initially suspicious that maybe I'd stumbled into a new preview feature that wasn't available to everyone, but I tried some experiments in a free ChatGPT account and confirmed that the new features are available there as well.

container.download

My first clue to the new features came the other day when I got curious about Los Angeles air quality, in particular has the growing number of electric vehicles there had a measurable impact?

I prompted a fresh GPT-5.2 Thinking session with:

Show me Los Angeles air quality over time for last 20 years

Here's the shared transcript. Annoyingly that doesn't make the thinking trace available, so here's a screenshot:

Exploring methods to fetch AQI data and reports Downloading AQI data for Los Angeles  I found an xlsx file titled "Air Quality Statistics by City, 2000-2023", which should include data for Los Angeles. While user requested data for Using provided URL to download file  The URL from the error message leads to an Excel file containing air quality trends. I can use the container.download tool to fetch it. Let’s proceed with this approach and see if it works out for downloading the file directly! Analyzed Parsing and analyzing the data for Los Angeles  Now that we've downloaded the data, I'll parse it using Python. First, I'll focus on identifying the row(s) corresponding to Los Angeles, which may be labeled under "Los Angeles-Long Beach-Anaheim, CA." Afterward, I'll plot metrics like "Median AQI" and "Days Good" over time to visualize trends. Let’s begin!

This bit instantly caught my eye:

[...] The URL from the error message leads to an Excel file containing air quality trends. I can use the container.download tool to fetch it. [...]

What's container.download? I asked ChatGPT:

Without running any searches tell me about your container.download tool

And got back:

Sure. container.download is a built-in tool I can call that fetches a file from a URL and saves it into the sandboxed container filesystem for this chat session.

What it does

  • Takes a publicly reachable URL and a destination filepath in the container.
  • Downloads the bytes from that URL and writes them to the given path.
  • After that, I can read/process the file locally in the container (e.g., unzip it, parse it with Python, open it as an image, convert it, etc.).

Signature (inputs)

It’s called like this (conceptually):

{
  "url": "https://example.com/somefile.zip",
  "filepath": "/mnt/data/somefile.zip"
}

Further experimentation confirmed that container.download behaves as described.

I had it download one file from a server with logs enabled and captured the IP address and user-agent. Here are the most interesting headers:

User-Agent: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot
Accept: text/html, application/xhtml+xml, application/xml;q=0.9, image/avif, image/webp, image/apng, */*;q=0.8, application/signed-exchange;v=b3;q=0.9
Cf-Connecting-Ip: 52.230.164.178

That 52.230.164.178 IP address resolves to Microsoft Azure Cloud (centralus) in Des Moines, Iowa.

Is container.download a data exfiltration vulnerability?

On the one hand, this is really useful! ChatGPT can navigate around websites looking for useful files, download those files to a container and then process them using Python or other languages.

Is this a data exfiltration vulnerability though? Could a prompt injection attack trick ChatGPT into leaking private data out to a container.download call to a URL with a query string that includes sensitive information?

I don't think it can. I tried getting it to assemble a URL with a query string and access it using container.download and it couldn't do it. It told me that it got back this error:

ERROR: download failed because url not viewed in conversation before. open the file or url using web.run first.

This looks to me like the same safety trick used by Claude's Web Fetch tool: only allow URL access if that URL was either directly entered by the user or if it came from search results that could not have been influenced by a prompt injection.

(I poked at this a bit more and managed to get a simple constructed query string to pass through web.run - a different tool entirely - but when I tried to compose a longer query string containing the previous prompt history a web.run filter blocked it.)

So I think this is all safe, though I'm curious if it could hold firm against a more aggressive round of attacks from a seasoned security researcher.

Bash and other languages

The key lesson from coding agents like Claude Code and Codex CLI is that Bash rules everything: if an agent can run Bash commands in an environment it can do almost anything that can be achieved by typing commands into a computer.

When Anthropic added their own code interpreter feature to Claude last September they built that around Bash rather than just Python. It looks to me like OpenAI have now done the same thing for ChatGPT.

Here's what ChatGPT looks like when it runs a Bash command - here my prompt was:

npm install a fun package and demonstrate using it

Screenshot of ChatGPT 5.2 Thinking interface with Activity panel. Main chat shows user prompt "npm install a fun package and demonstrate using it" and response "Thought for 32s > Yep — here's a fun one: cowsay 🐄 I installed it with npm and ran a tiny Node script:" followed by bash code block containing "mkdir -p /mnt/data/npmfun cd /mnt/data/npmfun npm init -y npm install cowsay --no-fund --no-audit node -e "const cowsay=require('cowsay'); console.log(cowsay.say({text:'Hel" and Output: section showing ASCII cow art saying "< Hello from npm inside the container! >" with cow drawing made of characters. Activity panel on right (32s) shows: "Running npm install in container" with bash command and output "v22.16.0 10.9.2", "Inspecting npm command output and container status" with directory listing showing "drwxr-xr-x 2 root oai_shared 80 Jan 26 17:32 . drwxrwxr-x 2 oai oai_shared 60 Jan 26 17:32 .. -rw-r--r-- 1 root root", "Re-running npm install" note about node_modules missing, and "Rerunning npm install with output" showing "added 41 packages in 2s".

It's useful to click on the "Thinking" or "Thought for 32s" links as that opens the Activity sidebar with a detailed trace of what ChatGPT did to arrive at its answer. This helps guard against cheating - ChatGPT might claim to have run Bash in the main window but it can't fake those black and white logs in the Activity panel.

I had it run Hello World in various languages later in that same session.

Terminal screenshot showing "Hello, world" programs in multiple languages. Top section shows code: swift -e 'print("Hello, world from Swift!")' followed by cat > hello.kts <<'KTS' / println("Hello, world from Kotlin!") / KTS / kotlinc -script hello.kts, then cat > hello.c <<'C' / #include <stdio.h> / int main(){ puts("Hello, world from C!"); return 0; } / C / gcc hello.c -o hello_c / ./hello_c, then cat > hello.cpp <<'CPP' / #include <iostream> / int main(){ std::cout << "Hello, world from C++!" << std::endl; } / CPP / g++ hello.cpp -o hello_cpp / ./hello_cpp. Bottom section shows output: Hello, world from Ruby! / Hello, world from Perl! / Hello, world from PHP! / Hello, world from Go! / Hello, world from Java! / Hello, world from Swift! / Hello, world from Kotlin! / Hello, world from C! / Hello, world from C++!. UI shows "Thought for 2m 29s" and "Done" at bottom.

Installing packages from pip and npm

In the previous example ChatGPT installed the cowsay package from npm and used it to draw an ASCII-art cow. But how could it do that if the container can't make outbound network requests?

In another session I challenged it to explore its environment. and figure out how that worked.

Here's the resulting Markdown report it created.

The key magic appears to be a applied-caas-gateway1.internal.api.openai.org proxy, available within the container and with various packaging tools configured to use it.

The following environment variables cause pip and uv to install packages from that proxy instead of directly from PyPI:

PIP_INDEX_URL=https://reader:****@packages.applied-caas-gateway1.internal.api.openai.org/.../pypi-public/simple
PIP_TRUSTED_HOST=packages.applied-caas-gateway1.internal.api.openai.org
UV_INDEX_URL=https://reader:****@packages.applied-caas-gateway1.internal.api.openai.org/.../pypi-public/simple
UV_INSECURE_HOST=https://packages.applied-caas-gateway1.internal.api.openai.org

This one appears to get npm to work:

NPM_CONFIG_REGISTRY=https://reader:****@packages.applied-caas-gateway1.internal.api.openai.org/.../npm-public

And it reported these suspicious looking variables as well:

CAAS_ARTIFACTORY_BASE_URL=packages.applied-caas-gateway1.internal.api.openai.org
CAAS_ARTIFACTORY_PYPI_REGISTRY=.../artifactory/api/pypi/pypi-public
CAAS_ARTIFACTORY_NPM_REGISTRY=.../artifactory/api/npm/npm-public
CAAS_ARTIFACTORY_GO_REGISTRY=.../artifactory/api/go/golang-main
CAAS_ARTIFACTORY_MAVEN_REGISTRY=.../artifactory/maven-public
CAAS_ARTIFACTORY_GRADLE_REGISTRY=.../artifactory/gradle-public
CAAS_ARTIFACTORY_CARGO_REGISTRY=.../artifactory/api/cargo/cargo-public/index
CAAS_ARTIFACTORY_DOCKER_REGISTRY=.../dockerhub-public
CAAS_ARTIFACTORY_READER_USERNAME=reader
CAAS_ARTIFACTORY_READER_PASSWORD=****
NETWORK=caas_packages_only

Neither Rust nor Docker are installed in the container environment, but maybe those registry references are a clue of features still to come.

Adding it all together

The result of all of this? You can tell ChatGPT to use Python or Node.js packages as part of a conversation and it will be able to install them and apply them against files you upload or that it downloads from the public web. That's really cool.

The big missing feature here should be the easiest to provide: we need official documentation! A release notes entry would be a good start, but there are a lot of subtle details to how this new stuff works, its limitations and what it can be used for.

As always, I'd also encourage OpenAI to come up with a name for this set of features that properly represents how it works and what it can do.

In the meantime, I'm going to call this ChatGPT Containers.

Update: a full list of tools

I decided to ask ChatGPT about other tools that were available to it in case there was anything interesting in there:

List all tools that are available to you, with their exact names and descriptions and signatures

Here's what I got back.

List of ChatGPT tools according to GPT-5.2 Thinking on 26th January 2026

Below are all tools available to me in this chat, with their exact names, descriptions, and signatures (as defined in my tool interface).


python

python.exec (FREEFORM) → any

Description: Execute Python code in a Jupyter environment for private/internal reasoning (not shown to you). Internet access is disabled.
Signature: python.exec(FREEFORM) -> any


web

web.run (ToolCallV5 JSON) → any

Description: Access the internet and specialized lookups (search, open/click/find pages, screenshot PDFs, image search, product search, finance, weather, sports, calculator, time). Supports citations via returned turn... reference IDs.
Signature:
web.run({ open?, click?, find?, screenshot?, image_query?, product_query?, sports?, finance?, weather?, calculator?, time?, search_query?, response_length? }) -> any


automations

automations.create → any

Description: Schedule a future/recurring task/reminder/search via iCal VEVENT.
Signature:
automations.create({ prompt: string, title: string, schedule?: string, dtstart_offset_json?: string }) -> any

automations.update → any

Description: Modify an existing scheduled task.
Signature:
automations.update({ jawbone_id: string, schedule?: string, dtstart_offset_json?: string, prompt?: string, title?: string, is_enabled?: boolean }) -> any

automations.list → any

Description: List existing automations.
Signature: automations.list() -> any


guardian_tool

guardian_tool.get_policy → str

Description: Retrieve policy guidance for specific categories (currently: election_voting).
Signature: guardian_tool.get_policy({ category: string }) -> str


file_search

file_search.msearch → any

Description: Search across available internal/connected sources (in this environment: primarily “recording_knowledge” transcripts when relevant).
Signature:
file_search.msearch({ queries?: string[], source_filter?: string[], file_type_filter?: string[], intent?: string, time_frame_filter?: { start_date?: string, end_date?: string } }) -> any

file_search.mclick → any

Description: Expand/open previously returned search results (or connector URLs) for more context.
Signature:
file_search.mclick({ pointers?: string[], start_date?: string, end_date?: string }) -> any


gmail (read-only)

gmail.search_email_ids → any

Description: Search Gmail message IDs by query/tags (read-only).
Signature:
gmail.search_email_ids({ query?: string, tags?: string[], max_results?: integer, next_page_token?: string }) -> any

gmail.batch_read_email → any

Description: Read a batch of Gmail messages by message IDs (read-only).
Signature:
gmail.batch_read_email({ message_ids: string[] }) -> any


gcal (read-only)

gcal.search_events → any

Description: Search Google Calendar events in a time range and/or by keyword (read-only).
Signature:
gcal.search_events({ time_min?: string, time_max?: string, timezone_str?: string, max_results?: integer, query?: string, calendar_id?: string, next_page_token?: string }) -> any

gcal.read_event → any

Description: Read a specific calendar event by ID (read-only).
Signature:
gcal.read_event({ event_id: string, calendar_id?: string }) -> any


gcontacts (read-only)

gcontacts.search_contacts → any

Description: Search Google Contacts by keyword (read-only).
Signature:
gcontacts.search_contacts({ query: string, max_results?: integer }) -> any


canmore (canvas)

canmore.create_textdoc → any

Description: Create a new canvas text document (markdown or code).
Signature:
canmore.create_textdoc({ name: string, type: "document" | "code/bash" | "code/zsh" | "code/javascript" | "code/typescript" | "code/html" | "code/css" | "code/python" | "code/json" | "code/sql" | "code/go" | "code/yaml" | "code/java" | "code/rust" | "code/cpp" | "code/swift" | "code/php" | "code/xml" | "code/ruby" | "code/haskell" | "code/kotlin" | "code/csharp" | "code/c" | "code/objectivec" | "code/r" | "code/lua" | "code/dart" | "code/scala" | "code/perl" | "code/commonlisp" | "code/clojure" | "code/ocaml" | "code/powershell" | "code/verilog" | "code/dockerfile" | "code/vue" | "code/react" | "code/other", content: string }) -> any

canmore.update_textdoc → any

Description: Apply regex-based edits to the current canvas textdoc.
Signature:
canmore.update_textdoc({ updates: { pattern: string, multiple?: boolean, replacement: string }[] }) -> any

canmore.comment_textdoc → any

Description: Add actionable comments on matched regions of the current canvas textdoc.
Signature:
canmore.comment_textdoc({ comments: { pattern: string, comment: string }[] }) -> any


python_user_visible

python_user_visible.exec (FREEFORM) → any

Description: Execute Python code that you will see (tables, plots, generated files). Internet disabled.
Signature: python_user_visible.exec(FREEFORM) -> any


user_info

user_info.get_user_info → any

Description: Get your current location and local time (or UTC if unknown).
Signature: user_info.get_user_info({}) -> any


summary_reader

summary_reader.read → any

Description: Read safe-to-share internal reasoning summaries from earlier turns (used when asked how I arrived at something).
Signature: summary_reader.read({ limit?: integer, offset?: integer }) -> any


container

container.exec → any

Description: Run a command inside the container and return stdout/stderr.
Signature:
container.exec({ cmd: string[], session_name?: string | null, workdir?: string | null, timeout?: integer | null, env?: object | null, user?: string | null }) -> any

container.feed_chars → any

Description: Send input to a running interactive exec session and flush output.
Signature:
container.feed_chars({ session_name: string, chars: string, yield_time_ms?: integer }) -> any

container.open_image → any

Description: Display an image from an absolute path in the container (jpg/jpeg/png/webp).
Signature:
container.open_image({ path: string, user?: string | null }) -> any

container.download → any

Description: Download a file from a URL into the container filesystem.
Signature:
container.download({ url: string, filepath: string }) -> any


bio (memory)

bio.update (FREEFORM) → any

Description: Persist (or forget) user-specific info across conversations (“memory”).
Signature: bio.update(FREEFORM) -> any


api_tool

api_tool.list_resources → any

Description: List available “resources” (including invokable tools) in an OP-style file tree.
Signature:
api_tool.list_resources({ path?: string, cursor?: string | null, only_tools?: boolean, refetch_tools?: boolean }) -> any

api_tool.call_tool → any

Description: Invoke a resource as a tool (must know its full path/URI from list_resources).
Signature:
api_tool.call_tool({ path: string, args: object }) -> any


personal_context

personal_context.search → any

Description: Retrieve user-specific context from prior interactions/linked sources to maintain continuity.
Signature:
personal_context.search({ query: string }) -> any


image_gen

image_gen.text2im → any

Description: Generate or edit images from text instructions (and optionally referenced image IDs).
Signature:
image_gen.text2im({ prompt?: string | null, size?: string | null, n?: integer | null, transparent_background?: boolean | null, is_style_transfer?: boolean | null, referenced_image_ids?: string[] | null }) -> any


user_settings

user_settings.get_user_settings → any

Description: Read current settings and allowed values (personality, accent color, appearance).
Signature: user_settings.get_user_settings() -> any

user_settings.set_setting → any

Description: Change one setting (accent color, appearance, or personality) to an allowed value.
Signature:
user_settings.set_setting({ setting_name: "accent_color" | "appearance" | "personality", setting_value: string }) -> any


artifact_handoff

artifact_handoff.prepare_artifact_generation → any

Description: Must be called immediately when the user asks for a spreadsheet or slide deck artifact.
Signature: artifact_handoff.prepare_artifact_generation() -> any

Tags: pypi, sandboxing, npm, ai, openai, generative-ai, chatgpt, llms, ai-assisted-programming, code-interpreter