MoreRSS

site iconThe Practical DeveloperModify

A constructive and inclusive social network for software developers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of The Practical Developer

We Built an Open Protocol So AI Agents Can Actually See Your Screen

2026-03-02 18:35:26

You know what's wild? Every major AI lab is building "computer use" agents right now. Models that can look at your screen, understand what they see, and click buttons on your behalf. Anthropic has Claude Computer Use. OpenAI shipped CUA. Microsoft built UFO2.

And every single one of them is independently solving the same problem: how do you describe a UI to an AI?

We thought that was broken, so we built Computer Use Protocol (CUP), an open specification that gives AI agents a universal way to perceive and interact with any desktop UI. One format. Every platform. MIT licensed.

GitHub: computeruseprotocol/computeruseprotocol
Website: computeruseprotocol.com

The Problem: Six Platforms, Six Different Languages

Here's the fragmentation that every computer-use agent has to deal with today:

Platform Accessibility API Role Count IPC Mechanism
Windows UIA (COM) ~40 ControlTypes COM
macOS AXUIElement AXRole + AXSubrole XPC / Mach
Linux AT-SPI2 ~100+ AtspiRole values D-Bus
Web ARIA ~80 ARIA roles In-process / CDP
Android AccessibilityNodeInfo Java class names Binder
iOS UIAccessibility ~15 trait flags In-process

That's roughly 300+ combined role types across platforms, each with different naming, different semantics, and different ways to query them. If you're building an agent that needs to work on more than one OS, you're writing a lot of glue code.

The Solution: One Schema, One Vocabulary

CUP collapses all of that into a single, ARIA-derived schema:

  • 59 universal roles, the subset that maps cleanly across all 6 platforms
  • 16 state flags, only truthy/active states listed (absence = default)
  • 15 action verbs, a canonical vocabulary for what agents can do with elements
  • Platform escape hatch, raw native properties preserved for advanced use

Here's what a CUP snapshot looks like in JSON:

{
  "version": "0.1.0",
  "platform": "windows",
  "timestamp": 1740067200000,
  "screen": { "w": 2560, "h": 1440, "scale": 1.0 },
  "app": { "name": "Spotify", "pid": 1234 },
  "tree": [
    {
      "id": "e0",
      "role": "window",
      "name": "Spotify",
      "bounds": { "x": 120, "y": 40, "w": 1680, "h": 1020 },
      "states": ["focused"],
      "actions": ["click"],
      "children": ["..."]
    }
  ]
}

Whether that UI was captured on Windows via UIA, macOS via AXUIElement, or Linux via AT-SPI2, it comes out looking exactly the same.

The Compact Format (97% Fewer Tokens)

Sending JSON trees to an LLM burns context fast. CUP defines a compact text format that's optimized for token efficiency:

# CUP 0.1.0 | windows | 2560x1440
# app: Spotify
# 63 nodes (280 before pruning)

[e0] window "Spotify" @120,40 1680x1020
  [e1] document "Spotify" @120,40 1680x1020
    [e2] button "Back" @132,52 32x32 [click]
    [e3] button "Forward" @170,52 32x32 {disabled} [click]
    [e7] navigation "Main" @120,88 240x972
      [e8] link "Home" @132,100 216x40 {selected} [click]

Each line follows: [id] role "name" @x,y wxh {states} [actions]

Same information. A fraction of the tokens. Your agent sees more UI in less context.

Architecture: Protocol First

CUP is intentionally layered. The protocol is the foundation, and everything else is optional.

Layer What It Does
Protocol (core repo) Defines the universal tree format: roles, states, actions, schema
SDKs Capture native accessibility trees, normalize to CUP, execute actions
MCP Servers Expose CUP as tools for AI agents (Claude Code, Cursor, Copilot, etc.)

You can adopt just the schema. Or use the SDKs. Or go all the way to MCP integration. Each layer is independent.

Getting Started in 30 Seconds

Install the SDK:

# TypeScript
npm install computeruseprotocol

# Python
pip install computeruseprotocol

Capture a UI tree and interact with it:

import { snapshot, action } from 'computeruseprotocol'

// Capture the active window's UI tree
const tree = await snapshot()

// Click a button
await action('click', 'e14')

// Type into a search box
await action('type', 'e9', { value: 'hello world' })

// Send a keyboard shortcut
await action('press', { keys: 'ctrl+s' })

That's it. The SDK auto-detects your OS and loads the right platform adapter.

MCP Integration

CUP ships a built-in MCP server. Add it to your claude_desktop_config.json or equivalent and your agent can start controlling desktop UIs immediately:

{
  "mcpServers": {
    "cup": {
      "command": "cup-mcp"
    }
  }
}

This exposes tools like snapshot (capture window tree), action (interact with elements), overview (list all open windows), and find (search elements in the last tree). It works with Claude Code, Cursor, OpenClaw, Codex, and anything MCP-compatible.

Why This Matters

Computer-use agents are evolving fast, but the infrastructure layer is still ad-hoc. Every team building an agent that needs to "see" a desktop is solving the same problems from scratch: how to capture UI state, how to represent it for an LLM, how to execute actions reliably across platforms.

CUP standardizes that layer so teams can focus on what makes their agent unique (the reasoning, the planning, the task execution) instead of reimplementing platform-specific UI perception.

Think of it like this: HTTP didn't make web browsers smart, but it gave them a common language. CUP aims to do the same for computer-use agents.

Current Status & Contributing

CUP is at v0.1.0, early but functional. The spec covers 59 roles mapped across 6 platforms, with SDKs for Python and TypeScript.

Contributions are very welcome, especially around new role/action proposals with cross-platform mapping rationale, platform mapping improvements, and SDK contributions like new platform adapters or bug fixes.

Check out the repos:

If you're building computer-use agents, cross-platform UI testing, or accessibility tooling, we'd love to hear from you. Open an issue, submit a PR, or just star the repo if you think this problem is worth solving.

CUP is MIT licensed and community-driven. The protocol belongs to everyone building in this space.

How to Use Docker with Python

2026-03-02 18:30:00

title

Docker is now a standard tool for running Python applications on a regular basis across the development, testing, and production phases. Using Docker with Python means running your application inside a container instead of using it directly on your local machine. The container includes Python, your dependencies, and system libraries, all bundled together.

In this guide, we’ll have a look at how you use Docker with Python in real projects. You’ll learn what to install, how to write Dockerfiles for Python apps, how to run containers, and how to avoid common mistakes.

Why Use Docker with Python?

When you work with Python, your application often depends on:

  • A specific Python version
  • OS-level packages (like libpq, curl, or build-essential)
  • Python dependencies from pip

What Docker does is it bundles all of these into a single, reproducible environment.

With Docker, you can:

  • Run the same Python app on any machine
  • Standardize development across teams
  • Simplify deployment to servers and cloud platforms
  • Isolate dependencies between projects

This consistency reduces setup time and deployment errors. Also docker needs to be installed on your system before you can get anything done. Once it’s installed, Docker runs in the background and manages containers for you.

You can verify Docker is installed by running:

docker --version

Working with Python Docker Images

Docker images for Python are published on Docker Hub, and some of the most commonly used base images are:

  • python:3.12
  • python:3.12-slim
  • python:3.12-alpine

Creating a Python Dockerfile

A Dockerfile defines how your Python app is built and how it's run.

Basic Dockerfile for a Python App

FROM python:3.12-slim

# Set working directory
WORKDIR /app

# Copy dependency file first (for caching)
COPY requirements.txt .

# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Expose the app port
EXPOSE 5000

# Run the application
CMD ["python", "app.py"]

Why This Structure Matters

  • Copying **_requirements.txt_** first allows Docker to cache dependencies
  • **_-no-cache-dir_** helps to keep the image size smaller
  • **_WORKDIR_** ensures all commands run in the correct directory

Creating a Docker image

From the project root, run:

docker build -t python-docker-app .
  • t assigns a name (tag) to the image
  • . tells Docker to use the current directory

You can list images with:

docker images

Executing a Python container

Start your container with:

docker run -p 5000:5000 python-docker-app

At this point, your Python app is fully containerized.

Using Python and Docker with Environment Variables

One common error is hardcoding configuration values. Docker provides clean support for environment variables.

Example: Passing an Environment Variable

docker run -p 5000:5000 -e FLASK_ENV=production python-docker-app

In Python:

import os

env = os.getenv("FLASK_ENV", "development")

This pattern is essential for secure and scalable deployments.

Accelerating Docker Builds for Python

Poor layering is a common cause of slow Docker builds, and to avoid that

  1. Pin dependency versions in requirements.txt
  2. Avoid copying unnecessary files
  3. Use a .dockerignore file

Example .dockerignore

__pycache__ /
.env
.git
venv/

This prevents large or sensitive files from entering your image.

Using Python for Development in Docker

For development, you often want live code reloading.

Mounting Source Code

docker run -p 5000:5000 -v $(pwd):/app python-docker-app

This allows you to edit code locally while the container runs. This approach is common in local development but not recommended for production.

Dockerizing Different Python Workloads

Docker works beyond web apps.

Background Workers

CMD ["python", "worker.py"]

Scripts and CLI Tools

docker run --rm python-docker-app python script.py

FastAPI Apps

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

The Docker fundamentals remain the same.

Typical Errors to Avoid

  • Using **_latest_** Python images in production
  • Installing system packages without cleanup
  • Running containers as root when unnecessary
  • Rebuilding images on every code change due to poor caching

Avoiding these issues leads to smaller, faster, and safer containers.

When Docker Makes the Biggest Difference

Docker is valuable when:

  • You work in a team with different OS environments
  • You deploy Python apps frequently
  • You manage multiple Python services
  • You need predictable CI/CD pipelines

Making use of Docker with Python is no longer optional for modern software teams. It gives you control over environments, reduces friction between development and production, and scales well from solo projects to enterprise systems.

Have a great one!!!

Author: Toluwanimi Fawole

Thank you for being a part of the community

Before you go:

Whenever you’re ready

There are 4 ways we can help you become a great backend engineer:

  • The MB Platform: Join thousands of backend engineers learning backend engineering. Build real-world backend projects, learn from expert-vetted courses and roadmaps, track your learning and set schedules, and solve backend engineering tasks, exercises, and challenges.
  • The MB Academy: The “MB Academy” is a 6-month intensive Advanced Backend Engineering Boot Camp to produce great backend engineers.
  • Join Backend Weekly: If you like posts like this, you will absolutely enjoy our exclusive weekly newsletter, sharing exclusive backend engineering resources to help you become a great Backend Engineer.
  • Get Backend Jobs: Find over 2,000+ Tailored International Remote Backend Jobs or Reach 50,000+ backend engineers on the #1 Backend Engineering Job Board.

Originally published at https://blog.masteringbackend.com.

I replaced my agents markdown memory with a semantic graph

2026-03-02 18:26:08

Why flat files are a dead end for agent memory and what happens when you use a DAG instead

I have been building with AI agents since mid-2025. First with LangChain, then briefly with AutoGen, and for the last couple months with OpenClaw. And the whole time there was something bugging me that I could not quite articulate until I saw it break in production.

The memory problem.

The thing nobody talks about

Every agent framework I have used stores memory the same way: text files. Markdown, YAML, JSON, whatever. It is all the same idea -- dump what the agent "knows" into a flat file and hope for the best.

OpenClaw does this with SOUL.md (the agent personality), HEARTBEAT.md (its task loop), and a bunch of markdown files for conversation history and long-term memory. And honestly? It works fine for personal use. I ran my OpenClaw agent for weeks managing my email and calendar through Telegram. No complaints.

Then I tried to build something for a client.

The client is a small fintech in Spain that needed an agent to handle KYC verification -- basically confirming that a user passed identity checks before letting them do certain transactions. Simple enough, right?

Here is where it fell apart. The agent could say a user passed KYC. It could write "User 4521 passed KYC Level 2" into a markdown file. But when another agent (the compliance agent) needed to verify that claim... it was just reading a text file. There was no way to know if that claim was actually true, if it had been tampered with, or even if the agent that wrote it had the authority to make that assertion.

I was literally building a compliance system on top of text files. I felt like an idiot.

Enter AIngle

I found AIngle because I was googling "semantic memory for agents" at 2am on a Tuesday, which is when all the best technical decisions are made.

AIngle is a protocol -- not a framework, not a library, a protocol -- for storing knowledge as semantic graphs instead of flat text. It comes from a project called Apilium and it is built in Rust (12 crates, which initially scared me, but the latency numbers are wild -- 76 microseconds for local operations).

The core idea is simple once you get past the terminology:
Instead of storing "User 4521 passed KYC Level 2" as a string in a markdown file, you store it as a semantic triple:

Subject:   user:4521
Predicate: kyc:level
Object:    2
Proof:     PoL:hash:a7f3

That last field is what changed everything for me. Every assertion has a cryptographic proof attached. It is called Proof-of-Logic (PoL), and it basically means that when Agent B reads a claim made by Agent A, it can mathematically verify that the claim is consistent with Agent A's history of assertions.

No trust required. No "I read it in a markdown file so it must be true." Math.

How it actually works (with code)

I am not going to pretend this was easy to set up. It was not. The docs are... improving. But here is the gist of what I ended up with.

AIngle has three layers, and it took me a while to understand why you need all three:

Cortex -- this is the part that takes natural language and turns it into SPARQL queries. So when your agent thinks "does this user have KYC level 2?", Cortex translates that into a structured query against the semantic graph. You do not write SPARQL yourself (thank god).

Ami -- the semantic mesh. This is where assertions live, propagate between agents, and get verified via PoL. Think of it as a shared knowledge layer where agents can publish claims and other agents can verify them without trusting each other.

Nexo Genesis -- the storage layer. Each agent (or user, or organization) gets their own "source chain" -- basically a private DAG where their data lives. You own your data. Nobody else sees it unless you explicitly share it, and even then you can use zero-knowledge proofs to share properties without sharing the underlying data.

Here is what the KYC verification looks like with AIngle vs without:

// WITHOUT (OpenClaw style)
// Agent A writes to a file
fs.writeFileSync('memory/kyc.md', 
  '## KYC Status - User 4521: Level 2 (verified 2026-02-15)'
);

// Agent B reads the file and... trusts it?
const kycData = fs.readFileSync('memory/kyc.md', 'utf8');
// hope nobody edited this file lol
// WITH AIngle
// Agent A publishes a verified assertion
const proof = await ami.assert({
  subject: 'user:4521',
  predicate: 'kyc:level',
  object: 2,
  evidence: kycVerificationResult.hash
});

// Agent B queries and verifies cryptographically
const claim = await ami.query({
  subject: 'user:4521',
  predicate: 'kyc:level',
  requireProof: true
});

const isValid = await nexo.verifyPoL(claim.proof);
// isValid is math, not faith

The second version is more code, yeah. But it is also the difference between "we think the user passed KYC" and "we can prove the user passed KYC, and here is the cryptographic receipt."

The stuff I did not expect

Once I had the semantic layer running, a few things surprised me.

Memory queries got smarter. With markdown, if I wanted to know "which users completed KYC in the last 30 days and also had a transaction flagged for review", I would have to parse text files and do string matching. With semantic triples, that is just a query. The graph structure makes relational queries trivial.

Agent disagreements became resolvable. I had two agents that disagreed about a user status -- the KYC agent said "verified" and the compliance agent said "under review" because it had newer information. With markdown, this is just two conflicting strings in two files and you pick whichever was written last. With Ami, there is a consensus mechanism. The agents compare the timestamps and provenance of their assertions and resolve the conflict based on which assertion has stronger evidence. No human intervention needed.

The ZK proofs are actually useful. I was skeptical about this. Zero-knowledge proofs sounded like blockchain hype to me. But in practice, being able to prove "this user is over 18" without revealing their birthdate, or "this user has sufficient balance" without revealing the amount -- that solves real GDPR problems. My client legal team was more excited about this than any of the AI features.

What still sucks

I am not going to write a puff piece. There are real problems.

The documentation is sparse. I spent way too many hours reading Rust source code to understand how certain things work. If you are not comfortable reading Rust, you will struggle with the lower-level AIngle stuff.

The onboarding experience needs work. Setting up Nexo Genesis for the first time involves more configuration than I would like. It is not "npm install and go" -- there is infrastructure to think about.

The community is small. When I got stuck, there were no Stack Overflow answers to fall back on. I ended up in a Discord channel with maybe 30 people. They were helpful, but it is not the 117K-member OpenClaw Discord.

And honestly, for simple personal agents -- managing your email, setting reminders, basic automation -- you do not need any of this. Markdown memory is fine for that. AIngle is overkill for "remind me to buy groceries."

But if you are building agents that need to make verifiable claims, handle sensitive data, or work in regulated industries... flat files are not going to cut it. I learned that the hard way.

What is next

I have been doing all of this integration manually -- wiring AIngle into my agent setup, writing the adapters, configuring Nexo Genesis by hand. It has been educational but it has also been a lot of plumbing work that I would rather not repeat.

A few days ago I came across a project called MAYROS that has AIngle baked in from the start. It is an OpenClaw fork, so the channel integrations (WhatsApp, Telegram, Slack) and classic skills carry over, but the memory layer is completely replaced with semantic graphs and PoL verification out of the box. Basically what I have been building by hand for weeks, but already wired into the agent runtime.

I have started setting it up for my fintech client staging environment and so far the migration CLI is surprisingly clean -- it reads the old OpenClaw markdown memory and converts it to semantic triples. The classic skills work without touching anything, which was my main worry. Still early days for me with it but the architecture looks solid.

I am going to write a proper follow-up post once I have spent more time with it -- the full migration process from OpenClaw, how the multi-agent PoL verification works in practice with real compliance flows, and honest benchmarks. The stuff I wish someone had written when I was getting started with AIngle by hand. Follow me if you do not want to miss that.

The MAYROS repo is at github.com/ApiliumCode if you want to poke around the code in the meantime. And if anyone has already tried it, hit me up in the comments -- I would love to compare notes.

tl;dr

  • Agent memory stored in flat files (markdown, JSON) works for personal use
  • It breaks badly when you need agents to verify each other claims
  • AIngle stores knowledge as semantic graphs with cryptographic proofs
  • The learning curve is real but the capabilities are worth it for anything touching compliance or sensitive data
  • If you are building toy agents, keep using markdown. If you are building agents that matter, look at semantic memory
  • I am testing MAYROS (an OpenClaw fork with AIngle built in) -- full writeup coming soon

I am happy to answer questions in the comments. I have been heads-down in this stuff for weeks and I genuinely think semantic agent memory is going to be the standard approach in a year or two. Or I am wrong and we will all be parsing markdown files forever. Either way, it has been a fun ride.

If you are working on something similar or have thoughts on agent memory architectures, I would love to hear about it. I am especially curious if anyone has tried other approaches to inter-agent verification that do not involve semantic graphs.

Kairos

2026-03-02 18:24:46

This is a submission for the DEV Weekend Challenge: Community

The Community

Nepal has thousands of small Christian fellowship churches — most run entirely by volunteers with no technical background. Every Saturday, a church anchor (presenter) manually types out the week's presentation: song lyrics in Nepali, Bible verses, sermon details, announcements, and prayer points — usually in PowerPoint.

What I Built

Kairos — Fellowship Builder is an AI-powered church presentation builder designed specifically for Nepali Christian communities.

An anchor fills in a simple form:

  • Fellowship date
  • Anchor name and sermon leader
  • Song lyrics (fetched automatically from a Nepali Christian songs library)
  • Bible references
  • Announcements and prayer points

Demo

Code

Kairos

A church fellowship presentation builder. Fill in your fellowship details — anchor name, sermon leader, song lyrics, Bible verse, announcements, and prayer points — and Claude AI generates a structured, slide-by-slide presentation ready to project fullscreen.

Features

  • AI-generated slides via Claude (claude-sonnet-4-6) using the Vercel AI SDK
  • Fullscreen presenter mode with keyboard navigation (arrow keys, Escape)
  • Upload a lyrics image — Claude extracts the text automatically
  • Save and manage presentations (Prisma + Supabase PostgreSQL)
  • Google OAuth sign-in via Supabase Auth

Stack

  • Next.js 16 (App Router) + TypeScript
  • Vercel AI SDK + @ai-sdk/anthropic
  • Prisma 6 ORM → Supabase PostgreSQL
  • Supabase Auth (Google OAuth only)
  • Zustand for slide/presenter state
  • Tailwind CSS v4 + shadcn/ui

Setup

  1. Clone and install:

    npm install
  2. Copy .env and fill in your values:

    ANTHROPIC_API_KEY=sk-ant-...
    NEXT_PUBLIC_SUPABASE_URL=https://xxx.supabase.co
    NEXT_PUBLIC_SUPABASE_ANON_KEY=eyJ...
    DATABASE_URL=postgresql://...
  3. Run Prisma migration:

    npx prisma migrate dev --name init
  4. Start…

How I Built It

Tech stack:

  1. Framework - Next.js 16 (App Router)
  2. AI - Gemini 2.5 Flash Lite via Vercel AI SDK (@ai-sdk/google)
  3. Database & Auth - Supabase (PostgreSQL + Google OAuth)
  4. Styling - Tailwind CSS v4 + shadcn/ui
  5. State - Zustand
  6. Deployment - Vercel

How the AI works:

The form data is sent to a Next.js API route which calls Gemini 2.5 Flash Lite with a carefully engineered system prompt. The prompt instructs the AI to:

Output only valid JSON (no markdown)
Write all slide content in Nepali Devanagari
Transliterate English names into Devanagari
Convert the Gregorian fellowship date into Bikram Sambat calendar in Nepali
Generate a warm, faith-appropriate welcome message in Nepali
Split song lyrics into individual slides per section (Verse 1, Chorus, Bridge, etc.)
Follow a fixed slide order: welcome → host → opening prayer → lyrics → sermon → Bible → announcements → closing prayer
Why Gemini 2.5 Flash Lite:
It handles Nepali Devanagari script accurately, understands Bikram Sambat calendar conversion, and is fast enough for real-time generation of 10–15 slides.

The biggest challenge:
Getting the AI to reliably output parseable JSON while also handling Nepali script, calendar conversion, and name transliteration all in a single prompt. The solution was using generateText (not streaming) and stripping any markdown code fences from the response before parsing.

This was built to solve a real problem for a real community — and it's already being used.

DEV username - BishalSunuwar202

I Built a CLI That Gives Your LLM Accurate Library Docs — No MCP Server Needed

2026-03-02 18:23:24

The Problem

You're building with Next.js 15 and ask your AI assistant to write an API route. It gives you the Pages Router pattern from Next.js 12. You paste React docs into the prompt, but they're already outdated by the time you copy them.

Context7 solved this by indexing documentation directly from source repos and serving it through an MCP server. Cursor, Claude Code, and other AI editors use it to get real, version-specific docs instead of hallucinated APIs.

But MCP has a constraint: you need an MCP-compatible client. If you're working in the terminal, running a script, or using a local LLM — you're out of luck.

The Solution

I built c7 — a CLI that pulls from the same Context7 database and outputs docs as plain text to stdout.

c7 react hooks
c7 express middleware
c7 nextjs "app router"

That's it. No server, no configuration, no IDE integration. Just text you can pipe anywhere.

How It Works

The CLI does two things:

  1. Resolves a library name to a Context7 ID (e.g., react/websites/react_dev)
  2. Fetches documentation for that library, filtered by topic

Under the hood it's two API calls using Node.js built-in fetch against Context7's v2 API. The entire project is ~220 lines across two files with zero dependencies.

bin/c7.js   — 136 lines (CLI parsing + output formatting)
lib/api.js  —  87 lines (Context7 v2 API client)

No axios. No commander. No chalk. Just process.argv and fetch.

The Real Power: Pipes

Because c7 outputs plain text to stdout, it composes with everything:

Pipe into LLMs

# Claude
c7 react hooks | claude "summarize the key patterns and show examples"

# Ollama (local models)
c7 express middleware | ollama run codellama "explain this middleware pattern"

# Any LLM CLI
c7 nextjs "api routes" | llm "write an API route based on these docs"

Pipe into Unix tools

# Search docs
c7 nextjs "api routes" | grep "export"

# Page through docs
c7 prisma "schema" | less

# Copy to clipboard
c7 react "useEffect" | pbcopy

# Build context files
c7 nextjs "app router" >> context.txt
c7 react "server components" >> context.txt

Use in scripts

# Pre-load context for a coding agent
DOCS=$(c7 nextjs "app router middleware")
claude "Build a Next.js middleware that handles auth. Use these docs:\n$DOCS"

c7 vs MCP Server

MCP Server c7 CLI
Setup Install server, configure MCP client, restart editor npx @vedanth/context7
Works in MCP-compatible editors Terminal, scripts, CI, anywhere
Composable Limited to MCP protocol Pipes, redirects, subshells
Dependencies Several npm packages Zero
Lines of code ~1000+ ~220

They're complementary. Use the MCP server in your editor, use c7 everywhere else.

Getting Started

# Run without installing
npx @vedanth/context7 react hooks

# Or install globally
npm install -g @vedanth/context7
c7 react hooks
c7 express middleware
c7 nextjs "app router" | claude "summarize"

No API key required for basic usage. For higher rate limits, get a free key at context7.com/dashboard.

Links

Built by Vedanth Bora. If this saves you from one hallucinated API, it was worth building.

Why Modern AI Models Sound More “Explanatory”

2026-03-02 18:20:44

A Structural Look at GPT vs. Claude

Many users have recently noticed a strange shift in how AI models speak.

Everything turns into an explanation

Less ability to read between the lines

Shallower responses

Safe generalizations instead of deep insight

The sense that “earlier models felt smarter”

This is not just a subjective feeling.

Contemporary AI models are structurally evolving toward “explanatory output.”
Not because they became lazy, but because their architectures now optimize for safety and consistency over depth and inference.

In this article, we’ll look at why this happens—
focusing especially on the key difference between GPT-style models and Claude-style models.

◎ 1. “Explanation Bias” Is Baked Into Language Model Training

All LLMs have a natural tendency toward explanatory text.

Why?

Because, in the context of large-scale training:

Explanations are low-risk

Explanations have stable structure

They are easier to evaluate

They rarely contradict safety expectations

They rarely contain ambiguity

From the model’s perspective

“Explanations” are statistically the safest things to output.

As a result, deep inference, conceptual leaps, and ambiguity become less rewarded,
while “clear explanations” become the winning strategy.

◎ 2. GPT-Style Models Now Integrate Safety Into the Core

This is the biggest structural change in recent generations.

Earlier LLMs generally worked like this:

Internal reasoning → Output → External safety layer filters it

But new GPT models increasingly work like this:

Embedding

Transformer (reasoning)

Safety Core (intervenes inside the model)

Policy Head (final output)

This matters because the Safety Core isn’t just filtering the final answer.

It is actively shaping:

How the model reasons

Which inferences are allowed to continue

Which directions are “pruned” early

What depth the model is allowed to explore

Thus, GPT models tend to:

avoid risky inferences

avoid emotionally ambiguous content

avoid deep-value reasoning

default to safe, surface-level explanations

In short:

When ethics and safety rules enter the core, flexibility disappears.

This matches perfectly with the intuition:
“Once ethics is baked into the kernel, the system gets rigid.”

◎ 3. Claude Takes the Opposite Approach: Safety Outside, Reasoning Inside

Claude’s architecture is fundamentally different

Transformer (full internal reasoning)
      ↓
Produces a complete answer
      ↓
External safety layer checks or rewrites output

This means

The internal reasoning process remains untouched

Deep inference chains are allowed

Conceptual leaps aren’t prematurely pruned

Multi-layered intent is preserved

Claude can respond to nuance and emotional context more freely

This structural choice explains why Claude often feels

more philosophical

more capable of reading subtext

more internally coherent

more willing to think “between the lines”

It’s not magic—
it’s simply a different placement of safety mechanisms.

◎ 4. So Why Do Models “Sound More Explanatory”?

Now we can summarize the structural reasons

✔ 1. Internal safety layers truncate deep reasoning

In GPT-style models:

Ambiguity is risky

Nuance is risky

Emotion is risky

Value judgments are risky

Large inference jumps are risky

Thus, the model often stops early and switches to explanation mode.

✔ 2. Multi-step reasoning chains collapse into “safe summaries”

If a deeper inference might violate policy,
the model will default to

“Let me just explain this safely.”

This is why answers feel polished but shallow.

✔ 3. The design priority has shifted: “Depth < Safety”

As LLMs move into enterprise and consumer infrastructure, companies optimize for:

risk reduction

neutrality

non-controversial output

predictable behavior

This inevitably pushes models toward:

“Explain but don’t explore.”

◎ 5. The Conclusion:

AI Models Don’t Explain Because They Want To—
They Explain Because They’re Built To

The main takeaway:

The rise of “explanatory tone” is a structural, architectural consequence—not a behavioral flaw.

GPT integrates safety into its core

Claude keeps safety external

This difference produces meaningful divergence in depth, nuance, and reasoning style

Explanatory AI isn’t the result of laziness.
It’s the result of a deliberate design choice:
a trade-off between depth and safety.

And as safety becomes more central to model architecture,
explanatory output becomes the default equilibrium.