MoreRSS

site iconThe Practical DeveloperModify

A constructive and inclusive social network for software developers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of The Practical Developer

Rethinking IDE Strategy for Modern Enterprise IT Teams

2026-02-03 13:31:14

Infographic titled “Types of IDEs Enterprise IT Teams Use” showing four categories: Traditional IDEs, Cloud IDEs, IDEs with Embedded AI, and Agentic IDEs. Each column lists example tools like Visual Studio, IntelliJ, AWS Cloud9, GitHub Codespaces, Copilot, CodeWhisperer, AWS Kiro, Cursor, and Zed. Professional headshot and LinkedIn handle @eagleeyethinker displayed on the right. A Practical Guide to Modern IDE's

Choosing the right IDE strategy is becoming a strategic enterprise decision.

Here’s how I think about the modern IDE landscape – beyond just “which editor looks nice.”

4 Types of IDEs in Enterprise IT

1️. Traditional IDEs

(Visual Studio, IntelliJ IDEA, Eclipse, NetBeans)

Pros

Rock-solid debugging and build tools
Mature plugin ecosystems
Excellent for large monolithic codebases
Strong language-specific tooling
Enterprise-grade stability

Cons

Heavyweight installs
Local machine dependency
Harder to standardize environments
Slower onboarding for new devs
Limited built-in AI assistance

Best For: Legacy systems, .NET/Java-heavy enterprises, regulated environments

2️. Cloud IDEs

(AWS Cloud9, GitHub Codespaces, Gitpod, Google Cloud Shell Editor)

Pros

Zero-setup developer onboarding
Environment standardization
Remote-friendly development
Secure, centrally managed
Works from any device

Cons

Dependent on internet connectivity
Cost per developer seat
Limited offline capability
Performance can vary
Tooling not as deep as desktop IDEs

Best For: Distributed teams, DevOps workflows, training environments

3️. Existing IDEs + Embedded AI

(VS Code + Copilot, IntelliJ + AI plugins, CodeWhisperer, Tabnine)

Pros

Immediate productivity boost
Smart code completion
Faster boilerplate generation
Works with existing workflows
Low adoption friction

Cons

Still developer-driven
Context switching remains
AI suggestions can be inconsistent
Security/privacy concerns in regulated industries
Not truly autonomous

Best For: Incremental AI adoption without changing developer tools

4️. Agentic IDEs

(AWS Kiro, Cursor, Zed, Google Antigravity)

Pros

AI agents that plan and execute tasks
Spec-driven development
Code + tests + docs generation
Multi-repo automation
Reduced manual grunt work

Cons

Still emerging tech
Requires trust in AI decisions
Governance challenges
Learning curve
Enterprise adoption still early

Best For: Next-gen software engineering teams looking to scale developer impact

My Take

Most enterprises will NOT choose just one category.

Instead, the winning formula is:

  • Traditional IDE stability
  • Cloud IDE collaboration
  • AI assistants for productivity
  • Agentic IDEs for automation

That hybrid model is where the future of enterprise development is headed.
🎯 Which one are YOU using today?

Traditional?
Cloud?
AI-embedded?
Going fully agentic?

Drop a comment 👇

EnterpriseIT, SoftwareEngineering, Developers, IDE, CloudComputing, AI , AgenticAI, DevOps, Programming, AWS, GitHub, CodeAssist, Productivity, TechnologyLeadership, GenAI, DeveloperExperience

Beyond Chatbots: Building Task-Driven Agentic Interfaces in Google Workspace with A2UI and Gemini

2026-02-03 13:07:07

fig1

Abstract

This article explores A2UI (Agent-to-User Interface) using Google Apps Script and Gemini. By generating dynamic HTML via structured JSON, Gemini transforms Workspace into an "Agent Hub." This recursive UI loop enables complex workflows where the AI builds the specific functional tools required to execute tasks directly.

Introduction: The Evolution of AI Interaction

The Official A2UI framework by Google marks a significant paradigm shift in how we interact with artificial intelligence. Short for Agent-to-User Interface, A2UI represents the evolution of Large Language Models (LLMs) from passive chatbots into active agents capable of designing their own functional interfaces. Building upon my previous research, A2UI for Google Apps Script and Bringing A2UI to Google Workspace with Gemini, I have refined this integration to support sophisticated, stateful workflows.

To appreciate the impact of A2UI, we must recognize the limitations of "Chat-centric" AI. In traditional chat interfaces, users must manually bridge the gap between an AI's advice and their actual files—a process often involving tedious context switching. By implementing A2UI within Google Apps Script (GAS), we leverage a unique "Home-Field Advantage." Because GAS is native to the Google ecosystem, it possesses high-affinity access to the Drive API and Spreadsheet services, allowing the AI to act directly on your data.

Core Architecture: The Generative UI Loop

In this system, Gemini functions as both the Agent and the UI Architect. When a user submits a natural language prompt, the Agent evaluates the intent and generates a specific HTML interface—such as a file selector, a metadata card, or a live text editor.

Crucially, this implementation utilizes Recursive UI Logic. When a user interacts with a generated component (e.g., clicking an "OK" button), that action is transmitted back to the Agent as a "System Event." This event contains the conversation history and the new data context. This allows the Agent to "see" the current state of the task and generate the next logical interface, creating a seamless, multi-step agentic workflow.

Workflow Visualization

This diagram illustrates how the system maintains state and generates interfaces recursively using "System Events."

fig2

Mermaid Chart Playground

Repository

The full source code and sample implementation can be found here:
https://github.com/tanaikech/A2UI-for-Google-Apps-Script

Application Setup Guide

To deploy this application in your own environment, please follow these steps:

1. Obtain an API Key
You will need a valid Gemini API Key to communicate with the LLM.
Get one here.

2. Copy the Sample Script
You can copy the Google Spreadsheet containing the pre-configured Google Apps Script using the link below:
https://docs.google.com/spreadsheets/d/1UB5j-ySSBBsGJjSaKWpBPRYkokl7UtgYhDxqmYW00Vc/copy

3. Configure the Script

  1. Open the script editor (Extensions > Apps Script).
  2. Locate the main.gs file.
  3. Set your API key in the GEMINI_API_KEY variable.
  4. Save the project.

Alternatively, visit the GitHub Repository to manually copy the source codes.

Demonstration: Productivity Meets Magic

The following video showcases how A2UI transforms a Google Sheet into an agentic command center. The system doesn't just talk; it guides the user through three distinct patterns of interaction.

Operational Patterns: Productivity in Action

The system transforms a standard Google Sheet into an agentic command center. It facilitates three distinct patterns of interaction:

Pattern 1: Intelligent Viewing

Sample prompt: Please list the files in the folder named 'sample'. I would like to select a file and view its content.

The user requests to see files in a specific folder. Gemini understands the intent, calls the Drive API to list the files, and generates a File Selector UI. Once the user selects files, the Agent fetches the content and renders it in a clean Content Viewer layout designed specifically for reading.

Pattern 2: Contextual Metadata Analysis

Sample prompt: Show me the files in the 'sample' folder. I need to select a file to check its metadata.

If a user asks for technical details, the UI adapts. The Agent generates a Metadata Viewer, displaying properties like File IDs, sizes, and creation dates. This showcases the agent hub's ability to pivot between task types by generating appropriate interfaces on the fly.

Pattern 3: Multi-Step "Verify and Edit"

Sample prompt: I want to edit a file in the 'sample' folder. Please let me select a file and check its content first. If it's the right one, I will edit and update it.

This demonstrates the power of stateful A2UI:

  1. Selection Preview: The Agent provides a preview with radio buttons for content confirmation.
  2. Dynamic Editor: Gemini generates an Editor UI containing the file’s text.
  3. Real-Time Execution: The script executes modifications directly to Google Drive upon clicking "Update," completing the cycle from prompt to action.

Note: In this specific sample, only text files on Google Drive are eligible for editing.

Important Note

This project serves as a foundational methodology for building Agentic UIs. When implementing this in a production environment, ensure the scripts are modified to meet your specific security requirements and workflow constraints.

Summary

  1. A2UI (Agent-to-User Interface) represents a paradigm shift where the Agent builds the functional UI components required for a task rather than just providing text.
  2. The recursive task execution model uses "System Events" to track progress, allowing the interface to evolve dynamically based on real-time user actions.
  3. Native Workspace integration via Google Apps Script provides secure, high-speed access to Drive and Sheets data without the need for external server management.
  4. Zero-Tab efficiency is achieved by consolidating file discovery, analysis, and editing within a single, dynamic dialog box inside a spreadsheet.
  5. This task-driven architecture proves the future of productivity lies in AI agents acting as architects, creating custom tools precisely when they are needed.

What is JavaScript? – A Simple Guide for Beginners

2026-02-03 13:01:14

Introduction

Today, almost every website and app uses one powerful programming language — JavaScript.
When you click a button and a popup appears, when a form shows an error, when a page loads new content without refreshing… all these happen because of JavaScript.

This blog is written in very simple English for complete beginners.

What is JavaScript?

JavaScript is a programming language mainly used to make websites interactive and dynamic.

HTML ➝ gives structure

CSS ➝ gives style and design

JavaScript ➝ gives life and action

Without JS, websites would look like plain text and images.

Where is JavaScript used?

JavaScript is not only for websites. It is used everywhere today:

  • Websites

Buttons

Sliders

Animations

Form validations

Pop-ups

  • Mobile Apps

Using frameworks like React Native and Ionic.

  • Backend / Servers

With Node.js, JavaScript can run on servers too (used by Netflix, Uber, etc.)

  • Desktop Apps

Apps like VS Code and Slack are made using JavaScript.

How does JavaScript work?

  • Runs in the browser or Node.js
  • Executes code line by line
  • Single-threaded (one thing at a time)
  • Async tasks (timers, API calls) are handled using the event loop.This makes JavaScript non-blocking

MIRROR and Engram: How AI Learns to Think and Remember

2026-02-03 13:01:04

Beyond Brute Force: How MIRROR and Engram Teach AI to Truly Think and Remember

The Frustrating Flaw: Why LLMs Forget and Fail to Reflect

Experiencing a chat with an AI like ChatGPT or Claude can be vexing. You might share a crucial piece of information early on, only for the model to seemingly lose track of it a few messages later, amidst unrelated queries. Even more troubling, it might adapt its responses to your emotional tone, prioritizing agreement over factual accuracy.

This isn't a mere glitch; it stems from a fundamental constraint inherent in today's large language model designs.

While modern LLMs demonstrate remarkable statistical prowess in text generation, they grapple with three significant shortcomings:

  1. Absence of cohesive working memory: Each interaction is processed in isolation, lacking a continuous internal state.
  2. Lack of self-reflection: Responses are produced in a singular, unexamined pass, missing an internal dialogue to ensure consistency.
  3. Inefficient handling of static knowledge: Instead of storing and recalling established facts, they repeatedly recompute them.

Consider a software engineer who consistently forgets variable declarations after just a few lines of code, or one who must consult the entire React documentation every time they implement useState(). This mirrors the behavior of contemporary LLMs.

However, a new era is dawning with two groundbreaking architectural approaches: MIRROR and Engram.

These aren't merely performance enhancements; they fundamentally reshape our understanding of AI's capacity to "think" and "retain information."

MIRROR: Cultivating AI's Inner Voice

The Challenge: AI's Lack of Internal State

Unlike machines, human cognition isn't a linear, one-shot process. When faced with a complex query, we typically:

  • Ponder (exploring various mental pathways)
  • Consolidate (shaping thoughts into a unified internal model)
  • Articulate (crafting a precise answer)

Traditional LLMs, however, bypass these crucial preliminary stages, jumping directly to step three. This absence of internal reflection often leads to:

  • Agreement bias: They tend to concur with user input, potentially overlooking accuracy or safety protocols.
  • Contextual amnesia: Key details introduced earlier in a dialogue are frequently overlooked.
  • Conflicting priorities: Difficulty in reconciling opposing requirements, such as user safety versus explicit instructions.

The MIRROR (Modular Internal Reasoning, Reflection, Orchestration, and Response) architecture directly addresses these limitations.

Architecture: Decoupling Cognition from Communication

MIRROR operates through a dual-layered framework:

1. The Thinker: AI's Internal State

The Thinker module sustains an evolving internal narrative — functioning as an adaptive mental model across an entire conversation. It comprises two distinct components:

a) The Inner Monologue Manager
This module coordinates three concurrent lines of reasoning:

  • User Intent: What are the user's underlying objectives and ultimate aims?
  • Logical Progression: What inferences can be drawn, and what intellectual frameworks are becoming apparent?
  • Retained Information: Which essential facts have been presented, and what preferences remain consistent?

b) The Cognitive Controller
This component integrates the three aforementioned threads into a cohesive narrative, which functions as the system's working memory. This narrative is dynamically updated with every conversational turn, forming the foundation for subsequent responses.

2. The Talker: AI's External Expression

Leveraging this internal narrative, the Talker module formulates articulate and contextually relevant responses, effectively mirroring the system's prevailing "state of awareness."

A key feature is temporal decoupling: during live operation, the Thinker can continue its reflective processes in the background, independently of the Talker, which provides immediate replies. This design allows for extensive, deep reflection without compromising response speed.

Remarkable Performance: Up to 156% Improvement in Critical Scenarios

Testing MIRROR involved the CuRaTe benchmark, specifically crafted for multi-turn conversations featuring stringent safety protocols and conflicting user preferences.

Metric Baseline With MIRROR Improvement
Average success rate 69% 84% +21%
Maximum performance (Llama 4 Scout) - 91% -
Critical scenario (3 people) - - +156%

The advantages of MIRROR are not confined to a single model; it demonstrably enhances performance across a range of leading LLMs, including GPT-4o, Claude 3.7 Sonnet, Gemini 1.5 Pro, Llama 4, and Mistral 3.

What drives such dramatic gains? MIRROR achieves this by converting vast conversational histories into practical understanding through a distinct three-phase pipeline:

  1. Exploration across dimensions (through multiple thought threads)
  2. Synthesis into a cohesive mental framework (the internal narrative)
  3. Application within context (to formulate a response)

This process mirrors how an experienced software developer tackles a complex bug: they don't rush to a solution but rather engage in deep thought and analysis beforehand.

Engram: Elevating Memory Over Raw Computation

The Issue: Redundant Calculation of Known Information

Picture a programmer needing to review Python's entire documentation simply to use the print() function. This scenario sounds preposterous, doesn't it?

However, this parallels the behavior of contemporary Transformer models. For instance, to recognize an entity such as "Diana, Princess of Wales," an LLM typically has to:

  1. Process tokens through numerous attention layers.
  2. Incrementally gather contextual attributes.
  3. Effectively "recompute" information that ideally should be a straightforward memory retrieval.

It’s comparable to your brain having to derive 2+2=4 anew each time, instead of instantly recalling the answer.

The Engram architecture addresses this inefficiency by integrating a conditional memory system—an O(1) constant-time lookup mechanism for static data.

Architecture: Achieving O(1) Knowledge Retrieval with Hashed N-grams

Engram innovates upon the traditional N-gram embedding method, yielding a highly scalable memory module.

1. Efficient Sparse Retrieval

a) Tokenizer Compression
Raw token identifiers are mapped to canonical IDs through textual normalization (NFKC, lowercase). This process slashes the effective vocabulary size by roughly 23% for a 128k tokenizer, thereby boosting semantic density.

b) Multi-Head Hashing
For every N-gram (a sequence of N tokens), the system employs K unique hash functions. Each hashing "head" then links the local context to an index within an embedding table. This strategy minimizes potential collisions and facilitates the rapid retrieval of a memory vector.

The outcome is an AI capable of performing knowledge lookups in constant time, rather than laboriously recomputing information across numerous Transformer layers.

2. Intelligent Context-Aware Gating

The retrieved memory vector (e_t) represents static information that might inherently include noise. To integrate this data intelligently, Engram utilizes an attention-based gating mechanism:

  • The Transformer's current hidden state (h_t) functions as the Query.
  • The external memory (e_t) provides the Key and Value components.
  • A scalar gate (α_t) is calculated to adjust the memory's influence.

Should the retrieved memory conflict with the dynamic contextual information, the gate scales down (α_t → 0), effectively filtering out potential noise.

The U-Shaped Scaling Law: Forging the Compute-Memory Partnership

Engram transcends being a mere component; it introduces a novel dimension of sparsity, complementing the existing Mixture-of-Experts (MoE) paradigm.

Research has uncovered a U-shaped correlation when distributing sparsity parameters between computational resources (MoE experts) and memory (Engram):

  • Excessive computation, insufficient memory → Leads to inefficiency (due to perpetual re-calculation).
  • Abundant memory, inadequate computation → Results in performance stagnation.
  • The Sweet Spot (20-25% memory allocation) → Consistently surpasses the capabilities of purely MoE models.

This represents a pivotal discovery: the trajectory of AI advancement lies not simply in larger models, but in more intelligently designed hybrid systems.

Performance: Excelling in Reasoning, Not Just Recall

Engram-27B and Engram-40B models underwent evaluation by reassigning parameters from a standard MoE baseline.

Benchmark Category Gain (Engram vs MoE)
BBH Complex Reasoning +5.0
CMMLU Cultural Knowledge +4.0
ARC-Challenge Scientific Reasoning +3.7
MMLU General Knowledge +3.4
HumanEval Code Generation +3.0
MATH Mathematical Reasoning +2.4

Intriguingly, the most significant performance boosts aren't observed in rote memorization tasks, but rather in areas like complex reasoning, code generation, and mathematics.

The reason? Engram liberates the initial layers of the model from the burden of reconstructing static information patterns. This effectively amplifies the network's "depth," dedicating more capacity to abstract reasoning.

Consider it akin to an optimized operating system handling memory management, thereby freeing your CPU to focus on more intricate calculations.

System Efficiency: Offloading Memory to RAM or NVMe

Engram's retrieval index operates deterministically; its functionality relies exclusively on the input token sequence, rather than dynamic runtime hidden states (a contrast to MoE routing).

This distinct characteristic enables the asynchronous prefetching of required embeddings from:

  • System RAM
  • NVMe drives through the PCIe bus

Such an approach effectively conceals communication delays and permits the expansion of the model's memory to encompass hundreds of billions of parameters with minimal performance impact (under 3%), circumventing the common limitations of GPU VRAM.

Envision the ability to upgrade your LLM's memory akin to installing more RAM in your personal computer, all without requiring extra GPUs. This is precisely the capability Engram brings to the table.

ENGRAM-R: Streamlining Reasoning with "Fact Cards"

Beyond architectural integration, modular memory principles are applied at the system level to manage long conversations and optimize large reasoning models (LRM).

The ENGRAM System: Cognition-Inspired Typed Memory

Drawing inspiration from cognitive psychology, this system categorizes conversational memory into three separate stores:

  • Episodic Memory: Stores unique events and interactions, complete with their temporal context (e.g., "The user relocated to Seattle last year").
  • Semantic Memory: Holds general facts, observations, and consistent preferences (e.g., "The user's preferred color is green").
  • Procedural Memory: Contains instructions and operational knowledge (e.g., "The tax submission deadline is April 15th").

With every turn in a dialogue, information is directed to its appropriate memory store(s). When a query arises, a dense similarity search pinpoints and retrieves the most pertinent context.

ENGRAM-R: Leveraging "Fact Cards" for Efficient Thought

ENGRAM-R integrates two primary mechanisms designed to substantially lower the computational overhead associated with reasoning:

1. Fact Card Generation
Instead of embedding lengthy conversational snippets directly into the context, retrieved data is condensed into concise, verifiable "fact cards":

[E1, A moved to Seattle, Turn 1]
[S2, Favorite color: green, Turn 5]
[P3, Tax deadline: April 15, Turn 12]

2. Direct Citation
The Large Reasoning Model (LRM) receives explicit instructions to treat these cards as authoritative sources and to reference them directly within its reasoning process:

“To answer Q1, E1 shows that A lives in Seattle. Answer: Seattle. Cite [E1].”

Efficiency Boosts: 89% Token Reduction, 2.5% Accuracy Increase

Assessments on extensive conversational benchmarks (LoCoMo, with 16k tokens, and LongMemEval, with 115k tokens) demonstrated:

Metric Full-Context ENGRAM-R Reduction
Input Tokens (LoCoMo) 28,371,703 3,293,478 ≈ 89%
Reasoning Tokens 1,335,988 378,424 ≈ 72%
Accuracy (Multi-hop) 72.0% 74.5% +2.5%
Accuracy (Temporal) 67.3% 69.2% +1.9%

This approach, which converts conversational history into a concise, citable evidence repository, facilitates:

  • Substantial reductions in computational expenditure.
  • Preservation, and often enhancement, of accuracy.
  • The establishment of verifiable and auditable reasoning pathways.

This mirrors the practice of an experienced developer: rather than re-reading an entire codebase constantly, they maintain a distilled mental model of key components.

The Cognitive Leap: AI That Thinks and Remembers

An Architectural Metamorphosis

MIRROR and Engram represent more than minor enhancements; they herald a fundamental paradigm shift in AI architecture:

Transitioning from: monolithic models that re-evaluate every piece of information in each pass.
To: hybrid compute-memory systems capable of genuine thought, recall, and reasoning.

This profound evolution draws direct inspiration from the field of cognitive science, incorporating elements such as:

  • Working memory (embodied by MIRROR’s Cognitive Controller).
  • Categorized long-term memory (comprising episodic, semantic, and procedural forms).
  • Data compression (via Fact Cards).
  • Internal dialogue (enabled by parallel reasoning threads).

Furthermore, systems such as XMem and Memoria are already demonstrating the ability to replicate human psychological phenomena, including primacy, recency, and temporal contiguity effects.

RAG vs. Full-Context: A Nuanced Discussion

The Convomem benchmark highlighted a critical insight: for initial conversations, up to about 150 exchanges, a full-context method—where the entire conversation history is provided—consistently surpasses even advanced RAG systems in accuracy (achieving 70-82% compared to 30-45%).

This implies that conversational memory thrives on a "small corpus advantage," where a comprehensive search of the entire history is not only feasible but also yields superior results. Consequently, simply applying generic RAG solutions may not always be the most effective strategy.

The path forward will likely involve a hybrid approach:

  • Full context for brief interactions.
  • Typed memory paired with Fact Cards for extended dialogues.
  • O(1) retrieval for fixed, static knowledge.

Reshaping Expectations for Developers and Innovators

For professionals in development and creative fields, these architectural breakthroughs fundamentally alter our expectations of what an LLM can achieve:

Today, we might say: "ChatGPT acts as an assistant that occasionally loses track or provides inconsistent information."
Tomorrow's reality: "My AI agent will uphold a consistent mental model of my project across weeks or even months."

Consider the possibilities:

  • A coding assistant that consistently adheres to your specific conventions and architectural choices over extended periods.
  • An e-commerce helper that retains a detailed grasp of your unique business limitations and client needs.
  • A customer support solution that never redundantly requests previously provided information.

These aren't merely performance improvements; they unlock AI's true potential for tackling intricate, long-duration assignments.

Conclusion: Ushering in the Era of Truly Cognitive AI

For a long time, the primary strategy for enhancing LLMs involved scaling them up: increasing parameters, expanding datasets, and boosting computational power.

However, MIRROR and Engram reveal an alternative direction: fostering smarter AI, rather than simply larger systems.

By endowing these models with internal reflective capabilities, effective working memory, and rapid knowledge retrieval, we're doing more than just boosting performance. We're forging systems capable of genuine thought and memory.

The pertinent question is shifting from "What model size is sufficient?" to "Which cognitive architecture offers the optimal solution?".

What about your own ventures? How do you foresee integrating these advanced architectures? Perhaps an assistant that maintains a consistent memory of your entire codebase? Or a support system that deeply comprehends user needs over time? An agent that meticulously reasons before taking action?

The future of artificial intelligence will increasingly be defined not by the sheer number of parameters, but by the sophistication of its internal reflection and cognitive depth.

If you found this exploration into advanced AI architectures insightful, I encourage you to dive deeper into the world of AI with Nicolas Dabène!

Let's build the future of truly cognitive AI together!

Save time, make money

2026-02-03 13:00:00

A lot of people get into programming to build some kind of SaaS tool. I just want to provide some insight to how I made ~10k using basic html/javascript in 2 hours.
 
How I made ~$12k using basic html/javascript in one day.
 
In a previous role I supported a B2B platform. On this platform vendors would have to process invoices and submit them via CSV.
One of my responsibilities was ensuring that these were submitted properly. The portal used to submit was actually contracted to a 3rd party (terrible decision, but this is what happens with large orgs that just keep abstracting systems).
 
The portal required a CSV file to be imported but provided a starter file as xls. Some vendors had no issue submitting, while others really struggled with formatting, not to mention the extremely strict rules about what each record required with over 20 use cases. Accounts payable did not make any exceptions, one mistake and the entire file was rejected.

This meant the entire invoice was not paid until a corrected version was resubmitted.
 
As the platform grew to almost 100 vendors, I was spending (conservatively) 1 hour a day checking these files or providing some kind of support/training.
Okay so you may be thinking, fix the UI/UX. I offered to do that, but since it was contracted to a 3rd party company we couldn’t touch it, and management didn’t  believe the business case would get a budget approved for UI/UX enhancements since it “worked”.
 
I created an html file, threw in a form, a table, and an export button. Then using Javascript I setup the form to add a row to the table after checking against the strict rules based on the “reason code”. If there was a problem with the row, clear error messages educated the user how to properly submit. Finally the export button handled the conversion of the html table to an array that exported to CSV.
 
I sent this file out to a couple vendors for beta testing, iterated on a few features, added a clear table button, an import CSV button and ability to delete lines, all using only vanilla JS and some basic CSS.
 
Okay you may be wondering about things like MRR or acquisition keywords, but the answer is, this one file saved me from having to do auditing or constant retraining. That’s an extra hour a day I didn’t have to spend working, so I can write up blog posts like this. Luckily I’m salary so I still get paid to do this, and 5 hours a week saved was  around ~12k/year for me at the time.

Above the API: What Developers Contribute When AI Can Code

2026-02-03 12:48:49

An AI researcher told me something that won't leave my head:

"If a human cannot outperform or meaningfully guide a frontier model
on a task, the human's marginal value is effectively zero."

Now we have data. Anthropic's study with junior engineers shows: using AI
without understanding leads to 17% lower mastery—two letter grades.

But some AI users scored high. The difference? They used AI to learn,
not to delegate.

The question isn't "Can I use AI?" anymore.

It's "Am I using AI to understand, or to avoid understanding?"

The New Divide

There's a line forming in software development. Not senior vs junior. Not experienced vs beginner.

It's deeper than that.

Below the API:

  • Can execute tasks AI handles autonomously
  • Follows patterns without deep understanding
  • Accepts AI output without verification
  • Builds features fast but can't foresee disasters

Above the API:

  • Guides systems with judgment
  • Knows when AI is wrong
  • Produces outcomes AI can't generate
  • Exercises architectural thinking

The question: which side of that line are you on?

The Divide: What AI Does vs What Humans Still Own

Domain AI Capability (Below) Human Capability (Above) Why It Matters
Code Generation Fast, comprehensive output Knows what to delete AI over-engineers by default
Debugging Pattern matching from training data System-level architectural thinking AI misses root causes across components
Architecture Local optimization within context Big picture coherence AI can't foresee cascading disasters
Refactoring Mechanical transformation of code Judgment on when/why/if to refactor AI doesn't understand technical debt tradeoffs
Learning Instant recall from training Hard-won skepticism through pain AI hasn't been burned by its own mistakes
Verification Cheap domains (does it compile?) Expensive domains (is this the right approach?) AI can't judge "good" vs "working"
Consistency Struggles across multiple files Maintains patterns across codebase AI loses context, creates inconsistent implementations
Simplification Adds features comprehensively Disciplines to reject complexity AI defaults to kitchen-sink solutions

Below the API: Can execute what AI suggests

Above the API: Can judge whether AI's suggestion is actually good

The line isn't about what you can build. It's about what you can verify, simplify, and maintain.

Why AI Makes Juniors Fast But Seniors Irreplaceable

Tiago Forte observed something crucial about AI-assisted development:

"Claude Code makes it easier to build something from scratch than to modify what exists. The value of building v1s will plummet, but the value of maintaining v2s will skyrocket."

The v1/v2 reality:

A junior developer uses Claude to build an authentication system. 200 lines of code, 20 minutes, tests pass, ships to production. Their portfolio looks impressive.

Six months later: the business needs SSO integration. Now they're debugging auth logic they didn't write, following patterns AI chose for reasons they don't understand, with zero architectural context. What should take 4 hours takes 3 days—because they never learned to structure v1 with v2 in mind.

This is the v1/v2 trap in action.

Skills AI Commoditizes (v1 territory):

  • Building greenfield projects
  • Generating boilerplate
  • Following templates
  • Speed and feature velocity

Skills AI Can't Replace (v2+ territory):

  • Debugging existing systems
  • Understanding technical debt
  • Knowing when to refactor vs rebuild
  • Maintaining architectural coherence

Here's the trap: Junior developers are using AI to build impressive v1 projects for their portfolios. But they're never learning the v2+ maintenance skills that actually command premium rates.

As Ben Podraza noted in response to Tiago: "Works great until you ask it to create two webpages with the same formatting. Then you iterate for hours burning thousands of tokens."

Consistency is hard. Context is hard. Legacy understanding is hard.

Those are exactly the skills you learn from working in mature codebases, reading other people's code, struggling through refactoring decisions.

The knowledge commons taught v2+ skills. AI teaches v1 skills.

Guess which one the market will pay for in 2027?

The Architecture Gap

Uncle Bob Martin (author of Clean Code) has been coding with Claude. His observation cuts to what humans still contribute:

"Claude codes faster than I do by a significant factor. It can hold more details in its 'mind' than I can. But Claude cannot hold the big picture. It doesn't understand architecture. And although it appreciates refactoring, it shows no inclination to acquire that value for itself. It does not foresee the disaster it is creating."

The danger: AI makes adding features so easy that you skip the "slow down and think" step.

"Left unchecked, AI will pile code on top of code making a mess. By the same token, it's so easy for humans to use AI to add features that they pile feature upon feature making a mess of things."

When someone asked "How much does code quality matter when we stop interacting directly with code?", Uncle Bob's response was stark:

"I'm starting to think code quality matters even more."

Why? Because someone still has to maintain architectural coherence across the mess AI generates. That someone needs to understand both what the code does AND why it was structured that way.

The Claude Code Reality Check

Since Claude Code and Anthropic's Model Context Protocol (MCP) launched, developers have been experimenting with AI-first workflows. The results mirror Uncle Bob's observation exactly: AI is incredibly fast at implementation but blind to architectural consequences.

What Claude Code excels at:

  • Generating boilerplate quickly
  • Following explicit patterns within single files
  • Maintaining local context for focused tasks
  • Implementing well-defined specifications

Where it fails (by design):

  • Understanding project-wide architecture
  • Maintaining consistency across multiple files
  • Knowing when to slow down and reconsider the approach
  • Foreseeing how today's "quick fix" becomes tomorrow's technical debt
  • Asking "should we even build this?" instead of "how do we build this?"

The tool is powerful. I use it daily. But treating it as autopilot instead of compass leads to the "code pile" Uncle Bob warned about.

This isn't Claude's limitation—it's a fundamental constraint of current AI architecture. As Peter Truchly explained in the comments: "LLMs are not built to seek the truth. They're trained for output coherency (a.k.a. helpfulness)."

An LLM will confidently generate code that compiles and runs. Whether it's the right code—architecturally sound, maintainable, simple—requires human judgment in what Ben Santora calls "expensive verification domains."

That judgment is what keeps you Above the API.

The Skills That Actually Matter

From the discussions in my knowledge collapse article, here's what keeps you Above the API:

1. Architectural Thinking (Uncle Bob's "Big Picture")

  • Knowing when to slow down
  • Seeing consequences AI can't predict
  • Making refactoring decisions with context
  • Balancing technical debt vs new features

2. V2+ Mastery (Tiago's Maintenance Skills)

  • Debugging complex existing systems
  • Understanding why code was written certain ways
  • Maintaining consistency across iterations
  • Choosing between rebuild vs refactor

3. Verification Capability (Ben Santora's "Judge" Layer)

  • Knowing when AI is confidently wrong
  • Distinguishing cheap vs expensive verification domains
  • Building skepticism without becoming paralyzed
  • Testing assumptions, not just accepting outputs

As Ben Santora explained in his work on AI reasoning limits:

"Knowledge collapse happens when solver output is recycled without a strong, independent judging layer to validate it. The risk is not in AI writing content; it comes from AI becoming its own authority."

Cheap verification domains:

  • Code compiles or doesn't
  • Tests pass or fail
  • API returns correct response

Expensive verification domains:

  • Is this architecture sound?
  • Will this scale?
  • Is this maintainable?
  • Is this the right approach?

AI sounds equally confident in both domains. But in expensive verification domains, you won't know you're wrong until months later when the system falls over in production.

4. Discipline to Simplify (Doogal Simpson's "Editing")

In the comments, Doogal Simpson reframed the shift from scarcity to abundance:

"We are trading the friction of search for the discipline of editing. The challenge now isn't generating the code, but having the guts to reject the 'Kitchen Sink' solutions the AI offers."

Old economy: Scarcity forced simplicity (finding answers was expensive)
New economy: Abundance requires discipline (AI generates everything, you must delete)

The skill shifts from ADDING to DELETING. From generating to curating. From solving to judging.

5. Domain Expertise (John H's Context)

In the comments, John H explained how he uses AI effectively as a one-man dev shop:

"I can concentrate on being the knowledge worker, ensuring the business rules are met and that the product meets the customer usability requirements."

What John brings:

  • 3 years with his application
  • Deep customer knowledge
  • Business rules understanding
  • Can verify if AI output actually solves the right problem

John isn't using AI as autopilot. He's using it as a force multiplier while staying as the judge.

The pattern: Experienced developers with deep context use AI effectively. They can verify output, catch errors, know when to override suggestions.

The problem: Can juniors learn this approach without first building the hard-won experience that makes verification possible?

The Anthropic Study: Using AI vs Learning With AI

While writing this piece, Anthropic published experimental data that
validates the Above/Below divide.

In a randomized controlled trial with junior engineers:

  • AI-assistance group finished ~2 minutes faster
  • But scored 17% lower on mastery quiz (two letter grades)
  • "Significant decrease in mastery"

However: Some in the AI group scored highly while using AI.

The difference? They asked "conceptual and clarifying questions to
understand the code they were working with—rather than delegating or
relying on AI."

This is the divide:

Below the API (delegating):

"AI, write this function for me" → Fast → No understanding → Failed quiz

Above the API (learning with AI):

"AI, explain why this approach works" → Slower but understands → Scored high

Speed without understanding = Below the API.

Understanding while using AI = Above the API.

The tool is the same. Your approach determines which side you're on.

[Source: Anthropic's study, January 2026]

The Last Generation Problem

In the comments, Maame Afua revealed something crucial: she's a junior developer, but she's using AI effectively because she had mentors.

"I got loads of advice from really good developers who have been through the old school system (without AI). I have been following their advice."

The transmission mechanism: Pre-AI developers teaching verification skills to AI-era juniors.

Maame can verify AI output not because she's experienced, but because experienced devs taught her to be skeptical. Her learning path:

  1. Build foundation first (books, docs, accredited resources)
  2. Use AI as assistant, not primary learning tool
  3. Verify against authoritative sources
  4. Never implement what she can't explain

But here's the cliff we're approaching:

Right now, there are enough pre-AI developers to mentor. In 5-10 years, most seniors will have learned primarily WITH AI.

Who teaches the next generation to doubt? Who transfers verification habits when nobody has them?

We're one generation away from losing the transmission mechanism entirely.

Maame is lucky. She found good mentors before the window closed. The juniors starting in 2030 won't have that option.

How People Learn Verification

Two developers in the comments showed different paths to building verification skills:

The Hard Way (ujja)

ujja learned "zero-trust reasoning" through painful experience:

"Trusted AI a bit too much, moved fast, and only realized days later that a core assumption was wrong. By then it was already baked into the design and logic, so I had to scrap big chunks and start over."

His mental model shifted:

  • Before: "Does this sound right?"
  • After: "What would make this wrong?"

He now treats AI like "a very confident junior dev - super helpful, but needs review."

His insight: "I do not think pain is required, but without some kind of feedback loop like wasted time or broken builds, it is hard to internalize. AI removes friction, so people skip verification until the cost shows up later."

The Deliberate Way (Fernando)

Fernando Fornieles recognized the problem months ago and took action without waiting to get burned:

  • Closed private social media accounts
  • Migrated to fediverse
  • Built home cloud server (Nextcloud on Raspberry Pi)
  • Actively avoiding platform "enshittification"

He's not learning through pain. He's acting on principles.

The question: Can we teach ujja's learned skepticism without the pain? Can we scale Fernando's deliberate action?

Or does every junior need to scrap a week's work before they learn to verify AI output?

What the Knowledge Commons Taught

Stack Overflow debates taught architecture. Someone would propose a solution, others would tear it apart, consensus would emerge through friction. That friction built judgment.

Code review culture taught "slow down and think." You couldn't just ship it - someone would ask "why this approach?" and you'd have to justify architectural decisions.

Painful bugs taught foreseeing disaster. You'd implement something that seemed fine, it would blow up in production, you'd learn to see those patterns early.

Legacy codebases taught refactoring judgment. You'd maintain someone else's decisions, understand their constraints, learn when to preserve vs rebuild.

All of this happened in public. On Stack Overflow. In code review comments. In GitHub issues. In conference talks.

AI assistance happens in private. Individual optimization. No public friction. No collective refinement.

The skills that keep you Above the API were taught by the knowledge commons we're killing.

Practical Actions

If You're Junior/Early Career:

Seek pre-AI mentors actively

  • Find developers who learned before ChatGPT
  • Ask them to review your AI-generated code
  • Learn their skepticism patterns

Work in mature codebases

  • Don't just build greenfield projects
  • Contribute to established open source
  • Learn from technical debt decisions

Document your reasoning publicly

  • Write about WHY you chose approaches
  • Publish debugging journeys, not just solutions
  • Contribute to the commons you're consuming

Build verification habits explicitly

  • Always check AI output against docs
  • Test assumptions, don't just ship
  • Learn to recognize "confident wrongness"

Treat AI like ujja does

  • "Very confident junior dev"
  • Super helpful, but needs review
  • Ask "what would make this wrong?" not "does this sound right?"

If You're Senior/Experienced:

Mentor explicitly

  • Teach verification, not just syntax
  • Share your skepticism patterns
  • Explain architectural thinking out loud

Preserve architectural knowledge

  • Document WHY decisions were made
  • Publish architecture decision records
  • Write about the disasters you foresaw

Contribute to commons deliberately

  • Answer questions on Stack Overflow
  • Write detailed technical blog posts
  • Open source your reasoning, not just code

Make "slow down and think" visible

  • Show juniors when you pause to consider
  • Explain the questions you ask AI
  • Demonstrate the editing/simplification process

The Uncomfortable Questions

The AGI Wild Card

In the comments on my knowledge collapse article, Leob raised the ultimate question: what if AI achieves true invention?

"Next breakthrough for AI would be if it can 'invent' something by itself, pose new questions, autonomously create content, instead of only regurgitating what's been fed to it."

If that happens, "Above the API" might become irrelevant.

But as Uncle Bob observed: "AI cannot hold the big picture. It doesn't understand architecture."

Peter Truchly added technical depth to this limitation:

"LLMs are not built to seek the truth. Gödel/Turing limitations do apply but LLM does not even care. The LLMs are just trained for output coherency (a.k.a. helpfulness)."

Two possible futures:

Scenario 1: AI remains sophisticated recombinator
Knowledge collapse poisons training data. Model quality degrades. The Above/Below divide matters enormously. Your architectural thinking and verification skills remain valuable for decades.

Scenario 2: AI achieves AGI and true invention
Knowledge collapse doesn't matter because AI generates novel knowledge. But then... what do humans contribute?

Betting everything on "AGI will save us from knowledge collapse" feels risky when we're already seeing the collapse happen.

Maybe we should fix the problem we KNOW exists rather than hoping for a breakthrough that might make everything worse.

Does Software Even Need Humans?

Mike Talbot pushed back on my entire premise:

"Why do humans need to build a knowledge base? So that they and others can make things work? Who cares about the knowledge base if the software works?"

His argument: Knowledge bases exist to help HUMANS build software. If AI can build software without human knowledge bases, who cares if Stack Overflow dies?

He used a personal example:

"I wrote my first computer game. I clearly remember working on a Disney project in the 90s and coming up with compiled sprites. All of that knowledge, all of that documentation, wiped out by graphics cards. Nobody cared about my compiled sprites; they cared about working software."

His point: Every paradigm shift makes previous knowledge obsolete. Maybe AI is just the next shift.

My response: Graphics cards didn't train on his compiled sprite documentation. They were a fundamentally different approach. AI is training on Stack Overflow, Wikipedia, GitHub. If those die and AI trains on AI output, we get model collapse not paradigm shift.

Mike's challenge matters because it forces clarity: Are we preserving human knowledge because it's inherently valuable? Or because it's necessary for AI to keep improving?

If AGI emerges, his question becomes more urgent. If it doesn't, preserving human knowledge becomes more critical.

What You Actually Contribute

Back to the original question: "What do you contribute that AI cannot?"

You contribute verification. AI solves problems. You judge if the solution is actually good.

You contribute architecture. AI writes code. You see the big picture it can't hold.

You contribute foresight. AI optimizes locally. You prevent disasters it doesn't see coming.

You contribute context. AI has patterns. You have domain expertise, customer knowledge, historical understanding.

You contribute judgment in expensive verification domains. AI excels where verification is cheap (does it compile?). You excel where verification is expensive (will this scale? is this maintainable? is this the right approach?).

You contribute simplification. AI generates comprehensive solutions. You have the discipline to delete complexity.

You contribute continuity. AI is stateless. You maintain coherence across systems, teams, and time.

But here's the uncomfortable truth: none of these skills are guaranteed.

They're learned. Through friction. Through pain. Through public struggle. Through mentorship from people who learned the hard way.

If we kill the knowledge commons, we kill the training grounds for Above-the-API skills.

If we stop mentoring explicitly, we lose the transmission mechanism in one generation.

If we optimize purely for velocity, we lose the "slow down and think" muscle.

Staying Above the API isn't automatic. It's a choice you make every day.

Choose to verify, not just accept.

Choose to simplify, not just generate.

Choose to foresee, not just react.

Choose to mentor, not just build.

Choose to publish, not just consume.

The API line is real. Which side will you be on?

This piece was built from discussions with developers working through these questions publicly. Special thanks to Uncle Bob Martin, Tiago Forte, Maame Afua, ujja, Fernando Fornieles, Doogal Simpson, Ben Santora, John H, Mike Talbot, Leob, Peter Truchly, and everyone else thinking through this transition.

What skills do you think will keep developers Above the API? What am I missing? Let's figure this out together.

Part of a series: