MoreRSS

site iconThe Practical DeveloperModify

A constructive and inclusive social network for software developers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of The Practical Developer

Three Responsibilities of a Global Application (Part 2)

2026-03-17 13:00:00

In Part 1, I explained that at a global scale, trust is part of the architecture. Not trust as a feeling, but trust as something the system must enforce and prove.

In this article, I aim to explain the 3 distinct responsibilities that enable the system to grow organically.

Why global systems feel complex

Most complexity in global systems does notcome from services. It comes from mixing concerns.
I have seen this pattern many times:

  • CloudWatch dashboards are used to answer audit questions
  • CloudTrail logs are pulled into debugging workflows
  • Metrics start carrying tenant identifiers just to be safe

None of the above are wrong in isolation, but when they are together, they create systems that are:

  • Hard to operate
  • Hard to explain
  • Hard to defend

The problem is the system is trying to answer too many different questions at once. In short, it is the same pattern that applies to code Single Responsibility.
My job brought me to a point where I stopped thinking in terms of architectures and started thinking in terms of responsibilities. No matter how the application is built, it must answer the same 3 questions:

  1. How does work actually happen?
  2. How can we prove that work happened correctly?
  3. How do we know whether the system is healthy?

When these responsibilities are clearly separated, decisions become easier. When they are mixed, every discussion becomes a mess.

Responsibility #1 — Doing the work (execution)

This is the responsibility most devs are comfortable with.

It is where:

  • Business logic runs
  • Requests are processed
  • Events are handled
  • Workflows progress

In AWS terms, this is:

  • AWS Lambda
  • Step Functions
  • EventBridge
  • DynamoDB
  • SQS
  • SNS

This responsibility answers one question only:

What does the system do for the business?

And it should be optimised for:

  • Correctness
  • Scalability
  • Resilience
  • Isolation

Problems start when this responsibility is overloaded.
Examples:

  • embedding compliance logic directly into business code
  • adding just in case logging everywhere without structure
  • leaking operational concerns into domain logic

Execution code should focus on doing the work, not explaining or defending it.

Responsibility #2 — Proving the work (evidence and control)

This responsibility exists because *someone outside the team will ask questions like:

  • Who had access?
  • Who changed production?
  • What data moved where?
  • Was logging enabled at the time?

This responsibility is not about debugging. It is about proof.

In AWS, this responsibility is expressed through things like:

A common issue is teams trying to reuse execution or observability data as evidence. That usually fails because:

  • Logs change format
  • Metrics are aggregated
  • Dashboards get deleted
  • Devs remember things differently

Evidence systems must be:

  • Complete
  • Consistent
  • Tamper‑resistant

This is why this responsibility must be separate from execution.

Responsibility #3 — Understanding the system (operations)

This responsibility answers a very different question:

Is the system healthy right now?

Not:

  • What happened to tenant X?
  • Who changed this?

But:

  • Are errors increasing?
  • Is latency degrading?
  • Is this regional or global?

And the answers are:

  • Metrics
  • Alerts
  • SLOs
  • Dashboards

In AWS environments, this usually means:

  • CloudWatch metrics
  • Amazon Managed Prometheus
  • Service telemetry
  • Alerts

Those services exist, they trigger the investigation and actions.

Why mixing responsibilities breaks systems

Once these 3 responsibilities are separated, many things become obvious. For example:

  • Metrics with tenantId * That is usually execution detail leaking into operations, and this is not what metrics are actually meant for.
  • CloudWatch dashboards as audit * Dashboards explain system behaviour while auditors need immutable, verifiable evidence.
  • Debugging incidents by scrolling through CloudTrail * CloudTrail is excellent at answering who did what, but it is a terrible tool for answering why the system is behaving this way right now.

Each of these feels ok on its own, but at scale, there is confusion about what the system is actually trying to tell us.

Benefits of separation

Once responsibilities are separated, conversations change.

Instead of:

Should we centralise logs?

I ask:

Which responsibility are we trying to serve?

Instead of:

Why cannot I just add tenantId to metrics?

I ask:

Is this an operational signal or an accounting question?

Instead of:

Why is governance slowing us down?

I ask:

Which responsibility are we trying to satisfy?

The trade‑offs are the same, but they are made explicit.

Conclusion

From a governance angle, global systems do not fail because they are distributed. They fail because we ask one system to do everything at once.
Separating responsibilities does notreduce complexity, it puts complexity where it belongs.

Beyond the Default: Building a Cost-Optimized Hybrid AWS Architecture

2026-03-17 12:51:52

In the world of AWS, "default" settings are often the fastest way to an expensive monthly bill. Recently, a Senior DevOps Engineer dropped a strategic hint on one of my posts that challenged the standard EC2-only approach:

"Lightsail is cheaper for the public-facing interface. You don't get much CPU, but enough SSD for cache. Then connect to your EC2."

I didn’t just read the comment. I built the system myself. I used AWS Lightsail for predictable costs and EC2 for extra computing power, creating a setup that greatly reduces Data Transfer Out (DTO) costs.

The "Aha!" Moment: Fighting the $0.09/GB Trap

Most developers realize too late that AWS charges roughly $0.09 per GB for outbound data from EC2 (after the first 100GB). If your site pushes 1TB of traffic, that’s roughly a $90 bill for bandwidth alone.

That’s when the "Aha!" moment hit me: AWS Lightsail isn’t just for beginners. Its $5/month plan includes 1TB of outbound data transfer. By using Lightsail as an Nginx reverse proxy, you’re essentially buying a "bandwidth insurance policy."

The strategy: route public-facing traffic through the Lightsail “front door” and keep the connection to your EC2 backend over a Lightsail VPC peering connection. This traffic stays on AWS’s private network, keeping your costs predictable while shielding your backend from the public internet.

The Build & Technical Insights

  1. Compute Strategy: Optimized Frontend, Scalable Backend
    Frontends rarely need heavy CPU. Lightsail’s burstable CPU is perfect for an Nginx reverse proxy, while EC2 handles the heavy lifting. Use Auto Scaling Groups and instance types optimized for your workload to ensure backend performance without overpaying for the frontend.

  2. Networking: The VPC Peering Bridge

Setting up Lightsail → EC2 connectivity requires a VPC peering connection. Key detail: both route tables need manual updates:

  • Lightsail Side: Add a route for the EC2 VPC CIDR pointing to the peering connection.
  • EC2 Side: Add a route for the Lightsail VPC CIDR pointing back. This ensures traffic flows privately between Lightsail and EC2.

. Troubleshooting Masterclass: Traceroute is King

When initial curl requests timed out, I didn’t guess I traced the packets:

traceroute 172.31.x.x

Packets left the instance but died after two hops, revealing a routing table gap rather than a misconfigured Nginx. One command saved me an hour of troubleshooting.

Production Optimizations

Web Server & Static Assets

  • OpenLiteSpeed (OLS): For WordPress or PHP apps, OLS handles concurrent users better than Apache and is more resource-efficient.
  • CloudFront: Offload static assets to a CDN. Free tier: 1TB per month for the first 12 months. Reduces load on your Lightsail proxy.

Security & Traffic Management

  • Cloudflare: DNS, WAF, and DDoS protection hide your Lightsail public IP and add resilience.
  • TLS Termination: Offload HTTPS at Cloudflare or Lightsail to reduce backend CPU usage.

Email & Notifications

  • AWS SES: Send up to 62,000 free emails/month from EC2 instances during the free tier perfect for transactional mail and app notifications.

Hardware Efficiency

  • ARM (t4g / Graviton2): ~20% better price-to-performance than x86 Intel for many workloads.
  • AMD (t3a): ~10% cheaper than Intel for x86 workloads sensitive to micro-latency.

With this setup (Cloudflare + CloudFront + Lightsail + EC2 + SES), you can run a robust, scalable stack for ~$15/month, assuming moderate traffic and efficient use of free tiers.

Production Readiness Notes

  • Monitoring & Logging: Use CloudWatch for EC2 metrics and Lightsail logs. Monitor Nginx access/error logs.
  • High Availability: Lightsail is single-AZ. For mission-critical apps, consider multi-region failover.
  • Auto Scaling: Keep EC2 backends in Auto Scaling Groups to handle traffic spikes.
  • Backup Strategy: Snapshot EC2 and Lightsail instances regularly.

Conclusion

This project proved that in DevOps, curiosity pays off. A single comment led me to a hybrid architecture that optimizes costs, improves scalability, and demonstrates professional-grade cloud engineering. Thank you sir Harith!

I’ve documented the full lab guide, including the Nginx configs and peering steps, in my GitHub repository here: aws-lightsail-nginx-lab

Not Everything Needs MCP, Part 2: The 2026 Phase Transition — When Three Independent Roads Led to the Same Conclusion

2026-03-17 12:46:24

The Ancient Past of Eighteen Months Ago — And What It Taught Us About the Future of AI Agents

Let me tell you a story from the ancient past.

By which I mean eighteen months ago.

In the world of AI, eighteen months is geological time. Think back to mid-2024. Context windows were small. "Prompt engineering" was the skill everyone was hiring for. MCP didn't exist yet. The idea of AI agents autonomously operating external services was mostly theoretical.

I was building a medical AI product in Osaka, Japan. And I had a problem that, looking back, contained the seed of everything that happened in 2026.

This is Part 2 of my "Not Everything Needs MCP" series. Part 1 told the story of Google Workspace CLI implementing a full MCP server, then deliberately deleting all 1,151 lines of it two days after launch. That investigation revealed an architectural mismatch between MCP's protocol design and large-scale APIs.

But that was only one data point. Since publishing that article, I discovered two more — and together, they tell a much bigger story about where AI agent architecture is heading in 2026.

The Timestamp Hack: Before MCP Had a Name

In early 2024, I was working on an AI assistant for my company's medical IT platform. We serve clinics across the Kansai region of Japan (Osaka, etc.) — and I'd been using ChatGPT's Custom GPTs to prototype workflows.

I had a simple need: I wanted every AI response to include the exact timestamp of when the conversation happened. Not for fun — for traceability. In medical IT, knowing when a decision was discussed matters. It matters for audits. It matters for compliance. It turned out to matter for patent applications too.

Here's what I did. I deployed a tiny Web API on a server we host publicly. It did exactly one thing: return the current time. Then I configured the Custom GPT to call this API before every response, and output the timestamp first.

The result looked like this:

User: Hey, long time no see!
(Communicated with universalkarte.com)

🕐 Response time: 2025-04-02 09:39:00 (JST) / 2025-04-02 00:39:00 (UTC)

Oh wow, it's been a while! So great to hear from you! 😊

A web API that returns a timestamp. Called before every response. Output deterministically. Nothing more, nothing less. That's all it did.

At the time, this was called "Function Calling" or "Tool Use" — the predecessors to what Anthropic would later formalize as MCP in November 2024. I didn't know I was implementing a pattern that would become the center of a protocol war. I just needed a clock.

But here's what matters: the design decision I made instinctively was to keep the external call as small and deterministic as possible. One API. One purpose. Minimal payload. The LLM didn't need to understand time zones or server infrastructure — it just needed to paste the result.

It wasn't a "hack" because I was lazy. It was an architectural instinct: keep the LLM away from what the system already knows. Deterministic output for a deterministic need. Don't make the AI think about the time — just give it the time.

Looking back now, eighteen months later, it turns out this minimal pattern — one deterministic call, zero reasoning overhead — was already the architecture that the rest of the industry would independently converge on. I didn't see it that way at the time. I was just solving a problem.

The MCP Honeymoon — And the Hangover

November 2024. Anthropic open-sourced MCP. By February 2025, Google and others rushed to announce MCP support. The community was electric. Finally, a standard protocol for connecting LLMs to external tools!

I dove in immediately. I connected MCP servers for GitHub, for databases, for various services. Context windows were getting larger. The future felt bright.

And at first, it was genuinely impressive. GitHub operations that used to require manual terminal commands — commits with thoughtful messages, PR creation, branch management — the AI handled them smoothly through MCP. I felt the productivity gains. They were real.

But then something else started happening.

The AI started getting... dumber.

Not in the "wrong answer" sense. In fact, the AI got better at executing tasks exactly as intended — MCP meant it could commit code, create PRs, and query databases with precision. But something subtler was degrading. The quality of reasoning. The ability to take a vague idea and turn it into a structured thought. What I call "zero-to-one thinking" — the creative, synthetic part of working with an LLM.

I spent the second half of 2025 with this nagging feeling. More tools, more capabilities, but less... intelligence. More precise in execution, less insightful in thought. I kept thinking: "I wish context windows would just get bigger so this wouldn't matter." But I also suspected that bigger windows alone wouldn't fix it — the AI would probably just get confused in different ways.

I couldn't quantify this feeling at the time. But I now know that researchers were documenting exactly what I was experiencing.

The Science Behind "Getting Dumber"

It turns out my gut feeling had a name: context rot.

Here's what researchers found — and why it matters for anyone loading MCP servers into their workflow:

Research Key Finding
Context Rot (Chroma Research) Irrelevant context degrades reasoning first. Retrieval survives; thinking dies.
Reasoning Degradation with Long Context Windows (14-model benchmark) Reasoning ability decays as a function of input size — even when the model can still find the right information.
Maximum Effective Context Window (Paulsen, 2025) The actual usable window is up to 99% smaller than advertised. Severe degradation at just 1,000 tokens in some top models.
Fundamental Limits of LLMs at Scale (arXiv, 2026) Context compression, reasoning degradation, and retrieval fragility are proven architectural ceilings — not bugs to be patched.

Let me unpack why this hits MCP users so hard.

Chroma Research showed that as irrelevant context increases in an LLM's input, performance degrades — and the degradation is worse when the task requires genuine reasoning rather than simple retrieval. The less obvious the connection between question and answer, the more devastating the irrelevant context becomes.

The "Challenging LLMs Beyond Information Retrieval" study tested 14 different LLMs and demonstrated that reasoning ability degrades as a function of input size — even when the model can still find the right information. Information retrieval and reasoning are different capabilities, and reasoning breaks first.

And here's the connection to MCP that makes this personal:

A single popular MCP server like Playwright contains 21 tools. Just the definitions of those tools — names, descriptions, parameter schemas — consume over 11,700 tokens. And these definitions are included in every single message, whether you use the tools or not.

Now multiply that by 10 MCP servers. You've burned 100,000+ tokens on tool definitions alone. Your 200k context window is suddenly 70k. And it's not just smaller — it's polluted with information that actively degrades the model's ability to reason about the thing you actually asked it to do.

This is what I felt. The AI wasn't broken. It was drowning. More tools meant more noise in the signal. More capability meant less room to think.

The 15,000-Character Prompt and the Limits of "Prompt Engineering"

While I was wrestling with MCP overhead, I was also building an AI-powered tool — essentially a converter that takes ambiguous, unstructured text input and generates structured, formatted output. Think of it as a bridge between how humans naturally communicate and how systems need to receive data.

The core of this tool is a system prompt. That prompt went through dozens of iterations. At its peak, it was 20,000 characters. I tested, compared outputs, and eventually settled on 15,000 characters.

15,000 characters of instructions. For a single task.

The whole time, a thought kept nagging me: "Would a human expert need 15,000 characters of instructions to do this job?" A domain specialist would need maybe a paragraph of guidance. The rest is knowledge they already have — accumulated through years of working in their field.

And that's when "prompt engineering" started feeling like what it really was: a brute-force workaround for the absence of domain expertise in the model's operating context.

But here's the twist. Despite the bloated prompt, the tool worked. Output quality stayed consistent and reliable. Why?

Because I had constrained the domain. The tool operated within a specific industry workflow — a narrow slice of reality with its own vocabulary, its own established patterns, its own expected output formats. By telling the LLM upfront "you are operating within this domain," the massive prompt became effective.

If you've ever worked with LLMs, you already know this intuitively: a purely descriptive, narrative-style prompt — no matter how long — doesn't guarantee output quality. We've all been there. But a prompt that constrains the domain changes the game.

Here's why, and you don't need a PhD to see it. Think about what's happening inside a Transformer model. The attention mechanism operates on an enormous matrix — in large models, tens of thousands of dimensions. Every token is trying to figure out which other tokens matter. When the domain is wide open, the model is searching for relevance across a vast, noisy space. The outputs fluctuate. The reasoning wanders. Anyone who's done even basic linear algebra — even 3×3 matrices in high school — can imagine what happens when you scale that uncertainty to tens of thousands of dimensions. Of course the output changes every time.

But constrain the domain, and you dramatically narrow where the model needs to look. The relevant vectors cluster. The gap between what the model retrieves and what the human intended shrinks toward zero. Domain limitation doesn't just help. It's the mechanism by which prompts actually work.

This taught me something that would later click into place: domain limitation is the real optimization. Not longer prompts. Not bigger context windows. Narrower scope.

And if that's true for prompts, shouldn't the same principle apply to how we design AI agents?

From Prompt Engineering to Architecture Engineering

As the tool matured, the architecture evolved in a direction I didn't fully appreciate at the time.

The initial version was pure prompt — a single, monolithic instruction set that did everything through LLM reasoning. Unstructured text in, structured text out.

But the real world isn't one output format. My domain required multiple types of structured documents — each with its own format, its own required fields, its own regulatory and compliance requirements. The number of output variations kept growing.

Trying to handle all of these through prompt engineering alone was... well, it was exactly the "spread the entire menu on the table" problem from Part 1.

So the architecture shifted. The LLM's output became fully structured JSON — deterministic, parseable, machine-readable. Document generation moved to Google Workspace via GCP. The LLM's job narrowed to what it's actually good at: understanding the input, extracting the meaning, structuring the reasoning. Everything else — formatting, template selection, compliance checks, document assembly — moved to deterministic systems.

The LLM handles the ambiguous. Deterministic systems handle the deterministic.

I was doing this throughout 2025, iterating toward an architecture where AI reasoning and programmatic execution were cleanly separated. And I kept thinking about Google Workspace — if only there were a way to programmatically drive every Workspace API from the command line, it would be the perfect backend for the document generation pipeline...

And Then GWS Appeared

March 2026. Google released gws — Google Workspace CLI. A Rust-based CLI that covers nearly every Google Workspace API, with commands dynamically generated from Google's Discovery Service.

When I saw the announcement, my reaction was immediate: "This is it. This is what I've been waiting for."

A CLI that could drive Gmail, Drive, Docs, Sheets, Calendar — all from the command line, all returning structured JSON. Perfect for my document generation pipeline. Perfect for AI agent integration.

And then I noticed the articles mentioning MCP support. Perfect! I could connect it directly to—

$ gws mcp
{
  "error": {
    "code": 400,
    "message": "Unknown service 'mcp'."
  }
}

You know the rest. That investigation became Part 1. Google had implemented a full MCP server — 1,151 lines of Rust — then deliberately deleted it as a breaking change. Two days after launch.

At the time, I focused on the forensic story: what happened, why, and what it meant for tool design. But the deeper significance only hit me later.

Google didn't just remove MCP. Google arrived at the same architectural conclusion I had been groping toward with my own product — that for large-scale operations, the right pattern is CLI-first with structured output, not protocol-mediated tool discovery. "Order from the kitchen when you're hungry" beats "spread the entire menu on the table."

That was two independent arrivals at the same destination.

Then I found the third.

The Hackathon Winner's Blueprint

A few days after publishing Part 1, I came across the everything-claude-code repository by Affaan Mustafa (@affaanmustafa). Affaan won the Anthropic × Forum Ventures hackathon in NYC, building zenith.chat entirely with Claude Code in 8 hours. His repository — 77,000+ stars, 640+ commits, 76 contributors — packages 10+ months of daily Claude Code usage into a complete agent configuration system.

I started reading it out of curiosity. Within minutes, I was sitting bolt upright.

The philosophy was identical to what I'd been building independently.

Let me show you the parallels.

MCP: Deliberately Minimized

From Affaan's guide:

"Your 200k context window before compacting might only be 70k with too many tools enabled."

His rule of thumb: have 20–30 MCPs configured, but keep under 10 enabled and under 80 tools active. The repository includes mcp-configs/mcp-servers.json with explicit disabledMcpServers entries — actively turning off MCP servers to protect context space.

This is exactly what Google concluded with gws. And exactly what I experienced — more tools, less thinking room.

CLI Skills as MCP Replacements

From Affaan's longform guide:

"Instead of having the GitHub MCP loaded at all times, create a /gh-pr command that wraps gh pr create with your preferred options. Instead of the Supabase MCP eating context, create skills that use the Supabase CLI directly. The functionality is the same, the convenience is similar, but your context window is freed up for actual work."

Skills in Claude Code are Markdown files — tiny prompt templates that load only when invoked. A /gh-pr skill might be 200 tokens. The GitHub MCP server's tool definitions are thousands. Same functionality. Orders of magnitude less context consumption.

This is the "kitchen model" from Part 1, independently rediscovered by a power user.

Domain Expert Agents

The repository is organized into specialized subagents: planner.md, code-reviewer.md, tdd-guide.md, security-reviewer.md, build-error-resolver.md. Each agent has a narrow scope, specific tools, and defined behaviors.

This mirrors what I learned from my own product development — that established industries organize into specialties for a reason, and AI should follow the same principle. You don't ask a generalist to do a specialist's job. You don't ask a general-purpose agent to handle security review when a specialized security-reviewer agent would be more precise and use less context.

Context Hygiene as First Principle

Affaan's system includes automatic compaction hooks, session memory persistence, and strategic context management. The entire architecture is built around one principle: protect the context window for reasoning.

Not storage. Not tool definitions. Reasoning.

The Convergence

So here's what happened in 2026:

Google — a trillion-dollar company with the largest productivity API surface in the world — implemented MCP, stress-tested it against 200–400 tool definitions, and deleted it. Their conclusion: CLI-first with on-demand schema discovery. Context stays clean.

Affaan Mustafa — an individual developer who won an AI hackathon and spent 10+ months refining his workflow — independently concluded that MCP should be minimized, replaced with CLI skills where possible, and the context window should be protected for reasoning above all else.

I — a medical IT veteran building AI-powered tools in Japan — arrived at the same architecture through a completely different path. A timestamp API in 2024. The "getting dumber" experience in 2025. A product's evolution from monolithic prompt to JSON + deterministic pipeline. And then the forensic discovery of Google's MCP deletion.

Three different starting points. Three different domains. Three different scales. The same conclusion.

That's not coincidence. That's a phase transition.

What the 2026 Phase Transition Actually Means

When people talk about AI milestones, they usually mean model capabilities. GPT-4. Claude 3. Gemini Ultra. Bigger context windows. Better benchmarks.

But the real phase transition of 2026 isn't about model capabilities. It's about how we architect around the capabilities we already have.

The shift can be summarized in one sentence:

"Do it for me" is expensive. "Do this specific thing" is cheap.

Every token spent on tool definitions, prompt engineering, and ambiguous instructions is a token not spent on reasoning. And the research confirms what practitioners have been feeling: irrelevant context doesn't just waste space — it actively degrades the model's ability to think.

Here's what that means in practice:

The end of "prompt engineering" as we knew it. A 15,000-character prompt is a confession that we're compensating for missing architecture. The future is narrower prompts, domain-specific skills, and deterministic systems handling everything that doesn't require reasoning.

MCP is not dead — it's bounded. MCP remains excellent for small-to-medium tool sets (under 50 tools). But for large API surfaces, CLI-first is the proven pattern. The "everything via MCP" fantasy is over.

"Skills" are the new unit of AI agent design. Whether you call them Skills (Affaan), Agent Skills (Google), or domain-specific prompts (what I've been doing with my own tools), the pattern is the same: small, scoped, loaded on demand, discarded after use.

Context windows are not memory — they're working memory. Treating the context window as storage is like covering your entire desk with every book you own before you even pick up a pen. You haven't left any room to actually write. The desk needs to be clear for thinking — and every MCP tool definition, every bloated prompt, every retained conversation turn is another book on the pile.

The Human Parallel (Or: Why "Do It For Me" Was Always Expensive)

There's an observation I keep coming back to, and it's one that makes me laugh every time.

Consider how humans delegate work:

Boss: "Handle this, will you?"
Employee: (Internal monologue: What exactly? By when? In what format? Who approved this? What's the budget?) → 10 rounds of clarification follow.

Now consider the alternative:

Boss: "Run git commit -m 'fix: resolve auth timeout' && git push origin main."
Employee: Done. One round. Zero ambiguity.

The first conversation — the "human" one — requires the employee to infer intent, plan actions, select tools, estimate parameters, and verify assumptions. Every step of that inference costs time and mental bandwidth.

In LLM terms, every step of that inference costs tokens.

MCP tool definitions are the LLM equivalent of "let me explain everything you might possibly need to know before we start." CLI commands are the equivalent of "just do this one thing."

What the token economy has done — accidentally, beautifully — is make the cost of human communication ambiguity visible as a number. Every vague instruction, every "you know what I mean," every "figure it out" translates directly to token consumption that crowds out actual reasoning.

Someone with forty-plus years of programming experience — from assembly language to LLMs — finds this deeply ironic. We spent decades making computers understand human language. Now we're learning that the most efficient way to use language-understanding computers is... to give them precise, unambiguous commands. Like assembly language. Like CLI.

The wheel doesn't just turn. It circles back to the truth.

What Comes Next

If the pattern holds, the next phase is already emerging.

Domain-specific agent languages. Not natural language prompts. Not traditional programming languages. Something in between — structured enough for deterministic execution, flexible enough for AI reasoning. We're already seeing DSLs for agent workflows (LangGraph's graph definitions), constrained syntax languages designed for LLM generation, and YAML/JSON-based knowledge objects.

Agent architecture as a discipline. "Prompt engineer" was the job title of 2024. The 2026 equivalent is closer to "Agent Architect" or "Domain Skill Designer" — someone who understands how to decompose workflows into deterministic and non-deterministic components, and how to allocate context window real estate accordingly.

Domain specialization as a design principle. This is my domain bias speaking — I come from medical IT, where specialization has been refined over centuries. There's a reason medicine has cardiologists and dermatologists. It isn't bureaucratic — it's cognitive. A specialist holds deep domain knowledge that makes their work faster, more accurate, and more reliable. I believe AI agents should be organized the same way. Not one giant model that knows everything. A team of specialists, each with their own skills, routing tasks to the right expert. Every industry has its own version of "specialties." The principle is universal.

Closing

In Part 1, I wrote: "If you write about an OSS tool, run it first."

In Part 2, the lesson is different:

If three independent paths converge on the same conclusion, pay attention.

Google didn't read Affaan's guide before deleting MCP from gws. Affaan didn't study my architecture before recommending CLI skills over MCP. I didn't know about either of them when I built a timestamp API in 2024 and started separating deterministic from non-deterministic processing.

We all arrived at the same place: protect the context window for reasoning. Push everything deterministic to CLI, scripts, and structured pipelines. Load skills on demand. Discard them when done. Let the AI think.

That convergence — from a trillion-dollar company, a hackathon winner, and someone who's been writing code since assembly language was the only option — is what makes 2026 a phase transition.

Not because the models got better. Because we finally learned how to stop wasting them.

Try It Yourself

If you want to feel what "the 2026 phase transition" means in practice rather than just reading about it, the fastest way is to inject Affaan's system into your own Claude Code environment.

I did it myself. The difference was immediate — sessions stayed coherent longer, context stopped rotting mid-task, and the AI's reasoning felt sharper in ways that are hard to quantify but impossible to miss once you've experienced them.

The quickest path — install as a Plugin directly inside Claude Code:

# Inside Claude Code
/plugin marketplace add affaan-m/everything-claude-code
/plugin install everything-claude-code@everything-claude-code

That alone gives you the commands, skills, and hooks. You'll notice the difference.

For the full setup including rules and language-specific configurations:

git clone https://github.com/affaan-m/everything-claude-code.git
cd everything-claude-code
./install.sh typescript   # or: python / golang / rust

You don't need to install everything. Start with the plugin. Use it for a day. Pay attention to how long your sessions stay productive before context degrades. Compare it to yesterday.

I suspect you'll have your own moment of convergence — your own version of the realization that Google, Affaan, and I all had independently. That the bottleneck was never the model. It was how much of the context window we were wasting on everything except thinking.

Your setup is different from mine. Your domain is different. But the principle is the same.

Let the AI think.

And if this feels familiar —

it is.

References

GHSA-WWG8-6FFR-H4Q2: GHSA-wwg8-6ffr-h4q2: Cross-Site Request Forgery in Admidio Role Management

2026-03-17 12:40:05

GHSA-wwg8-6ffr-h4q2: Cross-Site Request Forgery in Admidio Role Management

Vulnerability ID: GHSA-WWG8-6FFR-H4Q2
CVSS Score: 5.7
Published: 2026-03-16

Admidio versions 5.0.0 through 5.0.6 contain a Cross-Site Request Forgery (CSRF) vulnerability in the organizational role management module. The application fails to validate anti-CSRF tokens for state-changing operations including role deletion, activation, and deactivation. An attacker can leverage this flaw to perform unauthorized actions by tricking a privileged user into executing a malicious request.

TL;DR

A missing CSRF validation check in Admidio's role management module allows attackers to permanently delete or modify organizational roles by tricking authenticated administrators into clicking a malicious link.

⚠️ Exploit Status: POC

Technical Details

  • Vulnerability Type: Cross-Site Request Forgery (CSRF)
  • CWE ID: CWE-352
  • CVSS v3.1 Base Score: 5.7 (Medium)
  • Attack Vector: Network
  • User Interaction: Required
  • Privileges Required: Low
  • Exploit Status: Proof of Concept Available

Affected Systems

  • Admidio Core Application
  • admidio/admidio: >= 5.0.0, < 5.0.7 (Fixed in: 5.0.7)

Mitigation Strategies

  • Upgrade Admidio to version 5.0.7 or later
  • Manually patch modules/groups-roles/groups_roles.php to include SecurityUtils::validateCsrfToken()
  • Implement network-level logging to detect unusual Referer headers targeting role management endpoints
  • Educate administrative staff on the risks of clicking external links while authenticated to the application

Remediation Steps:

  1. Back up the Admidio database and application files.
  2. Download Admidio version 5.0.7 from the official repository.
  3. Deploy the updated application files to the web server.
  4. Verify that the role management features function correctly.
  5. Review access logs for potential historical exploitation of the vulnerability.

References

Read the full report for GHSA-WWG8-6FFR-H4Q2 on our website for more details including interactive diagrams and full exploit analysis.

Struggling to Learn Docker? I Built a Hands-On Learning Environment (DockerQuest)

2026-03-17 12:32:57

When I started learning Docker, one problem kept coming up:

I was always afraid of breaking things.

Running commands like docker rm, modifying containers, or experimenting with networks felt risky because once something broke, I didn’t know how to get back to a clean state.

So instead of just following tutorials, I tried building something to solve this.

The Idea

What if you could:

• Run real Docker commands
• Break containers freely
• Reset everything instantly
• Learn step-by-step through challenges

That’s how DockersQuest came in.

What DockersQuest Does

It’s a small learning environment where:

• Each challenge is defined using container setups
• You interact with real Docker commands
• The system validates your progress
• You can reset the environment anytime

The Hard Problems I Faced

  1. Resetting Environments Reliably

Users can run any command and completely change container state.

So I had to design a system that:

destroys everything cleanly

recreates environments from YAML

ensures consistency every time

  1. Designing the Learning Path

This was harder than coding.

Teaching Docker is not just commands — it’s sequence.

I struggled with:

which commands should come first

how to avoid overwhelming beginners

how to make learning feel like progression, not documentation

  1. Validation Logic

Users can solve the same problem in multiple ways.

So instead of checking commands, I had to:

inspect container states

check running services

validate outcomes instead of steps

What I Learned

• Containers are easy to break but harder to reset correctly
• YAML-based environments help maintain consistency
• Teaching systems require more design thinking than coding

Try It Yourself

If you're learning Docker or teaching it, I’d really appreciate you trying it out and sharing feedback.

👉 GitHub: https://github.com/KanaparthyPraveen/DockersQuest

If you find it useful, consider giving it a ⭐ — so it really helps beginners reach this project.

docker #devops #beginners #opensource #webdev

Learning Automation the Smart Way: Scripts, Bots, and AI Workflows Every Developer Should Master

2026-03-17 12:30:00

If you’re a developer today, chances are you’ve automated something at least once - maybe a deployment script, a cron job, or a quick Python tool to clean messy data. But automation in 2026 looks very different from what it did even five years ago.

Today, developers aren’t just writing scripts. They’re building automation ecosystems made up of scripts, bots, APIs, and AI-driven workflows that operate continuously in the background.

The difference between basic automation and true productivity automation often comes down to how well developers understand workflow design.

A recent report from McKinsey estimated that about 60% of work activities could be automated using existing technologies, particularly when AI and workflow automation are combined.

Source: https://www.mckinsey.com/capabilities/operations/our-insights/the-future-of-work-after-covid-19

For developers, that means learning automation is no longer optional. It’s becoming a core engineering skill.

This article explores how developers can learn automation the smart way - using scripts, bots, and AI workflows that actually solve real problems instead of creating complicated automation systems that nobody maintains.

Why developers should prioritize automation skills

Many developers still think automation means writing small helper scripts.

In reality, automation today includes:

  • Infrastructure automation
  • AI-powered workflows
  • DevOps pipelines
  • API orchestration
  • Data pipelines
  • Business process automation

The most productive engineers spend less time doing repetitive tasks and more time designing systems that eliminate those tasks entirely.

A Stack Overflow developer survey consistently shows that developers who invest in automation and DevOps tools tend to report higher productivity and job satisfaction.

Source: https://survey.stackoverflow.co/

Automation doesn’t just save time. It reduces errors, improves scalability, and allows teams to move faster.

Understanding the three layers of modern automation

To automate effectively, developers need to understand the three main layers of automation systems.

1. Script-based automation

This is the simplest and most common type of automation.

Examples include:

  • Bash scripts for deployments
  • Python scripts for data processing
  • Scheduled tasks using cron
  • Database backup scripts

Scripts are ideal for automating repetitive local tasks.

Example:

A developer might write a Python script that:

  1. Pulls new data from an API
  2. Cleans the dataset
  3. Stores it in a database
  4. Sends a report to Slack

While simple, these scripts often become the foundation of larger automation systems.

A helpful reference for learning scripting automation techniques can be found here:

https://realpython.com/python-automation/

2. Bot-based automation

Bots automate tasks across platforms.

They interact with services like:

  • Slack
  • Discord
  • GitHub
  • Jira
  • Customer support tools

For example, a DevOps team might create a Slack bot that:

  • Monitors system alerts
  • Triggers infrastructure scaling
  • Notifies the engineering team

Bots allow automation to operate inside collaboration tools where teams already work.

A good introduction to building developer bots is available here:

https://developer.github.com/apps/building-github-apps/

3. AI-powered automation workflows

This is where automation becomes significantly more powerful.

Instead of executing predefined steps, AI workflows can:

  • Interpret data
  • Make decisions
  • Generate responses
  • Trigger actions

For example, an AI automation workflow could:

  1. Monitor customer support tickets
  2. Classify issues using an AI model
  3. Automatically respond to simple requests
  4. Escalate complex problems to human agents

Platforms like Zapier, Make, and n8n have begun integrating AI agents directly into workflow automation.

Overview of AI workflow automation:

https://zapier.com/blog/ai-workflows/

Common automation mistakes developers make

Learning automation the smart way means avoiding mistakes that cause automation systems to fail.

Overengineering simple tasks

Some developers build complex systems for problems that require only a simple script.

Example:

A developer might design an entire microservice architecture just to run daily reports when a scheduled script would work perfectly.

The key is choosing the simplest automation solution that solves the problem.

Ignoring observability and logging

Automation systems can fail silently if logging and monitoring are not implemented.

For example:

A workflow that processes financial transactions must include:

  • error logging
  • alert notifications
  • retry mechanisms

Without these safeguards, automation becomes risky.

Guidelines for building reliable automation pipelines:

https://martinfowler.com/articles/patterns-of-distributed-systems/

Creating automation without documentation

Another common issue is undocumented automation.

When the original developer leaves the team, nobody understands how the system works.

Automation should always include:

  • clear documentation
  • workflow diagrams
  • configuration guides

This ensures the automation remains maintainable.

Practical automation examples developers can build

Developers can start learning automation by building small but useful projects.

Example 1: Automated deployment pipeline

Tools involved:

  • GitHub Actions
  • Docker
  • CI/CD pipelines

Workflow:

  1. Developer pushes code to GitHub
  2. CI pipeline runs automated tests
  3. Docker image builds automatically
  4. Application deploys to the server

Documentation:

https://docs.github.com/en/actions

Example 2: AI-powered content classification system

Tools involved:

  • Python
  • OpenAI APIs or LLM services
  • Task queues like Celery

Workflow:

  1. New content enters the system
  2. AI analyzes the content
  3. System assigns tags automatically
  4. Results update the database

Guide for AI application workflows:

https://www.langchain.com/

Example 3: Automated data pipeline

Tools involved:

  • Apache Airflow
  • Python
  • Cloud storage

Workflow:

  1. Collect data from multiple APIs
  2. Clean and transform the data
  3. Store results in a data warehouse
  4. Trigger analytics dashboards

Introduction to Airflow pipelines:

https://airflow.apache.org/docs/

The rise of AI-assisted automation

AI is rapidly changing how automation systems are built.

Instead of manually coding every workflow step, developers can now use AI to:

  • generate scripts
  • create workflow logic
  • analyze automation logs
  • detect anomalies in systems

According to Deloitte’s automation trends report, organizations adopting AI-powered automation are seeing significant productivity improvements in technical teams.

Source: https://www2.deloitte.com/insights/us/en/focus/tech-trends.html

Developers who understand both automation engineering and AI tools will likely become some of the most valuable technical professionals in the coming years.

Developers interested in structured learning paths around automation systems, scripting, and AI-driven workflows can explore programs focused on AI Automation Mastery here:

https://www.edstellar.com/course/ai-automation-mastery-training

Actionable steps to start learning automation today

If you want to build strong automation skills, start with these steps.

1. Automate one repetitive task every week

Look for small tasks in your workflow and automate them.

2. Learn one automation framework

Popular choices include:

  • Apache Airflow
  • GitHub Actions
  • Zapier or n8n
  • Prefect

3. Build a personal automation toolkit

Develop reusable tools such as:

  • notification scripts
  • monitoring scripts
  • API connectors

4. Combine AI with automation

Experiment with AI agents that:

  • analyze logs
  • categorize data
  • generate reports

Related resources for developers

Conclusion

Automation is evolving rapidly, and developers who treat it as a core skill rather than a side project will have a major advantage.

The smartest way to learn automation is not by chasing tools but by understanding workflows.

Start with simple scripts. Expand into bots. Then build intelligent AI workflows that can adapt and scale.

Over time, automation stops being something you occasionally write - and becomes the foundation of how your systems operate.

What’s the most useful automation you’ve built so far? Was it a simple script, a bot, or a full AI workflow?