2026-03-17 13:00:00
In Part 1, I explained that at a global scale, trust is part of the architecture. Not trust as a feeling, but trust as something the system must enforce and prove.
In this article, I aim to explain the 3 distinct responsibilities that enable the system to grow organically.
Most complexity in global systems does notcome from services. It comes from mixing concerns.
I have seen this pattern many times:
None of the above are wrong in isolation, but when they are together, they create systems that are:
The problem is the system is trying to answer too many different questions at once. In short, it is the same pattern that applies to code Single Responsibility.
My job brought me to a point where I stopped thinking in terms of architectures and started thinking in terms of responsibilities. No matter how the application is built, it must answer the same 3 questions:
When these responsibilities are clearly separated, decisions become easier. When they are mixed, every discussion becomes a mess.
This is the responsibility most devs are comfortable with.
It is where:
In AWS terms, this is:
This responsibility answers one question only:
What does the system do for the business?
And it should be optimised for:
Problems start when this responsibility is overloaded.
Examples:
Execution code should focus on doing the work, not explaining or defending it.
This responsibility exists because *someone outside the team will ask questions like:
This responsibility is not about debugging. It is about proof.
In AWS, this responsibility is expressed through things like:
A common issue is teams trying to reuse execution or observability data as evidence. That usually fails because:
Evidence systems must be:
This is why this responsibility must be separate from execution.
This responsibility answers a very different question:
Is the system healthy right now?
Not:
But:
And the answers are:
In AWS environments, this usually means:
Those services exist, they trigger the investigation and actions.
Once these 3 responsibilities are separated, many things become obvious. For example:
Each of these feels ok on its own, but at scale, there is confusion about what the system is actually trying to tell us.
Once responsibilities are separated, conversations change.
Instead of:
Should we centralise logs?
I ask:
Which responsibility are we trying to serve?
Instead of:
Why cannot I just add tenantId to metrics?
I ask:
Is this an operational signal or an accounting question?
Instead of:
Why is governance slowing us down?
I ask:
Which responsibility are we trying to satisfy?
The trade‑offs are the same, but they are made explicit.
From a governance angle, global systems do not fail because they are distributed. They fail because we ask one system to do everything at once.
Separating responsibilities does notreduce complexity, it puts complexity where it belongs.
2026-03-17 12:51:52
In the world of AWS, "default" settings are often the fastest way to an expensive monthly bill. Recently, a Senior DevOps Engineer dropped a strategic hint on one of my posts that challenged the standard EC2-only approach:
"Lightsail is cheaper for the public-facing interface. You don't get much CPU, but enough SSD for cache. Then connect to your EC2."
I didn’t just read the comment. I built the system myself. I used AWS Lightsail for predictable costs and EC2 for extra computing power, creating a setup that greatly reduces Data Transfer Out (DTO) costs.
Most developers realize too late that AWS charges roughly $0.09 per GB for outbound data from EC2 (after the first 100GB). If your site pushes 1TB of traffic, that’s roughly a $90 bill for bandwidth alone.
That’s when the "Aha!" moment hit me: AWS Lightsail isn’t just for beginners. Its $5/month plan includes 1TB of outbound data transfer. By using Lightsail as an Nginx reverse proxy, you’re essentially buying a "bandwidth insurance policy."
The strategy: route public-facing traffic through the Lightsail “front door” and keep the connection to your EC2 backend over a Lightsail VPC peering connection. This traffic stays on AWS’s private network, keeping your costs predictable while shielding your backend from the public internet.
Compute Strategy: Optimized Frontend, Scalable Backend
Frontends rarely need heavy CPU. Lightsail’s burstable CPU is perfect for an Nginx reverse proxy, while EC2 handles the heavy lifting. Use Auto Scaling Groups and instance types optimized for your workload to ensure backend performance without overpaying for the frontend.
Networking: The VPC Peering Bridge
Setting up Lightsail → EC2 connectivity requires a VPC peering connection. Key detail: both route tables need manual updates:
. Troubleshooting Masterclass: Traceroute is King
When initial curl requests timed out, I didn’t guess I traced the packets:
traceroute 172.31.x.x
Packets left the instance but died after two hops, revealing a routing table gap rather than a misconfigured Nginx. One command saved me an hour of troubleshooting.
Web Server & Static Assets
Security & Traffic Management
Email & Notifications
Hardware Efficiency
With this setup (Cloudflare + CloudFront + Lightsail + EC2 + SES), you can run a robust, scalable stack for ~$15/month, assuming moderate traffic and efficient use of free tiers.
Production Readiness Notes
This project proved that in DevOps, curiosity pays off. A single comment led me to a hybrid architecture that optimizes costs, improves scalability, and demonstrates professional-grade cloud engineering. Thank you sir Harith!
I’ve documented the full lab guide, including the Nginx configs and peering steps, in my GitHub repository here: aws-lightsail-nginx-lab
2026-03-17 12:46:24
The Ancient Past of Eighteen Months Ago — And What It Taught Us About the Future of AI Agents
Let me tell you a story from the ancient past.
By which I mean eighteen months ago.
In the world of AI, eighteen months is geological time. Think back to mid-2024. Context windows were small. "Prompt engineering" was the skill everyone was hiring for. MCP didn't exist yet. The idea of AI agents autonomously operating external services was mostly theoretical.
I was building a medical AI product in Osaka, Japan. And I had a problem that, looking back, contained the seed of everything that happened in 2026.
This is Part 2 of my "Not Everything Needs MCP" series. Part 1 told the story of Google Workspace CLI implementing a full MCP server, then deliberately deleting all 1,151 lines of it two days after launch. That investigation revealed an architectural mismatch between MCP's protocol design and large-scale APIs.
But that was only one data point. Since publishing that article, I discovered two more — and together, they tell a much bigger story about where AI agent architecture is heading in 2026.
In early 2024, I was working on an AI assistant for my company's medical IT platform. We serve clinics across the Kansai region of Japan (Osaka, etc.) — and I'd been using ChatGPT's Custom GPTs to prototype workflows.
I had a simple need: I wanted every AI response to include the exact timestamp of when the conversation happened. Not for fun — for traceability. In medical IT, knowing when a decision was discussed matters. It matters for audits. It matters for compliance. It turned out to matter for patent applications too.
Here's what I did. I deployed a tiny Web API on a server we host publicly. It did exactly one thing: return the current time. Then I configured the Custom GPT to call this API before every response, and output the timestamp first.
The result looked like this:
User: Hey, long time no see!
(Communicated with universalkarte.com)
🕐 Response time: 2025-04-02 09:39:00 (JST) / 2025-04-02 00:39:00 (UTC)
Oh wow, it's been a while! So great to hear from you! 😊
A web API that returns a timestamp. Called before every response. Output deterministically. Nothing more, nothing less. That's all it did.
At the time, this was called "Function Calling" or "Tool Use" — the predecessors to what Anthropic would later formalize as MCP in November 2024. I didn't know I was implementing a pattern that would become the center of a protocol war. I just needed a clock.
But here's what matters: the design decision I made instinctively was to keep the external call as small and deterministic as possible. One API. One purpose. Minimal payload. The LLM didn't need to understand time zones or server infrastructure — it just needed to paste the result.
It wasn't a "hack" because I was lazy. It was an architectural instinct: keep the LLM away from what the system already knows. Deterministic output for a deterministic need. Don't make the AI think about the time — just give it the time.
Looking back now, eighteen months later, it turns out this minimal pattern — one deterministic call, zero reasoning overhead — was already the architecture that the rest of the industry would independently converge on. I didn't see it that way at the time. I was just solving a problem.
November 2024. Anthropic open-sourced MCP. By February 2025, Google and others rushed to announce MCP support. The community was electric. Finally, a standard protocol for connecting LLMs to external tools!
I dove in immediately. I connected MCP servers for GitHub, for databases, for various services. Context windows were getting larger. The future felt bright.
And at first, it was genuinely impressive. GitHub operations that used to require manual terminal commands — commits with thoughtful messages, PR creation, branch management — the AI handled them smoothly through MCP. I felt the productivity gains. They were real.
But then something else started happening.
The AI started getting... dumber.
Not in the "wrong answer" sense. In fact, the AI got better at executing tasks exactly as intended — MCP meant it could commit code, create PRs, and query databases with precision. But something subtler was degrading. The quality of reasoning. The ability to take a vague idea and turn it into a structured thought. What I call "zero-to-one thinking" — the creative, synthetic part of working with an LLM.
I spent the second half of 2025 with this nagging feeling. More tools, more capabilities, but less... intelligence. More precise in execution, less insightful in thought. I kept thinking: "I wish context windows would just get bigger so this wouldn't matter." But I also suspected that bigger windows alone wouldn't fix it — the AI would probably just get confused in different ways.
I couldn't quantify this feeling at the time. But I now know that researchers were documenting exactly what I was experiencing.
It turns out my gut feeling had a name: context rot.
Here's what researchers found — and why it matters for anyone loading MCP servers into their workflow:
| Research | Key Finding |
|---|---|
| Context Rot (Chroma Research) | Irrelevant context degrades reasoning first. Retrieval survives; thinking dies. |
| Reasoning Degradation with Long Context Windows (14-model benchmark) | Reasoning ability decays as a function of input size — even when the model can still find the right information. |
| Maximum Effective Context Window (Paulsen, 2025) | The actual usable window is up to 99% smaller than advertised. Severe degradation at just 1,000 tokens in some top models. |
| Fundamental Limits of LLMs at Scale (arXiv, 2026) | Context compression, reasoning degradation, and retrieval fragility are proven architectural ceilings — not bugs to be patched. |
Let me unpack why this hits MCP users so hard.
Chroma Research showed that as irrelevant context increases in an LLM's input, performance degrades — and the degradation is worse when the task requires genuine reasoning rather than simple retrieval. The less obvious the connection between question and answer, the more devastating the irrelevant context becomes.
The "Challenging LLMs Beyond Information Retrieval" study tested 14 different LLMs and demonstrated that reasoning ability degrades as a function of input size — even when the model can still find the right information. Information retrieval and reasoning are different capabilities, and reasoning breaks first.
And here's the connection to MCP that makes this personal:
A single popular MCP server like Playwright contains 21 tools. Just the definitions of those tools — names, descriptions, parameter schemas — consume over 11,700 tokens. And these definitions are included in every single message, whether you use the tools or not.
Now multiply that by 10 MCP servers. You've burned 100,000+ tokens on tool definitions alone. Your 200k context window is suddenly 70k. And it's not just smaller — it's polluted with information that actively degrades the model's ability to reason about the thing you actually asked it to do.
This is what I felt. The AI wasn't broken. It was drowning. More tools meant more noise in the signal. More capability meant less room to think.
While I was wrestling with MCP overhead, I was also building an AI-powered tool — essentially a converter that takes ambiguous, unstructured text input and generates structured, formatted output. Think of it as a bridge between how humans naturally communicate and how systems need to receive data.
The core of this tool is a system prompt. That prompt went through dozens of iterations. At its peak, it was 20,000 characters. I tested, compared outputs, and eventually settled on 15,000 characters.
15,000 characters of instructions. For a single task.
The whole time, a thought kept nagging me: "Would a human expert need 15,000 characters of instructions to do this job?" A domain specialist would need maybe a paragraph of guidance. The rest is knowledge they already have — accumulated through years of working in their field.
And that's when "prompt engineering" started feeling like what it really was: a brute-force workaround for the absence of domain expertise in the model's operating context.
But here's the twist. Despite the bloated prompt, the tool worked. Output quality stayed consistent and reliable. Why?
Because I had constrained the domain. The tool operated within a specific industry workflow — a narrow slice of reality with its own vocabulary, its own established patterns, its own expected output formats. By telling the LLM upfront "you are operating within this domain," the massive prompt became effective.
If you've ever worked with LLMs, you already know this intuitively: a purely descriptive, narrative-style prompt — no matter how long — doesn't guarantee output quality. We've all been there. But a prompt that constrains the domain changes the game.
Here's why, and you don't need a PhD to see it. Think about what's happening inside a Transformer model. The attention mechanism operates on an enormous matrix — in large models, tens of thousands of dimensions. Every token is trying to figure out which other tokens matter. When the domain is wide open, the model is searching for relevance across a vast, noisy space. The outputs fluctuate. The reasoning wanders. Anyone who's done even basic linear algebra — even 3×3 matrices in high school — can imagine what happens when you scale that uncertainty to tens of thousands of dimensions. Of course the output changes every time.
But constrain the domain, and you dramatically narrow where the model needs to look. The relevant vectors cluster. The gap between what the model retrieves and what the human intended shrinks toward zero. Domain limitation doesn't just help. It's the mechanism by which prompts actually work.
This taught me something that would later click into place: domain limitation is the real optimization. Not longer prompts. Not bigger context windows. Narrower scope.
And if that's true for prompts, shouldn't the same principle apply to how we design AI agents?
As the tool matured, the architecture evolved in a direction I didn't fully appreciate at the time.
The initial version was pure prompt — a single, monolithic instruction set that did everything through LLM reasoning. Unstructured text in, structured text out.
But the real world isn't one output format. My domain required multiple types of structured documents — each with its own format, its own required fields, its own regulatory and compliance requirements. The number of output variations kept growing.
Trying to handle all of these through prompt engineering alone was... well, it was exactly the "spread the entire menu on the table" problem from Part 1.
So the architecture shifted. The LLM's output became fully structured JSON — deterministic, parseable, machine-readable. Document generation moved to Google Workspace via GCP. The LLM's job narrowed to what it's actually good at: understanding the input, extracting the meaning, structuring the reasoning. Everything else — formatting, template selection, compliance checks, document assembly — moved to deterministic systems.
The LLM handles the ambiguous. Deterministic systems handle the deterministic.
I was doing this throughout 2025, iterating toward an architecture where AI reasoning and programmatic execution were cleanly separated. And I kept thinking about Google Workspace — if only there were a way to programmatically drive every Workspace API from the command line, it would be the perfect backend for the document generation pipeline...
March 2026. Google released gws — Google Workspace CLI. A Rust-based CLI that covers nearly every Google Workspace API, with commands dynamically generated from Google's Discovery Service.
When I saw the announcement, my reaction was immediate: "This is it. This is what I've been waiting for."
A CLI that could drive Gmail, Drive, Docs, Sheets, Calendar — all from the command line, all returning structured JSON. Perfect for my document generation pipeline. Perfect for AI agent integration.
And then I noticed the articles mentioning MCP support. Perfect! I could connect it directly to—
$ gws mcp
{
"error": {
"code": 400,
"message": "Unknown service 'mcp'."
}
}
You know the rest. That investigation became Part 1. Google had implemented a full MCP server — 1,151 lines of Rust — then deliberately deleted it as a breaking change. Two days after launch.
At the time, I focused on the forensic story: what happened, why, and what it meant for tool design. But the deeper significance only hit me later.
Google didn't just remove MCP. Google arrived at the same architectural conclusion I had been groping toward with my own product — that for large-scale operations, the right pattern is CLI-first with structured output, not protocol-mediated tool discovery. "Order from the kitchen when you're hungry" beats "spread the entire menu on the table."
That was two independent arrivals at the same destination.
Then I found the third.
A few days after publishing Part 1, I came across the everything-claude-code repository by Affaan Mustafa (@affaanmustafa). Affaan won the Anthropic × Forum Ventures hackathon in NYC, building zenith.chat entirely with Claude Code in 8 hours. His repository — 77,000+ stars, 640+ commits, 76 contributors — packages 10+ months of daily Claude Code usage into a complete agent configuration system.
I started reading it out of curiosity. Within minutes, I was sitting bolt upright.
The philosophy was identical to what I'd been building independently.
Let me show you the parallels.
From Affaan's guide:
"Your 200k context window before compacting might only be 70k with too many tools enabled."
His rule of thumb: have 20–30 MCPs configured, but keep under 10 enabled and under 80 tools active. The repository includes mcp-configs/mcp-servers.json with explicit disabledMcpServers entries — actively turning off MCP servers to protect context space.
This is exactly what Google concluded with gws. And exactly what I experienced — more tools, less thinking room.
From Affaan's longform guide:
"Instead of having the GitHub MCP loaded at all times, create a
/gh-prcommand that wrapsgh pr createwith your preferred options. Instead of the Supabase MCP eating context, create skills that use the Supabase CLI directly. The functionality is the same, the convenience is similar, but your context window is freed up for actual work."
Skills in Claude Code are Markdown files — tiny prompt templates that load only when invoked. A /gh-pr skill might be 200 tokens. The GitHub MCP server's tool definitions are thousands. Same functionality. Orders of magnitude less context consumption.
This is the "kitchen model" from Part 1, independently rediscovered by a power user.
The repository is organized into specialized subagents: planner.md, code-reviewer.md, tdd-guide.md, security-reviewer.md, build-error-resolver.md. Each agent has a narrow scope, specific tools, and defined behaviors.
This mirrors what I learned from my own product development — that established industries organize into specialties for a reason, and AI should follow the same principle. You don't ask a generalist to do a specialist's job. You don't ask a general-purpose agent to handle security review when a specialized security-reviewer agent would be more precise and use less context.
Affaan's system includes automatic compaction hooks, session memory persistence, and strategic context management. The entire architecture is built around one principle: protect the context window for reasoning.
Not storage. Not tool definitions. Reasoning.
So here's what happened in 2026:
Google — a trillion-dollar company with the largest productivity API surface in the world — implemented MCP, stress-tested it against 200–400 tool definitions, and deleted it. Their conclusion: CLI-first with on-demand schema discovery. Context stays clean.
Affaan Mustafa — an individual developer who won an AI hackathon and spent 10+ months refining his workflow — independently concluded that MCP should be minimized, replaced with CLI skills where possible, and the context window should be protected for reasoning above all else.
I — a medical IT veteran building AI-powered tools in Japan — arrived at the same architecture through a completely different path. A timestamp API in 2024. The "getting dumber" experience in 2025. A product's evolution from monolithic prompt to JSON + deterministic pipeline. And then the forensic discovery of Google's MCP deletion.
Three different starting points. Three different domains. Three different scales. The same conclusion.
That's not coincidence. That's a phase transition.
When people talk about AI milestones, they usually mean model capabilities. GPT-4. Claude 3. Gemini Ultra. Bigger context windows. Better benchmarks.
But the real phase transition of 2026 isn't about model capabilities. It's about how we architect around the capabilities we already have.
The shift can be summarized in one sentence:
"Do it for me" is expensive. "Do this specific thing" is cheap.
Every token spent on tool definitions, prompt engineering, and ambiguous instructions is a token not spent on reasoning. And the research confirms what practitioners have been feeling: irrelevant context doesn't just waste space — it actively degrades the model's ability to think.
Here's what that means in practice:
The end of "prompt engineering" as we knew it. A 15,000-character prompt is a confession that we're compensating for missing architecture. The future is narrower prompts, domain-specific skills, and deterministic systems handling everything that doesn't require reasoning.
MCP is not dead — it's bounded. MCP remains excellent for small-to-medium tool sets (under 50 tools). But for large API surfaces, CLI-first is the proven pattern. The "everything via MCP" fantasy is over.
"Skills" are the new unit of AI agent design. Whether you call them Skills (Affaan), Agent Skills (Google), or domain-specific prompts (what I've been doing with my own tools), the pattern is the same: small, scoped, loaded on demand, discarded after use.
Context windows are not memory — they're working memory. Treating the context window as storage is like covering your entire desk with every book you own before you even pick up a pen. You haven't left any room to actually write. The desk needs to be clear for thinking — and every MCP tool definition, every bloated prompt, every retained conversation turn is another book on the pile.
There's an observation I keep coming back to, and it's one that makes me laugh every time.
Consider how humans delegate work:
Boss: "Handle this, will you?"
Employee: (Internal monologue: What exactly? By when? In what format? Who approved this? What's the budget?) → 10 rounds of clarification follow.
Now consider the alternative:
Boss: "Run git commit -m 'fix: resolve auth timeout' && git push origin main."
Employee: Done. One round. Zero ambiguity.
The first conversation — the "human" one — requires the employee to infer intent, plan actions, select tools, estimate parameters, and verify assumptions. Every step of that inference costs time and mental bandwidth.
In LLM terms, every step of that inference costs tokens.
MCP tool definitions are the LLM equivalent of "let me explain everything you might possibly need to know before we start." CLI commands are the equivalent of "just do this one thing."
What the token economy has done — accidentally, beautifully — is make the cost of human communication ambiguity visible as a number. Every vague instruction, every "you know what I mean," every "figure it out" translates directly to token consumption that crowds out actual reasoning.
Someone with forty-plus years of programming experience — from assembly language to LLMs — finds this deeply ironic. We spent decades making computers understand human language. Now we're learning that the most efficient way to use language-understanding computers is... to give them precise, unambiguous commands. Like assembly language. Like CLI.
The wheel doesn't just turn. It circles back to the truth.
If the pattern holds, the next phase is already emerging.
Domain-specific agent languages. Not natural language prompts. Not traditional programming languages. Something in between — structured enough for deterministic execution, flexible enough for AI reasoning. We're already seeing DSLs for agent workflows (LangGraph's graph definitions), constrained syntax languages designed for LLM generation, and YAML/JSON-based knowledge objects.
Agent architecture as a discipline. "Prompt engineer" was the job title of 2024. The 2026 equivalent is closer to "Agent Architect" or "Domain Skill Designer" — someone who understands how to decompose workflows into deterministic and non-deterministic components, and how to allocate context window real estate accordingly.
Domain specialization as a design principle. This is my domain bias speaking — I come from medical IT, where specialization has been refined over centuries. There's a reason medicine has cardiologists and dermatologists. It isn't bureaucratic — it's cognitive. A specialist holds deep domain knowledge that makes their work faster, more accurate, and more reliable. I believe AI agents should be organized the same way. Not one giant model that knows everything. A team of specialists, each with their own skills, routing tasks to the right expert. Every industry has its own version of "specialties." The principle is universal.
In Part 1, I wrote: "If you write about an OSS tool, run it first."
In Part 2, the lesson is different:
If three independent paths converge on the same conclusion, pay attention.
Google didn't read Affaan's guide before deleting MCP from gws. Affaan didn't study my architecture before recommending CLI skills over MCP. I didn't know about either of them when I built a timestamp API in 2024 and started separating deterministic from non-deterministic processing.
We all arrived at the same place: protect the context window for reasoning. Push everything deterministic to CLI, scripts, and structured pipelines. Load skills on demand. Discard them when done. Let the AI think.
That convergence — from a trillion-dollar company, a hackathon winner, and someone who's been writing code since assembly language was the only option — is what makes 2026 a phase transition.
Not because the models got better. Because we finally learned how to stop wasting them.
If you want to feel what "the 2026 phase transition" means in practice rather than just reading about it, the fastest way is to inject Affaan's system into your own Claude Code environment.
I did it myself. The difference was immediate — sessions stayed coherent longer, context stopped rotting mid-task, and the AI's reasoning felt sharper in ways that are hard to quantify but impossible to miss once you've experienced them.
The quickest path — install as a Plugin directly inside Claude Code:
# Inside Claude Code
/plugin marketplace add affaan-m/everything-claude-code
/plugin install everything-claude-code@everything-claude-code
That alone gives you the commands, skills, and hooks. You'll notice the difference.
For the full setup including rules and language-specific configurations:
git clone https://github.com/affaan-m/everything-claude-code.git
cd everything-claude-code
./install.sh typescript # or: python / golang / rust
You don't need to install everything. Start with the plugin. Use it for a day. Pay attention to how long your sessions stay productive before context degrades. Compare it to yesterday.
I suspect you'll have your own moment of convergence — your own version of the realization that Google, Affaan, and I all had independently. That the bottleneck was never the model. It was how much of the context window we were wasting on everything except thinking.
Your setup is different from mine. Your domain is different. But the principle is the same.
Let the AI think.
And if this feels familiar —
it is.
2026-03-17 12:40:05
Vulnerability ID: GHSA-WWG8-6FFR-H4Q2
CVSS Score: 5.7
Published: 2026-03-16
Admidio versions 5.0.0 through 5.0.6 contain a Cross-Site Request Forgery (CSRF) vulnerability in the organizational role management module. The application fails to validate anti-CSRF tokens for state-changing operations including role deletion, activation, and deactivation. An attacker can leverage this flaw to perform unauthorized actions by tricking a privileged user into executing a malicious request.
A missing CSRF validation check in Admidio's role management module allows attackers to permanently delete or modify organizational roles by tricking authenticated administrators into clicking a malicious link.
5.0.7)Remediation Steps:
Read the full report for GHSA-WWG8-6FFR-H4Q2 on our website for more details including interactive diagrams and full exploit analysis.
2026-03-17 12:32:57
When I started learning Docker, one problem kept coming up:
I was always afraid of breaking things.
Running commands like docker rm, modifying containers, or experimenting with networks felt risky because once something broke, I didn’t know how to get back to a clean state.
So instead of just following tutorials, I tried building something to solve this.
The Idea
What if you could:
• Run real Docker commands
• Break containers freely
• Reset everything instantly
• Learn step-by-step through challenges
That’s how DockersQuest came in.
What DockersQuest Does
It’s a small learning environment where:
• Each challenge is defined using container setups
• You interact with real Docker commands
• The system validates your progress
• You can reset the environment anytime
The Hard Problems I Faced
Users can run any command and completely change container state.
So I had to design a system that:
destroys everything cleanly
recreates environments from YAML
ensures consistency every time
This was harder than coding.
Teaching Docker is not just commands — it’s sequence.
I struggled with:
which commands should come first
how to avoid overwhelming beginners
how to make learning feel like progression, not documentation
Users can solve the same problem in multiple ways.
So instead of checking commands, I had to:
inspect container states
check running services
validate outcomes instead of steps
What I Learned
• Containers are easy to break but harder to reset correctly
• YAML-based environments help maintain consistency
• Teaching systems require more design thinking than coding
Try It Yourself
If you're learning Docker or teaching it, I’d really appreciate you trying it out and sharing feedback.
👉 GitHub: https://github.com/KanaparthyPraveen/DockersQuest
If you find it useful, consider giving it a ⭐ — so it really helps beginners reach this project.
2026-03-17 12:30:00
If you’re a developer today, chances are you’ve automated something at least once - maybe a deployment script, a cron job, or a quick Python tool to clean messy data. But automation in 2026 looks very different from what it did even five years ago.
Today, developers aren’t just writing scripts. They’re building automation ecosystems made up of scripts, bots, APIs, and AI-driven workflows that operate continuously in the background.
The difference between basic automation and true productivity automation often comes down to how well developers understand workflow design.
A recent report from McKinsey estimated that about 60% of work activities could be automated using existing technologies, particularly when AI and workflow automation are combined.
Source: https://www.mckinsey.com/capabilities/operations/our-insights/the-future-of-work-after-covid-19
For developers, that means learning automation is no longer optional. It’s becoming a core engineering skill.
This article explores how developers can learn automation the smart way - using scripts, bots, and AI workflows that actually solve real problems instead of creating complicated automation systems that nobody maintains.
Many developers still think automation means writing small helper scripts.
In reality, automation today includes:
The most productive engineers spend less time doing repetitive tasks and more time designing systems that eliminate those tasks entirely.
A Stack Overflow developer survey consistently shows that developers who invest in automation and DevOps tools tend to report higher productivity and job satisfaction.
Source: https://survey.stackoverflow.co/
Automation doesn’t just save time. It reduces errors, improves scalability, and allows teams to move faster.
To automate effectively, developers need to understand the three main layers of automation systems.
This is the simplest and most common type of automation.
Examples include:
Scripts are ideal for automating repetitive local tasks.
Example:
A developer might write a Python script that:
While simple, these scripts often become the foundation of larger automation systems.
A helpful reference for learning scripting automation techniques can be found here:
https://realpython.com/python-automation/
Bots automate tasks across platforms.
They interact with services like:
For example, a DevOps team might create a Slack bot that:
Bots allow automation to operate inside collaboration tools where teams already work.
A good introduction to building developer bots is available here:
https://developer.github.com/apps/building-github-apps/
This is where automation becomes significantly more powerful.
Instead of executing predefined steps, AI workflows can:
For example, an AI automation workflow could:
Platforms like Zapier, Make, and n8n have begun integrating AI agents directly into workflow automation.
Overview of AI workflow automation:
https://zapier.com/blog/ai-workflows/
Learning automation the smart way means avoiding mistakes that cause automation systems to fail.
Some developers build complex systems for problems that require only a simple script.
Example:
A developer might design an entire microservice architecture just to run daily reports when a scheduled script would work perfectly.
The key is choosing the simplest automation solution that solves the problem.
Automation systems can fail silently if logging and monitoring are not implemented.
For example:
A workflow that processes financial transactions must include:
Without these safeguards, automation becomes risky.
Guidelines for building reliable automation pipelines:
https://martinfowler.com/articles/patterns-of-distributed-systems/
Another common issue is undocumented automation.
When the original developer leaves the team, nobody understands how the system works.
Automation should always include:
This ensures the automation remains maintainable.
Developers can start learning automation by building small but useful projects.
Tools involved:
Workflow:
Documentation:
https://docs.github.com/en/actions
Tools involved:
Workflow:
Guide for AI application workflows:
https://www.langchain.com/
Tools involved:
Workflow:
Introduction to Airflow pipelines:
https://airflow.apache.org/docs/
AI is rapidly changing how automation systems are built.
Instead of manually coding every workflow step, developers can now use AI to:
According to Deloitte’s automation trends report, organizations adopting AI-powered automation are seeing significant productivity improvements in technical teams.
Source: https://www2.deloitte.com/insights/us/en/focus/tech-trends.html
Developers who understand both automation engineering and AI tools will likely become some of the most valuable technical professionals in the coming years.
Developers interested in structured learning paths around automation systems, scripting, and AI-driven workflows can explore programs focused on AI Automation Mastery here:
https://www.edstellar.com/course/ai-automation-mastery-training
If you want to build strong automation skills, start with these steps.
Look for small tasks in your workflow and automate them.
Popular choices include:
Develop reusable tools such as:
Experiment with AI agents that:
Automation scripting with Python
https://realpython.com/python-automation/
GitHub automation and developer bots
https://developer.github.com/apps/building-github-apps/
Workflow automation with Zapier AI
https://zapier.com/blog/ai-workflows/
CI/CD automation using GitHub Actions
https://docs.github.com/en/actions
Building data pipelines with Apache Airflow
https://airflow.apache.org/docs/
Enterprise automation trends and insights
https://www2.deloitte.com/insights/us/en/focus/tech-trends.html
Automation is evolving rapidly, and developers who treat it as a core skill rather than a side project will have a major advantage.
The smartest way to learn automation is not by chasing tools but by understanding workflows.
Start with simple scripts. Expand into bots. Then build intelligent AI workflows that can adapt and scale.
Over time, automation stops being something you occasionally write - and becomes the foundation of how your systems operate.
What’s the most useful automation you’ve built so far? Was it a simple script, a bot, or a full AI workflow?