MoreRSS

site iconThe Practical DeveloperModify

A constructive and inclusive social network for software developers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of The Practical Developer

The New OAuth Problem Is Agent Delegation

2026-05-04 09:09:08

Enterprise identity used to have a fairly stable center of gravity.

A user authenticated. An application received a token. The token carried scopes or claims. The backend enforced what that application was allowed to do.

That model was never trivial, but it was legible.

Agents are making it less so.

An AI agent is not just another software client. It can plan, delegate, chain tools, invoke other agents, operate over time, and make decisions inside partially autonomous loops. It may act on behalf of a user in one moment, on behalf of a service in the next, and through a brokered protocol hop after that. It may hold authority briefly, derive narrower authority for a subtask, or preserve more authority than anyone intended.

That is why the emerging identity problem in AI is not simply authentication.

It is delegation.

More specifically, it is the combined problem of agent identity, delegated authority, and protocol trust.

That is where the next serious access-control failures are likely to come from.

Why classic OAuth thinking starts to strain

OAuth was built for an important but narrower question:

How can one application access a resource on behalf of a user, under bounded consent?

That question still matters. It is just no longer enough.

Agents introduce harder questions:

  • Is this agent acting as the user, for the user, or instead of the user?
  • Can the agent delegate part of its authority to another agent or tool?
  • If it does, what exactly is supposed to survive that delegation?
  • Can the delegated authority be narrowed but never expanded?
  • Can the user revoke the whole chain later?
  • Can a relying service tell whether this request came from a human, a first-party agent, or a third-party delegated sub-agent?

Traditional OAuth patterns do not disappear here, but they begin to strain because the delegated actor is no longer a passive software client. It is a reasoning system with workflow freedom.

That changes the trust problem.

The real issue is not "who are you?"

Identity conversations often begin with a familiar question:

Who is making this request?

That remains necessary, but in agent systems it is no longer sufficient.

The more important question is:

Whose authority is being exercised right now, under what limits, and through how many hops?

That is a different class of problem.

An agent may be authenticated correctly and still be dangerously over-authorized. It may be a valid agent with an invalid delegation chain. It may hold a token that proves origin without proving the action is appropriate. It may invoke a second agent that inherits too much context or too much scope. It may call a tool with what looks like user authority even though the user never meant to authorize that specific kind of step.

In other words, authentication is table stakes. Delegation semantics are where the hard failures live.

Agents turn authority into a chain, not a session

This is the architectural shift that matters most.

Classic application auth often centers on a session or token grant between a user, a client, and a resource server.

Agent systems create authority chains:

user intent
  → primary agent
    → tool call or protocol broker
      → secondary agent
        → downstream API
          → side effect in an external system

Every hop raises new questions:

  • Did the original user intend this exact step?
  • Was the authority narrowed or preserved?
  • Did the downstream system verify the chain or just trust the immediate caller?
  • Can the second agent prove what it is allowed to do versus what it merely can do?
  • If something goes wrong, who is accountable for the action?

The more agentic the workflow becomes, the less useful it is to think in terms of one flat access token floating around the system.

The dangerous default: delegation without attenuation

This is where the next wave of poor security design is likely to show up.

Many teams will correctly recognize that agents need to call tools and other services. They will wire up tokens, API keys, service accounts, or on-behalf-of flows and consider the problem solved.

But the real danger is not delegation by itself.

It is delegation without attenuation.

Meaning:

  • the child agent gets as much authority as the parent
  • the tool receives a broad token instead of a task-specific one
  • the scope does not shrink as the chain gets longer
  • argument-level limits are not preserved
  • the downstream service cannot tell whether the user approved this exact action

That is how agent delegation becomes the new OAuth problem.

OAuth taught us that overly broad scopes and long-lived tokens create trouble. Agent systems add a new twist: authority can now move across reasoning systems that generate new subtasks on the fly.

If scope does not shrink as the chain expands, the system is effectively multiplying trust rather than routing it.

In AI systems, that risk is amplified by a property ordinary clients do not have: delegated authority is being handed to components that can be manipulated by language.

If a child agent holds a broad, unattenuated token and then processes malicious input, prompt injection stops being only a reasoning failure. It becomes an authorization failure. The attacker does not need to steal the token directly. They only need to steer the agent that already has it.

That is what makes over-delegation so much more urgent in agent systems than in ordinary software clients.

Protocol trust is now part of security, not plumbing

This is another area the industry still underestimates.

When agents talk to tools or other agents through emerging protocols, the protocol itself becomes part of the trust model.

Not because protocols are inherently unsafe, but because they define:

  • how identity is represented
  • how delegation is expressed
  • what provenance survives handoff
  • how consent is bound to action
  • whether scope restrictions remain machine-verifiable
  • what the receiver is allowed to assume about the caller

Protocol design is no longer just interoperability work. It is authorization design.

If the protocol does not clearly preserve identity, delegation depth, authority narrowing, and provenance, then every implementation ends up reconstructing trust from partial hints.

That is how systems become integrated without becoming defensible.

Four failures we are likely to see more of

1. The helpful agent with a service-account skeleton key

A company wants its internal agent to work reliably, so it grants backend service credentials with broad read and write access across several business systems. The user experience feels great because the agent rarely gets blocked.

But once the agent operates through a shared backend identity, the system starts collapsing user intent and service privilege into one bundle.

Now the question is no longer "can the user do this?" It becomes "can the agent backend do this?" Those are not the same thing.

2. The child agent that inherited too much

A primary agent hands work to a specialist agent for scheduling, procurement, code changes, legal lookup, or data retrieval. The child gets the same authority envelope as the parent because building a narrower one felt inconvenient.

That is classic over-delegation.

The user authorized a task. The architecture quietly authorized a whole class of adjacent actions.

3. The downstream API that trusted the last hop only

An external service receives a request from an agent broker or sub-agent and checks that the presented token is valid. What it does not verify is whether the full chain of delegation still matches the original user intent, whether the action exceeded the approved task boundary, or whether scope was meant to attenuate at each hop.

The request is authenticated.

The chain is still wrong.

4. The revocation problem nobody modeled

A user revokes consent, an approval expires, or the primary task is canceled. But delegated authority already propagated to a child agent, a queued job, or a downstream tool execution context.

Now the system has to answer a very uncomfortable question:

Did revocation actually follow the authority chain, or did it only update the front door?

Why agent identity is not just a naming problem

A lot of teams hear "agent identity" and think mainly about registration:

  • naming the agent
  • assigning a client ID
  • issuing credentials
  • deciding whether it is first-party or third-party

That matters, but it is not enough.

The deeper problem is that agent identity has to be meaningful in context.

The receiver needs to understand things like:

  • which agent this is
  • who authorized it
  • whether it is acting directly or through delegation
  • which task or intent boundary applies
  • what autonomy level is allowed
  • whether human approval was required upstream
  • whether any of those facts were transformed across hops

That is much richer than "this token belongs to client X."

The next design pattern will be proof-carrying delegation

The answer is not "invent one magic protocol and everything is solved."

But the direction is becoming clearer.

This work is not starting from zero. The broader security world already has useful building blocks for attenuated and delegable authority, even if AI systems have not applied them seriously enough yet.

Concepts like OAuth Token Exchange, Macaroons, and Biscuit tokens all point in the right direction. They are different tools, but they share an important idea: authority can be delegated with constraints, caveats, attenuation, and verifiable structure instead of being passed around as one broad bearer credential.

None of them is the complete answer for agent systems. Multi-agent planning, protocol handoff, prompt injection, and long delegation chains introduce additional problems. But they give builders a far better place to start than pretending agent authorization has to be invented from scratch.

Agent systems are going to need delegation artifacts that carry more proof than ordinary bearer access:

  • who the agent is
  • which human, service, or policy delegated authority
  • what exact task or capability is allowed
  • how much further delegation is permitted
  • whether the authority was narrowed at each hop
  • when the delegation expires
  • how revocation propagates
  • what audit evidence links the final action back to the original grant

That is the shape of a more trustworthy agent authorization model.

Not just a token that says "allowed," but a chain that says allowed by whom, for what, how far, and under which constraints.

What good looks like

A serious agent platform should treat delegation as a first-class control surface.

That means:

Short-lived, task-bound authority.
An agent should not carry broad reusable permission when a narrow, per-task grant would do.

Attenuating delegation.
Child agents and downstream tools should inherit less authority than the parent, not the same amount.

Explicit delegation depth.
If the system allows agent-to-agent handoff, it should define how many hops are allowed and what changes at each hop.

Machine-verifiable provenance.
The receiver should not have to trust narrative claims about who authorized the action. It should be able to verify them.

Cascade revocation.
When the root authority is withdrawn, dependent delegated grants should not keep drifting alive in queues, workers, or child agents.

Separation between user intent and service convenience.
A backend service account should not become a universal substitute for delegated user authority just because it is easier to implement.

The practical question for teams

If you are building an agent that calls tools, APIs, or other agents, ask this:

What is the maximum authority this agent can pass downstream, and can we prove that it shrinks rather than spreads?

That question exposes the real architecture quickly.

It forces you to inspect:

  • token exchange design
  • scope narrowing
  • tool-level constraints
  • delegation depth
  • revocation semantics
  • protocol assumptions
  • auditability of the full chain

If those answers are fuzzy, the identity layer is probably less mature than the demo suggests.

The next OAuth lesson is already here

Classic OAuth taught the industry a durable lesson: authorization is easy to get working and hard to get right.

Agents are reopening that lesson in a more complicated form.

Now the problem is not just application consent screens, bearer tokens, and API scopes.

It is delegated authority moving through reasoning systems, protocol hops, child agents, and external tools.

That is why this topic matters now.

The new identity problem in AI is not simply "how does the agent sign in?"

It is:

How do we make delegated agent authority narrow, provable, revocable, and trustworthy across the entire chain?

That is the new OAuth problem. And most teams are only beginning to discover it.

Design First, Code Later: Mastering Spec-Driven Development in Rust

2026-05-04 08:58:32

Have you ever started coding a feature, only to realize halfway through that your architecture is hopelessly tangled? We’ve all been there. You start writing a service, and before you know it, your business logic is heavily bleeding into your database queries and third-party APIs.

To prevent this architectural chaos, modern engineering teams are increasingly turning to Spec-Driven Development (SDD).

What is Spec-Driven Development?

Spec-Driven Development is a paradigm where you define the "What" (the specifications, contracts, and behaviors) long before you write the "How" (the concrete implementation).

Instead of diving straight into writing database queries or HTTP requests, you define clear interfaces. By doing this, you naturally enforce core Software Design Principles—most notably the Dependency Inversion Principle (DIP) and the Single Responsibility Principle (SRP). Your core application logic depends on abstractions (the Spec), not on concrete details.

How SDD Guides Correct Software Design

In languages with powerful type systems like Rust, SDD feels incredibly natural. Rust’s _trait _system is the perfect tool for defining specifications.

When you define a _trait _first, you are building a contract. Your business logic only knows about this contract. It doesn't care if the underlying data comes from a PostgreSQL database, an external REST API, or a simple text file. This separation of concerns allows different developers to work on the core logic and the infrastructure simultaneously, and it makes unit testing an absolute breeze.

The Example: Applying SDD to a Payment System

Let’s demonstrate how to apply Spec-Driven Development correctly in Rust. Imagine we are building an e-commerce checkout service.

Step 1: Define the Spec (The Contract)
Before writing any complex logic, we define what a payment gateway should do.

// 1. The Specification (Contract)
pub trait PaymentGateway {
    fn process_payment(&self, user_id: &str, amount: f64) -> Result<(), String>;
}

Step 2: Write Logic Against the Spec
Now, we write our core business logic. Notice how _CheckoutService _doesn't know anything about Stripe, PayPal, or credit cards. It only knows about the _PaymentGateway _spec.

// 2. The Core Logic (Depends on the Spec, not the implementation)
pub struct CheckoutService<T: PaymentGateway> {
    gateway: T,
}

impl<T: PaymentGateway> CheckoutService<T> {
    pub fn new(gateway: T) -> Self {
        Self { gateway }
    }

    pub fn complete_checkout(&self, user_id: &str, amount: f64) {
        println!("🛒 Starting checkout process for user: {}", user_id);

        match self.gateway.process_payment(user_id, amount) {
            Ok(_) => println!("✅ Checkout successful! Items are being prepared."),
            Err(e) => eprintln!("❌ Checkout failed: {}", e),
        }
    }
}

Step 3: Create Concrete Implementations
Finally, we implement the details. We can create a mock for local testing and a real one for production.

// 3. Concrete Implementations of the Spec

// A mock implementation for rapid testing and development
pub struct MockPaymentGateway;
impl PaymentGateway for MockPaymentGateway {
    fn process_payment(&self, _user_id: &str, amount: f64) -> Result<(), String> {
        println!("🛠️  [MOCK] Authorizing ${:.2} without hitting real APIs...", amount);
        Ok(())
    }
}

// A production implementation (e.g., Stripe)
pub struct StripeGateway;
impl PaymentGateway for StripeGateway {
    fn process_payment(&self, user_id: &str, amount: f64) -> Result<(), String> {
        println!("💳 [STRIPE] Connecting to production API for user {}...", user_id);
        println!("💳 [STRIPE] Successfully charged ${:.2}", amount);
        Ok(())
    }
}

fn main() {
    // Execution 1: Using the Mock (Development/Testing environment)
    println!("--- Running with Mock Gateway ---");
    let dev_service = CheckoutService::new(MockPaymentGateway);
    dev_service.complete_checkout("user_dev_01", 45.50);

    println!("\n--- Running with Production Gateway ---");
    // Execution 2: Using the Real Gateway (Production environment)
    let prod_service = CheckoutService::new(StripeGateway);
    prod_service.complete_checkout("user_prod_99", 120.00);
}

Execution & Proof

Terminal Output:

Conclusion

  • Spec-Driven Development is more than just a coding technique; it is an architectural mindset. By defining your interfaces (specs) first, you are forced into building decoupled, highly cohesive systems.

  • In our Rust example, if we ever need to switch our payment provider from Stripe to another service, our CheckoutService _remains completely untouched. We simply write a new struct that implements the _PaymentGateway trait.

  • By marrying SDD with core software design principles, you stop fighting your codebase and start building modular, future-proof software. Define the contract, respect the boundaries, and let the architecture guide your implementation.

Cx Dev Log — 2026-04-30

2026-05-04 08:51:36

Phase 11 just introduced compound assign lowering on submain, pulling +=, -=, *=, /=, and %%= into the IR backend. All in all, 126 new lines in src/ir/lower.rs and three fresh tests. These operators mark their maiden voyage through the IR backend, and while main keeps its 78/78 green tests, submain stays ahead by 22 commits with a 33-day bridge to cross.

Compound Assign Lowering

Commit 9015aff on submain is the sentinel. Gone is the CompoundAssign stub that only returned an UnresolvedSemanticArtifact. Now, it has real weight with true lowering logic. For Binding LValues, it mirrors the plain assignment's SSA journey:

  1. Fetch the SSA value of the binding.
  2. Lower the right-hand argument.
  3. Emit a Binary instruction mapping to the op (Add for +=, Sub for -=, etc.).
  4. Re-bind via SsaBind and adjust the binding map.

For DotAccess LValues, things are different. Instead of silently bypassing, compound assigns for struct fields raise an intentional typed error. This mirrors the handling of other unsupported constructs and lays down a marker – struct field work needs pointer arithmetic, which isn't in the mix yet.

There are three tests now in play:

  • lowers_compound_assign_add ensures += lands as Binary::Add.
  • lowers_compound_assign_sub drives -= end-to-end through lower_and_validate.
  • rejects_compound_assign_dot_access makes sure DotAccess gets shown the door with the expected error.

Two older error-message checks refined their format, adopting "compound assign binding BindingId(N)" over the bland "CompoundAssign".

In summary, a plus of 126 lines with just 5 lines trimmed, all within src/ir/lower.rs.

The Pattern

This move exemplifies the piecemeal approach standing firm throughout Phase 11. Each parcel grabs one syntax category, works the lowering logic, wraps it in tailored tests, and writes stubs with explicit error signals for what can’t be reached yet. Think of unary expressions landing on April 26. Compound assigns line up now. What remains – ArrayLit, Index, MethodCall, and others, along with compound assign on DotAccess – are nailed down with typed errors, not panic.

The four-day pause from April 26 until now – noteworthy mainly because it broke a steady rhythm – doesn't reflect an existential challenge, but a continued expression lowering march.

Submain Divergence

The rift between main and submain hasn’t changed. Right now, it looms large. Submain's 22 commits are sitting pretty with a 33-day lead covering Phases 10 and 11, the error model (Result), integer overflows, optional semicolons, thorough audits, diagnostic sheen, and the roadmap to v5.0 with its 9 hard blockers checked off. Main is back at v4.8 still staring at those very same 9 blockers.

Merge day has been the linchpin priority forever, but here we are. Every spin in the submain-alone orbit means the merge becomes a bit more of a puzzle. The test gap right now? 39 tests wide (main’s 78 vs submain’s 117).

What's Next

The backend's immediate path is buffered through the remaining expression lowerings. Compound assign on DotAccess, particularly, hinges on struct field access revamps (pointer arithmetic and Load/Store), yet to make its landing.

Merging remains king. The daily-log PR bloat – nearly 30 stagnant branches stretching from March 29 through April 29 – stacks as maintenance debt going forward.

Follow the Cx language project:

Originally published at https://cx-lang.com/blog/2026-04-30

I Built My Own Entropy Coder Because Deflate Doesn't Know What GN Knows

2026-05-04 08:47:51

I shipped gni-compression to npm two days ago. One of the first questions I got (from myself, running benchmarks at midnight): does it work on anything other than chat data?

Short answer: not yet. Long answer: I found out exactly why, and it led me somewhere more interesting than I expected.

The Benchmark That Told the Truth

After the npm launch I ran GN against Silesia — the standard general text compression benchmark suite. Dickens, Webster, XML logs, binaries. Here's what came back:

GN loses. Not slightly — brotli-6 is 10–30% better on general text depending on the corpus. Gzip-6 beats it too.

The obvious question is why. GN beats brotli on chat data by ~2% consistently across 12 measurements. Same algorithm, different corpus, completely different result.

What's Actually Happening

GN's pipeline looks like this:

input → sliding window learner → tokenizer → token stream + literal stream → deflate each stream → frame

The sliding window learns repeated patterns from the data. On chat data it learns role markers, JSON field names, tool call schemas, prompt fragments. On Silesia it learns... less. The vocabulary is shallower because general text has less structural repetition.

But that's not the whole story. I ran a test that revealed something more uncomfortable:

deflate on raw data:       2.563x  — 28ms
deflate on GN-tokenized:   2.525x  — 15ms

Deflate on raw beats deflate on GN-tokenized. The tokenization step is actually hurting ratio on general text. It's faster (smaller input) but it compresses worse.

This means GN's wins on chat data come entirely from the vocabulary quality on that specific domain — and when the vocabulary is weaker, we're paying overhead with nothing to show for it.

Why Deflate Is the Wrong Coder Here

Deflate was designed for mixed byte streams. It uses LZ77 + Huffman coding. It's extremely well engineered for its purpose.

But GN's token stream is not a mixed byte stream. After tokenization it's a stream of small integers — token IDs, mostly in a narrow range (top 5000 tokens out of a possible vocabulary). The symbol distribution is highly skewed and known in advance.

Deflate doesn't know any of that. It treats the token stream like arbitrary bytes and builds a fresh Huffman tree from scratch for each chunk. It's doing redundant work and missing structure that's visible to GN's own data model.

ANS is different. ANS is a modern entropy coder — it's what zstd uses internally. It can be initialized with a pre-built frequency table tuned to GN's specific token distribution. On token streams with known skewed distributions, ANS should code significantly closer to theoretical entropy than deflate.

We Already Built It

The ANS implementation is already in the codebase — gn_ans_compress, gn_ans_compress_bits, gn_ans_compress_o1 for the compress side, matching decompress variants. What's left is wiring it into the main compression path and benchmarking against deflate on the same split-stream output.

This matters for a reason beyond ratio numbers. Right now GN has one piece of its pipeline it didn't design: the entropy stage. Everything else — the rolling hash tokenizer, the codon table, the sliding window learner, the split-stream architecture, the frame format — was built for GN's specific problem. Replacing deflate with our own ANS implementation means the hot path is fully ours.

Why This Opens the Door to General Text

Here's the thing about GN's architecture: the domain-specificity lives in the vocabulary. The sliding window learns from whatever you feed it. On LLM chat data it learns chat patterns. On Silesia it could learn Silesia patterns — it's just shallower because general text has less structural repetition to exploit.

But with a coder that's tuned to GN's output distribution rather than arbitrary bytes, the floor goes up. The overhead we're currently paying on general text drops. The question becomes: how much does domain-adaptive preprocessing help when your entropy stage is no longer the bottleneck?

That's GNCompressorV2. Same architecture, own entropy coder, tested on both conversation data and general text with verified numbers.

Not there yet. But now I know exactly what the ceiling is and what's holding us below it.

Code: github.com/atomsrkuul/glasik-core | npm: gni-compression

Stop Using AI Only to Build—Start Using It to Break Your Systems

2026-05-04 08:45:05

Most of us have gotten comfortable using AI to speed things up—write code, generate tests, clean up documentation. It’s become a productivity tool. But there’s another way to use AI that feels less obvious and, in many cases, more valuable: using it to challenge your system instead of helping it.

If you’ve worked on real production systems, you already know this—things don’t usually break in obvious ways. They break in small, annoying, hard-to-reproduce ways. A value comes in with a slightly different format, a field has an extra space, casing changes, or something gets reordered. Nothing looks “wrong,” but suddenly the system behaves differently. These are the kinds of issues that slip through testing and show up later when it’s much harder to debug.

The reason this happens is simple. Most testing reflects how engineers think, not how real inputs behave. We test the expected cases, maybe a few edge cases, and call it done. Even automated tools often generate inputs that are either too clean or completely random. Neither really captures how data looks in the wild.

This is where AI starts to become useful in a different way. Instead of asking it to create solutions, you ask it to create variations. Give it one valid input, and it can produce multiple versions of that same input that still mean the same thing but look slightly different. That’s exactly the kind of variation that exposes weaknesses in systems.

Think about a basic API that takes an amount and a currency. You test it with something like “1000.00 USD,” and everything works. But what happens when the input becomes “1000”, or “1,000.00”, or has extra spaces, or uses lowercase for the currency? These aren’t unusual cases—they happen all the time. Yet many systems treat them differently, sometimes rejecting them, sometimes misinterpreting them, and sometimes behaving inconsistently.

Instead of manually trying to think of all these possibilities, you can let AI do that work. Treat it like a mutation engine. Start with one valid input and ask for realistic variations that don’t change the meaning. Then run all of them through your system and observe what happens. You’re no longer just testing whether the system works—you’re testing how stable it is when things are slightly off.

This changes what you pay attention to. Instead of only asking, “Did this pass or fail?” you start asking, “Did the system behave the same way across all these inputs?” Because if two inputs are effectively the same but produce different outcomes, that’s a deeper issue. It’s not just a bug—it’s inconsistency in how your system interprets the world.

The nice part is that you don’t need a complicated setup to try this. You can start small. Generate a handful of variations using AI, run them through your existing flow, and compare the results. Even this simple exercise can reveal things that traditional testing misses.

This approach becomes especially useful in systems where input variability is common. Financial applications are a good example, where formatting differences can affect validations. OCR pipelines often deal with slightly inconsistent outputs for the same text. And modern AI-driven systems themselves can behave differently based on small changes in input phrasing. In all these cases, stability matters just as much as correctness.

One thing to watch out for is overusing AI without direction. If you generate too many random variations, you end up with noise instead of insight. The goal isn’t to overwhelm the system—it’s to explore meaningful differences. Another common mistake is focusing only on correctness and ignoring consistency. Both matter, but consistency is often what reveals deeper issues.

A more balanced way to think about this is to combine approaches. Let your code handle strict validation and rules. Use AI to explore the gray areas—the inputs that are technically valid but slightly different. Together, they give you a much better understanding of how your system behaves.

If you look back at most production issues, they rarely come from completely invalid data. They come from those edge cases that no one thought to test. Usually, a small percentage of inputs ends up causing a large share of problems. Adversarial testing is simply a way to find those cases earlier, when it’s easier to fix them.

In the end, AI isn’t just a tool for building faster. It’s also a way to question whether what you’ve built actually holds up under real conditions. When you start using it to push your system instead of just supporting it, you begin to uncover things you didn’t even realize were there.

And that shift—using AI not just as a helper but as something that challenges your system—is where the real learning starts.

¿Debería el desarrollo de software tener un control de acceso?

2026-05-04 08:43:26

Cada vez es más fácil escribir código.

Hoy alguien puede abrir un editor, describir lo que quiere en lenguaje natural, y obtener una aplicación funcional en minutos. Herramientas de IA, frameworks más amigables, plantillas, tutoriales infinitos. Entrar a programar nunca había sido tan accesible.

Y eso parece una buena noticia.

Pero esa facilidad trae una pregunta incómoda: si entrar es tan fácil, ¿la base de quienes entran también se vuelve más débil?

Hablo de fundamentos.

Mucha gente hoy aprende a programar logrando resultados antes de construir comprensión. Una API funciona, una página carga, una feature sale. Pero detrás de eso muchas veces faltan cosas más profundas: entender por qué falla algo, qué hace realmente una abstracción, qué implica una decisión de diseño, por qué cierto código escala mal o por qué se rompe en producción.

El conocimiento efímero de la asistencia con IA

Hay una diferencia importante entre entender algo en el momento y realmente aprenderlo.

Cuando trabajas con una herramienta de IA, muchas veces ocurre esto:

  • lees una solución
  • la reconoces
  • incluso puedes explicarla superficialmente

Y eso da la sensación de comprensión.

Pero esa sensación puede ser engañosa.

Porque no pasaste por el proceso que normalmente consolida el conocimiento: equivocarte, explorar alternativas, construir la solución paso a paso. En lugar de eso, recibiste una versión ya resuelta.

El resultado es un tipo de conocimiento frágil.

Sabes seguirlo, pero no sabes reconstruirlo.

Y esa diferencia se nota más adelante.

Cuando vuelves a enfrentarte a un problema similar, muchas veces aparece el mismo bloqueo. No porque no lo hayas visto antes, sino porque no lo internalizaste.

Sin esfuerzo cognitivo real —sin tener que recordar, fallar, ajustar— el cerebro no consolida bien la información. Lo que obtienes es reconocimiento, no dominio.

Puedes avanzar muy rápido… pero con una base que no necesariamente te sostiene cuando el problema cambia.

Vibecoding, “slop” y la ilusión de calidad

Con herramientas como el vibecoding, producir código funcional es extremadamente fácil con los suficientes tokens.

El problema es que “funcional” no es un buen indicador de calidad.

Hoy un agente puede generarte algo que:

  • corre
  • pasa un caso básico
  • cumple con lo que pediste

Pero eso no significa que sea buen código.

Puede ser:

  • difícil de mantener
  • inconsistente
  • ineficiente
  • frágil ante cambios
  • lleno de decisiones que nadie entiende

En otras palabras, puede ser slop… que funciona.

Y si no tienes criterio —lo que muchos llaman taste— o estándares claros, ese tipo de código no solo pasa, sino que se acumula.

Ahí es donde sí aparece una degradación real.

No porque la herramienta sea mala, sino porque elimina fricción sin reemplazarla por juicio.

Antes, llegar a una solución requería suficiente esfuerzo como para cuestionar partes del proceso. Hoy puedes saltarte ese proceso por completo.

El resultado es claro:

la barrera para producir código bajó… pero la barrera para producir buen código no.

Y cuando ambas se separan, lo que crece más rápido no es la calidad, es el volumen.

Entonces… ¿necesitamos gatekeeping?

No creo que la solución sea hacer la entrada más difícil.

Hacer más dura la entrada no garantiza mejores programadores. Solo garantiza menos acceso. Históricamente, mucha gente brillante entró precisamente porque el camino estaba abierto, no porque pasó un filtro elitista.

El problema es otro.

Nunca había sido tan fácil parecer productivo sin entender realmente lo que estás haciendo.

Y ahí es donde entra algo interesante.

Tal vez no estamos eliminando el gatekeeping. Tal vez solo lo estamos moviendo.

Antes, el filtro estaba al principio. Muchos no podían ni entrar.

Ahora, casi cualquiera puede empezar. Casi cualquiera puede construir algo que funcione. Pero el verdadero filtro aparece después, cuando el proyecto crece, cuando el comportamiento no es obvio, cuando el error no tiene una respuesta copiable, cuando hay que depurar, mantener, diseñar, decidir.

En ese momento, una base débil empieza a notarse.

Entonces quizá el nuevo gatekeeping no está en el acceso. Está en la profundidad.

No te filtra el editor. No te filtra el lenguaje. No te filtra la sintaxis.

Te filtra la realidad.

Cierre

La industria no tiene un problema de acceso. Tiene un problema de profundidad.

Algunas ideas las obtuve de este video por gonz:

Estamos optimizando para que más gente escriba código, pero no necesariamente para que más gente entienda sistemas.

Y esas dos cosas no son lo mismo.

El problema no es que ahora cualquiera pueda escribir código.

Es que ahora cualquiera puede crear código que nadie debería haber escrito.

Y quizá lo más peligroso no es eso.

Quizá lo más peligroso es que, al principio… funciona.