MoreRSS

site iconThe Practical DeveloperModify

A constructive and inclusive social network for software developers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of The Practical Developer

Agents are building their own UIs now. Here's when that's worth doing.

2026-04-30 05:02:06

This is a submission for the Google Cloud NEXT Writing Challenge

During the developer keynote at Google Cloud NEXT '26, a GDE demoed FinnishIt: an AI-powered Finnish language tutor built on the GenUI SDK for Flutter. You give it a topic, it asks refining questions, then generates a custom deck of interactive flashcards specific to that context. Role-play scenarios shift from text to tap-and-drag word puzzles to fill-in-the-blank modules depending on what the AI determines you need right now.

Every session produces a different interface.

That's the point where I stopped treating A2UI as conference noise and started paying attention.

A2UI is an open standard Google donated to the community at NEXT '26. It lets agents generate UI dynamically at runtime. The GenUI SDK for Flutter is the developer-facing layer that makes it practical to build with. Most coverage either skipped it or described it without asking the more useful question: when does this actually make sense to use?

Where GenUI earns it

FinnishIt works because the interface IS the learning experience.
There is no predetermined layout that would serve a user practicing spoken conversational Finnish the same as one drilling grammar for the YKI citizenship test. The right exercise type, difficulty, and interaction pattern all depend on what the AI assesses the user needs right now. Hardcoding any of that would produce a worse product.

The dynamic generation isn't a feature on top of the app. It is the app.

The same logic applies to onboarding flows, and this is where I think GenUI has untapped potential.

Most onboarding flows are static decision trees in disguise. You collect preferences on screen one, goals on screen two, then route users down one of two or three predetermined paths. The result feels personalized but is just filtered content behind a fixed interface.

Consider a personal finance app. Someone who opens it saying "I want to stop overspending" has a completely different mental model than someone who says "I want to start investing" or "I have irregular income and need to plan around it." Those aren't just different content buckets. They're different journeys, with different concepts to introduce, different decisions to make up front, and a different definition of what "getting to value" even means.

A GenUI-powered onboarding flow could read what a user brings to that first session and generate the next step as a direct response: not a static screen two, but a computed one.

A personal style app makes the case even more clearly, because here the interaction type itself changes, not just the content.

Someone who opens a style app saying "I have a job interview next week" needs an occasion-specific outfit construction flow: clear goal, tight timeline, specific constraints. Someone who says "I'm trying to figure out my personal style" needs a discovery experience: visual-first, exploratory, maybe swipe-on-images or mood board style. Someone who says "I want to build a capsule wardrobe on a budget" might need a wardrobe audit flow that starts with photographing what they already own.

These are not variations on the same form. They require different interface primitives: camera, swipe cards, visual grids, checklists. GenUI earns it here because you genuinely cannot know which one to show until the user tells you what they're trying to do.

The right interaction depends on the context. The context arrives at runtime.

A decision filter

Before reaching for GenUI, three questions:

Is the interface the experience, or is it a container for a fixed one?
In FinnishIt, the dynamically generated exercise is the product. That's different from a news reader or a task manager, where content arrives through a stable interface. Not every app benefits from a layout that changes each session.

Does the user need to find the same thing in the same place next time?
Adaptive learning, personalized onboarding, style discovery: each session is meant to feel different. An e-commerce checkout, a settings screen, a navigation menu: users build trust and speed through repetition. Those interfaces earn nothing from variation.

Is this an exploratory action, or one that requires confident understanding of what's about to happen?
Payment confirmation, account deletion, anything irreversible: users need to know exactly what they're looking at. Dynamic layout introduces uncertainty at exactly the wrong moment.

Where it doesn't fit

The failure cases aren't about regulation or compliance. They're about what users need from an interface to trust it.

A checkout flow that looks different each time isn't personalization. It's friction.

High-frequency task interfaces derive part of their value from the fact that users can operate them without thinking. Email, task management, booking flows: variability works against that entirely.
There's also a quieter design system concern. Most product teams ship against a component library: specific tokens, spacing rules, interaction patterns. An agent that approximately matches those patterns is not the same as one that respects the contract. That gap shows up in production in ways that are hard to articulate and easy to notice.

The open bet

A2UI and GenUI aren't solutions looking for a problem. There's a real category of app where static UI has always been the wrong answer: the kind where the right interaction depends on context that only arrives at runtime.

FinnishIt is an early, polished example of what that looks like when it's done well. Personalized onboarding, adaptive learning, style discovery: same category.

What I'm watching is whether developers build intuition for where this pattern belongs, or whether the next few years surface a wave of apps that introduced variability in exactly the places their users needed stability.

If you've seen agent-generated UI get it right, or quietly get in the way, I'd like to hear about it.

Manage Your Auth0 Tenants Faster with the Gemini CLI Extension

2026-04-30 05:01:35

Stop switching between your browser and terminal to manage identity. In this video, I'll show you how to integrate the Auth0 MCP Server directly into your Gemini CLI.

Learn how to leverage AI to query your applications, initialize tenants, and manage APIs using natural language commands without leaving your development environment.

What You'll Learn

  • How to find the Auth0 extension on the official Gemini page.
  • The one-command installation process for the Auth0 MCP server.
  • Authenticating and initializing your Auth0 environment via CLI.
  • Using natural language to list applications and create new API resources.

Resources

Your AI Agent Isn't Stupid, Your Vector Database Math Is

2026-04-30 05:01:04

Hey DEV community, CallmeMiho here. I’ve been auditing AI architectures all week, and I keep seeing developers blaming their LLMs for "hallucinations" when the model isn't the problem at all. Your AI agent doesn't have amnesia—your database is just failing at basic math.

Retrieval-Augmented Generation (RAG) is the gold standard for AI Agents, but it has a hidden failure mode: Semantic Drift.

Even if you upload the correct documents, your vector database might retrieve completely wrong "fragments." Why? Because the mathematical distance between the user's query and the data is being miscalculated.

This usually happens because of dimensionality mismatch (e.g., trying to stuff 3072-D OpenAI embeddings into a 1536-D index to save money). When you do this, you cause "Manifold Collapse." The semantic distance between concepts is destroyed, and the AI gets fed contextual garbage.

I made a 50-second breakdown of exactly how this happens:

How to actually fix it:

Stop tweaking your prompts. You cannot fix a database math error with Prompt Engineering.

You need to drop down to the math level and manually audit the Cosine Similarity of your embeddings to ensure the retrieval is mathematically sound.

If you want to debug your embeddings locally without sending your proprietary vector data to a cloud dashboard, I built a free, 100% offline Vector Distance Calculator at FmtDev.

Stop guessing. Check the math.

Privacy-first mind mapping app. Part 6: Maintainability and Coding Rules

2026-04-30 05:00:49

This is the chapter where I admit something simple: clean code talks are nice, but products are built in history, not in slides.

When MindMapVault was small, I could hold most of it in my head. Then the project grew into frontend work, desktop work, encryption flows, uploads, backend routes, different database paths, deployment scripts, release notes, and a lot of "just fix this one thing" days.

That is when coding rules stopped being theory and became survival.

The rules I kept coming back to were not fancy:

  • keep changes small
  • do not refactor the whole house when the sink is leaking
  • be extra careful in crypto, auth, and storage code
  • keep frontend and backend contracts aligned
  • prefer readable code over clever code

That sounds obvious. It is also the difference between a project that can still move and a project that starts breaking under its own weight.

The code has a visible history, and that is normal

You can see the project history in the codebase. I actually think that is healthy to admit.

MindMapVault started with MongoDB. Later I added Stoolap. Later I added SQL-oriented paths and those _sql.rs files. If you do not stop everything and rewrite the whole project from zero every time the architecture evolves, signs of the older path stay visible.

That is not a moral failure. That is what real software looks like.

You can often read the timeline directly from the repo:

frontend_app/        hosted app UI, editor, crypto helpers, vault flows
frontend_www/        marketing site, release notes, public blog
desktop/src-tauri/   local desktop shell and native packaging
backend/src/         auth, routes, storage, DB adapters, upload flows
scripts/             regression runners, banner rendering, deployment helpers

Then inside the backend there is another layer of history:

  • older MongoDB-oriented paths
  • later SQL and Stoolap paths
  • route files that had to stay practical while the storage model evolved

You can absolutely see that evolution if you read the repo for long enough. I am fine with that. Every long-running project carries some residue of its earlier decisions.

Practical first, perfect never

I like practical things. I like jobs done.

That preference is visible in the code.

Some parts are tidy and stable. Some parts are tactical. Some parts are not how I would design them in a greenfield rewrite. But a real product is not rebuilt from first principles every Tuesday morning.

There is a version of software advice that pretends all good code emerges from calm, linear planning. That is not how most product work happens.

Real code grows under pressure from:

  • feature delivery
  • production bugs
  • changed infrastructure
  • new storage backends
  • packaging and deployment headaches
  • the need to keep existing users working while the internals evolve

So yes, theoretically clean code and real-life code are often different things.

The goal was never to make MindMapVault look like a textbook. The goal was to keep it understandable enough, safe enough, and changeable enough while the product kept moving.

Where the rules were bent

There are places where the project is cleaner than average, and places where it absolutely is not.

The most visible compromises are usually these:

  1. Boundaries are not always perfect

Some concerns leak across layers because the fastest safe fix was not always the prettiest abstraction.

  1. React hygiene is not always textbook

There are places with deliberate lint-rule exceptions or dependency-array compromises because stable behavior in a real flow mattered more than satisfying the purest interpretation of the rule.

  1. Style consistency is uneven

Some modules were written during calmer phases. Others were written during "let me finally this *** and go to the bed" phases. That difference is visible.

I would rather say that openly than write fake architecture prose around it.

Copilot changed the texture of the code too

Another honest point: code does not look like it did five years ago.

This project carries signs of Copilot use and, more broadly, LLM-assisted development. That is real now. We should stop pretending otherwise.

Sometimes that means faster scaffolding. Sometimes it means a strange but useful first draft. Sometimes it means the code gets a little more chaotic or stylistically mixed than it would with one human colleague writing every line in one voice.

But there is another side to that trade-off.

LLMs are also good at searching through that mess, finding the right file, spotting a broken path, or repairing a repeated pattern faster than a human might. In that sense, the code is not only written differently now. It is also maintained differently now.

I think we have to accept both sides:

  • the Copilot touch is visible
  • some generated structure is less elegant than an ideal hand-crafted version
  • but the same tooling also makes large, messy codebases easier to search, patch, and recover
  • and different coding styles are visible in a single project even one man´s hand (and a robot´s hand)

That is part of modern software reality now.

What actually kept the project under control

The workflow was practical, not ceremonial.

  • tsc and production builds were the first fast safety net.
  • backend regression runs through WSL and Python scripts were the real "did I break the product" check
  • dependency audits were hygiene, not proof of quality
  • security-sensitive changes were validated by behavior, not by vibes

On the Python side I leaned on repeatable checks like:

  • scripts/backend_regression_test.py
  • scripts/attachement_regression_test.py
  • scripts/shared_regression_test.py
  • scripts/production_functional_test.py

For burst and stress behavior I also used helpers like:

  • scripts/load_test_stoolap.py
  • scripts/production_burst_test.py
  • scripts/crud_burst_runner.py

That does not make the project magically clean. It just means there were repeatable ways to keep reality in view.

Passing lint does not prove good architecture. A green build does not prove good UX. An audit does not prove safe design.

But together, those checks helped keep the project from drifting too far into chaos.

The maintainability standard I actually believe in

For me, maintainability is not "could this win a code-style argument on the internet?"

It is more practical:

  • can I still understand this at 2 AM during a bug
  • can I change one thing without breaking five others
  • can I trace a storage or auth path end to end
  • can I ship a fix without turning it into a rewrite

That is the bar I care about.

MindMapVault is not pristine, and I do not need to pretend it is. It is a real product with a real timeline, visible scars, and a codebase that shows both human shortcuts and AI-era development habits.

I am okay with that.

What matters is that the important parts stay understandable, the dangerous parts stay guarded, and the project keeps moving without collapsing under its own history.

Why I built a CLI tool to kill my own ideas

2026-04-30 04:57:28

We’ve all been there. You get a "brilliant" idea at 11 PM, and by midnight, you’re already git init-ing. You spend the next three weekends wrestling with state management, CSS, and deployment pipelines, only to realize a month later that:

The problem wasn't actually that painful for anyone else.

A giant incumbent already solves this perfectly.

The distribution hurdle is actually a brick wall.

As a product developer, I’ve realized that while AI has made the cost of building converge toward zero, the cost of building the wrong thing remains as high as ever.

To solve my own "idea-to-code" impulsivity, I built Shakedown.

The Concept
Socratic Grilling as a Reasoning Primitive
Shakedown isn’t a project management tool or a simple checklist. It’s a CLI skill designed to act as a "pressure test" for your concepts before you write a single line of code.

Instead of just listing features, it forces you into a Socratic dialogue. It uses LLM-backed reasoning to find the holes in your logic that your "creator’s bias" usually ignores.

How it works
The tool follows a specific "grilling" protocol to move an idea from a vague concept to a definitive signal:

  1. Socratic Grilling
    The system doesn't just agree with you. It asks uncomfortable questions. If you say you’re building "another Task App," it will ask why existing habits aren't enough and where exactly the friction lies in the status quo.

  2. Landscape Research
    It attempts to map out the existing territory. It identifies incumbents and potential "hidden" competitors that you might have missed in your initial excitement.

  3. Differentiation and Adoption
    This is the "so what?" phase. It challenges you on why a user would actually switch. Is the value proposition 10x better, or just 10% different?

  4. The Output: Pursue / Pivot / Kill
    At the end of the session, you get a cold, hard assessment:

Pursue: The logic holds up; the moat is clear.

Pivot: The problem is real, but your current solution is flawed.

Kill: Save your weekends. This isn't the one.

Why use a CLI for this?
Most validation frameworks live in spreadsheets or Notion docs, which feels like "work." By moving this into the CLI, it stays within the developer's natural environment. It’s a "pre-flight check" that lives right next to your compiler.

Feedback
I’ve open-sourced the logic and the skill definitions. If you’re the kind of person who starts coding too fast, I’d love for you to give it a spin and see if it can "kill" your next bad idea before it kills your free time.

Check out the repo here: https://github.com/tracekc/super-pm/blob/main/shakedown/skill.md

How do you usually "shakedown" your ideas before building? Let’s discuss in the comments.

Orchestrating the Future with the Agentic Enterprise

2026-04-30 04:55:56

This is a submission for the Google Cloud NEXT Writing Challenge

The landscape of Artificial Intelligence has shifted. We have moved past the era of simple AI chatbots that merely answer questions to the dawn of AI Agents that achieve goals. At Google Cloud NEXT, the vision of the "Agentic Enterprise" was unveiled—a future where every employee becomes a director of a boundless autonomous workforce. The core of this transformation lies in bridging the gap between the promise of AI and the daily reality of enterprise complexity.

The Three Pillars of an AI Agent

To understand this shift, we must define what makes an agent truly "agentic." Unlike traditional LLMs, an enterprise agent requires three core capabilities:

  • Context: A deep understanding of unique business information and workflows.
  • Reasoning: The ability to break down complex, multi-step goals into actionable plans.
  • Orchestration: The power to act across different tools and systems to get the job done.

Unified Intelligence: Workspace and Gemini Enterprise

The most significant takeaway from the session is the integration of Workspace Intelligence and Gemini Enterprise.

  • Workspace Intelligence turns scattered emails, chats, and files into a cohesive "Knowledge Graph," providing the real-time context of daily collaboration.
  • Gemini Enterprise acts as the organization's central nervous system, connecting this human context with structured data from CRMs, ERPs, and even third-party platforms like Microsoft 365.

Key Innovations for the Modern Workforce

Several groundbreaking tools were introduced to facilitate this new way of working:

  • Enhanced Agent Designer: A no-code interface that allows anyone to build sophisticated agents using natural language, blending generative creativity with deterministic business logic.
  • Long-Running Agents: These autonomous nodes can operate for hours or days in secure sandboxes, handling mission-critical tasks without constant human supervision.
  • Gemini Projects & Shared Chats: Transitioning AI from a private assistant to a "multiplayer" experience, where teams and agents co-create in a shared, transparent environment.

Trust and Sovereignty: Turning Shadow AI into Managed AI

For the enterprise, power without control is a liability. Google Cloud addresses this through a robust governance framework:

  • Agent Identity: Enforcing the principle of least privilege for every digital worker.
  • Agent Registry & Gateway: Providing IT teams with total visibility and centralized management of every agent in the ecosystem.
  • Data Sovereignty: Ensuring that "your data is your data." Google guarantees that enterprise information is never used to train global models or viewed by humans without explicit permission.

Conclusion: From Managing Tasks to Directing Outcomes

The "Google Cloud NEXT Writing Challenge" highlights a fundamental truth: the tools we choose today define how we innovate tomorrow. By leveraging an open ecosystem—supported by partners like ServiceNow and Oracle—Google is not just adding AI to existing apps; it is building a new foundation for work. We are no longer just doing work; we are directing it.