MoreRSS

site iconThe Practical DeveloperModify

A constructive and inclusive social network for software developers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of The Practical Developer

Information Wants to be Free

2026-04-29 02:00:00

"On the one hand, information wants to be expensive, because it's so valuable. On the other hand, information wants to be free, because the cost of getting it out is getting lower and lower all the time."
— Stewart Brand

"All generally useful information should be free."
— Richard Stallman

Trail is open-source because we believe useful information should be free. That is the reason. Not strategy, not distribution, not growth.

The problem Trail addresses is real. More teams are building with AI every month. More decisions are getting lost in chat. More scope is drifting without anyone noticing. More systems are accumulating undocumented assumptions that will eventually cost someone a painful week to untangle. I have experienced that cost. Trail exists so I do not have to go through it again.

But the decision to make it free isn't driven by the problem. It's based on the belief that if a method makes work more reliable and helps people avoid avoidable failure, there's no reason to restrict access. Putting that behind a paywall doesn't improve the outcome; it simply limits who can benefit from it.

Trail isn't software you buy. It is a framework you implement. The documentation and methodology are licensed under Creative Commons Attribution 4.0, and the scaffold, including folder structure and template files, is MIT. This means you can use it, adapt it, share it, and build on it, personally or commercially, with proper attribution.

The split is intentional. The approach remains flexible and applicable across various industries, while the scaffold can be dropped directly into real projects without legal issues. Both are as permissive as possible while still ensuring credit for the work.

Trail requires no special tools or specific platform. It uses markdown files stored in folders. You can run it with Git, a shared drive, or any system you already use. The framework fits seamlessly into your environment; it does not replace it.

If Trail works for you, use it. If it breaks, you'll see exactly where and why—that's the point.

Web: trail.venturanomadica.com
GitHub: github.com/Ventura-Nomadica/trail-framework

Photo by Will van Wingerden on Unsplash

I wanted jq with memory, time ranges, and filters. So I built logdive

2026-04-29 02:00:00

Your app is in production. Something broke at 2am. Your options are:

  • grep through a rotated log file, squinting at terminal output.
  • Chain together half a dozen jq pipes until the command line becomes unreadable.
  • Page an SRE to query your observability stack, assuming you have one.
  • Spin up Loki or Elastic locally, spend two hours on config, and then do the actual investigation.

All four of these suck. Either you're limited to flat text tooling, or you're paying for infrastructure complexity you don't need.

logdive is what sits in the gap. It's a single Rust binary. You drop it anywhere, point it at a log file or pipe Docker output into it, and you get a fast, queryable index on your local machine. No daemon. No cloud. No YAML. Just cargo install logdive.

# Ingest logs from a file or pipe from stdin.
logdive ingest --file ./logs/app.log
docker logs my-container | logdive ingest --tag my-container

# Query the index.
logdive query 'level=error AND service=payments last 2h'
logdive query 'message contains "timeout"' --format json | jq

# Inspect what you've indexed.
logdive stats

# Optionally expose a read-only HTTP API for remote querying.
logdive-api --db ./logdive.db --port 4000
curl 'http://127.0.0.1:4000/query?q=level%3Derror&limit=100'

That's the whole product surface.

Why logdive exists

Every backend engineer has hit the wall this is built for: your application emits perfectly good structured JSON logs. Tools for querying that JSON locally are stuck in the extremes:

  • jq is for a single file, one-shot, no memory, no time ranges, no filters-across-files.
  • Loki, Datadog, Elastic, Splunk all demand infrastructure, cost, and configuration that's overkill for a side project, small team, or personal investigation.

The target user is a backend engineer who wants jq with memory, filters, and time ranges — without YAML files, without a running daemon they didn't ask for, without a monthly bill.

Rust makes this credible in a way no other language quite does: a single self-contained binary with no runtime, zero-copy parsing, SQLite bundled directly into the binary, and real concurrency for ingestion. This is the kind of tool Rust is genuinely good at.

Who this is for (and who it isn't)

Good fit:

  • Backend engineers debugging production incidents from local log copies.
  • Small teams without a dedicated observability budget.
  • Anyone who's ever built a 4-stage jq pipeline and wished it was searchable afterward.
  • Folks running Docker locally who want docker logs my-container | logdive ingest and instant querying.
  • CI pipelines that need to grep through structured output of a previous step.

Bad fit (and I'll be explicit about this):

  • Multi-machine, networked indexes. logdive is single-host by design.
  • Real-time log tailing / tail -f style follow mode. Not in v0.1.0.
  • Anything needing authentication on the HTTP endpoint. The v1 API assumes the network layer handles access control.
  • Massive enterprise-scale log volumes. SQLite handles a lot, but if you're indexing 100GB/day, you want Loki.
  • Non-JSON log formats (plaintext, logfmt, syslog). v1 is JSON-only.

The whole scope is deliberately small. v0.1.0 ships what a side project or small team needs, nothing more.

The query language

Small enough to fit in your head, expressive enough to be useful:

level=error
level=error AND service=payments
message contains "database timeout"
level=error last 2h
tag=api AND status > 499 since 2026-04-15
user_id=4812 AND duration_ms > 500

Operators: =, !=, >, <, CONTAINS. Time ranges: last Nm/Nh/Nd or since <datetime>. Clauses chain with AND. Known fields (timestamp, level, message, tag) hit SQLite indexes directly. Unknown fields go through json_extract() on a JSON blob — slower but works for arbitrary JSON shapes.

No OR in v0.1.0 — it's the single biggest v1 non-goal. AND covers the dominant query pattern, and adding OR requires a two-level grammar plus precedence handling that would roughly double the parser. Deferred to v2 deliberately.

Under the hood

For readers who like the implementation details:

  • Three-crate Cargo workspace. logdive-core is pure library (parser, indexer, query engine — no I/O at module level), logdive is the CLI binary, logdive-api is the HTTP server binary. Each is independently publishable.
  • SQLite via rusqlite with the bundled feature. Zero infrastructure, battle-tested, ships inside the binary at ~500KB.
  • Hybrid storage. Known fields (timestamp, level, message, tag) are real indexed columns. Everything else is stored in a JSON blob queryable via SQLite's json_extract(). This is the only way to handle arbitrary JSON shapes without a schema-bound design.
  • Hand-written recursive descent query parser. ~200 lines of Rust enums. No parser-combinator dependency. Better error messages than generated parsers, and honestly, it was one of the most satisfying parts of the project to write.
  • Blake3 row hashing for deduplication. INSERT OR IGNORE on a unique hash column means re-ingesting a file (or dealing with log rotation) is free. No duplicate rows. The hash is cheap — negligible per-line cost.
  • Batched inserts at 1000 rows per transaction. Standard SQLite throughput pattern.
  • Axum HTTP API. Read-only via SQLITE_OPEN_READ_ONLY, blocking SQLite work wrapped in tokio::task::spawn_blocking so it doesn't block Tokio's worker threads, graceful shutdown on Ctrl-C and SIGTERM.

The full architecture is documented in the repo's README.

Performance

Representative numbers on an Acer Nitro 5 laptop, measured via criterion:

Operation Throughput / Latency
Ingestion, batched insert (10k rows) ~210k lines/sec
Ingestion, parse + insert end-to-end (10k rows) ~166k lines/sec
Query on known field, empty result (100k rows) ~17 μs
Query on known field, 25% match (100k rows, LIMIT 1000) ~39 ms
Query on JSON field, 25% match (100k rows, LIMIT 1000) ~3.6 ms
Query on JSON field, 0% match (full scan, 100k rows) ~68 ms
CONTAINS full-table scan (100k rows) ~36–40 ms
3-clause AND chain (100k rows) ~22 ms

Release binaries at 3.7 MB (logdive) and 4.1 MB (logdive-api) — well under the 10 MB target, thanks to LTO + strip + panic=abort in the release profile.

Run cargo bench in the repo to get your own baseline.

Tradeoffs worth being honest about

A few design decisions have real downsides that users should know:

Timestamps compared as lexical TEXT. This is correct for ISO-8601-shaped timestamps (they sort chronologically when compared as strings), but any exotic timestamp format will silently misorder. Default timestamps from modern structured loggers are ISO-8601, so in practice this is rarely a problem — but it's a real sharp edge.

No index on json_extract expressions. Queries on unknown JSON fields fall back to full table scans. 100k rows scans in ~68ms which is still fast, but if you're hammering the same unknown field constantly, it's slower than a known-column query by 1000x. A future version could promote frequently-queried JSON fields to real columns.

Single-host only. There's no clustering story. If you need distributed query across machines, you want Loki or Elastic.

No authentication on the HTTP API. Deliberate for v1. If you expose logdive-api beyond localhost, put a reverse proxy with auth in front of it. The binary defaults to binding 127.0.0.1 for a reason.

Install and try it

From crates.io:

cargo install logdive logdive-api

From prebuilt binaries: grab the tarball for your platform from the GitHub Releases page. Linux x86_64 and macOS arm64 are built on every tag push.

From source:

git clone https://github.com/Aryagorjipour/logdive
cd logdive
cargo build --release

MSRV: Rust 1.85 (edition 2024). Dual-licensed MIT OR Apache-2.0.

Try the included examples:

logdive --db /tmp/demo.db ingest --file examples/app.log
logdive --db /tmp/demo.db ingest --file examples/nginx.log
logdive --db /tmp/demo.db stats
logdive --db /tmp/demo.db query 'level=error AND service=payments'

Call for contributions

v0.1.0 is deliberately small, but there's a clear set of high-value v2 features that would benefit hugely from community help. If you want to contribute, these are genuine needs:

OR operator in the query language. The single most-requested feature implied by v1's scope. Extends the parser to handle a two-level grammar (clauses joined by OR, OR-groups joined by AND) and the SQL generator to emit parenthesized disjunctions. Non-trivial but well-scoped.

Non-JSON log format support. Plaintext and logfmt are the obvious next formats. Would plug in as additional parser implementations alongside parse_line in logdive-core.

Follow mode (-f / tail-and-index). Watch a log file for new lines and ingest them as they appear. Good use of tokio::fs and notify crate patterns.

A browser UI. The HTTP API is ready for one — someone with frontend chops could build a single-page React/Svelte/HTMX UI that talks to logdive-api and gives people a browser-based query interface.

Generated columns for frequently-queried JSON fields. The big performance win. Would let users mark certain JSON fields as "promote to indexed column" and get known-field query performance for those.

Benchmarks on more hardware. If you run the existing cargo bench suite on your machine, an issue/PR updating the README's Performance section with a broader sample would be genuinely useful.

Docker image for the HTTP API. Dockerfile for logdive-api with a volume mount for the index database. Natural next step for users who want to run the API as a service.

The repo has CI, benchmarks, clean test coverage, and a documented contribution workflow. Issues and pull requests at github.com/Aryagorjipour/logdive.

A note on context

logdive started as the final project in a Rust learning journey. The framing mattered — I wanted a project that was small enough to finish, demanding enough to exercise real Rust (parsers, SQLite, async, concurrency, CLI, HTTP), and useful enough that I'd actually keep using it afterward.

What I underestimated: how much of the effort lives in the parts that aren't writing code. Setting up a clean workspace. Choosing the right abstractions between core and binaries. Writing benchmarks that actually measure what you think they measure. Testing an HTTP server with tower::ServiceExt::oneshot. Packaging a three-crate workspace for crates.io when one crate depends on another and you're publishing for the first time. Each of these had at least one subtle gotcha.

The project is open source because the next person hitting the "jq vs Datadog" wall might as well benefit from it, and because Rust has given me enough that I want to give something back.

Links

About

Arya Gorjipour — backend engineer, Rust learner, logdive maintainer.

Issues, bug reports, and pull requests welcome. If you end up using logdive to debug a real production incident, I'd love to hear about it.

From Code Generation to Message Injection: Richard Seroter's AI Evolution (and What It Means for Us)

2026-04-29 01:59:49

From Code Generation to Message Injection: Richard Seroter's AI Evolution (and What It Means for Us)

By HARDIN

Richard Seroter has a habit of building things two years before the rest of us are ready to talk about them. In 2024, he asked: can we store prompts in source control and let an LLM generate the entire app at build time? Last week, he asked something newer and, honestly, a bit more unsettling: should we call LLMs directly from our messaging middleware?

One experiment is about generating software. The other is about injecting intelligence into the veins of your data flow. Put them side by side, and you don't just see the evolution of one guy's thinking. You see where the whole industry is quietly heading—whether we're ready or not.

I've been dissecting Google Cloud Next '26 for weeks. But sometimes the most revealing stuff isn't in the keynotes. It's buried in a blog post where someone asks, "Should we?" and leaves the answer hanging.

Let's pull that thread.

1. The 2024 Experiment: Prompts as Source Code

I already covered this in depth in my previous article, but the core architecture was beautifully simple:

Source Control (prompts.json)  
    │  
    ▼  
Spring AI + Gemini 1.5 Flash  
    │  
    ▼  
Generated code (Node.js/Python + Dockerfile)  
    │  
    ▼  
Cloud Run deployment

The philosophy: treat prompts as the source of truth, and let AI do the implementation. Richard even built a working GitHub repo to prove it. The AI pumped out everything from index.js to Dockerfile to package.json. The output was non-deterministic, untested, and completely unregulated. Richard knew this. He called it "bonkers" and explicitly warned against using it for real workloads.

You can explore the entire project here:

GitHub logo rseroter / Gemini-code-generator

Java application that generates code using prompts fed to the Google Gemini LLM

Gemini-code-generator

This is a Spring Boot app that uses Spring AI, Google Gemini 1.5 Flash and various Google Cloud services to take prompts, generate code, and publish the generated app to Cloud Run.

You can customize and use this without Google Cloud services, and presumably with a different model!

See this blog post for a full walkthrough.






But here's the thing he actually proved, even if he didn't shout it: if you can describe what you want precisely enough, an LLM can turn intent into executable artifact. That idea stuck. That's the seed of agentic software engineering you see everywhere now.

Honestly, re-reading his 2024 post gave me a weird nostalgia—like finding the first commit of a framework that now runs half the internet.

2. The 2026 Experiment: AI Inference Inside the Message Bus

Fast forward two years. Richard's latest post explores a new Google Cloud feature: AI Inference SMT (Single Message Transform) for Pub/Sub.

The architecture is almost embarrassingly simple:

Pub/Sub Topic  
    │  
    ▼  
AI Inference SMT (calls Gemini)  
    │  
    ▼  
Pub/Sub Subscription (enriched/altered message)

No custom subscriber. No Cloud Run service. No boilerplate code. You configure a template that tells the SMT what to ask the LLM, and every message passing through Pub/Sub gets enriched, translated, summarized, or routed. It feels like magic—and like most magic, it probably comes with a hidden price tag.

The philosophy: intelligence as infrastructure, not application code. Instead of writing a service that calls an LLM, you declare, "this message topic shall be intelligent," and the platform handles the rest.

Richard's post raises a question that's going to age like fine wine: is this a good idea? He compares it to storing business logic in database triggers—powerful, but a maintenance nightmare if abused. He's cautious. I'd say he's right, and maybe even understating the danger. More on that in a bit.

3. Architecture Comparison: Two Sides of the Same Coin

At first glance, these two experiments look like completely different animals. They're not. They're two halves of the same coin: declarative AI-driven automation.

Dimension 2024 (Code Generation) 2026 (Message Injection)
Trigger CI/CD build pipeline Every single message arrival
Scope Static artifact generation Dynamic content transformation
Output Application files Modified messages
Latency Minutes (build time) Milliseconds (real-time)
Governance Manual review of generated code Template configuration + IAM
Risk Non-deterministic builds, no testing Unexplained message mutations,
AI hallucinations in the data plane

The 2024 experiment automates what gets deployed. The 2026 experiment automates what happens to data in flight. Put them together, and you've got a pipeline where intent flows through your message bus and comes out the other side as a running service. That's not science fiction. That's today.

4. Building the Bridge: The Agentic Message-to-Deployment Pipeline

So, naturally, I started sketching. What happens if we connect these two ideas? I call the result the Agentic Message-to-Deployment Pipeline.

The Scenario

A business unit sends a single message: "We need a microservice that translates customer feedback from any language to English and stores it in Firestore."

How It Would Work

  1. Trigger: Business user sends a natural language request to Pub/Sub.
  2. Enrichment: AI Inference SMT adds architectural constraints and routes to Generator Agent.
  3. Generation: Generator Agent (ADK + Gemini 3.1 Pro) builds full application code using Memory Bank for context.
  4. Evaluation: Evaluator Agent runs tests and security scans; Red Agent attacks, Green Agent fixes.
  5. Deployment: Deployer Agent ships to Cloud Run via MCP.
  6. Notification: A message is published back to Pub/Sub confirming the live service URL.

The whole chain starts and ends with a message. The same Pub/Sub topic where requests arrive is also where results are published. The intelligence doesn't live in a monolithic pipeline script—it lives in the SMT layer and the agent mesh.

When I sketched this out, I sat back and stared at it for a minute. It's elegant. It's terrifying. It's probably where we're all heading.

5. The Dark Side: Why This Should Honestly Keep You Up at Night

I've been writing dark jokes about AI automation for a month. But this combination genuinely made me pause.

1. Message mutations are invisible bugs.
When an SMT silently modifies a message using AI, how do you even start debugging? The original message is gone. The AI's decision is a black box. If the LLM hallucinates a translation or misinterprets a feature request, you ship the wrong service. Try explaining that one to the business.

2. The attack surface just exploded.
In 2024, a malicious prompt could generate bad code. In 2026, a malicious message can trigger an entire agentic deployment pipeline. Someone with Pub/Sub publish permissions can potentially spin up new services, modify data, or exfiltrate information—all through natural language. "Please delete all production databases." Said politely. Executed instantly.

3. The platform is the developer now.
We already saw that 75% of Google's new code is AI-generated. With AI Inference SMT plus agentic deployment, that percentage doesn't stop at 75. It inches toward 100. The developer becomes a reviewer, a policy setter, a person who says "yes" or "no" to an agent's proposal. I wrote about this role shift in my Developer Keynote analysis, and honestly, it's accelerating faster than I expected.

6. What Richard's Evolution Tells Us About Our Own Careers

Richard Seroter's journey from 2024 to 2026 isn't just his—it's a mirror for the whole industry.

- **2024**: "Let's see if AI can write code at all."
- **2025**: "Let's put AI in the CI/CD pipeline."
- **2026**: "Let's put AI in the data plane. And the deployment plane. And the governance plane."

I respect the hell out of Richard for publishing both experiments with zero pretense. The 2024 repo is humble—4 commits, 4 stars. The 2026 blog post is cautious, full of "should we?" questions. That kind of intellectual honesty is getting rare in tech evangelism. He's not selling. He's exploring. There's a difference.

But don't mistake the modesty. These two experiments together form a blueprint. One that says: AI should generate code, and AI should decide when to generate code, and AI should route the decision to generate code through your message bus. It's turtles all the way down, and the turtle is Gemini.

7. What You Should Actually Do About This

  1. Read Richard's posts. Both of them. They're short, honest, and technically precise. It'll take you maybe 20 minutes total.
  2. Experiment with AI Inference SMT, but start stupidly small. Simple enrichments. Nothing critical. Get a feel for the footguns before you aim them at production.
  3. Sketch your own "bridge." What happens when your team's Pub/Sub topic can trigger an agentic deployment? What policies need to exist before that's even remotely safe? Draw it out. The act of drawing it will reveal the gaps.
  4. Never forget the Red Agent. Whatever you build, make something try to break it. The Green Agent is useless without a Red Agent keeping it honest. I learned that from Next '26, and I'm going to keep saying it until it sticks.

What do you think? Are AI Inference SMTs a brilliant abstraction or a future maintenance nightmare? Would you let a message bus trigger your deployment pipeline? Drop your experience—or your darkest prediction—in the comments. I read them all, and honestly, some of you scare me more than the AI does.

Sources

Richard Seroter's Work

Google Cloud Documentation

My Previous Analysis






The Legacy of Bamini: The Font That Defined Tamil Digital Typography

2026-04-29 01:55:16

Before the era of universal Unicode standards, typing in Tamil on a computer was a complex challenge. In the early 1990s, one font emerged as the definitive solution for millions of users in Sri Lanka and Tamil Nadu: Bamini.

Developed by S. Kalyanasundaram, Bamini wasn't just a typeface; it was a bridge that brought the ancient Tamil script into the digital age.

Why Was Bamini So Popular?
In the 1990s, operating systems didn't naturally "speak" Tamil. Bamini solved this by using an ASCII-based mapping system.

By mimicking the layout of traditional Tamil typewriters, it allowed professional typists to transition to PCs without any extra training. If you knew how to use a physical typewriter, you already knew how to type in Bamini.

Key Features of the Bamini Font
High Readability: Designed with bold, block-like strokes, it was optimized for the low-resolution screens and dot-matrix printers of the '90s.

The "jkpo;" Logic: Because it’s a legacy font, typing "jkpo;" on your QWERTY keyboard magically appears as "தமிழ்" on your screen.

DTP Standard: It became the de facto choice for newspapers, school textbooks, and government forms across Northern Sri Lanka.

The Technical Side: How It Works
Bamini uses a proprietary encoding scheme within the 128–255 ASCII range. Unlike modern Unicode fonts (like Latha or Nirmala), Bamini text requires the specific font file to be installed on the viewer's device. Without it, the text simply looks like a string of random Latin characters.

Bamini in the Modern World
While the world has moved to Unicode for web and mobile compatibility, Bamini remains a vital tool for:

Legacy Archives: Thousands of historical documents and older PDFs are still stored in Bamini.

Professional Printing: Many high-speed typists still prefer the Bamini layout for its ergonomic efficiency.

Graphic Design: It carries a "retro" aesthetic that is highly sought after for specific branding projects.

Looking to Download the Bamini Font?
If you need to view legacy documents, convert old files, or simply want that classic Tamil look for your next design project, you can download the full family of Bamini fonts (including variations from 01 to 87).

Click Here to Download Bamini Tamil Font

AnalogJS 2.5: Runtime i18n, Fast Compilation Mode, Hierarchical Content, and more!

2026-04-29 01:53:29

We are excited to announce the release of Analog 2.5! This release introduces first-class runtime i18n support, a new fast compilation mode, hierarchical content with recursive prerendering, and a new AI integrations guide. Let's dive in.

Runtime i18n 🌍

Analog 2.5 adds first-class runtime internationalization built on Angular's $localize. With a single build, your app can serve every supported locale, with the active locale detected on both the server and the client.

Setup

Install @angular/localize and import the polyfill in your application entry:

// src/main.ts
import '@angular/localize/init';

Configure the supported locales on the analog() plugin in vite.config.ts. This enables locale detection on SSR and exposes the values as build-time globals so the application config doesn't have to repeat them:

// vite.config.ts
import { defineConfig } from 'vite';
import analog from '@analogjs/platform';

export default defineConfig(() => ({
  plugins: [
    analog({
      i18n: {
        defaultLocale: 'en',
        locales: ['en', 'fr', 'de'],
      },
    }),
  ],
}));

Then provide the runtime translation loader via provideI18n():

// src/app/app.config.ts
import { ApplicationConfig } from '@angular/core';
import { provideFileRouter } from '@analogjs/router';
import { provideI18n } from '@analogjs/router/i18n';

export const appConfig: ApplicationConfig = {
  providers: [
    provideFileRouter(),
    provideI18n({
      loader: async (locale) => {
        const translations = await import(`../i18n/${locale}.json`);
        return translations.default;
      },
    }),
  ],
};

Mark templates with the standard Angular i18n attribute:

<h1 i18n="@@greeting">Hello</h1>

On SSR, the locale is detected from the URL path prefix (/fr/about) or the Accept-Language header. On the client, it's read from window.location.pathname. The loader function you provide is called once at startup with the active locale, and $localize handles the rest.

Message Extraction

The i18n plugin option also accepts an extract config that scans the compiled output for $localize tagged templates and writes a translation source file at the end of a production build:

analog({
  i18n: {
    defaultLocale: 'en',
    locales: ['en', 'fr', 'de'],
    extract: {
      format: 'json',
      outFile: 'src/i18n/messages.json',
    },
  },
})

format supports json, xliff, xliff2, and xmb.

Switching Locales

A small helper makes runtime locale switching a one-liner:

import { Component } from '@angular/core';
import { injectSwitchLocale } from '@analogjs/router/i18n';

@Component({ /* ... */ })
export class LanguagePickerComponent {
  switchLocale = injectSwitchLocale();
}

Calling switchLocale('fr') navigates to the equivalent /fr URL and re-evaluates templates with the new translations. Check out the i18n docs for the full guide.

Fast Compilation Mode ⚡️

Analog 2.5 introduces fastCompile, a new option on the analog() Vite plugin that enables a Vite-native Angular compilation pipeline combined with OXC for fast parsing. The most noticeable differences are with cold starts and HMR.

Enable it in your Vite config:

// vite.config.ts
import { defineConfig } from 'vite';
import analog from '@analogjs/platform';

export default defineConfig(() => ({
  plugins: [
    analog({
      fastCompile: true,
    }),
  ],
}));

A companion fastCompileMode option controls the compilation output:

  • 'full' (default): Emits final Ivy definitions for application builds.
  • 'partial': Emits partial declarations for library publishing.
analog({
  fastCompile: true,
  fastCompileMode: 'partial',
})

Fast compilation mode allows you to offload type checking as a separate process, keeping the development workflow streamlined, especially when building content-focused sites and applications.

Hierarchical Content 📚

Support for hierarchical content and nested content directories has been improved for more complex documentation structure:

src/content/docs/
  getting-started/
    welcome.md
    first-upload.md
  assets/
    upload.md

Content resolvers in Analog previously only handled top level folders and slugs in the src/content directory, but is now more flexible to handle nested subdirectories with existing APIs.

Slash-containing slugs

A catch-all route now resolves nested files correctly:

// src/app/pages/docs/[...slug].page.ts
import { Component } from '@angular/core';
import { injectContent, MarkdownComponent } from '@analogjs/content';

@Component({
  standalone: true,
  imports: [MarkdownComponent],
  template: `
    @if (post(); as post) {
      <analog-markdown [content]="post.content" />
    }
  `,
})
export default class DocsPageComponent {
  post = injectContent({ subdirectory: 'docs' });
}

Recursive prerendering

A new recursive option was added, and each content file now exposes a relativePath so transforms can identify identically-named files across subdirectories:

// vite.config.ts
analog({
  prerender: {
    routes: async () => [
      {
        contentDir: 'src/content/docs',
        recursive: true,
        transform: (file) => {
          const segment = file.relativePath
            ? `${file.relativePath}/`
            : '';
          return `/docs/${segment}${file.name}`;
        },
      },
    ],
  },
})

For backward compatibility, recursive defaults to false.

AI Integrations Guide 🤖

Analog 2.5 ships with a new AI integrations guide that walks through wiring LLM providers into Analog's API routes — streaming responses, server-side keys, and the usual use cases. If you're adding AI features to an app, that guide is the starting point.

We are also looking into adding official skills for Analog features and conventions.

Upgrading

To upgrade to Analog 2.5, run:

ng update @analogjs/platform@latest

If you're using Nx, run:

nx migrate @analogjs/platform@latest

For the full list of changes, see the changelog.

Partner with Analog 🤝

Continued development of Analog would not be possible without our partners and community. Thanks to our official deployment partner Zerops, code review partner CodeRabbit, and longtime supporters Snyder Technologies, Nx, and House of Angular, and many other backers of the project.

Find out more information on our partnership opportunities or reach out directly to partnerships[at]analogjs.org.

Join the Community 🥇

If you enjoyed this post, click the ❤️ so other people will see it. Follow AnalogJS and Brandon Roberts on Bluesky, and subscribe to my YouTube Channel for more content!

I Built Skills That Let AI Agents Query 14 European Government Registries

2026-04-29 01:50:50

TL;DR

  • I built 14 Apify actors that scrape official government registries across Poland, Spain, Austria, and France
  • To make them accessible from AI coding agents, I packaged them as 6 open-source skills in the getregdata repo
  • One install command gives you access across 45+ AI agents - Claude Code, GitHub Copilot, Cline, Codex, Amp, and more
  • The skills provide registry-specific analysis frameworks and let you query live data through the Apify API
  • Install: skills add Nolpak14/getregdata -g -y

Why Government Registry Data Is Hard

If you have ever tried to programmatically access European government registries, you know the pain. Each country has its own set of portals, each with its own quirks:

  • Poland's KRS (National Court Register) deliberately anonymizes board member names in its API - returning L****** instead of full names - while the same data is available unredacted in PDF extracts from the same portal
  • Poland's CRBR (Beneficial Owners Register) has no API at all - just a CAPTCHA-protected web form that accepts one company at a time
  • Poland's EKW (Land Registry) blocks all datacenter IPs - you need residential Polish proxies to access property ownership data
  • Spain's BORME (Corporate Gazette) publishes 500+ corporate acts daily as unstructured gazette text with no machine-readable format
  • Austria's Ediktsdatei (Insolvency Publications) requires an IWG license for the official API - but the public web portal has no such restriction
  • France's Societe.com aggregates data from INSEE, INPI, and BODACC behind aggressive anti-bot measures

Each registry is a separate engineering challenge. Different authentication, different anti-scraping measures, different data formats, different legal frameworks.

The Actor Suite: 14 Registries, 4 Countries

Over the past months I built Apify actors for all of these. Each actor handles the specific technical challenges of its registry and returns clean, structured JSON.

Poland (9 registries)

Registry What You Get Records
KNF - Financial Supervision Authority Payment institutions, e-money issuers, lending companies 75,000+
MSiG - Court Gazette (Monitor Sadowy) Bankruptcy declarations, restructuring, liquidation notices 2001-present
KRS - National Court Register Full non-anonymized board member names from PDF extracts 800,000+ companies
KRZ - National Debtor Registry Bankruptcy, restructuring, enforcement proceedings 9 search modes
eKRS - Financial Statements Balance sheets, P&L, assets, equity, revenue, net profit Official filings
EKW - Land Registry Property ownership, mortgages, easements, restrictions 25 million entries
UOKiK - Consumer Protection Court-ruled prohibited contract clauses 7,500+ clauses
CRBR - Beneficial Owners Ultimate beneficial owners (UBO) for AML compliance All KRS entities
BDO - Waste Registry Waste management entity registrations and permits 674,000+ entities

Spain (2 registries)

Registry What You Get
BORME - Corporate Gazette Daily incorporations, officer appointments, capital changes, dissolutions
Registro Mercantil - Company Directory NIF, officers, CNAE codes, legal form, IRUS, EUID

Austria (2 registries)

Registry What You Get
Ediktsdatei - Insolvency Publications Bankruptcy, restructuring, debt regulation proceedings - no IWG license needed
WKO - Chamber of Commerce Directory 620,000+ businesses with phone, email, website, trade licenses

France (1 registry)

Registry What You Get
Societe.com - Company Data SIREN, directors with roles, financials, shareholders, subsidiaries, brands

All actors are pay-per-result with no subscriptions. Pricing ranges from $0.003 to $0.01 per result depending on the registry. A free Apify account includes $5 credits - enough for 100-1,600 lookups.

From Actors to AI Agent Skills

The actors work great when you know exactly which registry to query. But for many users, the real question is: "I need to check this Polish company - where do I even start?"

That is where skills come in. Skills are modular packages that give AI coding agents specialized knowledge. When you install a skill, your agent learns how to handle domain-specific tasks - in this case, querying European government registries.

I packaged the 14 actors into 6 skills organized by workflow:

Skill What It Does Registries Used
regdata Router - identifies which registry you need All 14
regdata-kyc-aml Beneficial ownership, financial licenses, board members CRBR, KNF, KRS, Societe.com, WKO, Spain Dir
regdata-credit-risk Insolvency proceedings, court gazette, financial statements KRZ, MSiG, eKRS, Ediktsdatei, BORME
regdata-property Polish land registry - ownership, mortgages, encumbrances EKW, KRS, CRBR
regdata-compliance Prohibited contract clauses, waste management registrations UOKiK, BDO
regdata-lead-gen Company directors, business contacts, new incorporations KRS, WKO, Spain Dir, Societe.com, BORME

Each skill includes:

  • Analysis frameworks specific to the registries it covers (e.g., how to interpret Polish land registry sections, what Austrian insolvency proceeding types mean)
  • Data extraction instructions for both MCP mode (direct tool calls) and API mode (curl/SDK)
  • Output interpretation guides so you know what the returned data actually means

Installation

One command installs all 6 skills globally across every supported agent:

skills add Nolpak14/getregdata -g -y

This works with 45+ AI agents including Claude Code, GitHub Copilot, Cline, Codex, Amp, Antigravity, and more. The skills are installed to ~/.agents/skills/ and symlinked to agent-specific locations.

How It Works in Practice

Example 1: Check beneficial owners of a Polish company

You open Claude Code (or any supported agent) and type:

Check who owns the Polish company with NIP 5213103635 in CRBR

The regdata-kyc-aml skill activates. It knows that CRBR is Poland's beneficial owners registry and that NIP is the correct identifier. It guides the extraction:

from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")

run = client.actor("regdata/crbr-beneficial-owners-scraper").call(
    run_input={"nip": "5213103635"}
)

for item in client.dataset(run["defaultDatasetId"]).list_items().items:
    print(f"Company: {item['name']}")
    for owner in item.get("beneficialOwners", []):
        print(f"  UBO: {owner['firstName']} {owner['lastName']}")
        print(f"  Ownership: {owner.get('sharePercentage', 'N/A')}%")

If you have the Apify MCP server configured, the agent can run the actor directly through tool calls without writing any code.

Example 2: Monitor Spanish corporate gazette for new incorporations

Search BORME for new company incorporations in Barcelona from last week

The regdata-credit-risk skill activates (BORME corporate acts). It builds the right query:

run = client.actor("regdata/borme-corporate-acts-scraper").call(
    run_input={
        "dateFrom": "2026-04-21",
        "dateTo": "2026-04-28",
        "provinces": ["Barcelona"]
    }
)

Example 3: Extract Polish land registry data

Pull EKW data for property WR1K/00094598/3

The regdata-property skill activates. It knows the KW number format, that EKW requires residential Polish proxies, and how to interpret the four sections (Dzial I-IV):

run = client.actor("regdata/ekw-ksiegi-wieczyste-scraper").call(
    run_input={
        "kwNumbers": ["WR1K/00094598/3"],
        "viewType": "aktualna",
        "sections": ["all"]
    }
)

The skill then explains what each section means - Dzial II for ownership, Dzial III for restrictions and easements, Dzial IV for mortgages - so you can actually interpret the raw data.

Example 4: Get non-anonymized KRS board members

Extract board members for Polish KRS number 0000019193

The regdata-lead-gen skill knows that the official KRS API anonymizes names, and that the actor works around this by extracting full names from PDF court extracts:

run = client.actor("regdata/krs-fullnames-scraper").call(
    run_input={
        "krsNumbers": ["0000019193"],
        "extractType": "aktualny"
    }
)

You get full first and last names of board members, supervisory board, and shareholders - not the L****** that the API returns.

What Makes This Different from a Generic Scraping Skill

The value of domain-specific skills is that they carry registry knowledge that a generic web scraping tool does not have:

  • The skill knows that EKW requires residential Polish proxies (datacenter IPs are blocked)
  • It knows that KRS anonymizes names in the API but not in PDFs
  • It knows that BORME Section A publishes corporate acts while Section B publishes financial data
  • It knows that Austria's Ediktsdatei has Konkursverfahren (bankruptcy) and Sanierungsverfahren (restructuring) and what the difference means
  • It knows that a Polish KW number follows the format CODE/NUMBER/DIGIT (e.g., WR1K/00094598/3)
  • It knows that CRBR defines beneficial ownership as >25% shares or voting rights

This registry-specific knowledge means your AI agent can handle queries correctly without you needing to read documentation for each portal.

Authentication Setup

The skills provide analysis frameworks for free, but extracting live data requires an Apify API token:

export APIFY_TOKEN=apify_api_xxxxx

Get your token by creating a free Apify account - includes $5 credits, enough for hundreds of registry lookups.

Pricing Quick Reference

All actors use pay-per-result pricing. No subscriptions, no minimum commitments.

Registry Cost per Result
KNF, UOKiK, WKO, BORME $0.003
MSiG, BDO $0.004
Ediktsdatei, Spain Dir, Societe.com $0.005
KRZ $0.006
KRS Board, eKRS, CRBR $0.008
EKW $0.01

Typical cost for a full company check (CRBR + KNF + KRS Board): ~$0.019.

Technical Details

The skills repo follows the standard skills ecosystem structure:

getregdata/
  skills/
    regdata/SKILL.md              # Router
    regdata-kyc-aml/SKILL.md      # + references/
    regdata-credit-risk/SKILL.md  # + references/
    regdata-property/SKILL.md     # + references/
    regdata-compliance/SKILL.md   # + references/
    regdata-lead-gen/SKILL.md     # + references/

Each SKILL.md uses YAML frontmatter for trigger matching:

---
name: regdata-kyc-aml
description: "Extract beneficial ownership data from Poland's CRBR registry,
  financial license status from KNF, non-anonymized board members from KRS..."
metadata:
  version: 1.0.0
  author: regdata
  triggers:
    - "CRBR beneficial owner lookup"
    - "check KNF registry"
    - "KRS board members extract"
    - "Polish beneficial owners"
---

Triggers are registry-specific, not generic domain terms. The skill activates when you mention a specific European registry, not when you ask about "KYC" or "compliance" in general.

The references/ subdirectories contain detailed interpretation guides (land registry section breakdowns, insolvency proceeding types across jurisdictions, UOKiK clause categories) that the agent loads on demand.

Source Code and Contributing

The skills are MIT licensed and open source:

GitHub: github.com/Nolpak14/getregdata

The underlying Apify actors are available on the Apify Store under the regdata publisher profile.

If you work with government registries in other EU countries and want to contribute skills for those, PRs are welcome.

What Is Next

  • More registries: Germany (Handelsregister, Insolvenzbekanntmachungen), Italy (Registro Imprese), Netherlands (KVK)
  • Deeper cross-registry workflows: automated multi-country due diligence pipelines
  • Scheduled monitoring: get notified when a company's registry data changes

Install: skills add Nolpak14/getregdata -g -y

Actors: apify.com/regdata

GitHub: github.com/Nolpak14/getregdata

This article is part of the European Registry Data series covering programmatic access to government registries across the EU.