2026-04-29 02:00:00
"On the one hand, information wants to be expensive, because it's so valuable. On the other hand, information wants to be free, because the cost of getting it out is getting lower and lower all the time."
— Stewart Brand"All generally useful information should be free."
— Richard Stallman
Trail is open-source because we believe useful information should be free. That is the reason. Not strategy, not distribution, not growth.
The problem Trail addresses is real. More teams are building with AI every month. More decisions are getting lost in chat. More scope is drifting without anyone noticing. More systems are accumulating undocumented assumptions that will eventually cost someone a painful week to untangle. I have experienced that cost. Trail exists so I do not have to go through it again.
But the decision to make it free isn't driven by the problem. It's based on the belief that if a method makes work more reliable and helps people avoid avoidable failure, there's no reason to restrict access. Putting that behind a paywall doesn't improve the outcome; it simply limits who can benefit from it.
Trail isn't software you buy. It is a framework you implement. The documentation and methodology are licensed under Creative Commons Attribution 4.0, and the scaffold, including folder structure and template files, is MIT. This means you can use it, adapt it, share it, and build on it, personally or commercially, with proper attribution.
The split is intentional. The approach remains flexible and applicable across various industries, while the scaffold can be dropped directly into real projects without legal issues. Both are as permissive as possible while still ensuring credit for the work.
Trail requires no special tools or specific platform. It uses markdown files stored in folders. You can run it with Git, a shared drive, or any system you already use. The framework fits seamlessly into your environment; it does not replace it.
If Trail works for you, use it. If it breaks, you'll see exactly where and why—that's the point.
Web: trail.venturanomadica.com
GitHub: github.com/Ventura-Nomadica/trail-framework
Photo by Will van Wingerden on Unsplash
2026-04-29 02:00:00
Your app is in production. Something broke at 2am. Your options are:
grep through a rotated log file, squinting at terminal output.jq pipes until the command line becomes unreadable.All four of these suck. Either you're limited to flat text tooling, or you're paying for infrastructure complexity you don't need.
logdive is what sits in the gap. It's a single Rust binary. You drop it anywhere, point it at a log file or pipe Docker output into it, and you get a fast, queryable index on your local machine. No daemon. No cloud. No YAML. Just cargo install logdive.
# Ingest logs from a file or pipe from stdin.
logdive ingest --file ./logs/app.log
docker logs my-container | logdive ingest --tag my-container
# Query the index.
logdive query 'level=error AND service=payments last 2h'
logdive query 'message contains "timeout"' --format json | jq
# Inspect what you've indexed.
logdive stats
# Optionally expose a read-only HTTP API for remote querying.
logdive-api --db ./logdive.db --port 4000
curl 'http://127.0.0.1:4000/query?q=level%3Derror&limit=100'
That's the whole product surface.
Every backend engineer has hit the wall this is built for: your application emits perfectly good structured JSON logs. Tools for querying that JSON locally are stuck in the extremes:
jq is for a single file, one-shot, no memory, no time ranges, no filters-across-files.The target user is a backend engineer who wants jq with memory, filters, and time ranges — without YAML files, without a running daemon they didn't ask for, without a monthly bill.
Rust makes this credible in a way no other language quite does: a single self-contained binary with no runtime, zero-copy parsing, SQLite bundled directly into the binary, and real concurrency for ingestion. This is the kind of tool Rust is genuinely good at.
Good fit:
jq pipeline and wished it was searchable afterward.docker logs my-container | logdive ingest and instant querying.Bad fit (and I'll be explicit about this):
tail -f style follow mode. Not in v0.1.0.The whole scope is deliberately small. v0.1.0 ships what a side project or small team needs, nothing more.
Small enough to fit in your head, expressive enough to be useful:
level=error
level=error AND service=payments
message contains "database timeout"
level=error last 2h
tag=api AND status > 499 since 2026-04-15
user_id=4812 AND duration_ms > 500
Operators: =, !=, >, <, CONTAINS. Time ranges: last Nm/Nh/Nd or since <datetime>. Clauses chain with AND. Known fields (timestamp, level, message, tag) hit SQLite indexes directly. Unknown fields go through json_extract() on a JSON blob — slower but works for arbitrary JSON shapes.
No OR in v0.1.0 — it's the single biggest v1 non-goal. AND covers the dominant query pattern, and adding OR requires a two-level grammar plus precedence handling that would roughly double the parser. Deferred to v2 deliberately.
For readers who like the implementation details:
logdive-core is pure library (parser, indexer, query engine — no I/O at module level), logdive is the CLI binary, logdive-api is the HTTP server binary. Each is independently publishable.rusqlite with the bundled feature. Zero infrastructure, battle-tested, ships inside the binary at ~500KB.timestamp, level, message, tag) are real indexed columns. Everything else is stored in a JSON blob queryable via SQLite's json_extract(). This is the only way to handle arbitrary JSON shapes without a schema-bound design.INSERT OR IGNORE on a unique hash column means re-ingesting a file (or dealing with log rotation) is free. No duplicate rows. The hash is cheap — negligible per-line cost.SQLITE_OPEN_READ_ONLY, blocking SQLite work wrapped in tokio::task::spawn_blocking so it doesn't block Tokio's worker threads, graceful shutdown on Ctrl-C and SIGTERM.The full architecture is documented in the repo's README.
Representative numbers on an Acer Nitro 5 laptop, measured via criterion:
| Operation | Throughput / Latency |
|---|---|
| Ingestion, batched insert (10k rows) | ~210k lines/sec |
| Ingestion, parse + insert end-to-end (10k rows) | ~166k lines/sec |
| Query on known field, empty result (100k rows) | ~17 μs |
| Query on known field, 25% match (100k rows, LIMIT 1000) | ~39 ms |
| Query on JSON field, 25% match (100k rows, LIMIT 1000) | ~3.6 ms |
| Query on JSON field, 0% match (full scan, 100k rows) | ~68 ms |
CONTAINS full-table scan (100k rows) |
~36–40 ms |
3-clause AND chain (100k rows) |
~22 ms |
Release binaries at 3.7 MB (logdive) and 4.1 MB (logdive-api) — well under the 10 MB target, thanks to LTO + strip + panic=abort in the release profile.
Run cargo bench in the repo to get your own baseline.
A few design decisions have real downsides that users should know:
Timestamps compared as lexical TEXT. This is correct for ISO-8601-shaped timestamps (they sort chronologically when compared as strings), but any exotic timestamp format will silently misorder. Default timestamps from modern structured loggers are ISO-8601, so in practice this is rarely a problem — but it's a real sharp edge.
No index on json_extract expressions. Queries on unknown JSON fields fall back to full table scans. 100k rows scans in ~68ms which is still fast, but if you're hammering the same unknown field constantly, it's slower than a known-column query by 1000x. A future version could promote frequently-queried JSON fields to real columns.
Single-host only. There's no clustering story. If you need distributed query across machines, you want Loki or Elastic.
No authentication on the HTTP API. Deliberate for v1. If you expose logdive-api beyond localhost, put a reverse proxy with auth in front of it. The binary defaults to binding 127.0.0.1 for a reason.
From crates.io:
cargo install logdive logdive-api
From prebuilt binaries: grab the tarball for your platform from the GitHub Releases page. Linux x86_64 and macOS arm64 are built on every tag push.
From source:
git clone https://github.com/Aryagorjipour/logdive
cd logdive
cargo build --release
MSRV: Rust 1.85 (edition 2024). Dual-licensed MIT OR Apache-2.0.
Try the included examples:
logdive --db /tmp/demo.db ingest --file examples/app.log
logdive --db /tmp/demo.db ingest --file examples/nginx.log
logdive --db /tmp/demo.db stats
logdive --db /tmp/demo.db query 'level=error AND service=payments'
v0.1.0 is deliberately small, but there's a clear set of high-value v2 features that would benefit hugely from community help. If you want to contribute, these are genuine needs:
OR operator in the query language. The single most-requested feature implied by v1's scope. Extends the parser to handle a two-level grammar (clauses joined by OR, OR-groups joined by AND) and the SQL generator to emit parenthesized disjunctions. Non-trivial but well-scoped.
Non-JSON log format support. Plaintext and logfmt are the obvious next formats. Would plug in as additional parser implementations alongside parse_line in logdive-core.
Follow mode (-f / tail-and-index). Watch a log file for new lines and ingest them as they appear. Good use of tokio::fs and notify crate patterns.
A browser UI. The HTTP API is ready for one — someone with frontend chops could build a single-page React/Svelte/HTMX UI that talks to logdive-api and gives people a browser-based query interface.
Generated columns for frequently-queried JSON fields. The big performance win. Would let users mark certain JSON fields as "promote to indexed column" and get known-field query performance for those.
Benchmarks on more hardware. If you run the existing cargo bench suite on your machine, an issue/PR updating the README's Performance section with a broader sample would be genuinely useful.
Docker image for the HTTP API. Dockerfile for logdive-api with a volume mount for the index database. Natural next step for users who want to run the API as a service.
The repo has CI, benchmarks, clean test coverage, and a documented contribution workflow. Issues and pull requests at github.com/Aryagorjipour/logdive.
logdive started as the final project in a Rust learning journey. The framing mattered — I wanted a project that was small enough to finish, demanding enough to exercise real Rust (parsers, SQLite, async, concurrency, CLI, HTTP), and useful enough that I'd actually keep using it afterward.
What I underestimated: how much of the effort lives in the parts that aren't writing code. Setting up a clean workspace. Choosing the right abstractions between core and binaries. Writing benchmarks that actually measure what you think they measure. Testing an HTTP server with tower::ServiceExt::oneshot. Packaging a three-crate workspace for crates.io when one crate depends on another and you're publishing for the first time. Each of these had at least one subtle gotcha.
The project is open source because the next person hitting the "jq vs Datadog" wall might as well benefit from it, and because Rust has given me enough that I want to give something back.
Arya Gorjipour — backend engineer, Rust learner, logdive maintainer.
Issues, bug reports, and pull requests welcome. If you end up using logdive to debug a real production incident, I'd love to hear about it.
2026-04-29 01:59:49
By HARDIN
Richard Seroter has a habit of building things two years before the rest of us are ready to talk about them. In 2024, he asked: can we store prompts in source control and let an LLM generate the entire app at build time? Last week, he asked something newer and, honestly, a bit more unsettling: should we call LLMs directly from our messaging middleware?
One experiment is about generating software. The other is about injecting intelligence into the veins of your data flow. Put them side by side, and you don't just see the evolution of one guy's thinking. You see where the whole industry is quietly heading—whether we're ready or not.
I've been dissecting Google Cloud Next '26 for weeks. But sometimes the most revealing stuff isn't in the keynotes. It's buried in a blog post where someone asks, "Should we?" and leaves the answer hanging.
Let's pull that thread.
I already covered this in depth in my previous article, but the core architecture was beautifully simple:
Source Control (prompts.json)
│
▼
Spring AI + Gemini 1.5 Flash
│
▼
Generated code (Node.js/Python + Dockerfile)
│
▼
Cloud Run deployment
The philosophy: treat prompts as the source of truth, and let AI do the implementation. Richard even built a working GitHub repo to prove it. The AI pumped out everything from index.js to Dockerfile to package.json. The output was non-deterministic, untested, and completely unregulated. Richard knew this. He called it "bonkers" and explicitly warned against using it for real workloads.
You can explore the entire project here:
This is a Spring Boot app that uses Spring AI, Google Gemini 1.5 Flash and various Google Cloud services to take prompts, generate code, and publish the generated app to Cloud Run.
You can customize and use this without Google Cloud services, and presumably with a different model!
See this blog post for a full walkthrough.
But here's the thing he actually proved, even if he didn't shout it: if you can describe what you want precisely enough, an LLM can turn intent into executable artifact. That idea stuck. That's the seed of agentic software engineering you see everywhere now.
Honestly, re-reading his 2024 post gave me a weird nostalgia—like finding the first commit of a framework that now runs half the internet.
Fast forward two years. Richard's latest post explores a new Google Cloud feature: AI Inference SMT (Single Message Transform) for Pub/Sub.
The architecture is almost embarrassingly simple:
Pub/Sub Topic
│
▼
AI Inference SMT (calls Gemini)
│
▼
Pub/Sub Subscription (enriched/altered message)
No custom subscriber. No Cloud Run service. No boilerplate code. You configure a template that tells the SMT what to ask the LLM, and every message passing through Pub/Sub gets enriched, translated, summarized, or routed. It feels like magic—and like most magic, it probably comes with a hidden price tag.
The philosophy: intelligence as infrastructure, not application code. Instead of writing a service that calls an LLM, you declare, "this message topic shall be intelligent," and the platform handles the rest.
Richard's post raises a question that's going to age like fine wine: is this a good idea? He compares it to storing business logic in database triggers—powerful, but a maintenance nightmare if abused. He's cautious. I'd say he's right, and maybe even understating the danger. More on that in a bit.
At first glance, these two experiments look like completely different animals. They're not. They're two halves of the same coin: declarative AI-driven automation.
| Dimension | 2024 (Code Generation) | 2026 (Message Injection) |
|---|---|---|
| Trigger | CI/CD build pipeline | Every single message arrival |
| Scope | Static artifact generation | Dynamic content transformation |
| Output | Application files | Modified messages |
| Latency | Minutes (build time) | Milliseconds (real-time) |
| Governance | Manual review of generated code | Template configuration + IAM |
| Risk | Non-deterministic builds, no testing | Unexplained message mutations, AI hallucinations in the data plane |
The 2024 experiment automates what gets deployed. The 2026 experiment automates what happens to data in flight. Put them together, and you've got a pipeline where intent flows through your message bus and comes out the other side as a running service. That's not science fiction. That's today.
So, naturally, I started sketching. What happens if we connect these two ideas? I call the result the Agentic Message-to-Deployment Pipeline.
A business unit sends a single message: "We need a microservice that translates customer feedback from any language to English and stores it in Firestore."
The whole chain starts and ends with a message. The same Pub/Sub topic where requests arrive is also where results are published. The intelligence doesn't live in a monolithic pipeline script—it lives in the SMT layer and the agent mesh.
When I sketched this out, I sat back and stared at it for a minute. It's elegant. It's terrifying. It's probably where we're all heading.
I've been writing dark jokes about AI automation for a month. But this combination genuinely made me pause.
1. Message mutations are invisible bugs.
When an SMT silently modifies a message using AI, how do you even start debugging? The original message is gone. The AI's decision is a black box. If the LLM hallucinates a translation or misinterprets a feature request, you ship the wrong service. Try explaining that one to the business.
2. The attack surface just exploded.
In 2024, a malicious prompt could generate bad code. In 2026, a malicious message can trigger an entire agentic deployment pipeline. Someone with Pub/Sub publish permissions can potentially spin up new services, modify data, or exfiltrate information—all through natural language. "Please delete all production databases." Said politely. Executed instantly.
3. The platform is the developer now.
We already saw that 75% of Google's new code is AI-generated. With AI Inference SMT plus agentic deployment, that percentage doesn't stop at 75. It inches toward 100. The developer becomes a reviewer, a policy setter, a person who says "yes" or "no" to an agent's proposal. I wrote about this role shift in my Developer Keynote analysis, and honestly, it's accelerating faster than I expected.
Richard Seroter's journey from 2024 to 2026 isn't just his—it's a mirror for the whole industry.
- **2024**: "Let's see if AI can write code at all."
- **2025**: "Let's put AI in the CI/CD pipeline."
- **2026**: "Let's put AI in the data plane. And the deployment plane. And the governance plane."
I respect the hell out of Richard for publishing both experiments with zero pretense. The 2024 repo is humble—4 commits, 4 stars. The 2026 blog post is cautious, full of "should we?" questions. That kind of intellectual honesty is getting rare in tech evangelism. He's not selling. He's exploring. There's a difference.
But don't mistake the modesty. These two experiments together form a blueprint. One that says: AI should generate code, and AI should decide when to generate code, and AI should route the decision to generate code through your message bus. It's turtles all the way down, and the turtle is Gemini.
What do you think? Are AI Inference SMTs a brilliant abstraction or a future maintenance nightmare? Would you let a message bus trigger your deployment pipeline? Drop your experience—or your darkest prediction—in the comments. I read them all, and honestly, some of you scare me more than the AI does.
2026-04-29 01:55:16
Before the era of universal Unicode standards, typing in Tamil on a computer was a complex challenge. In the early 1990s, one font emerged as the definitive solution for millions of users in Sri Lanka and Tamil Nadu: Bamini.
Developed by S. Kalyanasundaram, Bamini wasn't just a typeface; it was a bridge that brought the ancient Tamil script into the digital age.
Why Was Bamini So Popular?
In the 1990s, operating systems didn't naturally "speak" Tamil. Bamini solved this by using an ASCII-based mapping system.
By mimicking the layout of traditional Tamil typewriters, it allowed professional typists to transition to PCs without any extra training. If you knew how to use a physical typewriter, you already knew how to type in Bamini.
Key Features of the Bamini Font
High Readability: Designed with bold, block-like strokes, it was optimized for the low-resolution screens and dot-matrix printers of the '90s.
The "jkpo;" Logic: Because it’s a legacy font, typing "jkpo;" on your QWERTY keyboard magically appears as "தமிழ்" on your screen.
DTP Standard: It became the de facto choice for newspapers, school textbooks, and government forms across Northern Sri Lanka.
The Technical Side: How It Works
Bamini uses a proprietary encoding scheme within the 128–255 ASCII range. Unlike modern Unicode fonts (like Latha or Nirmala), Bamini text requires the specific font file to be installed on the viewer's device. Without it, the text simply looks like a string of random Latin characters.
Bamini in the Modern World
While the world has moved to Unicode for web and mobile compatibility, Bamini remains a vital tool for:
Legacy Archives: Thousands of historical documents and older PDFs are still stored in Bamini.
Professional Printing: Many high-speed typists still prefer the Bamini layout for its ergonomic efficiency.
Graphic Design: It carries a "retro" aesthetic that is highly sought after for specific branding projects.
Looking to Download the Bamini Font?
If you need to view legacy documents, convert old files, or simply want that classic Tamil look for your next design project, you can download the full family of Bamini fonts (including variations from 01 to 87).
2026-04-29 01:53:29
We are excited to announce the release of Analog 2.5! This release introduces first-class runtime i18n support, a new fast compilation mode, hierarchical content with recursive prerendering, and a new AI integrations guide. Let's dive in.
Analog 2.5 adds first-class runtime internationalization built on Angular's $localize. With a single build, your app can serve every supported locale, with the active locale detected on both the server and the client.
Install @angular/localize and import the polyfill in your application entry:
// src/main.ts
import '@angular/localize/init';
Configure the supported locales on the analog() plugin in vite.config.ts. This enables locale detection on SSR and exposes the values as build-time globals so the application config doesn't have to repeat them:
// vite.config.ts
import { defineConfig } from 'vite';
import analog from '@analogjs/platform';
export default defineConfig(() => ({
plugins: [
analog({
i18n: {
defaultLocale: 'en',
locales: ['en', 'fr', 'de'],
},
}),
],
}));
Then provide the runtime translation loader via provideI18n():
// src/app/app.config.ts
import { ApplicationConfig } from '@angular/core';
import { provideFileRouter } from '@analogjs/router';
import { provideI18n } from '@analogjs/router/i18n';
export const appConfig: ApplicationConfig = {
providers: [
provideFileRouter(),
provideI18n({
loader: async (locale) => {
const translations = await import(`../i18n/${locale}.json`);
return translations.default;
},
}),
],
};
Mark templates with the standard Angular i18n attribute:
<h1 i18n="@@greeting">Hello</h1>
On SSR, the locale is detected from the URL path prefix (/fr/about) or the Accept-Language header. On the client, it's read from window.location.pathname. The loader function you provide is called once at startup with the active locale, and $localize handles the rest.
The i18n plugin option also accepts an extract config that scans the compiled output for $localize tagged templates and writes a translation source file at the end of a production build:
analog({
i18n: {
defaultLocale: 'en',
locales: ['en', 'fr', 'de'],
extract: {
format: 'json',
outFile: 'src/i18n/messages.json',
},
},
})
format supports json, xliff, xliff2, and xmb.
A small helper makes runtime locale switching a one-liner:
import { Component } from '@angular/core';
import { injectSwitchLocale } from '@analogjs/router/i18n';
@Component({ /* ... */ })
export class LanguagePickerComponent {
switchLocale = injectSwitchLocale();
}
Calling switchLocale('fr') navigates to the equivalent /fr URL and re-evaluates templates with the new translations. Check out the i18n docs for the full guide.
Analog 2.5 introduces fastCompile, a new option on the analog() Vite plugin that enables a Vite-native Angular compilation pipeline combined with OXC for fast parsing. The most noticeable differences are with cold starts and HMR.
Enable it in your Vite config:
// vite.config.ts
import { defineConfig } from 'vite';
import analog from '@analogjs/platform';
export default defineConfig(() => ({
plugins: [
analog({
fastCompile: true,
}),
],
}));
A companion fastCompileMode option controls the compilation output:
'full' (default): Emits final Ivy definitions for application builds.'partial': Emits partial declarations for library publishing.
analog({
fastCompile: true,
fastCompileMode: 'partial',
})
Fast compilation mode allows you to offload type checking as a separate process, keeping the development workflow streamlined, especially when building content-focused sites and applications.
Support for hierarchical content and nested content directories has been improved for more complex documentation structure:
src/content/docs/
getting-started/
welcome.md
first-upload.md
assets/
upload.md
Content resolvers in Analog previously only handled top level folders and slugs in the src/content directory, but is now more flexible to handle nested subdirectories with existing APIs.
A catch-all route now resolves nested files correctly:
// src/app/pages/docs/[...slug].page.ts
import { Component } from '@angular/core';
import { injectContent, MarkdownComponent } from '@analogjs/content';
@Component({
standalone: true,
imports: [MarkdownComponent],
template: `
@if (post(); as post) {
<analog-markdown [content]="post.content" />
}
`,
})
export default class DocsPageComponent {
post = injectContent({ subdirectory: 'docs' });
}
A new recursive option was added, and each content file now exposes a relativePath so transforms can identify identically-named files across subdirectories:
// vite.config.ts
analog({
prerender: {
routes: async () => [
{
contentDir: 'src/content/docs',
recursive: true,
transform: (file) => {
const segment = file.relativePath
? `${file.relativePath}/`
: '';
return `/docs/${segment}${file.name}`;
},
},
],
},
})
For backward compatibility, recursive defaults to false.
Analog 2.5 ships with a new AI integrations guide that walks through wiring LLM providers into Analog's API routes — streaming responses, server-side keys, and the usual use cases. If you're adding AI features to an app, that guide is the starting point.
We are also looking into adding official skills for Analog features and conventions.
To upgrade to Analog 2.5, run:
ng update @analogjs/platform@latest
If you're using Nx, run:
nx migrate @analogjs/platform@latest
For the full list of changes, see the changelog.
Continued development of Analog would not be possible without our partners and community. Thanks to our official deployment partner Zerops, code review partner CodeRabbit, and longtime supporters Snyder Technologies, Nx, and House of Angular, and many other backers of the project.
Find out more information on our partnership opportunities or reach out directly to partnerships[at]analogjs.org.
If you enjoyed this post, click the ❤️ so other people will see it. Follow AnalogJS and Brandon Roberts on Bluesky, and subscribe to my YouTube Channel for more content!
2026-04-29 01:50:50
skills add Nolpak14/getregdata -g -y
If you have ever tried to programmatically access European government registries, you know the pain. Each country has its own set of portals, each with its own quirks:
L****** instead of full names - while the same data is available unredacted in PDF extracts from the same portalEach registry is a separate engineering challenge. Different authentication, different anti-scraping measures, different data formats, different legal frameworks.
Over the past months I built Apify actors for all of these. Each actor handles the specific technical challenges of its registry and returns clean, structured JSON.
| Registry | What You Get | Records |
|---|---|---|
| KNF - Financial Supervision Authority | Payment institutions, e-money issuers, lending companies | 75,000+ |
| MSiG - Court Gazette (Monitor Sadowy) | Bankruptcy declarations, restructuring, liquidation notices | 2001-present |
| KRS - National Court Register | Full non-anonymized board member names from PDF extracts | 800,000+ companies |
| KRZ - National Debtor Registry | Bankruptcy, restructuring, enforcement proceedings | 9 search modes |
| eKRS - Financial Statements | Balance sheets, P&L, assets, equity, revenue, net profit | Official filings |
| EKW - Land Registry | Property ownership, mortgages, easements, restrictions | 25 million entries |
| UOKiK - Consumer Protection | Court-ruled prohibited contract clauses | 7,500+ clauses |
| CRBR - Beneficial Owners | Ultimate beneficial owners (UBO) for AML compliance | All KRS entities |
| BDO - Waste Registry | Waste management entity registrations and permits | 674,000+ entities |
| Registry | What You Get |
|---|---|
| BORME - Corporate Gazette | Daily incorporations, officer appointments, capital changes, dissolutions |
| Registro Mercantil - Company Directory | NIF, officers, CNAE codes, legal form, IRUS, EUID |
| Registry | What You Get |
|---|---|
| Ediktsdatei - Insolvency Publications | Bankruptcy, restructuring, debt regulation proceedings - no IWG license needed |
| WKO - Chamber of Commerce Directory | 620,000+ businesses with phone, email, website, trade licenses |
| Registry | What You Get |
|---|---|
| Societe.com - Company Data | SIREN, directors with roles, financials, shareholders, subsidiaries, brands |
All actors are pay-per-result with no subscriptions. Pricing ranges from $0.003 to $0.01 per result depending on the registry. A free Apify account includes $5 credits - enough for 100-1,600 lookups.
The actors work great when you know exactly which registry to query. But for many users, the real question is: "I need to check this Polish company - where do I even start?"
That is where skills come in. Skills are modular packages that give AI coding agents specialized knowledge. When you install a skill, your agent learns how to handle domain-specific tasks - in this case, querying European government registries.
I packaged the 14 actors into 6 skills organized by workflow:
| Skill | What It Does | Registries Used |
|---|---|---|
regdata |
Router - identifies which registry you need | All 14 |
regdata-kyc-aml |
Beneficial ownership, financial licenses, board members | CRBR, KNF, KRS, Societe.com, WKO, Spain Dir |
regdata-credit-risk |
Insolvency proceedings, court gazette, financial statements | KRZ, MSiG, eKRS, Ediktsdatei, BORME |
regdata-property |
Polish land registry - ownership, mortgages, encumbrances | EKW, KRS, CRBR |
regdata-compliance |
Prohibited contract clauses, waste management registrations | UOKiK, BDO |
regdata-lead-gen |
Company directors, business contacts, new incorporations | KRS, WKO, Spain Dir, Societe.com, BORME |
Each skill includes:
One command installs all 6 skills globally across every supported agent:
skills add Nolpak14/getregdata -g -y
This works with 45+ AI agents including Claude Code, GitHub Copilot, Cline, Codex, Amp, Antigravity, and more. The skills are installed to ~/.agents/skills/ and symlinked to agent-specific locations.
You open Claude Code (or any supported agent) and type:
Check who owns the Polish company with NIP 5213103635 in CRBR
The regdata-kyc-aml skill activates. It knows that CRBR is Poland's beneficial owners registry and that NIP is the correct identifier. It guides the extraction:
from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("regdata/crbr-beneficial-owners-scraper").call(
run_input={"nip": "5213103635"}
)
for item in client.dataset(run["defaultDatasetId"]).list_items().items:
print(f"Company: {item['name']}")
for owner in item.get("beneficialOwners", []):
print(f" UBO: {owner['firstName']} {owner['lastName']}")
print(f" Ownership: {owner.get('sharePercentage', 'N/A')}%")
If you have the Apify MCP server configured, the agent can run the actor directly through tool calls without writing any code.
Search BORME for new company incorporations in Barcelona from last week
The regdata-credit-risk skill activates (BORME corporate acts). It builds the right query:
run = client.actor("regdata/borme-corporate-acts-scraper").call(
run_input={
"dateFrom": "2026-04-21",
"dateTo": "2026-04-28",
"provinces": ["Barcelona"]
}
)
Pull EKW data for property WR1K/00094598/3
The regdata-property skill activates. It knows the KW number format, that EKW requires residential Polish proxies, and how to interpret the four sections (Dzial I-IV):
run = client.actor("regdata/ekw-ksiegi-wieczyste-scraper").call(
run_input={
"kwNumbers": ["WR1K/00094598/3"],
"viewType": "aktualna",
"sections": ["all"]
}
)
The skill then explains what each section means - Dzial II for ownership, Dzial III for restrictions and easements, Dzial IV for mortgages - so you can actually interpret the raw data.
Extract board members for Polish KRS number 0000019193
The regdata-lead-gen skill knows that the official KRS API anonymizes names, and that the actor works around this by extracting full names from PDF court extracts:
run = client.actor("regdata/krs-fullnames-scraper").call(
run_input={
"krsNumbers": ["0000019193"],
"extractType": "aktualny"
}
)
You get full first and last names of board members, supervisory board, and shareholders - not the L****** that the API returns.
The value of domain-specific skills is that they carry registry knowledge that a generic web scraping tool does not have:
This registry-specific knowledge means your AI agent can handle queries correctly without you needing to read documentation for each portal.
The skills provide analysis frameworks for free, but extracting live data requires an Apify API token:
export APIFY_TOKEN=apify_api_xxxxx
Get your token by creating a free Apify account - includes $5 credits, enough for hundreds of registry lookups.
All actors use pay-per-result pricing. No subscriptions, no minimum commitments.
| Registry | Cost per Result |
|---|---|
| KNF, UOKiK, WKO, BORME | $0.003 |
| MSiG, BDO | $0.004 |
| Ediktsdatei, Spain Dir, Societe.com | $0.005 |
| KRZ | $0.006 |
| KRS Board, eKRS, CRBR | $0.008 |
| EKW | $0.01 |
Typical cost for a full company check (CRBR + KNF + KRS Board): ~$0.019.
The skills repo follows the standard skills ecosystem structure:
getregdata/
skills/
regdata/SKILL.md # Router
regdata-kyc-aml/SKILL.md # + references/
regdata-credit-risk/SKILL.md # + references/
regdata-property/SKILL.md # + references/
regdata-compliance/SKILL.md # + references/
regdata-lead-gen/SKILL.md # + references/
Each SKILL.md uses YAML frontmatter for trigger matching:
---
name: regdata-kyc-aml
description: "Extract beneficial ownership data from Poland's CRBR registry,
financial license status from KNF, non-anonymized board members from KRS..."
metadata:
version: 1.0.0
author: regdata
triggers:
- "CRBR beneficial owner lookup"
- "check KNF registry"
- "KRS board members extract"
- "Polish beneficial owners"
---
Triggers are registry-specific, not generic domain terms. The skill activates when you mention a specific European registry, not when you ask about "KYC" or "compliance" in general.
The references/ subdirectories contain detailed interpretation guides (land registry section breakdowns, insolvency proceeding types across jurisdictions, UOKiK clause categories) that the agent loads on demand.
The skills are MIT licensed and open source:
GitHub: github.com/Nolpak14/getregdata
The underlying Apify actors are available on the Apify Store under the regdata publisher profile.
If you work with government registries in other EU countries and want to contribute skills for those, PRs are welcome.
Install: skills add Nolpak14/getregdata -g -y
Actors: apify.com/regdata
GitHub: github.com/Nolpak14/getregdata
This article is part of the European Registry Data series covering programmatic access to government registries across the EU.