2026-02-17 21:30:04
The JavaScript ecosystem has a magic problem.
Not the fun kind. The kind where you stare at your code, everything looks correct, and something still breaks in a way you can't explain. The kind where you spend forty minutes debugging why your computed() stopped updating, or why an effect fired when you didn't expect it, or why destructuring a store value makes it stop being reactive.
We called it reactivity. We called it signals. We called it runes. And every new name comes with a new layer of invisible machinery running underneath your code, doing things you didn't ask for, breaking in ways you didn't anticipate. The deeper problem isn't performance or verbosity — it's locality of reasoning. You can't look at a line of code and know when or why it will execute.
I've been building complex web applications for eighteen years — interactive dashboarding systems, industrial HMI interfaces, config-driven UIs. Those projects are the reason this framework exists: not because they failed, but because I could see exactly how they would age. I got tired of the magic. So I built something without it.
This isn't the Nth framework built out of frustration. It's a deliberate synthesis of ideas that already proved themselves: Redux's three principles, the Entity Component System architecture from game engines, and lit-html's surgical DOM updates. None of these are new. What's new is putting them together and following the logic all the way through.
I really started digging React when they introduced Redux. It revamped Functional Programming concepts as good practices for large-scale systems — proving they belonged in production code, not just CS theory. Three principles made any webapp predictable, debuggable, and testable as never before:
But things went south. Devs complained that Redux was too verbose, immutable updates were painful, async logic was a hack. Those complaints were valid — the boilerplate was a genuine tax. Enter RTK, which solved real problems: simpler reducers, built-in Immer, sane async thunks. But then it kept going — createAppSlice, builder callback notation, circular dependency nightmares. The question isn't whether Redux needed fixing. It's whether the fixes took things in the right direction. Then the "Single Source Of Truth" dogma started bending entirely: local state here, Context there, Zustand, Jotai, signals. We write less code now, and it just magically works. Well — not for me.
Let me be specific, because "magic is bad" is an easy claim to make and a hard one to defend without evidence.
React re-renders are actually fast — React was right about that. The real problem is that re-renders trigger effects and lifecycle methods. useEffect fires after every matching render, subscriptions re-initialize, derived state recomputes. Invisible dependency arrays silently break when you forget something, and useEffect lists grow into things nobody on the team fully trusts. React's answer? A stable compiler that adds layers of cache automatically. Which means you can have a suboptimal component hierarchy and the compiler will compensate — which is convenient until you need to understand why something broke.
Vue 3 introduced a subtle trap with the Composition API: destructuring a reactive object silently breaks the proxy chain that powers reactivity. Your variable stops updating and you get no warning whatsoever. Vue provides toRefs() specifically to patch this — which proves the point: you now have to manage the integrity of an invisible system on top of writing your actual application. And computed() knows when to recompute by secretly tracking which reactive properties you accessed while it ran, which can produce circular dependencies that only blow up at runtime.
Svelte 5 introduced runes — $state(), $derived(), $effect(). The docs themselves define the word:
rune /ruːn/ noun — A letter or mark used as a mystical or magic symbol.
It's impressive engineering. But unlike JSX — which is a purely syntactic transformation — Svelte's compiler is semantically active: it changes what your code means, not just how it looks. $state() isn't JavaScript with nicer syntax; it's a different programming model that requires the compiler to be correct.
All three are racing in the same direction: more reactivity, more compilation, more invisible machinery. I went the other way.
Inglorious Web is built on one idea: state is data, behavior are functions, rendering is a pure function of state.
No proxies. No signals. No compiler. Just plain JavaScript objects, event handlers, and lit-html's surgical DOM updates. The mental model is a one-time cost, not a continuous tax — you learn it once, and it scales without adding new concepts.
const counter = {
create(entity) {
entity.value = 0;
},
increment(entity) {
entity.value++;
},
render(entity, api) {
return html`
<div>
<span>Count: ${entity.value}</span>
<button @click=${() => api.notify(`#${entity.id}:increment`)}>
+1
</button>
</div>
`;
},
};
It looks like a hybrid between Vue's Options API and React's JSX. If you prefer either of those syntaxes, there are Vite plugins for both. But the key differences are in what's absent. There are no hooks, no lifecycle methods, no component-level state. create and increment are plain event handlers — closer to RTK reducers than to React methods. The templates are plain JavaScript tagged literals: no new syntax to learn, no compilation step required. Boring doesn't mean verbose — it means every line does exactly what it says.
One deliberate abstraction worth naming: state mutations inside handlers look impure but aren't. The framework wraps them in Mutative — the same structural sharing idea as Immer, but 2–6x faster — so you write entity.value++ and get back an immutable snapshot. That's the only reactive magic in the stack, it's a small and well-understood library, and it's what makes testing trivial.
When state changes, the whole tree re-renders. But lit-html only touches the DOM nodes that actually changed — the same way Redux reducers don't do anything when an action isn't their concern. Re-rendering is cheap. Effects and lifecycle surprises don't exist. The question "why did this effect fire?" is simply impossible to ask, because you can look at any handler and reason about exactly when it runs. And because every state transition is an explicit event, you can grep for every place it's fired — something you cannot do with a reactive dependency graph.
In React, testing a component with hooks means setting up a fake component tree and mocking the world around it. In Vue 3, testing a composable means testing impure functions swimming in proxy magic.
In Inglorious Web, testing state logic is this:
import { trigger } from "@inglorious/web/test";
const { entity, events } = trigger(
{ type: "counter", id: "counter1", value: 10 },
counter.increment,
5,
);
expect(entity.value).toBe(15);
And testing rendering is equally straightforward:
import { render } from "@inglorious/web/test";
const template = counter.render(
{ id: "counter1", type: "counter", value: 42 },
{ notify: vi.fn() },
);
const root = document.createElement("div");
render(template, root);
expect(root.textContent).toContain("Count: 42");
// snapshot testing works too:
expect(root.innerHTML).toMatchSnapshot();
No fake component tree. No lifecycle setup. No async ceremony. Because render is a pure function of an entity, and a pure function is just a function you call.
React, Vue, and Svelte are component-centric. The component is the unit. Logic lives in components, state is owned or lifted by them, everything is a tree.
Inglorious Web is entity-centric. Your application is a collection of entities — pieces of state with associated behaviors. Some entities happen to render. Most of the time you don't think about the tree at all.
If you've heard of the Entity Component System (ECS) architecture used in game engines, this will feel familiar — though it's not a strict implementation. Think of it as ECS meets Redux: entities hold data, types hold behavior, and the store is the single source of truth. The practical consequence is that you can add, remove, or compose behaviors at the type level without touching the UI, and you can test state logic in complete isolation from rendering. That's not just less magic — it's a different ontology.
This is the first post in a series.
In the next post, I'll go deeper into the entity-centric architecture: how types compose, how the ECS lineage maps to real web UI problems, and whether the mental model holds up at scale — from a TodoMVC to a config-driven industrial HMI. I'll also be honest about the ecosystem, the tradeoffs, and where the framework fits and where it doesn't.
In the third post, I'll show the numbers: a benchmark running 1000 rows at 100 updates per second, comparing React (naive, memoized, and with RTK), and a live chart benchmark against Recharts. Performance, bundle size, and what "dramatically smaller optimization surface area" actually looks like in practice.
The ecosystem is moving toward more magic. I'm moving the other way.
2026-02-17 21:26:43
Every time I start a new microservices project, the same thing happens.
I spend the first three weeks not building features, but scaffolding. Wiring up MediatR. Configuring EF Core aggregate mappings. Setting up the event bus. Writing the Dockerfile. Then the Bicep templates. Then the Terraform. Then the CI/CD pipeline. Then realizing my "domain events" are just anemic DTOs being passed around.
By week four, I've written zero business logic and I'm already burned out.
Sound familiar?
After doing this dance across half a dozen projects, I finally built the starter kit I wish someone had handed me on day one. It's a fully wired, 108-file .NET 8 solution with three microservices, real DDD patterns, CQRS, infrastructure as code, and documentation that actually explains why things are the way they are.
This article walks through the key architecture decisions and code. If you want the full thing, the DDD Microservices Starter Kit is on Gumroad.
graph TB
Client[API Client] --> GW[API Gateway]
GW --> OrderAPI[Order Service]
GW --> InvAPI[Inventory Service]
GW --> NotifAPI[Notification Service]
OrderAPI -->|Domain Events| Bus[Event Bus<br/>Azure Service Bus / RabbitMQ]
Bus --> InvAPI
Bus --> NotifAPI
OrderAPI --> OrderDB[(Order DB)]
InvAPI --> InvDB[(Inventory DB)]
NotifAPI --> NotifDB[(Notification DB)]
style Bus fill:#0078d4,color:#fff
style OrderAPI fill:#512bd4,color:#fff
style InvAPI fill:#512bd4,color:#fff
style NotifAPI fill:#512bd4,color:#fff
Three bounded contexts — Order, Inventory, and Notification — each with its own database and its own deployable. They communicate through domain events published over Azure Service Bus (or RabbitMQ for local dev). No shared databases. No temporal coupling. Just messages.
Each microservice follows the same four-layer structure:
graph LR
A[Domain] --> B[Application]
B --> C[Infrastructure]
C --> D[API / Presentation]
A -.->|no dependency| C
A -.->|no dependency| D
style A fill:#2d6a4f,color:#fff
style B fill:#40916c,color:#fff
style C fill:#52b788,color:#fff
style D fill:#95d5b2,color:#000
src/
Services/
Order/
Order.Domain/ # Entities, Value Objects, Domain Events, Repository interfaces
Order.Application/ # Commands, Queries, Handlers, Validators, DTOs
Order.Infrastructure/ # EF Core, Service Bus, External APIs
Order.API/ # Controllers, Middleware, DI configuration
The dependency rule is enforced by project references. Domain references nothing. Application references only Domain. Infrastructure references Application and Domain. API wires it all together.
This isn't just a folder convention — it's compile-time enforcement. If someone tries to reference Infrastructure from Domain, the build fails.
Most "DDD" starter kits I've seen put an Entity base class in the domain and call it a day. That's not DDD. That's an anemic model with extra steps.
Here's what the Order aggregate actually looks like:
public class Order : AggregateRoot<OrderId>
{
private readonly List<OrderLine> _lines = new();
public CustomerId CustomerId { get; private set; }
public OrderStatus Status { get; private set; }
public Money TotalAmount { get; private set; }
public IReadOnlyCollection<OrderLine> Lines => _lines.AsReadOnly();
private Order() { } // EF Core
public static Order Create(CustomerId customerId, Address shippingAddress)
{
var order = new Order
{
Id = OrderId.Create(),
CustomerId = customerId,
Status = OrderStatus.Draft,
TotalAmount = Money.Zero("USD")
};
order.AddDomainEvent(new OrderCreatedEvent(order.Id, customerId));
return order;
}
public void AddLine(ProductId productId, int quantity, Money unitPrice)
{
if (Status != OrderStatus.Draft)
throw new OrderDomainException("Can only add lines to draft orders.");
var line = new OrderLine(productId, quantity, unitPrice);
_lines.Add(line);
RecalculateTotal();
}
public void Submit()
{
if (!_lines.Any())
throw new OrderDomainException("Cannot submit an empty order.");
Status = OrderStatus.Submitted;
AddDomainEvent(new OrderSubmittedEvent(Id, CustomerId, TotalAmount));
}
private void RecalculateTotal()
{
TotalAmount = _lines.Aggregate(
Money.Zero("USD"),
(sum, line) => sum.Add(line.TotalPrice));
}
}
Notice:
Money, OrderId, CustomerId, and Address — not primitive types scattered everywhere.Create) instead of a public constructor — the aggregate controls its own creation.The Money value object prevents an entire class of bugs:
public record Money
{
public decimal Amount { get; }
public string Currency { get; }
private Money(decimal amount, string currency)
{
if (amount < 0) throw new ArgumentException("Amount cannot be negative.");
Amount = amount;
Currency = currency;
}
public static Money Zero(string currency) => new(0, currency);
public static Money Of(decimal amount, string currency) => new(amount, currency);
public Money Add(Money other)
{
if (Currency != other.Currency)
throw new InvalidOperationException("Cannot add different currencies.");
return new Money(Amount + other.Amount, Currency);
}
}
No more accidentally adding USD to EUR. No more negative totals sneaking in through a careless assignment. The type system catches it.
Every use case is a discrete command or query, dispatched through MediatR:
// Command
public record SubmitOrderCommand(Guid OrderId) : IRequest<Result<OrderDto>>;
// Handler
public class SubmitOrderCommandHandler
: IRequestHandler<SubmitOrderCommand, Result<OrderDto>>
{
private readonly IOrderRepository _orders;
private readonly IUnitOfWork _uow;
public SubmitOrderCommandHandler(IOrderRepository orders, IUnitOfWork uow)
{
_orders = orders;
_uow = uow;
}
public async Task<Result<OrderDto>> Handle(
SubmitOrderCommand request, CancellationToken ct)
{
var order = await _orders.GetByIdAsync(OrderId.From(request.OrderId), ct);
if (order is null) return Result.NotFound();
order.Submit();
await _uow.CommitAsync(ct); // Dispatches domain events after save
return Result.Success(order.ToDto());
}
}
The handler is 15 lines. No logging boilerplate, no validation checks, no try-catch. That's because pipeline behaviors handle cross-cutting concerns:
// Validation — runs before every handler
public class ValidationBehavior<TRequest, TResponse>
: IPipelineBehavior<TRequest, TResponse>
where TRequest : IRequest<TResponse>
{
private readonly IEnumerable<IValidator<TRequest>> _validators;
public ValidationBehavior(IEnumerable<IValidator<TRequest>> validators)
=> _validators = validators;
public async Task<TResponse> Handle(TRequest request,
RequestHandlerDelegate<TResponse> next, CancellationToken ct)
{
var failures = _validators
.Select(v => v.Validate(request))
.SelectMany(r => r.Errors)
.Where(f => f is not null)
.ToList();
if (failures.Any())
throw new ValidationException(failures);
return await next();
}
}
There's also a LoggingBehavior and a PerformanceBehavior (logs warnings for slow handlers) wired up the same way. Add your own — it's just another IPipelineBehavior.
Validation rules use FluentValidation and live next to their commands:
public class SubmitOrderCommandValidator : AbstractValidator<SubmitOrderCommand>
{
public SubmitOrderCommandValidator()
{
RuleFor(x => x.OrderId).NotEmpty().WithMessage("OrderId is required.");
}
}
When order.Submit() is called, it adds an OrderSubmittedEvent to the aggregate's internal event list. The magic happens in the UnitOfWork:
public async Task CommitAsync(CancellationToken ct)
{
// 1. Save to database
await _dbContext.SaveChangesAsync(ct);
// 2. Dispatch domain events (in-process via MediatR)
var domainEvents = _dbContext.GetDomainEvents();
foreach (var domainEvent in domainEvents)
await _mediator.Publish(domainEvent, ct);
// 3. Publish integration events (cross-service via event bus)
await _eventBus.PublishPendingAsync(ct);
}
Domain events stay in-process. Integration events cross service boundaries. An in-process handler maps between them:
public class OrderSubmittedDomainEventHandler
: INotificationHandler<OrderSubmittedEvent>
{
private readonly IEventBus _eventBus;
public OrderSubmittedDomainEventHandler(IEventBus eventBus)
=> _eventBus = eventBus;
public Task Handle(OrderSubmittedEvent notification, CancellationToken ct)
{
_eventBus.Enqueue(new OrderSubmittedIntegrationEvent(
notification.OrderId.Value,
notification.TotalAmount.Amount,
notification.TotalAmount.Currency));
return Task.CompletedTask;
}
}
The IEventBus abstraction swaps between Azure Service Bus in production and RabbitMQ in Docker Compose, with zero code changes in the domain or application layers.
EF Core configuration lives entirely in Infrastructure. The domain never knows about persistence:
public class OrderConfiguration : IEntityTypeConfiguration<Order>
{
public void Configure(EntityTypeBuilder<Order> builder)
{
builder.ToTable("Orders");
builder.HasKey(o => o.Id);
builder.Property(o => o.Id)
.HasConversion(id => id.Value, val => OrderId.From(val));
builder.OwnsOne(o => o.TotalAmount, money =>
{
money.Property(m => m.Amount).HasColumnName("TotalAmount");
money.Property(m => m.Currency).HasColumnName("TotalCurrency");
});
builder.OwnsMany(o => o.Lines, line =>
{
line.WithOwner().HasForeignKey("OrderId");
line.Property(l => l.ProductId)
.HasConversion(id => id.Value, val => ProductId.From(val));
line.OwnsOne(l => l.UnitPrice);
line.OwnsOne(l => l.TotalPrice);
});
builder.Metadata.FindNavigation(nameof(Order.Lines))!
.SetPropertyAccessMode(PropertyAccessMode.Field);
}
}
Value Objects are OwnsOne / OwnsMany. Private collections are accessed via backing fields. The domain model stays clean and the database schema stays sane.
The starter kit ships with three IaC options:
| Tool | Use Case | Files |
|---|---|---|
| Docker Compose | Local development |
docker-compose.yml, docker-compose.override.yml
|
| Bicep | Azure-native deployment |
infra/bicep/ — AKS, Service Bus, SQL, Container Registry |
| Terraform | Multi-cloud / team preference |
infra/terraform/ — same resources, HCL syntax |
Plus Kubernetes manifests in k8s/ for when you outgrow Docker Compose, and a GitHub Actions pipeline that builds, tests, and deploys on every push to main.
# .github/workflows/ci-cd.yml (simplified)
jobs:
build-and-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-dotnet@v4
with:
dotnet-version: '8.0.x'
- run: dotnet build --configuration Release
- run: dotnet test --configuration Release --no-build
deploy:
needs: build-and-test
if: github.ref == 'refs/heads/main'
steps:
- uses: azure/login@v2
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
# Build images, push to ACR, deploy to AKS...
The kit includes documentation that goes beyond READMEs:
IServiceProvider? Why separate integration events from domain events? Why EF Core over Dapper? Each decision is documented with context, options considered, and the rationale.Here's the full inventory:
✅ 3 microservices with Clean Architecture (Order, Inventory, Notification)
✅ Rich domain models — Aggregates, Value Objects, Domain Events
✅ CQRS via MediatR with validation and logging pipeline behaviors
✅ FluentValidation for command/query validation
✅ EF Core with proper aggregate mapping (owned types, backing fields)
✅ Dual event bus — Azure Service Bus + RabbitMQ
✅ Docker Compose for one-command local dev
✅ Bicep + Terraform for Azure deployment
✅ Kubernetes manifests
✅ GitHub Actions CI/CD pipeline
✅ Unit tests + integration tests
✅ Event Storming guide, Context Map, ADRs, Strangler Fig migration guide
✅ 108 files, all wired and working
This is not a tutorial. It's not a toy. It's the codebase you'd end up with after months of iteration — minus the months.
Stop scaffolding. Start building.
👉 Get the DDD Microservices Starter Kit on Gumroad
The repo is also on GitHub: tysoncung/ddd-microservices-azure-starter
Questions? Hit me up on X: @tscung
Built by Tyson Cung — software architect, DDD practitioner, and recovering scaffolding addict.
2026-02-17 21:25:52
When I needed to implement document signing in a production application, I quickly realized that most guides online either:
child_process.execSync (fragile, platform-dependent)So I built it myself — a complete PKI (Public Key Infrastructure) using node-forge for certificate generation combined with Node.js built-in crypto for key management. This post walks you through every step and explains a real compatibility issue you will hit if you use node-forge alone.
By the end, you'll have a working PKI that:
A PKI is the infrastructure that makes digital trust possible. Before diving into code, it helps to understand how the pieces fit together.
Think of it like a notary system. When you need to prove your identity for a legal document, you don't ask a random person to vouch for you — you go to a recognized institution that everyone trusts. That institution verifies your identity and stamps the document.
A PKI works the same way, but digitally:
Real-world uses: HTTPS (TLS), document signing (PDF/PAdES), code signing, email (S/MIME).
A PKI is structured as a hierarchy. A production-grade three-tier setup looks like this:
This layering is what gives PKIs their resilience: the higher the tier, the less it's exposed, the harder it is to compromise.
Let me save you the debugging session I went through.
node-forge encrypts private keys with forge.pki.encryptRsaPrivateKey(). This works fine. The problem appears later, when you try to decrypt the same key with forge.pki.decryptRsaPrivateKey() — it silently returns null for keys encrypted with AES-256 in certain environments.
This is a known but poorly documented issue. Depending on your Node.js version and the exact AES-256 encryption parameters forge uses, the decryption fails without throwing an error.
The fix: use Node.js built-in crypto.createPrivateKey() for decryption. It handles all standard PKCS#1/PKCS#8 encrypted key formats correctly, then you hand the decrypted key back to forge for operations that require it (like building a PKCS#12 bundle).
Here's the flow that works:
node-forge encryptRsaPrivateKey() → encrypted PEM on disk
↓
Node.js crypto.createPrivateKey() → decrypted CryptoKey object
↓
privateKeyObject.export({ type: 'pkcs1' }) → unencrypted PEM
↓
node-forge privateKeyFromPem() → forge key object (for P12, signing, etc.)
We'll see this in practice in Step 4.
pnpm add node-forge
pnpm add -D @types/node-forge
node is a built-in module — no extra installation needed for crypto.
The Root CA is self-signed: it signs its own certificate with its own private key. There is no higher authority vouching for it. It is trusted by declaration — you add it manually to your application's trust store.
Key decisions for the Root CA:
workers: -1 option tells node-forge to use all available CPU threads for key generation — without it, a 4096-bit key can block your process for several seconds.basicConstraints: cA: true: marks this certificate as a CA. Without this extension, other software refuses to use it to verify or sign anything.keyUsage: keyCertSign: explicitly declares that this certificate is authorized to sign other certificates.critical: true: if a parser doesn't understand this extension, it must reject the certificate. This prevents old or lenient parsers from ignoring security-relevant constraints.
import forge from 'node-forge';
import crypto from 'crypto';
export interface CACertificate {
certificate: string; // PEM format
privateKey: string; // PEM format, AES-256 encrypted
publicKey: string; // PEM format
}
export async function generateRootCA(passphrase: string): Promise<CACertificate> {
// workers: -1 uses all available CPU threads — essential for 4096-bit to avoid blocking
const keys = forge.pki.rsa.generateKeyPair({ bits: 4096, workers: -1 });
const cert = forge.pki.createCertificate();
cert.publicKey = keys.publicKey;
// Serial numbers must be unique within a CA — use random bytes, not sequential integers
cert.serialNumber = crypto.randomBytes(16).toString('hex');
const now = new Date();
cert.validity.notBefore = now;
cert.validity.notAfter = new Date();
cert.validity.notAfter.setFullYear(now.getFullYear() + 10);
const attrs = [
{ name: 'commonName', value: 'My Root CA' },
{ name: 'organizationName', value: 'My Org' },
{ name: 'countryName', value: 'FR' },
];
cert.setSubject(attrs);
cert.setIssuer(attrs); // Self-signed: issuer = subject, identical fields
cert.setExtensions([
{
name: 'basicConstraints',
cA: true, // This is a CA — can sign other certificates
critical: true,
},
{
name: 'keyUsage',
keyCertSign: true, // Authorized to sign certificates
cRLSign: true, // Authorized to sign Certificate Revocation Lists
critical: true,
},
{
name: 'subjectKeyIdentifier',
// Fingerprint of this key, referenced by child certs via authorityKeyIdentifier
},
]);
// Self-sign: the CA's own private key signs its own certificate
cert.sign(keys.privateKey, forge.md.sha256.create());
// Encrypt the private key before returning — never store it in plaintext
const privateKeyPem = forge.pki.encryptRsaPrivateKey(keys.privateKey, passphrase, {
algorithm: 'aes256',
});
return {
certificate: forge.pki.certificateToPem(cert),
privateKey: privateKeyPem,
publicKey: forge.pki.publicKeyToPem(keys.publicKey),
};
}
Notice setIssuer(attrs) receives the same array as setSubject(attrs). That is the definition of a self-signed certificate — the issuer and subject are identical.
The private key is immediately encrypted before the function returns. It should never exist in plaintext on disk.
The Intermediate CA sits between the Root CA and the signing certificates. Its role: absorb the day-to-day risk of certificate issuance so the Root CA never has to be exposed online.
Key differences from the Root CA:
pathlenConstraint: 0: this Intermediate CA can sign leaf certificates but cannot create further intermediate CAs. Limits blast radius if compromised.authorityKeyIdentifier: links this certificate back to the Root CA's public key fingerprint — the mechanism chain-walkers use to find the issuer
export interface IntermediateCACertificate {
certificate: string; // PEM format
privateKey: string; // PEM format, AES-256 encrypted
publicKey: string; // PEM format
}
export async function generateIntermediateCA(
rootCA: CACertificate,
rootPassphrase: string,
options: {
commonName: string;
organization?: string;
country?: string;
validityYears?: number;
}
): Promise<IntermediateCACertificate> {
const { commonName, organization = '', country = '', validityYears = 5 } = options;
// Decrypt the Root CA private key to sign this certificate
// — see Step 4 for why we use Node.js crypto here instead of forge
const rootPrivateKey = decryptPrivateKey(rootCA.privateKey, rootPassphrase);
const keys = forge.pki.rsa.generateKeyPair({ bits: 2048, workers: -1 });
const cert = forge.pki.createCertificate();
cert.publicKey = keys.publicKey;
cert.serialNumber = crypto.randomBytes(16).toString('hex');
const now = new Date();
cert.validity.notBefore = now;
cert.validity.notAfter = new Date();
cert.validity.notAfter.setFullYear(now.getFullYear() + validityYears);
const subjectAttrs = [{ name: 'commonName', value: commonName }];
if (organization) subjectAttrs.push({ name: 'organizationName', value: organization });
if (country) subjectAttrs.push({ name: 'countryName', value: country });
cert.setSubject(subjectAttrs);
// Issuer = Root CA
const rootCACert = forge.pki.certificateFromPem(rootCA.certificate);
cert.setIssuer(rootCACert.subject.attributes);
cert.setExtensions([
{
name: 'basicConstraints',
cA: true, // This is a CA — can sign leaf certificates
pathlenConstraint: 0, // But cannot sign further intermediate CAs
critical: true,
},
{
name: 'keyUsage',
keyCertSign: true,
cRLSign: true,
critical: true,
},
{
name: 'authorityKeyIdentifier',
keyIdentifier: true, // Points back to the Root CA's key fingerprint
authorityCertIssuer: true,
},
{ name: 'subjectKeyIdentifier' },
]);
// Signed by Root CA's private key
cert.sign(rootPrivateKey, forge.md.sha256.create());
const privateKeyPem = forge.pki.encryptRsaPrivateKey(keys.privateKey, rootPassphrase, {
algorithm: 'aes256',
});
return {
certificate: forge.pki.certificateToPem(cert),
privateKey: privateKeyPem,
publicKey: forge.pki.publicKeyToPem(keys.publicKey),
};
}
After this step, the Intermediate CA certificate carries the Root CA's signature. Anyone verifying a leaf certificate will walk the chain: leaf → Intermediate CA → Root CA. As long as they trust the Root CA, the entire chain is trusted.
The signing certificate is what you hand to an actor (a user, a service, a role). It is now signed by the Intermediate CA, not the Root directly — the Root CA never needs to be online for day-to-day issuance.
Key differences from the CA certificates:
basicConstraints: cA: false: this is a leaf certificate — it cannot sign other certificateskeyUsage: digitalSignature + nonRepudiation: for signing documents. nonRepudiation means the signer cannot later deny having signed something — required for legal validity in most jurisdictionsauthorityKeyIdentifier: points to the Intermediate CA's key fingerprint
export interface SigningCertificate {
certificate: string; // PEM format
privateKey: string; // PEM format, AES-256 encrypted
publicKey: string; // PEM format
}
export async function generateSigningCertificate(
intermediateCA: IntermediateCACertificate,
passphrase: string,
options: {
commonName: string;
organization?: string;
country?: string;
validityDays?: number;
}
): Promise<SigningCertificate> {
const { commonName, organization = '', country = '', validityDays = 730 } = options;
// Decrypt the Intermediate CA private key — using the Node.js crypto workaround
const intermediatePrivateKey = decryptPrivateKey(intermediateCA.privateKey, passphrase);
const keys = forge.pki.rsa.generateKeyPair({ bits: 2048, workers: -1 });
const cert = forge.pki.createCertificate();
cert.publicKey = keys.publicKey;
cert.serialNumber = crypto.randomBytes(16).toString('hex');
const now = new Date();
cert.validity.notBefore = now;
cert.validity.notAfter = new Date();
cert.validity.notAfter.setDate(now.getDate() + validityDays);
const subjectAttrs = [{ name: 'commonName', value: commonName }];
if (organization) subjectAttrs.push({ name: 'organizationName', value: organization });
if (country) subjectAttrs.push({ name: 'countryName', value: country });
cert.setSubject(subjectAttrs);
// Issuer = Intermediate CA (not the Root)
const intermediateCACert = forge.pki.certificateFromPem(intermediateCA.certificate);
cert.setIssuer(intermediateCACert.subject.attributes);
cert.setExtensions([
{
name: 'basicConstraints',
cA: false, // Leaf certificate — cannot sign other certificates
critical: true,
},
{
name: 'keyUsage',
digitalSignature: true, // Can sign data (documents, PDFs)
nonRepudiation: true, // Signature is legally binding — signer cannot deny it
critical: true,
},
{
name: 'authorityKeyIdentifier',
keyIdentifier: true, // Points to Intermediate CA by key fingerprint
authorityCertIssuer: true,
},
{ name: 'subjectKeyIdentifier' },
]);
// Signed by the Intermediate CA's private key
cert.sign(intermediatePrivateKey, forge.md.sha256.create());
const privateKeyPem = forge.pki.encryptRsaPrivateKey(keys.privateKey, passphrase, {
algorithm: 'aes256',
});
return {
certificate: forge.pki.certificateToPem(cert),
privateKey: privateKeyPem,
publicKey: forge.pki.publicKeyToPem(keys.publicKey),
};
}
The Root CA private key is never touched during this step — it can remain offline. Only the Intermediate CA key is needed to issue new signing certificates.
Verification answers: "can I trust this certificate?" In a three-tier PKI, this means walking the full chain: signing certificate → Intermediate CA → Root CA.
Each step does two things: confirm the issuer/subject linkage (via authorityKeyIdentifier), and verify the cryptographic signature (did the parent's private key sign this certificate?). If every link holds and the chain ends at a trusted root, the certificate is valid.
export function verifyCertificateChain(
signingCertPem: string,
intermediateCAPem: string,
rootCAPem: string,
): { valid: boolean; error?: string } {
try {
const signingCert = forge.pki.certificateFromPem(signingCertPem);
const intermediateCert = forge.pki.certificateFromPem(intermediateCAPem);
const rootCert = forge.pki.certificateFromPem(rootCAPem);
// Check the signing cert is a leaf (not a CA)
const leafConstraints = signingCert.getExtension('basicConstraints') as { cA?: boolean } | null;
if (leafConstraints?.cA) {
return { valid: false, error: 'Signing certificate must not be a CA' };
}
// Check all certificates are within their validity windows
const now = new Date();
for (const cert of [signingCert, intermediateCert, rootCert]) {
if (now < cert.validity.notBefore || now > cert.validity.notAfter) {
return { valid: false, error: `Certificate "${cert.subject.getField('CN')?.value}" is expired or not yet valid` };
}
}
// Walk the chain: verify each signature with the parent's public key
// intermediateCert.verify(signingCert) = did Intermediate CA sign the signing cert?
if (!intermediateCert.verify(signingCert)) {
return { valid: false, error: 'Signing certificate signature is invalid (not signed by Intermediate CA)' };
}
// rootCert.verify(intermediateCert) = did Root CA sign the Intermediate CA?
if (!rootCert.verify(intermediateCert)) {
return { valid: false, error: 'Intermediate CA signature is invalid (not signed by Root CA)' };
}
// Root CA must be self-signed
if (!rootCert.verify(rootCert)) {
return { valid: false, error: 'Root CA self-signature is invalid' };
}
return { valid: true };
} catch (err) {
return { valid: false, error: err instanceof Error ? err.message : 'Verification failed' };
}
}
The cert.verify(issuerCert) call is the core of each step: it uses the issuer's public key to validate the signature embedded in the certificate. If the private key that signed it doesn't match the issuer's public key, verification fails.
In a real verification flow, you'd also check revocation status (CRL or OCSP). We keep it simple here.
Here is the part that took me the most time to debug.
node-forge encrypts private keys with forge.pki.encryptRsaPrivateKey(). The encrypted PEM header is BEGIN RSA PRIVATE KEY (PKCS#1 format with an encryption wrapper). When you later need to use that key, the natural choice is forge.pki.decryptRsaPrivateKey(). And this is where it breaks.
The symptom: forge.pki.decryptRsaPrivateKey(encryptedPem, passphrase) returns null. No exception, no error message — just null. Your code then crashes trying to call .sign() on a null value.
The root cause: node-forge's AES-256 decryption for the PBE-SHA1-AES-256-CBC scheme has compatibility issues in certain Node.js versions. It works in some environments, silently fails in others.
The fix: delegate decryption to Node.js built-in crypto, which uses OpenSSL and handles all standard encrypted key formats correctly:
import crypto from 'crypto';
import forge from 'node-forge';
/**
* Decrypts an AES-256 encrypted PEM private key.
*
* node-forge's decryptRsaPrivateKey() silently returns null for AES-256
* encrypted keys in some Node.js versions. Node.js built-in crypto handles
* the same format correctly, so we use it for decryption and re-export
* in a format forge can read.
*/
export function decryptPrivateKey(
encryptedPem: string,
passphrase: string,
): forge.pki.rsa.PrivateKey {
// Step 1: Node.js crypto decrypts the key correctly
const keyObject = crypto.createPrivateKey({
key: encryptedPem,
format: 'pem',
passphrase: passphrase,
});
// Step 2: export as unencrypted PKCS#1 PEM — a format forge reads without issues
const unencryptedPem = keyObject.export({
type: 'pkcs1',
format: 'pem',
}) as string;
// Step 3: now forge can parse it without problems
return forge.pki.privateKeyFromPem(unencryptedPem);
}
Usage is straightforward:
// Encrypted key lives on disk (or in a secrets manager)
const encryptedKeyPem = fs.readFileSync('pki/signing/accountant.key', 'utf-8');
// Decrypt with the workaround
const privateKey = decryptPrivateKey(encryptedKeyPem, process.env.PKI_PASSPHRASE);
// Now use it normally with forge
const p12 = forge.pkcs12.toPkcs12Asn1(privateKey, [cert], passphrase, { algorithm: '3des' });
The key insight: never call forge.pki.decryptRsaPrivateKey() with AES-256 encrypted keys. Always go through crypto.createPrivateKey() instead.
With a decrypted private key and a certificate, you can sign arbitrary data. The process:
To verify, the process runs in reverse: recompute the hash of the data, use the public key to decrypt the signature and recover the original hash. If they match, the data is authentic and unmodified.
export function signData(
data: string | Buffer,
privateKey: forge.pki.rsa.PrivateKey,
): string {
const md = forge.md.sha256.create();
const content = typeof data === 'string' ? data : data.toString('binary');
md.update(content, 'utf8');
// RSA-PKCS#1-v1.5 signature: hash → RSA sign → raw bytes
const signature = privateKey.sign(md);
// Base64-encode for safe storage and transport
return forge.util.encode64(signature);
}
export function verifySignature(
data: string | Buffer,
signatureBase64: string,
certificatePem: string,
): boolean {
try {
const cert = forge.pki.certificateFromPem(certificatePem);
const publicKey = cert.publicKey as forge.pki.rsa.PublicKey;
const md = forge.md.sha256.create();
const content = typeof data === 'string' ? data : data.toString('binary');
md.update(content, 'utf8');
// Decodes the signature and verifies against the recomputed hash
return publicKey.verify(md.digest().bytes(), forge.util.decode64(signatureBase64));
} catch {
return false;
}
}
verifySignature only confirms the data was signed with this certificate's private key. It does not confirm that the certificate is trustworthy. For full trust verification, also call verifySigningCertificate — a valid signature from an untrusted certificate is worthless.
async function main() {
const passphrase = 'min-12-char-passphrase';
// 1. Generate the Root CA once — store offline, never on a server
const rootCA = await generateRootCA(passphrase);
console.log('Root CA created');
// 2. Generate the Intermediate CA — signed by Root, used for day-to-day issuance
const intermediateCA = await generateIntermediateCA(rootCA, passphrase, {
commonName: 'My Signing CA',
organization: 'My Org',
country: 'FR',
validityYears: 5,
});
console.log('Intermediate CA created');
// 3. Issue signing certificates per actor — signed by Intermediate CA
// Root CA does not need to be online for this step
const accountantCert = await generateSigningCertificate(intermediateCA, passphrase, {
commonName: 'Cabinet Dupont & Associés',
organization: 'Cabinet Dupont',
country: 'FR',
validityDays: 730,
});
console.log('Accountant certificate issued');
// 4. Verify the full chain: signing cert → Intermediate CA → Root CA
const chainResult = verifyCertificateChain(
accountantCert.certificate,
intermediateCA.certificate,
rootCA.certificate,
);
console.log('Chain valid:', chainResult.valid); // true
// 5. Sign some data — first decrypt the private key using the workaround
const privateKey = decryptPrivateKey(accountantCert.privateKey, passphrase);
const signature = signData('Tax declaration 2024-Q4', privateKey);
// 6. Verify the signature using the certificate's public key
const signatureValid = verifySignature(
'Tax declaration 2024-Q4',
signature,
accountantCert.certificate
);
console.log('Signature valid:', signatureValid); // true
// Tamper detection — any change to the data fails verification
const tampered = verifySignature(
'Tax declaration 2024-Q4 (modified)',
signature,
accountantCert.certificate
);
console.log('Tampered signature valid:', tampered); // false
}
main();
import fs from 'fs/promises';
import path from 'path';
// Save all three files to a directory
async function saveCertificate(cert: SigningCertificate, dir: string): Promise<void> {
await fs.mkdir(dir, { recursive: true });
await Promise.all([
fs.writeFile(path.join(dir, 'cert.pem'), cert.certificate),
fs.writeFile(path.join(dir, 'key.pem'), cert.privateKey), // AES-256 encrypted
fs.writeFile(path.join(dir, 'public.pem'), cert.publicKey),
]);
}
// Load back from disk
async function loadCertificate(dir: string): Promise<SigningCertificate> {
const [certificate, privateKey, publicKey] = await Promise.all([
fs.readFile(path.join(dir, 'cert.pem'), 'utf-8'),
fs.readFile(path.join(dir, 'key.pem'), 'utf-8'),
fs.readFile(path.join(dir, 'public.pem'), 'utf-8'),
]);
return { certificate, privateKey, publicKey };
}
A few rules to follow in production:
pki/ to git — add it to .gitignore. Certificates and encrypted keys should go in a secrets manager, not source control.forge.pki.decryptRsaPrivateKey returns null: as described above, use crypto.createPrivateKey() instead. Always.
workers: -1 is essential for 4096-bit keys: without it, node-forge generates RSA 4096 keys synchronously, blocking the entire Node.js event loop for 2–10 seconds. With workers: -1, it uses all available CPU threads in parallel.
Clock skew: notBefore / notAfter validation fails if server clocks drift. Set notBefore to new Date(Date.now() - 5 * 60 * 1000) (5 minutes in the past) as a buffer against clock differences between issuer and verifier.
SHA-1 is dead: forge.md.sha256.create() is what we're using throughout. forge.md.sha1.create() exists but SHA-1 certificates are rejected by all modern runtimes and browsers.
Sequential serial numbers: serial numbers must be unique within a CA. Sequential integers can collide in distributed systems where multiple servers issue certificates concurrently. Use crypto.randomBytes(16) as shown above.
This PKI gives you the foundation for more advanced use cases:
@signpdf to sign PDFs in a format that Adobe Reader can verify. I cover this in the next post.The key takeaways:
crypto for key decryption — forge.pki.decryptRsaPrivateKey silently returns null for AES-256 encrypted keys; crypto.createPrivateKey() does notforge.pki.encryptRsaPrivateKey() before any key ever touches diskworkers: -1 for RSA key generation — avoids blocking the event loopThe chain of signatures is what makes trust transitive: if you trust the Root CA, and the Root signed the Intermediate, and the Intermediate signed the leaf, then you trust the leaf. Break any link in the chain and verification fails.
Questions or edge cases you've hit? Drop them in the comments.
2026-02-17 21:25:36
Muitos desenvolvedores adotam Polars ou uv apenas porque "ouviram dizer que é rápido". Mas entender a engenharia por trás dessas ferramentas é o que separa um usuário comum de um Engenheiro de Dados Sênior capaz de processar Terabytes.
Vamos dissecar os três pilares da eficiência: Computação (CPU), Memória (RAM) e Tempo (Developer Experience).
O problema fundamental do Pandas (e do Python clássico) é o GIL (Global Interpreter Lock). Em termos simples, o Python padrão só permite que uma única thread execute bytecode por vez.
Se você tem um MacBook Pro com chip M4 Max de 16 núcleos:
O Polars (e o motor do uv) são escritos em Rust, que não possui Garbage Collector e permite "concorrência sem medo" (fearless concurrency).
O Mecanismo de Aceleração:
a + b), a CPU soma vetores inteiros ([a1, a2, a3, a4] + [b1, b2, b3, b4]) em um único ciclo de clock.Como verificar isso na prática:
Ao rodar uma query pesada no Polars, abra o monitor de atividade (htop ou Activity Monitor). Você verá o uso de CPU saltar para 100% em todos os núcleos. Isso é eficiência de computação: você pagou pelo chip todo, use o chip todo.
A maior mentira que contaram para nós foi que "memória é barata". Em Big Data local, a memória é o gargalo mais frequente.
No modelo antigo, conectar ferramentas significava copiar dados.
Cada cópia duplica o consumo de RAM e queima ciclos de CPU.
O Polars, DuckDB e BigQuery DataFrames falam a mesma língua na memória: Apache Arrow. É um formato colunar padronizado.
O Conceito Zero-Copy:
Imagine que o Polars carrega um arquivo Parquet de 5GB na memória. Você quer consultar esses dados com SQL usando DuckDB.
Demonstração Técnica (Python):
import polars as pl
import duckdb
# 1. Polars aloca memória (Formato Arrow)
df_polars = pl.scan_parquet("dados_gigantes.parquet").collect()
# 2. DuckDB acessa a MESMA memória sem copiar
# O 'df_polars' é tratado como uma tabela virtual view
relacao_duck = duckdb.arrow(df_polars.to_arrow())
# 3. Query SQL roda sobre a memória alocada pelo Polars
resultado = relacao_duck.query("duckdb", "SELECT avg(valor) FROM relacao_duck").fetchall()
Isso permite analisar datasets que ocupam 80-90% da sua RAM sem estourar o limite (Out-Of-Memory Error).
uv resolve dependências em milissegundos
O uv não é apenas um "pip mais rápido". Ele muda como os arquivos são armazenados no seu disco (Filesystem Layout).
Quando você cria 10 projetos com Pandas usando pip/venv:
Quando você usa uv:
uv baixa o Pandas uma vez e armazena em um Cache Central Global (~/.cache/uv).uv não copia o arquivo. Ele cria um Hardlink (ou Reflink no APFS do macOS).O pip usa um algoritmo de resolução de dependências que faz "backtracking" (tentativa e erro) de forma ineficiente. Se houver conflito de versões, ele pode levar minutos testando combinações.
O uv implementa o algoritmo PubGrub (usado também pelo Dart/Flutter) em Rust. Ele modela a árvore de dependências matematicamente e encontra a solução ótima quase instantaneamente.
Não basta mudar a ferramenta, é preciso mudar a mentalidade de codificação.
iterrows is dead)
Se você ver for row in df.iterrows(): no seu código, pare. Isso força o Python a processar linha por linha, anulando qualquer ganho do Rust.
.apply() apenas como último recurso. Prefira as funções nativas do Polars (pl.col("a").str.to_uppercase()), que rodam em Rust/C++.No Pandas, nós depurávamos imprimindo o df.head() a cada três linhas. Isso força a execução do código.
No Polars, você constrói um Plano de Execução. Acostume-se a encadear 10 ou 20 operações e chamar .collect() apenas no final.
O Pandas tenta "adivinhar" tipos e frequentemente erra (transformando números em objetos/strings). O Polars é estrito.
schema ao ler arquivos.Resumo da Filosofia 2026:
Nós paramos de tentar otimizar código Python lento. Nós passamos a usar Python apenas como uma camada de API (cola) para orquestrar motores de alta performance escritos em Rust e C++. O Python comanda, o Rust executa.
2026-02-17 21:23:22
A gem that can't adapt to its host app will never leave your own repo.
This is part 3 of the series where we build DataPorter, a mountable Rails engine for data import workflows. In part 1, we established the problem and architecture. Part 2 covered scaffolding the engine gem with isolate_namespace.
In this article, we'll build the configuration layer: a clean DSL that lets host apps customize DataPorter's behavior through an initializer. By the end, you'll have a DataPorter.configure block that feels like any well-designed Rails gem.
Our engine needs to run inside apps we don't control. One app uses Sidekiq with a custom queue name. Another stores files on S3. A third needs every import scoped to the current hotel's account.
Hard-coding any of these choices inside the gem would make it useless to anyone whose setup differs from ours. But making everything configurable turns the gem into a configuration puzzle where nobody remembers what goes where.
The challenge is finding the line between flexibility and convention -- and expressing it through an API that feels obvious on first read.
Here's the end result from the host app's perspective:
# config/initializers/data_porter.rb
DataPorter.configure do |config|
config.parent_controller = "Admin::BaseController"
config.queue_name = :low_priority
config.storage_service = :amazon
config.preview_limit = 200
config.context_builder = ->(controller) {
{ hotel: controller.current_hotel, user: controller.current_user }
}
end
This reads exactly like a Devise or Sidekiq initializer. A configure block yields a plain object with sensible defaults. If you don't call configure at all, everything still works.
We need an object that holds every configurable value and provides reasonable defaults out of the box. The simplest approach that works: a plain Ruby class with attr_accessor and defaults set in initialize.
# lib/data_porter/configuration.rb
module DataPorter
class Configuration
attr_accessor :parent_controller,
:queue_name,
:storage_service,
:cable_channel_prefix,
:context_builder,
:preview_limit,
:enabled_sources,
:scope
def initialize
@parent_controller = "ApplicationController"
@queue_name = :imports
@storage_service = :local
@cable_channel_prefix = "data_porter"
@context_builder = nil
@preview_limit = 500
@enabled_sources = %i[csv json api]
@scope = nil
end
end
end
Every attribute has a default that makes the gem work without any initializer. parent_controller defaults to "ApplicationController" because that exists in every Rails app. storage_service defaults to :local because that requires zero setup. preview_limit caps at 500 rows to keep the preview page responsive.
Two attributes default to nil on purpose: context_builder and scope. These are opt-in features. When context_builder is nil, imports run without host-specific context. When scope is nil, the engine shows all imports. The gem checks for nil and adapts its behavior, rather than forcing a value that might be wrong.
Let's walk through the attributes that deserve a closer look.
parent_controller is a string, not a class. That's deliberate. At configuration time (during boot), the host app's controller class might not be loaded yet. We store the string and constantize it later, when Rails actually needs to resolve the inheritance chain.
context_builder is the most interesting one. It's a lambda that receives the current controller instance and returns whatever the host app needs during import. The engine calls it internally with context_builder.call(controller) at the start of an import, passing the result to the Orchestrator. This is how a multi-tenant app passes current_hotel into the import flow without the gem knowing anything about hotels. We'll use this extensively when we build the Orchestrator in part 7.
enabled_sources lets the host app restrict which source types appear in the UI. If you only deal with CSV files, you can set enabled_sources = %i[csv] and the JSON/API options won't clutter the interface.
The Configuration class is just a data object. We need two module-level methods to turn it into a DSL: one to access the singleton instance, and one to yield it for configuration.
# lib/data_porter.rb
module DataPorter
class Error < StandardError; end
def self.configuration
@configuration ||= Configuration.new
end
def self.configure
yield(configuration)
end
end
configuration uses memoization via ||= to ensure a single instance across the application. The first call creates a Configuration.new with all defaults; subsequent calls return the same object. This is the singleton pattern without the ceremony of the Singleton module.
configure yields that singleton to a block. The block receives the Configuration instance and can call any writer method on it (config.queue_name = :low_priority). After the block runs, DataPorter.configuration.queue_name returns whatever the host app set -- or the default if they didn't touch it.
There's no reset! method in production code. We don't need one. The configuration is set once during Rails boot and stays put for the process lifetime. We do need to reset between tests, but that's handled with instance_variable_set in the spec (we'll see that in a moment).
The configuration module gets required early, before the engine loads. This ensures DataPorter.configure is available when the host app's initializer runs.
# lib/data_porter.rb (top of file)
require "rails/engine"
require_relative "data_porter/version"
require_relative "data_porter/configuration"
require_relative "data_porter/engine"
The load order matters. configuration.rb comes before engine.rb because the Engine class might reference configuration values during setup. In practice, Rails processes initializers after the engine is loaded, so the host app's configure block runs with the full gem already available.
Other parts of the gem read configuration like this:
# Inside any DataPorter class
DataPorter.configuration.queue_name
DataPorter.configuration.context_builder&.call(controller)
The safe navigation operator (&.) on context_builder handles the nil default gracefully. When no builder is configured, the call simply returns nil instead of raising a NoMethodError.
Configuration object — Plain class with attr_accessor over OpenStruct or Dry::Configurable. No dependencies, easy to read, easy to document. IDE autocompletion works with real attributes.
Singleton pattern — Memoized module instance variable over the Singleton module or Rails.application.config. Simpler API (DataPorter.configure), no coupling to Rails config namespace, works in non-Rails test contexts.
context_builder type — Lambda over a stored proc or method object. Lambdas enforce arity (catches wrong argument count), and the ->() {} syntax signals "this is a callable" to the reader.
parent_controller type — String over class constant. Avoids load-order issues: the class may not exist at configuration time, but the string can be constantized later.
Default for optional features — nil over the null object pattern. Simpler to check if context_builder than to create a no-op null object. The gem has few enough optional features to keep the nil checks manageable.
The specs verify two things: that defaults are sane, and that the configure block actually mutates the singleton.
# spec/data_porter/configuration_spec.rb
RSpec.describe DataPorter::Configuration do
subject(:config) { described_class.new }
it "has default parent_controller" do
expect(config.parent_controller).to eq("ApplicationController")
end
it "has default queue_name" do
expect(config.queue_name).to eq(:imports)
end
it "has default preview_limit" do
expect(config.preview_limit).to eq(500)
end
it "has nil context_builder by default" do
expect(config.context_builder).to be_nil
end
end
Notice that each default gets its own test. This is intentional. When someone changes a default six months from now, the failure message says exactly which default broke, not just "configuration test failed."
The module-level specs test the singleton behavior and the configure yield pattern:
# spec/data_porter/configuration_spec.rb
RSpec.describe DataPorter do
describe ".configure" do
after { DataPorter.instance_variable_set(:@configuration, nil) }
it "yields the configuration" do
DataPorter.configure do |config|
config.queue_name = :custom_queue
end
expect(DataPorter.configuration.queue_name).to eq(:custom_queue)
end
end
describe ".configuration" do
after { DataPorter.instance_variable_set(:@configuration, nil) }
it "memoizes the configuration" do
expect(DataPorter.configuration).to be(DataPorter.configuration)
end
end
end
The after block resets the singleton between tests using instance_variable_set. This is the one place where we reach into internals, and it's acceptable because test isolation trumps encapsulation here. A public reset! method would leak test concerns into production code.
Configuration class is a plain Ruby object with attr_accessor and defaults in initialize. No framework magic, no dependencies.configure and configuration) create the DSL that host apps use in their initializer.context_builder and scope are opt-in via nil defaults.parent_controller as a string avoids boot-order issues. Using a lambda for context_builder enforces arity and reads clearly.Configuration tells the gem how to behave. In part 4, we'll tackle what it operates on: the data models. We'll use StoreModel and JSONB columns to store import records, validation errors, and summary reports as structured data inside a single table -- no migration per import type, no schema sprawl. If you've ever debated "extra table vs. JSON column," that's the one to read.
This is part 3 of the series "Building DataPorter - A Data Import Engine for Rails". Previous: Scaffolding a Rails Engine gem | Next: Modeling import data with StoreModel & JSONB
GitHub: SerylLns/data_porter | RubyGems: data_porter
2026-02-17 21:23:13
Using OpenID Connect (OIDC) as an authentication source is one of the best practices when working with infrastructure, as it significantly improves both security and maintainability. Keycloak is an excellent open-source project widely adopted for this purpose. It supports many features and storage backends (such as PostgreSQL) and has straightforward deployment instructions on their official website.
However, I recently encountered an interesting challenge when deploying Keycloak in Kubernetes that required a specific configuration to solve internal service communication issues.
When deploying Keycloak in Kubernetes, you typically specify a public hostname using the --hostname=https://auth.example.com parameter. This works perfectly for external clients accessing your authentication service.
But here's where it gets tricky: imagine you have other services running in your Kubernetes cluster—perhaps a container registry or CI server—that need to authenticate with Keycloak. These services need to access the discovery URL at https://auth.example.com/realms/{realm-name}/.well-known/openid-configuration to retrieve authentication configuration.
The issue arises because Keycloak internally always redirects to (and generates tokens/URLs based on) the hostname that was specified during deployment. But what happens when this public URL is not resolvable by pods inside the Kubernetes cluster? This creates a problem where internal services can't properly reach Keycloak for backchannel requests (token introspection, userinfo, etc.), even if they can reach the pod via internal DNS.
Fortunately, Keycloak provides a CLI option to address this exact issue (available when the hostname:v2 feature is enabled):
--features=hostname:v2
--hostname-backchannel-dynamic=true
This configuration tells Keycloak to dynamically determine the backchannel (internal) URLs based on the incoming request, allowing access via:
keycloak.keycloak-namespace.svc.cluster.local:8080/realms/{realm-name}/.well-known/openid-configuration)With --hostname-backchannel-dynamic=true enabled:
https://auth.example.com) for authentication flows.This dual-access approach ensures that:
Note for production: For this to work securely, make sure your ingress / reverse proxy correctly passes Forwarded or X-Forwarded-* headers, and consider enabling HTTPS on both external and internal access paths.
Here's how you might configure this in a Kubernetes deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: keycloak
spec:
template:
spec:
containers:
- name: keycloak
image: quay.io/keycloak/keycloak:latest
args:
- start
- --features=hostname:v2 # required for dynamic backchannel
- --hostname=https://auth.example.com
- --hostname-backchannel-dynamic=true
- --db=postgres
- --proxy-headers=forwarded # important for correct header handling behind proxy/ingress
# ... other configuration (ports, HTTPS, DB credentials via env vars, etc.)
The --hostname-backchannel-dynamic=true flag (combined with the hostname:v2 feature) is a simple yet powerful solution for mixed internal/external access scenarios in Kubernetes. While the public URL remains ideal for external client access, internal service-to-service communication often requires this flexibility.
Keycloak's hostname configuration options make it a robust choice for authentication infrastructure in containerized environments.