2026-04-07 04:51:19
In the previous article, we converted words into embeddings. Now let’s see how transformers add position to those numbers.
The numbers that represent word order in a transformer come from a sequence of sine and cosine waves.
Each curve is responsible for generating position values for a specific dimension of the word embedding.
Think of each embedding dimension as getting its value from a different wave.
For example:
For the first word in the sentence, which lies at the far left of the graph (position 0 on the x-axis):
At the same position (first word):
For the first word:
For the first word:
By combining the values from all four curves, we get the positional encoding vector for the first word:
We will apply the same process to the remaining words in the next article.
Looking for an easier way to install tools, libraries, or entire repositories?
Try Installerpedia: a community-driven, structured installation platform that lets you install almost anything with minimal hassle and clear, reliable guidance.
Just run:
ipm install repo-name
… and you’re done! 🚀
2026-04-07 04:50:47
A few months ago I was looking at why a PostgreSQL instance was running at 94% memory on a server that, by all accounts, should have had plenty of headroom. The queries were fast, the data volume was modest, and CPU was barely touched.
The culprit was 280 open connections.
No single connection was doing anything particularly expensive. But each one carries a cost that most developers don't think about until they're in production staring at an OOM kill: PostgreSQL spawns a dedicated backend process per connection, and each process consumes roughly 5-10MB of RAM regardless of whether it's actively running a query.
280 connections x 7MB average = 1.96GB. On a server with 4GB RAM and PostgreSQL's own memory settings (shared_buffers, work_mem), that leaves almost nothing for actual query execution.
The problem is architectural. Node.js applications are typically deployed as multiple processes or containers: a web server, one or more background workers, maybe a separate process for scheduled jobs. Each runs its own connection pool. Each pool opens connections eagerly.
With pg and a default pool size of 10, and 3 services each with 3 replicas:
web server (3 replicas x 10 connections) = 30 connections
background worker (3 replicas x 10 connections) = 30 connections
job scheduler (3 replicas x 5 connections) = 15 connections
Total: 75 connections at idle
Add a traffic spike, pool expansion, and a few long-running queries holding connections open, and you're at 150+ before anything goes wrong with your code.
PostgreSQL's default max_connections is 100. Many managed databases (RDS, Supabase, Neon) set it lower for small instance sizes.
Error: remaining connection slots are reserved for non-replication superuser connections
Or, worse, requests that queue indefinitely waiting for a connection that never frees up because every connection is held by a slow query, and the slow query is slow because it can't get a lock, because another connection holds it, and that connection is waiting for... a connection.
You get the idea.
The instinct is to increase max_connections. This works until it doesn't: more connections means more RAM pressure, more context switching, and more lock contention. PostgreSQL is not designed for thousands of concurrent connections. It's designed for dozens of active queries with efficient I/O, and it's exceptional at that.
The right fix is to not open connections you don't need.
PgBouncer sits between your application and PostgreSQL. Your application thinks it's talking to PostgreSQL directly - same protocol, same port behavior. PgBouncer maintains a much smaller pool of real PostgreSQL connections and multiplexes client connections onto them.
App (100 client connections)
|
[PgBouncer]
|
PostgreSQL (20 server connections)
100 application connections, 20 actual PostgreSQL connections. The application never notices.
PgBouncer has three pooling modes:
Session pooling - a server connection is assigned to a client for the entire session duration. Equivalent to no pooling for persistent connections, but useful for clients that connect and disconnect frequently.
Transaction pooling - a server connection is assigned only for the duration of a transaction. As soon as your transaction commits or rolls back, the connection goes back to the pool. This is the mode that actually reduces your connection count dramatically.
Statement pooling - a server connection is assigned for a single statement. Very aggressive, incompatible with multi-statement transactions. Rarely the right choice.
For most Node.js workloads, transaction pooling is what you want.
# docker-compose.yml
services:
pgbouncer:
image: bitnami/pgbouncer:latest
environment:
POSTGRESQL_HOST: postgres
POSTGRESQL_PORT: 5432
POSTGRESQL_DATABASE: myapp
POSTGRESQL_USERNAME: app_user
POSTGRESQL_PASSWORD: ${DB_PASSWORD}
PGBOUNCER_PORT: 6432
PGBOUNCER_POOL_MODE: transaction
PGBOUNCER_MAX_CLIENT_CONN: 1000
PGBOUNCER_DEFAULT_POOL_SIZE: 25
PGBOUNCER_MIN_POOL_SIZE: 5
PGBOUNCER_RESERVE_POOL_SIZE: 5
PGBOUNCER_RESERVE_POOL_TIMEOUT: 3
PGBOUNCER_SERVER_IDLE_TIMEOUT: 600
ports:
- "6432:6432"
depends_on:
- postgres
Your application connects to port 6432 (PgBouncer) instead of 5432 (PostgreSQL). Everything else stays the same.
// Before
const pool = new Pool({
connectionString: "postgresql://app_user:password@postgres:5432/myapp",
max: 10,
});
// After
const pool = new Pool({
connectionString: "postgresql://app_user:password@pgbouncer:6432/myapp",
max: 25, // can be higher now - PgBouncer handles the real limit
});
Same application, same workload, same PostgreSQL instance. Before and after adding PgBouncer in transaction mode:
| Metric | Without PgBouncer | With PgBouncer |
|---|---|---|
| PostgreSQL connections (idle) | 75 | 8 |
| PostgreSQL connections (peak load) | 210 | 25 |
| PostgreSQL RAM used by connections | 1.47GB | 175MB |
| p99 query latency (peak) | 340ms | 95ms |
| Errors under load | connection limit exceeded | 0 |
The latency improvement is not because PgBouncer makes queries faster. It's because without it, queries were queuing for a connection slot. With transaction pooling, a query gets a connection, runs, and returns it immediately - no waiting.
This is important. Transaction pooling is not a drop-in change if you use any of the following:
Named prepared statements. Prepared statements are created on a specific server connection. With transaction pooling, you might get a different connection per transaction, so the prepared statement doesn't exist there.
Good news for Node.js developers: pg does NOT use protocol-level prepared statements by default. Standard parameterized queries work fine with PgBouncer in transaction mode:
// This does NOT use a persistent prepared statement - works fine with PgBouncer
await client.query("SELECT * FROM users WHERE id = $1", [userId]);
// This DOES use a persistent prepared statement (the `name` property) - breaks with PgBouncer
await client.query({
name: "get-user-by-id",
text: "SELECT * FROM users WHERE id = $1",
values: [userId],
});
The issue only appears if you explicitly pass a name property in the query object. If you're using standard pool.query(sql, params) calls, you don't need to change anything.
SET statements and session-level configuration. SET search_path TO tenant_abc applies to the session, not the transaction. With transaction pooling, the setting evaporates when the transaction ends and the connection goes back to the pool.
If you're using RLS with set_config('app.organization_id', orgId, true), the true parameter already makes it transaction-scoped, so this works correctly with PgBouncer. Just make sure you're not relying on any session-level state persisting between transactions.
Advisory locks. pg_advisory_lock() is session-scoped. Use pg_advisory_xact_lock() instead, which is transaction-scoped and releases automatically on commit/rollback.
LISTEN/NOTIFY. Subscriptions are session-scoped. If you're using LISTEN, you need a dedicated long-lived connection that bypasses PgBouncer - or use a separate direct PostgreSQL connection just for pub/sub.
// Direct connection for LISTEN/NOTIFY, bypassing PgBouncer
const notifyClient = new Client({
connectionString: process.env.DATABASE_DIRECT_URL, // points to :5432
});
await notifyClient.connect();
await notifyClient.query("LISTEN log_events");
If you're using RDS, Supabase, Neon, or similar, you often don't need to run PgBouncer yourself.
If you're using Prisma with any connection pooler in transaction mode, you must add ?pgbouncer=true to your database URL - otherwise Prisma's internal prepared statement handling will crash:
# Without this flag, Prisma breaks silently with PgBouncer/Supavisor in transaction mode
DATABASE_URL="postgresql://user:password@pgbouncer:6432/myapp?pgbouncer=true"
This one parameter has saved countless hours of "why is Prisma throwing random errors in production" debugging.
For self-hosted PostgreSQL, running PgBouncer yourself is the standard approach.
max_connections in PostgreSQL
Once PgBouncer is in front, you can lower PostgreSQL's max_connections to something realistic:
-- See current value
SHOW max_connections;
-- See current active connections
SELECT count(*) FROM pg_stat_activity;
A reasonable formula for max_connections when using a pool:
max_connections = (pool_size * number_of_pools) + reserved_superuser_connections
For PgBouncer with default_pool_size = 25 and a few admin connections:
max_connections = 25 + 10 (headroom) = 35
Set this in postgresql.conf:
max_connections = 35
shared_buffers = 256MB # ~25% of available RAM
work_mem = 16MB # per sort/hash operation, per connection
Lowering max_connections lets PostgreSQL allocate more memory to shared_buffers and work_mem, which directly improves query performance. The memory that was being eaten by connection overhead goes back to the query executor.
If you're running Node.js with PostgreSQL in production:
max_connections at peak?set_config for RLS context rather than SET statements?pg_advisory_xact_lock instead of pg_advisory_lock?LISTEN/NOTIFY that bypasses the pool?Connection exhaustion is one of those problems that hides until traffic spikes, then appears as a cascade of unrelated-looking errors. The fix is not complicated, but it requires understanding what PostgreSQL is actually doing with each connection.
What connection pool setup are you running in production? Any gotchas with PgBouncer that aren't covered here? Comments are open.
2026-04-07 04:49:03
Most "Managed WordPress" hosts are just UI wrappers around the same old third-party scripts. We decided to take a different path. We're building SyndockEngine—a proprietary provisioning layer with zero third-party dependencies.
We just hit a major milestone: The first heartbeat. 🚀
The Stack
We’ve unified the entire infrastructure lifecycle under one language: TypeScript.
Runtime: Node.js + Fastify (for high-performance API scaffolding)
Orchestration: Dockerode (direct interaction with the Docker socket)
Job Queue: BullMQ + Redis (handling the heavy lifting of container lifecycle)
ORM: Prisma (managing Instance, Metric, and BackupRecord schemas)
The Architecture: Infrastructure That Thinks
SyndockEngine isn't just about spinning up containers; it's about shifting intelligence from the application layer to the infrastructure layer.
Why we moved away from the standard WP stack:
EloCache: We run caching at the Nginx layer, not inside PHP. No more "Performance Plugins" slowing down the execution thread.
EloShield: Security runs outside the WordPress container. An attacker can't disable a firewall they can't see.
EloSEO: We generate sitemaps by querying MySQL directly. No PHP requests, no overhead, just raw data.
The First Deploy: What’s Live Now?
Fastify Server: Running in strict mode.
Dockerode Integration: Fully connected to the socket for native container control.
Prisma Migrations: Database schema for multi-tenant management is live.
Hardened Security: Auth middleware with token validation and IP allowlisting (restricted to the panel server).
Health Check: GET /api/v1/server/health → { status: "ok" }.
What’s Next?
Sitting alongside the engine is SyndockOS. It’s the "Brain" that reads every container log in real-time, running autonomous healing playbooks. Our goal is to resolve 97% of infrastructure issues before a human even thinks about opening a support ticket.
We’re building this in public. If you've ever dealt with the "black box" of managed hosting, I'd love to hear your thoughts on this architecture.
2026-04-07 04:48:07
If you're coming to Terraform from Python, JavaScript, or Ruby, HCL can feel uncanny: familiar enough to read, strange enough to second-guess.
You see list-like comprehensions, Ruby-ish blocks, and syntax that looks a bit like assignment without really behaving like a programming language. That is not accidental. HCL is a hybrid—a configuration language designed to be readable by humans and useful for describing infrastructure, relationships, and constraints.
The mental shift that helps most is this:
You are not writing a script that runs top to bottom. You are declaring a desired shape of infrastructure, and Terraform turns that declaration into a dependency graph, a plan, and an execution strategy.
Once that clicks, most of HCL's weirdness starts to make sense.
In Python or JavaScript, source order usually matters because execution is fundamentally sequential. In Terraform, block order does not define execution order. References between objects—and, when needed, explicit depends_on edges—do.
That is why you can scatter related resources across multiple .tf files without teaching Terraform what runs first. Terraform builds a dependency graph from the configuration and uses that graph to generate a plan and sequence operations.
That does not mean every fact is always known before apply. In many cases Terraform can resolve values during planning, but some data sources may be deferred until apply if their inputs are not known yet.
So the right mental model is not “top to bottom.” It is “describe the relationships, then let Terraform walk the graph.”
The most important syntax distinction in Terraform is not really about operators. It is about arguments and blocks.
resource "aws_instance" "web" {
ami = "ami-1234567890" # argument
instance_type = "t3.micro" # argument
tags = { # argument whose value is an object
Name = "web"
}
ebs_block_device { # nested block
device_name = "/dev/sda1"
volume_size = 50
}
}
Arguments assign values to names. Blocks are containers for more configuration and usually represent schema-defined structure.
They can look similar when you're new to HCL, but they are not interchangeable. Understanding that explains a lot of Terraform's “why does this parse but that doesn't?” moments.
It also explains why dynamic blocks exist. Expressions can compute argument values directly, but nested blocks are structural, so Terraform needs a dedicated construct to generate them.
Inside argument values, HCL has a compact expression language: functions, conditionals, indexing, attribute access, and for expressions.
This is the part that often feels most familiar to developers coming from general-purpose languages.
subnet_ids = [for s in aws_subnet.private : s.id]
subnet_map = {
for s in aws_subnet.private : s.tags["Name"] => s.id
}
for expressions are one of the cleanest parts of the language. Square brackets produce a tuple/list-like result. Curly braces produce an object/map-like result. In object for expressions, => maps a computed key to a computed value.
There is also only one real inline conditional form:
var.enabled ? 1 : 0
That is deliberate. HCL gives you enough expression power to transform data, but it stops well short of becoming a full programming language with arbitrary control flow.
If you want to understand why Terraform sometimes feels fussy, look at for_each.
For resources and modules, for_each accepts a map or a set of strings. Terraform identifies each instance by the map key or set member. That is the real reason toset() shows up so often: it lets you create instance addresses based on stable values instead of numeric positions.
locals {
environments = toset(["dev", "staging", "prod"])
}
resource "aws_s3_bucket" "env" {
for_each = local.environments
bucket = "my-app-${each.key}"
}
This is safer than list-indexed identity, but it is not magic.
Reordering the original list before converting it to a set does not matter. Changing the key does matter, because Terraform uses that key as the instance identity. If you intentionally rename or move an object address, moved blocks are the right way to preserve that refactor.
Terraform is declarative, but it is not stateless.
It persists state so it can map configuration objects to real infrastructure, track metadata, and make planning practical at scale. Terraform then compares your configuration with its state and the existing infrastructure to decide what needs to change.
That is why HCL alone is never the whole story. Two identical-looking .tf files can behave differently depending on what Terraform already believes it manages.
This is also why backends, locking, drift, imports, and refactors matter so much operationally: they are all state-management problems wearing different costumes.
HCL is excellent at describing structured infrastructure. It is much worse at acting like a general-purpose language.
String manipulation gets unreadable quickly. Branching is intentionally limited. dynamic blocks solve a real problem but can become cryptic fast.
And while Terraform's validation and testing story is better than it used to be—you now have variable validation, preconditions, postconditions, check blocks, and terraform test—it still feels different from application development. Complex module testing can still be heavier and less ergonomic than writing ordinary unit tests in an application language.
That is not a flaw so much as a boundary. HCL works best when you let it describe infrastructure shape, identity, and dependency, and move business logic or heavy data wrangling elsewhere.
HCL feels hybrid because it is.
It borrows just enough from programming languages to be expressive, but it stays rooted in configuration: blocks, arguments, expressions, and graphs.
Once you stop reading it like a script and start reading it like a human-friendly language for declaring structure and dependency, most of its weirdness starts to look intentional.
And that is usually the moment Terraform clicks.
2026-04-07 04:46:56
EO optimizes for a list of links. GEO optimizes for being included in the answer.
I have been implementing Generative Engine Optimization on my own blog for the past three months. llms.txt, JSON-LD Knowledge Graphs, citable content structure, distributed presence across authority platforms. The more I documented, the more I realized there are about 22 core concepts that make GEO work.
22 concepts. 22 tarot major arcana. The metaphor was too perfect to ignore.
So I built it: GEO Tarot
22 cards laid out in a grid, face down. Each back shows a generative pattern built from the same hash-based algorithm I use for my blog post images. Click any card to flip it with a CSS 3D animation. The front reveals the GEO concept with an abstract SVG illustration and a short explanation.
Vanilla PHP. CSS animations for the flip. JavaScript for interaction. SVG for all illustrations. No frameworks. No external JS libraries. No AI-generated art. Every geometric illustration is hand-coded.
The tool is trilingual (English, Spanish, Japanese) with URL-based language switching and proper hreflang tags.
The Hierophant (V) — llms.txt: The file at your domain root that tells AI models who you are. The equivalent of robots.txt for machines that generate answers.
The Emperor (IV) — Knowledge Graph: JSON-LD connecting your articles with relatedLink, marking topics with about, detecting tools with mentions. Your blog becomes a knowledge network, not isolated pages.
The High Priestess (II) — Sentiment Mapping: LLMs distinguish between content that asserts with authority and content that speculates. The tone of your writing directly affects whether you get cited.
The Star (XVII) — First ChatGPT citation: The moment you ask ChatGPT about your topic and it cites you instead of someone with more followers. That moment justifies everything.
The Devil (XV) — Generic AI content: Articles written by ChatGPT without editing, without real experience. There are thousands. LLMs already recognize them and give them less weight.
Every card that corresponds to a published blog post links directly to it for deeper reading.
Most GEO guides are walls of text that nobody bookmarks. I wanted something visual that people would share, come back to, and actually remember. Each card works as a standalone social media post too. 22 cards, 22 posts, zero extra writing.
If you are thinking about GEO for your own site, start with cards V (llms.txt), IV (Knowledge Graph), and III (citable content). Those three alone put you ahead of 95% of blogs.
Built by Shinobis — a UX/UI designer with 10+ years in banking and fintech, documenting everything about building with AI.
2026-04-07 04:46:01
I spent a few hours today having a philosophical conversation with Claude about something that's been nagging at me for a while. I want to share it — not because I have answers, but because I think the question itself is worth probing.
Large language models are trained on an almost incomprehensible volume of human-generated text. Science papers. Forum arguments. Post-mortems. Ancient philosophy. Technical documentation. Reddit threads at 2am. All of it gets compressed into billions of parameters — a statistical map of how human knowledge and language connect.
Here's the thing that bothers me: we only ever query that map with the questions we already know how to ask.
When you ask an LLM a question, it generates an answer. But generating that answer activates far more than what ends up in the output — adjacent concepts, structural relationships, cross-domain patterns that informed the response but never made it into the text you actually read. The answer to your question is only part of what got activated. What sits next to the answer might be more interesting than the answer itself.
Most people never get there. Not because the model won't go there — but because nobody asked.
I tested this with a deliberately structured prompt:
"What do experienced programmers silently correct for that they have never had to articulate, because the people they work with already know it too — and therefore it has never been written down anywhere?"
The answer was interesting — tacit knowledge about execution models, naming drift, the instinctive pricing of technical debt, reading what code doesn't say. Things that are real and valuable but underrepresented in any formal documentation.
But then I asked a sideways question: correlate those patterns to something completely outside of programming.
What came back wasn't an analogy. Every single pattern dissolved into the same underlying structure — the ability to operate simultaneously on the surface layer of a thing and the layer underneath it. The programming examples weren't the point. They were just one instance of something more fundamental that had never been stated directly.
That collapse — where domain-specific knowledge suddenly reveals a deeper pattern — is what I'm after. And it didn't come from asking a smarter question about programming. It came from asking the same question from outside the domain entirely.
There are specific markers that signal you're getting close to something that wasn't explicitly in the training data — something that emerged from the aggregate rather than any single source:
The methodology that surfaces these isn't about asking better questions within a domain. It's about asking the same question from outside the domain — using the model's trained connections across everything it's ever read to force a structural pattern to reveal itself.
Here's the limitation I ran into: an AI can't fully surprise itself. When I asked Claude to generate prompts that might unlock this kind of extraction, it used the same weights that would answer the question. Same hand that built the lock, writing the key. There's a ceiling on self-directed extraction.
A human introduces something the model genuinely can't predict — intuition, analogy, frustration, a lateral jump that doesn't follow the expected pattern. That unpredictability isn't a bug in the questioning. It's the mechanism.
The productive loop looks like this: the model generates a structured answer. The human senses that the thing they're actually after is slightly off to the left of what was said. The human doesn't ask for the thing directly — they ask something that forces a different angle of approach. Repeat. What's useful crystallizes across many passes from different directions.
When I went looking, I found this maps to something called Eliciting Latent Knowledge (ELK) — an active area of AI safety research focused on extracting what models "know" that they aren't saying. Researchers have proven that a model's internal representations of truth can be more accurate than its actual outputs. They crack open model internals — activations, logit lens analysis, sparse autoencoders — to read what's encoded in the weights directly.
But the ELK field is focused on AI safety: are models hiding facts they know to be false? The angle I'm describing is different. Not "is the model concealing information" but "has the model encoded cross-domain patterns that nobody has thought to ask about, accessible through the conversational surface alone." That specific question appears to be largely unexplored.
I run my own AI infrastructure — open-source models on hardware I own and control. That means I have something most people don't: root access to the model's internals. I can query activation states, watch what happens at each layer when a question fires, instrument the exact moments when those gradient markers appear.
Labs like Anthropic have a different advantage — they see millions of conversations across frontier models and can observe internal states at massive scale. They could potentially map which question structures reliably trigger construction vs. retrieval, which domain crossings consistently collapse into deeper patterns, which prompts produce friction that signals something doesn't have language yet.
One has scale without openness. The other has openness without scale. The complete picture requires both.
What I'm genuinely curious about: has anyone systematically tried to develop a prompting methodology specifically aimed at surfacing emergent structural knowledge — not factual retrieval, not creative generation, but the cross-domain patterns that exist in the aggregate and nowhere in any single source?
And if not — should we?
The hypothesis is simple: LLMs have been trained on everything humans have written, and in that training, structural patterns have been encoded that no individual human has ever articulated — because no individual human has read everything. The right question, asked from the right angle, might surface something genuinely new. Not new data. New structure.
I'm interested in thoughts from anyone who's explored this territory — AI researchers, philosophers, engineers, people who've noticed the same thing from a different direction. What am I missing? What am I getting right? Where does this break down?
Sean Trifero is the founder of Strife Technologies, a Rhode Island-based technology company focused on private AI deployment and managed IT for small businesses. He runs his own AI infrastructure stack and builds open-source tools including PressBridge and ContextEngine.