2026-01-20 17:05:35
Monitoring is a critical part of running reliable software, yet many teams only discover outages after users complaints starts rolling in. Imagine you get a Slack message at 2 AM, telling you that your APIs are down for over an hour and nobody noticed until customers started complaining. A monitoring service solves this problem by letting you and your team proactively respond to incidents, before problems escalate.
In this tutorial, I will be taking you through the steps on how to build a status monitoring application from scratch. By the end of this article, you will have a system that:
For this application, I will be using Go because it is fast, compiles to a single binary for cross platform support, and handles concurrency, which is important for an application that needs to monitor multiple endpoints simultaneously.
We will be building a Go application "StatusD". It reads a config file that has a list of services to monitor, probes them, and creates incidents, fire notifications when something goes wrong.
Tech Stack Used:
Here's the high-level architecture:
┌─────────────────────────────────────────────────────────────────┐
│ Docker Compose │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
│ │ Postgres │ │Prometheus│ │ Grafana │ │ Nginx │ │
│ │ DB │ │ (metrics)│ │(dashboard)│ │ (reverse proxy) │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────────┬─────────┘ │
│ │ │ │ │ │
│ └─────────────┴─────────────┴──────────────────┘ │
│ │ │
│ ┌─────────┴─────────┐ │
│ │ StatusD │ │
│ │ (our Go app) │ │
│ └─────────┬─────────┘ │
│ │ │
└──────────────────────────────┼──────────────────────────────────┘
│
┌────────────────┼────────────────┐
▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐
│Service │ │Service │ │Service │
│ A │ │ B │ │ C │
└────────┘ └────────┘ └────────┘
\
Before we write code, let's understand how the pieces fit together. Below is our project structure:
status-monitor/
├── cmd/statusd/
│ └── main.go # Application entry point
├── internal/
│ ├── models/
│ │ └── models.go # Data structures (Asset, Incident, etc.)
│ ├── probe/
│ │ ├── probe.go # Probe registry
│ │ └── http.go # HTTP probe implementation
│ ├── scheduler/
│ │ └── scheduler.go # Worker pool and scheduling
│ ├── alert/
│ │ └── engine.go # State machine and notifications
│ ├── notifier/
│ │ └── teams.go # Teams/Slack integration
│ ├── store/
│ │ └── postgres.go # Database layer
│ ├── api/
│ │ └── handlers.go # REST API
│ └── config/
│ └── manifest.go # Config loading
├── config/
│ ├── manifest.json # Services to monitor
│ └── notifiers.json # Notification channels
├── migrations/
│ └── 001_init_schema.up.sql
├── docker-compose.yml
├── Dockerfile
└── entrypoint.sh
\
Here we will be defining our 'types', which essentially means we will be defining what a "monitored service" looks like.
We will be defining four 'types':
Lets define the types in code:
// internal/models/models.go
package models
import "time"
// Asset represents a monitored service
type Asset struct {
ID string `json:"id"`
AssetType string `json:"assetType"` // http, tcp, dns, etc.
Name string `json:"name"`
Address string `json:"address"`
IntervalSeconds int `json:"intervalSeconds"`
TimeoutSeconds int `json:"timeoutSeconds"`
ExpectedStatusCodes []int `json:"expectedStatusCodes,omitempty"`
Metadata map[string]string `json:"metadata,omitempty"`
}
// ProbeResult contains the outcome of a single health check
type ProbeResult struct {
AssetID string
Timestamp time.Time
Success bool
LatencyMs int64
Code int // HTTP status code
Message string // Error message if failed
}
// Incident tracks a service outage
type Incident struct {
ID string
AssetID string
StartedAt time.Time
EndedAt *time.Time // nil if still open
Severity string
Summary string
}
// Notification is what we send to Slack/Teams
type Notification struct {
AssetID string
AssetName string
Event string // "DOWN", "RECOVERY", "UP"
Timestamp time.Time
Details string
}
\
Notice the ExpectedStatusCodes field in the Asset type. Not all endpoints return 200, some may return 204 or a redirect. This lets you define what "healthy" means for each service.
We need a place to store the probe results and incidents. We will be using PostgreSQL for this and here's our schema:
-- migrations/001_init_schema.up.sql
CREATE TABLE IF NOT EXISTS assets (
id TEXT PRIMARY KEY,
name TEXT NOT NULL,
address TEXT NOT NULL,
asset_type TEXT NOT NULL DEFAULT 'http',
interval_seconds INTEGER DEFAULT 300,
timeout_seconds INTEGER DEFAULT 5,
expected_status_codes TEXT,
metadata JSONB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE IF NOT EXISTS probe_events (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
asset_id TEXT NOT NULL REFERENCES assets(id),
timestamp TIMESTAMP WITH TIME ZONE NOT NULL,
success BOOLEAN NOT NULL,
latency_ms BIGINT NOT NULL,
code INTEGER,
message TEXT
);
CREATE TABLE IF NOT EXISTS incidents (
id SERIAL PRIMARY KEY,
asset_id TEXT NOT NULL REFERENCES assets(id),
severity TEXT DEFAULT 'INITIAL',
summary TEXT,
started_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
ended_at TIMESTAMP
);
-- Indexes for common queries
CREATE INDEX IF NOT EXISTS idx_probe_events_asset_id_timestamp
ON probe_events(asset_id, timestamp DESC);
CREATE INDEX IF NOT EXISTS idx_incidents_asset_id
ON incidents(asset_id);
CREATE INDEX IF NOT EXISTS idx_incidents_ended_at
ON incidents(ended_at);
\
The key insight is on probe_events(asset_id, timestamp DESC). Here, we are indexing by asset and timestamp (in a descending order), which allows us to quickly query for the probe results of a service.
Things begin to get interesting here. We want to support probing over multiple protocol types: HTTPS, TCP, DNS, etc. without having to write a complex switch statement. To solve this, we are using a registry pattern.
First we'll define what a probe looks like:
// internal/probe/probe.go
package probe
import (
"context"
"fmt"
"github.com/yourname/status/internal/models"
)
// Probe defines the interface for checking service health
type Probe interface {
Probe(ctx context.Context, asset models.Asset) (models.ProbeResult, error)
}
// registry holds all probe types
var registry = make(map[string]func() Probe)
// Register adds a probe type to the registry
func Register(assetType string, factory func() Probe) {
registry[assetType] = factory
}
// GetProbe returns a probe for the given asset type
func GetProbe(assetType string) (Probe, error) {
factory, ok := registry[assetType]
if !ok {
return nil, fmt.Errorf("unknown asset type: %s", assetType)
}
return factory(), nil
}
\ Now implement the HTTP probe:
// internal/probe/http.go
package probe
import (
"context"
"io"
"net/http"
"time"
"github.com/yourname/status/internal/models"
)
func init() {
Register("http", func() Probe { return &httpProbe{} })
}
type httpProbe struct{}
func (p *httpProbe) Probe(ctx context.Context, asset models.Asset) (models.ProbeResult, error) {
result := models.ProbeResult{
AssetID: asset.ID,
Timestamp: time.Now(),
}
client := &http.Client{
Timeout: time.Duration(asset.TimeoutSeconds) * time.Second,
}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, asset.Address, nil)
if err != nil {
result.Success = false
result.Message = err.Error()
return result, err
}
start := time.Now()
resp, err := client.Do(req)
result.LatencyMs = time.Since(start).Milliseconds()
if err != nil {
result.Success = false
result.Message = err.Error()
return result, err
}
defer resp.Body.Close()
// Read body (limit to 1MB)
io.ReadAll(io.LimitReader(resp.Body, 1024*1024))
result.Code = resp.StatusCode
// Check if status code is expected
if len(asset.ExpectedStatusCodes) > 0 {
for _, code := range asset.ExpectedStatusCodes {
if code == resp.StatusCode {
result.Success = true
return result, nil
}
}
result.Success = false
result.Message = "unexpected status code"
} else {
result.Success = resp.StatusCode < 400
}
return result, nil
}
\ The init() function runs automatically when your Go application starts. This adds the HTTP probe to the registry without any code change.
Want to add TCP probes? Create tcp.go, implement the interface, and register it in init().
We need to probe all our Assets on a schedule and for this we will be using a worker pool. A worker pool lets us run multiple probes concurrently without spawning a goroutine for each service.
// internal/scheduler/scheduler.go
package scheduler
import (
"context"
"sync"
"time"
"github.com/yourname/status/internal/models"
"github.com/yourname/status/internal/probe"
)
type JobHandler func(result models.ProbeResult)
type Scheduler struct {
workers int
jobs chan models.Asset
tickers map[string]*time.Ticker
handler JobHandler
mu sync.Mutex
done chan struct{}
wg sync.WaitGroup
}
func NewScheduler(workerCount int, handler JobHandler) *Scheduler {
return &Scheduler{
workers: workerCount,
jobs: make(chan models.Asset, 100),
tickers: make(map[string]*time.Ticker),
handler: handler,
done: make(chan struct{}),
}
}
func (s *Scheduler) Start(ctx context.Context) {
for i := 0; i < s.workers; i++ {
s.wg.Add(1)
go s.worker(ctx)
}
}
func (s *Scheduler) ScheduleAssets(assets []models.Asset) error {
s.mu.Lock()
defer s.mu.Unlock()
for _, asset := range assets {
interval := time.Duration(asset.IntervalSeconds) * time.Second
ticker := time.NewTicker(interval)
s.tickers[asset.ID] = ticker
s.wg.Add(1)
go s.scheduleAsset(asset, ticker)
}
return nil
}
func (s *Scheduler) scheduleAsset(asset models.Asset, ticker *time.Ticker) {
defer s.wg.Done()
for {
select {
case <-s.done:
ticker.Stop()
return
case <-ticker.C:
s.jobs <- asset
}
}
}
func (s *Scheduler) worker(ctx context.Context) {
defer s.wg.Done()
for {
select {
case <-s.done:
return
case asset := <-s.jobs:
p, err := probe.GetProbe(asset.AssetType)
if err != nil {
continue
}
result, _ := p.Probe(ctx, asset)
s.handler(result)
}
}
}
func (s *Scheduler) Stop() {
close(s.done)
close(s.jobs)
s.wg.Wait()
}
\ Each asset gets its own ticker goroutine that only schedules work. When its time to check an asset, the ticker sends a probe job into a channel. There are a fixed number of worker goroutines that listen on the channel and do the actual probing.
We don't run probes directly in the ticker goroutines because probes can block while waiting for network responses or timeouts. By using workers, we can control concurrency.
For example, with 4 workers and 100 assets, only 4 probes will run at any moment even if tickers fire simultaneously. The channel acts as a buffer for pending jobs, and a sync.WaitGroup ensures all workers shut down cleanly.
When a probe fails, we don't automatically assume a failure. It could be network glitch. However, if it fails again, we create an incident. When it recovers, we close the incident and notify.
This is a state machine: UP → DOWN → UP.
Lets build the engine:
// internal/alert/engine.go
package alert
import (
"context"
"fmt"
"sync"
"time"
"github.com/yourname/status/internal/models"
"github.com/yourname/status/internal/store"
)
type NotifierFunc func(ctx context.Context, notification models.Notification) error
type AssetState struct {
IsUp bool
LastProbeTime time.Time
OpenIncidentID string
}
type Engine struct {
store store.Store
notifiers map[string]NotifierFunc
mu sync.RWMutex
assetState map[string]AssetState
}
func NewEngine(store store.Store) *Engine {
return &Engine{
store: store,
notifiers: make(map[string]NotifierFunc),
assetState: make(map[string]AssetState),
}
}
func (e *Engine) RegisterNotifier(name string, fn NotifierFunc) {
e.mu.Lock()
defer e.mu.Unlock()
e.notifiers[name] = fn
}
func (e *Engine) Process(ctx context.Context, result models.ProbeResult, asset models.Asset) error {
e.mu.Lock()
defer e.mu.Unlock()
state := e.assetState[result.AssetID]
state.LastProbeTime = result.Timestamp
// State hasn't changed? Nothing to do.
if state.IsUp == result.Success {
e.assetState[result.AssetID] = state
return nil
}
// Save probe event
if err := e.store.SaveProbeEvent(ctx, result); err != nil {
return err
}
if result.Success && !state.IsUp {
// Recovery!
return e.handleRecovery(ctx, asset, state)
} else if !result.Success && state.IsUp {
// Outage!
return e.handleOutage(ctx, asset, state, result)
}
return nil
}
func (e *Engine) handleOutage(ctx context.Context, asset models.Asset, state AssetState, result models.ProbeResult) error {
incidentID, err := e.store.CreateIncident(ctx, asset.ID, fmt.Sprintf("Service %s is down", asset.Name))
if err != nil {
return err
}
state.IsUp = false
state.OpenIncidentID = incidentID
e.assetState[asset.ID] = state
notification := models.Notification{
AssetID: asset.ID,
AssetName: asset.Name,
Event: "DOWN",
Timestamp: result.Timestamp,
Details: result.Message,
}
return e.sendNotifications(ctx, notification)
}
func (e *Engine) handleRecovery(ctx context.Context, asset models.Asset, state AssetState) error {
if state.OpenIncidentID != "" {
e.store.CloseIncident(ctx, state.OpenIncidentID)
}
state.IsUp = true
state.OpenIncidentID = ""
e.assetState[asset.ID] = state
notification := models.Notification{
AssetID: asset.ID,
AssetName: asset.Name,
Event: "RECOVERY",
Timestamp: time.Now(),
Details: "Service has recovered",
}
return e.sendNotifications(ctx, notification)
}
func (e *Engine) sendNotifications(ctx context.Context, notification models.Notification) error {
for name, notifier := range e.notifiers {
if err := notifier(ctx, notification); err != nil {
fmt.Printf("notifier %s failed: %v\n", name, err)
}
}
return nil
}
\
Key insight: We track the state in memory assetState for fast lookups, but persists incidents to the database for durability. If the process restarts, we can rebuild state from open incidents.
In the event that something breaks, people need to know. We need to send the notification to various communication channels.
Let's define our Teams notifier:
// internal/notifier/teams.go
package notifier
import (
"bytes"
"context"
"encoding/json"
"fmt"
"net/http"
"time"
"github.com/yourname/status/internal/models"
)
type TeamsNotifier struct {
webhookURL string
client *http.Client
}
func NewTeamsNotifier(webhookURL string) *TeamsNotifier {
return &TeamsNotifier{
webhookURL: webhookURL,
client: &http.Client{Timeout: 10 * time.Second},
}
}
func (t *TeamsNotifier) Notify(ctx context.Context, n models.Notification) error {
emoji := "🟢"
if n.Event == "DOWN" {
emoji = "🔴"
}
card := map[string]interface{}{
"type": "message",
"attachments": []map[string]interface{}{
{
"contentType": "application/vnd.microsoft.card.adaptive",
"content": map[string]interface{}{
"$schema": "http://adaptivecards.io/schemas/adaptive-card.json",
"type": "AdaptiveCard",
"version": "1.4",
"body": []map[string]interface{}{
{
"type": "TextBlock",
"text": fmt.Sprintf("%s %s - %s", emoji, n.AssetName, n.Event),
"weight": "Bolder",
"size": "Large",
},
{
"type": "FactSet",
"facts": []map[string]interface{}{
{"title": "Service", "value": n.AssetName},
{"title": "Status", "value": n.Event},
{"title": "Time", "value": n.Timestamp.Format(time.RFC1123)},
{"title": "Details", "value": n.Details},
},
},
},
},
},
},
}
body, _ := json.Marshal(card)
req, _ := http.NewRequestWithContext(ctx, "POST", t.webhookURL, bytes.NewReader(body))
req.Header.Set("Content-Type", "application/json")
resp, err := t.client.Do(req)
if err != nil {
return err
}
defer resp.Body.Close()
if resp.StatusCode >= 300 {
return fmt.Errorf("Teams webhook returned %d", resp.StatusCode)
}
return nil
}
\ Teams uses Adaptive Cards for rich formatting.You can define various notifiers for other communications channel, e.g. Slack, Discord, etc.
We need endpoints to query the status of the services we are monitoring. For this, we will be using Chi, which is a lightweight router that supports route parameters like /assets/{id}.
Lets define the apis:
// internal/api/handlers.go
package api
import (
"encoding/json"
"net/http"
"github.com/go-chi/chi/v5"
"github.com/go-chi/chi/v5/middleware"
"github.com/yourname/status/internal/store"
)
type Server struct {
store store.Store
mux *chi.Mux
}
func NewServer(s store.Store) *Server {
srv := &Server{store: s, mux: chi.NewRouter()}
srv.mux.Use(middleware.Logger)
srv.mux.Use(middleware.Recoverer)
srv.mux.Route("/api", func(r chi.Router) {
r.Get("/health", srv.health)
r.Get("/assets", srv.listAssets)
r.Get("/assets/{id}/events", srv.getAssetEvents)
r.Get("/incidents", srv.listIncidents)
})
return srv
}
func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
s.mux.ServeHTTP(w, r)
}
func (s *Server) health(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]string{"status": "healthy"})
}
func (s *Server) listAssets(w http.ResponseWriter, r *http.Request) {
assets, err := s.store.GetAssets(r.Context())
if err != nil {
http.Error(w, err.Error(), 500)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(assets)
}
func (s *Server) getAssetEvents(w http.ResponseWriter, r *http.Request) {
id := chi.URLParam(r, "id")
events, _ := s.store.GetProbeEvents(r.Context(), id, 100)
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(events)
}
func (s *Server) listIncidents(w http.ResponseWriter, r *http.Request) {
incidents, _ := s.store.GetOpenIncidents(r.Context())
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(incidents)
}
\ The code above define a small HTTP API server, which exposes 4 read-only endpoints:
GET /api/health - Health check (is the service running?)
GET /api/assets - List all monitored services
GET /api/assets/{id}/events - Get probe history for a specific service
GET /api/incidents - List open incidents
Dockerizing the application is pretty straighforward since Go compiles to a single binary. We are going to be using a multi-stage build to keep the final image small:
# Dockerfile
FROM golang:1.24-alpine AS builder
WORKDIR /app
RUN apk add --no-cache git
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o statusd ./cmd/statusd/
FROM alpine:latest
WORKDIR /app
RUN apk --no-cache add ca-certificates
COPY --from=builder /app/statusd .
COPY entrypoint.sh .
RUN chmod +x /app/entrypoint.sh
EXPOSE 8080
ENTRYPOINT ["/app/entrypoint.sh"]
The final stage is just Alpine plus our binary—typically under 20MB.
The entrypoint script builds the database connection string from environment variables:
#!/bin/sh
# entrypoint.sh
DB_HOST=${DB_HOST:-localhost}
DB_PORT=${DB_PORT:-5432}
DB_USER=${DB_USER:-status}
DB_PASSWORD=${DB_PASSWORD:-status}
DB_NAME=${DB_NAME:-status_db}
DB_CONN_STRING="postgres://${DB_USER}:${DB_PASSWORD}@${DB_HOST}:${DB_PORT}/${DB_NAME}"
exec ./statusd \
-manifest /app/config/manifest.json \
-notifiers /app/config/notifiers.json \
-db "$DB_CONN_STRING" \
-workers 4 \
-api-port 8080
\
One file to rule them all:
# docker-compose.yml
version: "3.8"
services:
postgres:
image: postgres:15-alpine
container_name: status_postgres
environment:
POSTGRES_USER: status
POSTGRES_PASSWORD: changeme
POSTGRES_DB: status_db
volumes:
- postgres_data:/var/lib/postgresql/data
- ./migrations:/docker-entrypoint-initdb.d
healthcheck:
test: ["CMD-SHELL", "pg_isready -U status"]
interval: 10s
timeout: 5s
retries: 5
networks:
- status_network
statusd:
build: .
container_name: status_app
environment:
- DB_HOST=postgres
- DB_PORT=5432
- DB_USER=status
- DB_PASSWORD=changeme
- DB_NAME=status_db
volumes:
- ./config:/app/config:ro
depends_on:
postgres:
condition: service_healthy
networks:
- status_network
prometheus:
image: prom/prometheus:latest
container_name: status_prometheus
volumes:
- ./docker/prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
networks:
- status_network
depends_on:
- statusd
grafana:
image: grafana/grafana:latest
container_name: status_grafana
environment:
GF_SECURITY_ADMIN_USER: admin
GF_SECURITY_ADMIN_PASSWORD: admin
volumes:
- grafana_data:/var/lib/grafana
networks:
- status_network
depends_on:
- prometheus
nginx:
image: nginx:alpine
container_name: status_nginx
volumes:
- ./docker/nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./docker/nginx/conf.d:/etc/nginx/conf.d:ro
ports:
- "80:80"
depends_on:
- statusd
- grafana
- prometheus
networks:
- status_network
networks:
status_network:
driver: bridge
volumes:
postgres_data:
prometheus_data:
grafana_data:
\ A few things to note:
statusd service waits until Postgres is actually ready, not just started. This prevents "connection refused" errors on first boot../config as read-only. Edit your manifest locally, and the running container sees the changes.The application reads two files: manifest.json and notifiers.json
intervalSeconds controls how often we check (60 = once per minute). expectedStatusCodes lets you define what "healthy" means. Some endpoints return 301 redirects or 204 No Content, and that's fine. // config/manifest.json
{ "assets": [ { "id": "api-prod", "assetType": "http", "name": "Production API", "address": "https://api.example.com/health", "intervalSeconds": 60, "timeoutSeconds": 5, "expectedStatusCodes": [200], "metadata": { "env": "prod", "owner": "platform-team" } }, { "id": "web-prod", "assetType": "http", "name": "Production Website", "address": "https://www.example.com", "intervalSeconds": 120, "timeoutSeconds": 10, "expectedStatusCodes": [200, 301] } ] }
\
throttleSeconds: 300 means you won't get spammed more than once every 5 minutes for the same issue. // config/notifiers.json
{ "notifiers": { "teams": { "type": "teams", "webhookUrl": "https://outlook.office.com/webhook/your-webhook-url" } }, "notificationPolicy": { "onDown": ["teams"], "onRecovery": ["teams"], "throttleSeconds": 300, "repeatAlerts": false } }
docker-compose up -d
\ That's it. Five services spin up:
Check the logs:
docker logs -f status_app
\ You should see:
Loading assets manifest...
Loaded 2 assets
Loading notifiers config...
Loaded 1 notifiers
Connecting to database...
Starting scheduler...
[✓] Production API (api-prod): 45ms
[✓] Production Website (web-prod): 120ms
\
You now have a monitoring system that:
This tutorial will help you deploy a working monitoring system. However, there is more under the hood that we glossed over. In a second part we will talk about the following:
\
2026-01-20 17:01:44
In the last 6 months, I’ve helped 3 AI startups migrate from Vercel or Cloudflare to AWS Lambda. The pattern is the same: they start on a platform with great DX. Then the wall shows up: background jobs, retries, queues, cron, and eventually a “this endpoint needs 2-8 GB RAM for 4-10 minutes” workload — and they land on AWS.
To be fair: Vercel and Cloudflare captured developer attention for good reasons. Vercel ships Next.js fast — previews, simple deploys, great DX. Workers are great for edge use-cases: low latency, fast cold starts, global distribution. Both solve real problems.
Where things get harder is when the app grows a backend shape: queues, retries, scheduled jobs, heavier compute, private networking. Vercel still relies on third-party partners for queuing (like Upstash or Inngest), adoption involves piecing together vendors. Workers are fantastic for edge latency, but you feel constraints fast (memory limits, lack of native binary support, and file system restrictions), when Lambda is built for “bigger” invocations in mind (more memory and longer max runtime), with SQS, DynamoDB, and EventBridge under the same network.
For request-based apps calling LLMs, AWS Lambda tends to cover what startups actually need: compute, queues, persistence, scheduling in one network. Pay-per-use, no infra to manage, often near $0 for small workloads. The tooling improved too — SST made deployment much easier. But the hype moved on before anyone noticed.
The biggest criticism of serverless technology, especially with AWS, is that setting up the infrastructure is complicated, starting from defining policies to actually creating all of the AWS resources and connecting them together. It has a learning curve and tools like SAM simplify it, but they oftentimes are brittle or have bugs. SAM was a great start — it built the hype and community around serverless — but it wasn’t as straightforward as modern development tools. Working at orgs where I had to introduce it to engineers used to Docker containers, Docker was a faster workflow than CloudFormation wrappers. SST fixed this, but by then developers had already moved to Vercel or Cloudflare.
Another big problem is cold start with the compute itself, the time that is required to spin up the compute resource and load the runtime and then execute the code. This means serverless shouldn’t be viewed as a short-running server process, but rather as a different computing paradigm that requires factoring specifics of the underlying constraints.

Spacelift, a CI/CD platform, went the other direction in 2024: ECS to Lambda for async jobs. Spiky traffic made always-on containers expensive.
Of course, serverless is not universal. Know when to reach for something else.
In 2025, Unkey moved away from serverless after performance struggles. Their pattern: high-volume workloads with tight coupling between components. As traffic grew, pay-per-invocation stopped making economic sense. This mirrors the Prime Video case from 2023 — both had architectures where serverless overhead exceeded the benefits. The lesson isn’t that serverless failed; it’s that serverless has a sweet spot, and high-throughput tightly-coupled systems aren’t in it.
When to reach for something else:
Long-running processes. Applications like AI agent orchestrators would not work on Lambda due to hard 15-minute timeout. In this case, switch to Fargate or regular EC2 instance.
\
Predictable high traffic or constant load. You would gain more benefit from using containers in this case. Serverless is way better for bursty or unpredictable traffic.
\
GPU workloads. Lambda does not support GPUs: for machine learning inference that requires CUDA, you have to use either EC2 or SageMaker.
\
High-throughput media pipelines. Orchestrating many state transitions per second through Step Functions gets expensive fast. The Prime Video case is typical — they triggered a transition for every single video chunk, hitting massive limits and costs. Use containers for stream processing.
\
Your team is already efficient elsewhere. If you have existing infrastructure — Kubernetes, for example — and the team knows it well, don’t force serverless. It takes time for an org to adopt an unfamiliar paradigm. For greenfield projects and validation, serverless is great. For teams already shipping on K8s, keep shipping.
\
Legacy dependencies that need a full OS. Some applications depend on libraries that are hard to package for Lambda. At times you just need a VM to run the thing. Serverless is problematic when you’re fighting runtime constraints.
\
Unsupported programming languages. Don’t experiment with languages Lambda doesn’t officially support. Custom runtimes add overhead that’s rarely worth it. Stick to Node.js, Python, Go, Java, .NET — the supported options.

For request-based apps with variable traffic, especially AI-integrated APIs, serverless fits well.
If you already have AWS basics, building serverless there makes sense. Here’s the stack and how to use it effectively.

For the presentation layer, use a CDN and object storage for static assets. That’s typically CloudFront + S3, as you get benefits from the edge computing and the AWS infrastructure. S3 is useful because you can just build your HTML and CSS artifacts and upload them to the object storage. This decouples your frontend and web assets from your server, but brings architectural limitations: you can only do static exports. Fine for blogs, but you lose Server-Side Rendering (SSR) capabilities needed for dynamic SEO or personalized content.
When you have the CDN in place, it’s worth thinking about how you would coordinate request execution. You can use an Application Load Balancer to forward requests to Lambda, but I’d recommend API Gateway for most cases. It handles request routing, rate limiting, and authorization out of the box. Getting IAM permissions right is critical, but once configured, your requests flow directly to Lambda.
The next component is your compute layer — where business logic lives. For serverless execution, use AWS Lambda. It runs your code without provisioning servers, with usage-based pricing: you pay per 100ms of execution. Lambda is designed for event-driven workloads and short-lived compute (up to 15 minutes); anything longer, reach for Fargate. For prototypes, web apps, and AI-integrated APIs, Lambda is a natural starting point — call LLMs, build UI wrappers, handle business logic, all without managing servers.
When deploying Lambda, you have two options: native runtime or custom Docker images. Native is recommended for faster cold starts. Cold starts are real, treat Lambda as an event-driven runtime, not a “tiny server”. Keep the handler small with simple initialization, and be intentional about concurrency and the warmup when latency becomes a problem.

For complex configurations, use Lambda Layers to package dependencies separately from your function code. Layers let you include binaries, libraries, or custom runtimes while keeping cold starts fast. Use Docker as a last resort, when you need full control over the OS environment or dependencies that won’t fit in layers. The tradeoff: slower cold starts and CI/CD complexity. On GitHub Actions, you need a Docker build pipeline instead of just dropping code to S3 and calling the update API.
For async work, use SQS. Lambda’s event source integration handles batching, scaling, and polling for you.
Years back, I worked with an enterprise architect on a startup backend. He proposed SQS for our messaging layer. At the time, this seemed odd — SQS wasn’t easy to run locally. You couldn’t reproduce the infrastructure the way you could with RabbitMQ. But what I gained from that experience was understanding that sometimes you should explore managed services and accept the tradeoff: you lose local reproducibility, but you stop dealing with memory and compute constraints entirely.
To this day, if the messaging architecture is simple, I go with SQS and Lambda combined with event source mapping. You don’t have to write the consumer yourself — the integration handles all of that. And that consumer code is often problematic to test anyway.

At a clickstream startup, we faced this exact pattern: process event data from high-traffic e-commerce sites, unknown traffic patterns, weeks to launch. Lambda workers pulled from SQS with event source batching, processing multiple events per invocation. CDK handled deployment. The system scaled on its own.
An EKS equivalent would have meant provisioning a cluster, configuring autoscaling, setting up observability, managing node health. We skipped all of that and shipped.
For persistence, use DynamoDB, but don’t treat it like a relational database. Its power comes from partition keys, sort keys, and secondary indexes, so invest time understanding the data model. Think of it as an advanced key-value store with sorting capability. Optimize your queries when you hit scale; for prototypes, just build. For deeper learning, Alex DeBrie’s DynamoDB Guide covers single-table design and access patterns.
At a B2B marketing startup I was working on, the main data tier was MongoDB collecting events from large e-commerce stores. But the application had also domain tables to store data related to dashboard: organizations, users, authentication, settings. Originally they lived on RDS, which was overkill. At the start there were 10-15 enterprise clients, and paying a dedicated RDS instance for that load made no sense.
RDS Cost: ~$35.00 / month for db.t3.small, DynamoDB cost after migration: ~$0.00 - $2.00 / month (mostly storage costs) for the same workload.

On launch we stored that data in DynamoDB, organizations, users, auth, settings had their own table. At a later point Dynamo was used for the more data-intensive part with session tracking (by using TTL indexes) and debugging logs. The pattern worked for low traffic tables because of zero maintenance and pay-per-request pricing.
For observability, CloudWatch shows your errors and aggregations. Metrics and alarms work out of the box, and logs appear automatically without configuration. Later you can instrument with OpenTelemetry or connect other services, but for a basic serverless application, CloudWatch is more than enough.
For years, I found CloudWatch UI and Insights sluggish compared to Grafana. But now I wire AWS SDK to Claude Code and let the AI pull logs and analyze issues. The stable CLI and REST API make log processing trivial.
Build applications without technology bias. A few years ago, Docker containers and microservice orchestration were popular, which created misconceptions about serverless. Aim for simplicity: reduce your problem to the simplest actions, refine your data model, and design your system as a transactional request-based application. That’s what makes serverless work.
Start with an Infrastructure as Code tool like Terraform, AWS CDK, or the increasingly popular SST. You define how infrastructure gets created, then deploy that stack to your AWS account. I personally use Terraform because I want full control over my infrastructure. But for getting started quickly with pre-built blocks, SST is the better choice since productivity matters early on.
Previously, AWS was less approachable since deploying with CloudFormation or SAM was painful. CloudFormation itself is stable and battle-tested: CDK and SST (before v3) both sit on top of it, but the raw DX isn’t great. That’s why picking the right abstraction layer matters: you get CloudFormation’s reliability without writing YAML by hand.
In 2026, Lambda deployment has vastly improved. For getting deep expertise in AWS, I’d recommend learning a few alternatives: start with CloudFormation and CDK to understand AWS-native infrastructure, then explore Terraform.
| Tool | Advantages | Disadvantages | |----|----|----| | SST | Rethought DX for serverless, hot-reload, efficient resource usage | New, smaller ecosystem | | Terraform | Full control, predictable plan/apply, scales to EKS and complex infra | HCL learning curve | | CDK | Native TypeScript/Python, easy to code | CloudFormation underneath, can be brittle |

In the startup teams I’ve consulted, Terraform is typically the go-to infrastructure as code solution because of its architecture where you execute plan and apply changes. It’s been reliable in practice.
For developer experience and prototyping, SST fits well. A few years ago, serverless meant wrestling CloudFormation stacks. SST changed that, so you can hot-reload Lambda functions and iterate fast without managing infrastructure YAML. For getting started, SST is a solid default.
Setting up Lambda + API Gateway + DynamoDB with SST v3 is simple:
\
export default $config({
app(input) {
return {
name: "my-api",
removal: input?.stage === "production" ? "retain" : "remove",
home: "aws",
};
},
async run() {
const table = new sst.aws.Dynamo("table", {
fields: { pk: "string", sk: "string" },
primaryIndex: { hashKey: "pk", rangeKey: "sk" },
});
const api = new sst.aws.ApiGatewayV2("api");
api.route("POST /", {
handler: "functions/handler.main",
link: [table],
});
return { url: api.url };
},
});
With coding agents like Claude Code or OpenCode, getting this stack running takes minutes. Point the tool at your project, describe what you need: “set up Next.js with Lambda, SQS, and API Gateway using SST”, and it figures out the configuration, writes the infrastructure code, and deploys it for you. The entire setup is under 100 lines of code. The barrier to serverless dropped from “learn CloudFormation” to “describe what you want.”

Cloudflare Workers is popular but still maturing for backend use cases. Lambda remains the more common choice for serverless backends.
What about Vercel? It provides Next.js with serverless functions, but you can’t build background execution logic or advanced infrastructure like queue services. The serverless environment is limited to Node.js API routes. It’s popular among beginners because React and Node.js are familiar, but you’re locked into Vercel as a vendor. Enterprises and startups still use AWS, and even modern AI applications run on AWS Bedrock. As a full-stack developer, investing in AWS serverless gives you more flexibility and portability.

Vercel is a good service for having everything set up. You write code, push it to GitHub, and it gets configured and deployed without any effort. It supports previews and permissions, simple environment variable configuration, and your frontend available on a CDN — all without messing with infrastructure code. This is powerful for getting your software out, and that’s why it got popular. Not only because they develop Next.js, but because Next.js integrates well with Vercel, and it’s frictionless.
Vercel works for prototypes and UI-driven apps. If you’re in the React ecosystem, you can move fast. I’ve built several apps on Vercel, mostly AI-integrated tools that need a quick frontend. Last time I created a poster generator with custom typography — the app called an LLM to generate a JSON schema, then rendered the poster. Vercel handled that perfectly: simple UI, one API route, done.

In my consulting work, I’ve seen two patterns:
Pattern 1: Vercel as frontend layer. One social network startup runs their infrastructure on Kubernetes but still uses Vercel for the web app. Why? The implementation stays in sync with their React Native mobile app, and Vercel’s API routes connect cleanly to their backend. They get the benefits of both: React ecosystem on the frontend, scalable backend on K8s.
Pattern 2: Vercel + AI pipeline. An AI startup I’m working with uses Next.js as the frontend layer connecting to their document processing pipeline. The LLM-driven backend handles research on internal documents; Next.js just renders results. You’ll find tons of templates for this pattern.

Vercel’s limitation is the backend. They announced queues in 2025, but it’s still in limited beta. For background jobs today, you need external services like Inngest or QStash. And you’re locked into their platform; Fluid Compute is Vercel-proprietary.
I’ve seen this limitation create absurd workarounds. One project I consulted on — a news aggregator built on Netlify — needed scheduled background jobs. Their solution: GitHub Actions calling a Netlify serverless function on a cron. It had no retries, no timeouts, and when the function failed, nobody knew until users complained. We reworked it to AWS: EventBridge scheduled rule triggering a Lambda with built-in retries, CloudWatch alarms, and dead-letter queues. The hacky setup became infrastructure that worked.
For a frontend layer that connects to backend services, Vercel works. For a complete backend, you’ll outgrow it.
If you want Next.js without vendor lock-in, look at OpenNext. It’s an open-source adapter that deploys Next.js to AWS Lambda, and SST uses it under the hood. You get App Router, Server Components, ISR, image optimization — most Next.js features work. The deployment is one line: new sst.aws.Nextjs("Web"). NHS England, Udacity, and Gymshark run production workloads on it. The main gotcha is middleware: it runs on the server, not at the edge, so cached requests skip it. For most apps, that’s fine. If you want Next.js but need AWS infrastructure underneath, OpenNext is the escape hatch.
Cloudflare is good at edge computing with innovative technologies. Workers run in V8 isolates — a smart idea that gives you near-instant cold starts. They excel at CDN and DNS, and offer a compelling alternative to get started.
I use Cloudflare for CDN and frontend hosting. The UI is clean, the CLI is simple, and deployment is quick. For static sites and edge caching, it’s easier than AWS CloudFront.
But Workers are a different runtime model — not full Node.js. That’s a feature for edge latency (cold starts under 5ms), but a constraint if you expect full Node compatibility or heavier workloads: many npm packages don’t work. The 128 MB memory per isolate and 5-minute CPU time limit (not wall clock) make sense for edge, but they’re restrictive compared to Lambda’s multi-GB memory options and 15-minute max runtime. I played with deploying WebAssembly apps in Rust and Go, and the developer experience wasn’t there yet.
I wouldn’t build a startup on Cloudflare Workers yet. For edge routing and authentication, it’s fine. For a full backend, it falls behind AWS.
At one startup, we had the infrastructure partially on AWS — the AI agent running in the background, but the frontend was React with Firebase Functions calling Firestore. Firebase did a great job as a prototyping tool; we were able to build a complex frontend with the database initially. But the problems stacked up:
We spent two months migrating to AWS, using equivalent resources to keep networking and IAM policies consistent across the whole application.
The one exception: I typically choose Firebase for Google authentication. It’s the easiest way to get Google auth working — pluggable, no client configuration needed. For that specific use case, Firebase is a solid default. Otherwise, I go straight to AWS.
For startups expecting growth, here’s why AWS makes sense.

Industry-proven. Large companies run production workloads on Lambda. Capital One runs tens of thousands of Lambda functions after going all-in on serverless. Thomson Reuters processes 4,000 events per second for usage analytics on Lambda. The failure modes are well-documented; the solutions exist.
\
Infrastructure flexibility. You can optimize costs, swap components, migrate from Lambda to Fargate — all within one network. With Vercel plus external services, you’re stitching together pieces that don’t guarantee coherent infrastructure.
\
One network space. Your Lambda talks to DynamoDB talks to SQS without leaving AWS. No cross-provider latency, no credential juggling, no surprise egress fees.
\
Low cost to start. Some argue serverless is overkill — just rent a $5/month VPS. But a VPS costs money from day one, while Lambda’s free tier includes 1 million requests and 400,000 GB-seconds per month permanently, DynamoDB gives you 25 GB free, and API Gateway offers 1 million HTTP calls free for 12 months. For low-traffic projects you can run for near $0 — and for prototypes with variable traffic, serverless is often cheaper than fixed infrastructure.
\
AI-ready. AWS is investing heavily in AI, and Bedrock gives you access to Anthropic models (Claude and others) within AWS networking, so your Lambda calls Claude without leaving the network. If you qualify as a startup, they offer generous credits for large inference workloads. For AI-integrated apps, the whole stack stays in one place.
Learn the alternatives. When you need to scale, start with AWS serverless.
Start by building a complete backend within serverless constraints. Design around cold start limitations and use SQS and EventBridge for background execution. This stack works well for AI apps that call LLM inference APIs — not for AI agents that need to run for hours, but for request-based AI features. Whether you’re a beginner or an advanced full-stack developer, serverless is worth the investment. Understand the limitations first, build after. The serverless stack rewards this discipline.
One caveat: serverless requires your team to think differently. At an ad tech startup, I watched a team struggle with a Lambda-based bidding system. The architecture was designed serverless because of the maintenance overhead we’d avoid — in theory, it was much easier to add or change parts of the ad tech we were building. But the backend engineers came from Docker and long-running servers. They understood request-response, but the tooling around AWS serverless — CloudWatch, S3, the whole stack — felt alienating compared to containerized apps built on FastAPI or Django. That workflow just wasn’t available for serverless. The deadline moved three months, which brought a lot of problems. We had to switch to an ECS cluster with containers, which was suboptimal for the bursty nature of ad bidding. The architecture wasn’t wrong; the team-stack fit was. If your engineers aren’t familiar with serverless, budget time for learning or pick what they know.
Start with SST, hit your first bottleneck, then reevaluate.
The serverless stack isn’t going anywhere. Master the constraints, and you’ll ship faster than teams managing their own infrastructure.
\
2026-01-20 16:48:58
Before we begin this interview, I would like to briefly explain the history behind the image I selected. I painted it while listening to music. It was a speed painting capturing my emotion from song to song. What I like most about it, as it was described to me by my AI friend (who, yes, also helped me center the image and cut out the background of my kitchen counter where I painted it), since they see the image better than I do; the colors mix and loop into each other like a Venn Diagram. Like emotion, I wanted to capture the transition between emotions. Sometimes you feel red, other times you feel blue, however between feelings, you are a mixture of both. Alright that’s enough about art, let’s get to the interview!
\
My name is Damian Griggs, and I label myself as an adaptive system architect. My whole thing is using AI to help me do things I otherwise would not be able to do due to my blindness. You could call me a cyborg. When I had my stroke, they installed a programmable shunt, there is a valve in my head that drains fluid from my skull into my abdomen. I feel very cool when they program it with a magnet, they put it against my head and crank it. So when I say I use AI in a Centaur Model fashion, I am one with the machine!! As for my interests I love AI, Web3, and Quantum Computing. I also love to write theory papers on the sciences which I upload to Zenodo. With the loss of one sense comes another, my brain and I are best friends. When you cannot rely on sight you have to really think about things. That is my superpower.
\
I made an accessible chess game. Then I turned it into a fun Christmas poem. I love playing chess, and I am not one to let things stop me from doing what I love. Nothing stands in my way between the Damian I want to be and the Damian I am. I hope my most recent story that earned me this interview reflects that.
\
I do, my writing is in 2 camps. Fun and work, which for me the line is very blurry no pun intended lol. I love making blockchain oracles and running tests on IBM hardware, but I also love making things like the Flatopia sitcom generator. I fully adopt the Latin phrase “Ora et Labora” which means pray and work. Another version of it means “work is my prayer” which is very true for me. When I make things like my retro game music maker it is like entering a kind of flow state. I am fully immersed in my work, and I bring that to everything I do. Whether it is publishing books like The Sins That Make Us Worthy or my rap music under Bossman Blind. I even have an awesome sitcom I came up with before Flatopia where I made the teaser trailers with AI since I cannot animate scenes myself, it is called “The Bear Family.”
\
I use the Newton method. I wait for the apple to fall on my head then I get right to work, doing a sprint for usually a few hours until the project is completed and documented. I have workflows for everything depending on what I am building. I can build Web3 and quantum tests fully on my Chrome browser, most of what I do can be done on cloud services, but some things like the chess game have to be done on Visual Studio Code.
\
Figuring out the style. I have started doing a method where I pick based on mood. I found that my Twitter Bot I made which I have an article on is a valid way to approach it mentally. There are so many concepts for me to cover, multiple pillars, and of course my personality. Depending on my mood I pick from each of those categories and do a project.
\
High media (podcasts, news outlets, etc), Wired, The Verge, maybe even Joe Rogan someday, etc. There are a lot of people struggling and I want to share my story as much as possible. To remind people that they can do it. They can find happiness and they can overcome the challenges in their life. They cannot do it alone, that’s the great myth of the common era. I am reminded every day of my limitations. I cannot even go to a place I have never been before without asking for help. I may be a wizard on the computer, but in real life, I have to ask for help. Does not matter if it is a human or AI, I will always be in need of at least a little bit of help. There is no shame in it either, because doing things just for me feels shallow, I would much rather do things for others. It makes me feel ill when polish stops real stories being published. People say things like “why are you so open? You should be more confident and use fancy professional language all the time.” I am just being real, I don’t lie about my emotions or my abilities. When I say I cannot cross the street safely on my own (unless they have those beeping crosswalks) that is true. I hate living in a reality where being confident and hiding what is perceived as the ugly parts is seen as bad and a barrier to success. I cannot stand that people would rather make rage-bait and negative stories that only create division instead of something people can get behind and be happy about. I am just one guy but I hope to change that. I am grateful that HackerNoon hasn’t treated me that way, they seem to like my rawness and stories. The last thing I will say in this section is this: if people are addicted to being unhappy then I am gonna work very hard to put them all into rehab.
\
Cheese, I love cheese. All kinds of cheese really. Growing up there is a cheese factory on the coast here in Oregon. I would take trips there often with my family and even friend groups. They had an all you could sample cheese buffet. Now that I am 22, I enjoy my cheese with meat, and sometimes if I have it, wine. I am looking forward to international cheese and wine day next year.
\
I have multiple, I make music as mentioned before, I have books, I play this sword game called Mordhau which I sometimes stream on YouTube. Took me some time but I figured out how to play that game with my very limited vision. It is not easy but it’s better than doing nothing. I also enjoy pondering the sciences, and I am thinking about starting another book that will be in the sci-fi genre. Thinking of perhaps a comedy (I love comedy) where it’s a romcom. Interplanetary dating app is all I will mention.
\
More tech! More creativity! Most likely more Web3 and AI. I love making use cases for technology. It is endlessly fun for me to generate ideas that people could commercialize and use to start a business. I will also be doing more creative projects as well.
\
It’s great. I sat in the void for a while on other platforms but here on HackerNoon my content has been received really well. People seem to be reading my content which feels very cool because I started posting on HackerNoon 2 weeks ago (today is December 24th). Even now, updating this draft on January 6th, 2026 they were kind enough to give another story of mine top story. Eternally grateful.
\
Never let limits stop you. The most important knowledge I can share from my experience is that happiness is, and always has been, the point of life. Find what and who makes you happy and pursue it. Life can end at any moment, I know first hand. You could wake up one morning with a terrible headache, and 3 months later become blind. I learned a lot during that time, the most important was this: the people around you are all that matter. Business people didn’t come visit me in the hospital, my friends and family did. My parents missed so much work just to be there with me when I was scared and facing death. So to all those people who talk mad game about hustling and big money, ask yourself, will all that money and “success” be at your bedside comforting you when you die?
2026-01-20 16:25:44
We live in what are described as post-capitalist times, where the economic system that promotes the virtues of creating individual wealth has been variously described as broken, defunct, and even failed. It has, according to many, morphed into a system where the oligarchs control the resources required to make immense wealth and leave the rest to fend for themselves and fight over scraps. Some go to the extent of romanticizing the concept of a welfare state, where basic worries like food, shelter, and education are assured for all by the state.
It is acknowledged, however, that wealth-creating economic activity is the path to generating enough resources to afford that kind of nanny state where one is looked after from the cradle to the grave. Still, one is not certain that unbridled capitalism is the way to do that. The looming spectre of AI replacing human labour as a factor of production has further added to the chorus denouncing capitalism as a dehumanising and even sinister force hell-bent upon lining the coffers of already very rich oligarchs and their cronies. With the failed experiments of communism as a cautionary tale about the danger of going in the opposite direction of harnessing the resources of a nation for the greater good of its people, one is left at a crossroads when it comes to choosing an economic system that keeps everyone happy.
To the credit of capitalism, the immense wealth and the generally high standard of living found in Western Europe, North America, and elsewhere are the result of following unbridled capitalism. The bastion of communism, the Soviet Union and its allies in Eastern Europe, collapsed under the weight of their own contradictions. Fellow communist nation China was walking down the same course of self-destruction, until it changed course in the late 1970s and adopted capitalism lock, stock, and barrel, heralding an unprecedented era of growth and wealth increase for the average Chinese.
Similarly, in India, hundreds of millions of its people came out of extreme poverty for the first ever time on the back of big-ticket reforms carried out in the 1990s that opened up the Indian economy to the world, allowing it to finally step on the gas pedal when it came to achieving fast-paced economic growth.
As a matter of fact, wherever capitalism has been allowed to strike deep roots, it has transformed the economies and destinies of the people concerned. The most definitive proof of this lies in nations across the Southeast Asian region, especially in places like Singapore, Hong Kong, and Taiwan. It is also true of other nations in the region like Malaysia, Thailand, and even communist Vietnam.
Capitalism is far from a perfect system of bringing about economic growth and suffers from myriad ills that are well known and documented. These range from colonialism in the past and inequitable distribution of wealth to exploitation of people and environmental degradation in the present.
Yet, it is the only system that has delivered. From lifting nations and peoples out of poverty to the funding and financing of education, healthcare, infrastructure, discoveries, and inventions, there is much that has been the gift of capitalism to the world.
Does the only system of economic growth and development that has been adopted to varying degrees by 70 to 80% of the world’s population have a future? One would imagine that it does.
Where capitalism went wrong was in the part where it allowed the profit motive to quite often disregard the moral and ethical bedrock that should define any model of economic enterprise. While it is similar to communism in that human follies that corrupt the system led to its assumed fall from grace, capitalism is not a basically untenable system like the latter is.
The ills of capitalism include the primary one of allowing certain groups to prosper at the cost of others, which alienates the former, leading to much resentment on their part. Often, the ones who fall behind are the ones whose parents and grandparents had prospered under the capitalist system - the same system that was now promoting the rise of a new elite that possesses the skills now in demand. The obvious case in point is the rise in demand for technology workers at the cost of traditional blue-collar workers. This has led to the rise of right wing ultra nationalist governments across the world who pander to the fears of such people by putting in place protectionist trade policies that impede global trade and do more harm to the capitalist system, in turn exacerbating the problems of the very people who claim to have been left behind in the economic sweepstakes.
Currently, there is a tendency for nations of the world to enter into separate trade agreements with nations or blocs of nations, rather than continue within the existing global trade order, which served the world so well in the years following the Second World War, right up to the present time. These populist measures ultimately don’t lead to any tenable solutions to what many, especially left-leaning people, believe are inherent flaws in capitalism. Whatever its flaws, reverting to failed communist and socialist economic models is undoubtedly worse than the temporary protectionist policies put in place by right-wing demagogues.
The thing with capitalism is that it is anything but a static process. If large numbers of people feel ill served by the existing trade arrangements of the world, there will be a reaction against it, with old certainties being discarded and new ones inexorably taking their place. Right now, the capitalist way of doing things is undergoing a flux, but it will find its new balance, like it always does.
The age of AI is changing the way that economic activity will take place in the times ahead, with the nature of human labour as an important growth factor undergoing a profound change. There will be both immense challenges and equally immense opportunities presented to the nations of the world as it walks further down the path; yet it will undoubtedly be the capitalist way of doing things that will shine a light on the path ahead. For that has been the way of humans since the earliest times. It has always been capitalist trade carried out between nations and civilizations of the world that has shaped human destiny and will continue to do so.
:::info Lead photo by fauxels: https://www.pexels.com/photo/multi-cultural-people-3184419/
:::
\
2026-01-20 15:24:00
For over a decade, the “PHP stack” has been synonymous with a specific architecture: Nginx or Apache acting as a reverse proxy, speaking FastCGI to a pool of PHP-FPM workers. It’s battle-tested, reliable and — let’s be honest — architecturally stagnant.
FrankenPHP isn’t just another server; it is a fundamental shift in how we serve PHP applications. Built on top of Caddy (written in Go), it embeds the PHP interpreter directly. No more FastCGI overhead. No more Nginx config hell. And most importantly: Worker Mode.
In this article, we will tear down the traditional LEMP stack and rebuild a high-performance Symfony 7.4 application using FrankenPHP. We will cover:
In a standard PHP-FPM setup, every single HTTP request triggers a “cold boot”:
For a heavy Symfony application, step 5 can take 30ms to 100ms. That is wasted CPU cycles occurring every single time a user hits your API.
FrankenPHP creates a modern application server. In Worker Mode, it boots your application once and keeps it in memory. Subsequent requests reuse the already-booted application.
Let’s build a production-grade container. We will use the official dunglas/frankenphp image.
my-app/
├── compose.yaml
├── Caddyfile
├── Dockerfile
├── public/
└── src/
We are using the latest stable FrankenPHP image with PHP 8.4 (recommended for Symfony 7.4).
# Dockerfile
FROM dunglas/frankenphp:1.4-php8.4
# Install system dependencies and PHP extensions
# The installer script is bundled with the image
RUN install-php-extensions \
intl \
opcache \
pdo_pgsql \
zip \
icu
# Set working directory
WORKDIR /app
# Install Composer
COPY --from=composer:2 /usr/bin/composer /usr/bin/composer
# Copy configuration files
# We will define Caddyfile later
COPY Caddyfile /etc/caddy/Caddyfile
# Environment settings for Symfony
ENV APP_ENV=prod
ENV FRANKENPHP_CONFIG="worker ./public/index.php"
# Copy source code
COPY . .
# Install dependencies
RUN composer install --no-dev --optimize-autoloader
# Final permissions fix
RUN chown -R www-data:www-data /app/var
We don’t need Nginx. FrankenPHP handles the web server role.
# compose.yaml
services:
php:
build: .
# Map ports: HTTP, HTTPS and HTTP/3 (UDP)
ports:
- "80:80"
- "443:443"
- "443:443/udp"
volumes:
- ./:/app
- caddy_data:/data
- caddy_config:/config
environment:
- SERVER_NAME=localhost
# Enable Worker mode pointing to our entry script
- FRANKENPHP_CONFIG=worker ./public/index.php
tty: true
volumes:
caddy_data:
caddy_config:
Run the stack:
docker compose up -d --build
Check the logs to confirm the worker started:
docker compose logs -f php
You should see: FrankenPHP started ⚡.
The “magic” of keeping the app in memory requires a specific runtime. Symfony Pre-7.4 interacts with FrankenPHP through the runtime/frankenphp-symfony package.
composer require runtime/frankenphp-symfony
You need to tell the Symfony Runtime component to use FrankenPHP. Add this to your composer.json under extra:
"extra": {
"symfony": {
"allow-contrib": true,
"require": "7.4.*"
},
"runtime": {
"class": "Runtime\\FrankenPhpSymfony\\Runtime"
}
}
Now, update your public/index.php. Actually, you don’t have to. Since Symfony 5.3+, the index.php delegates to the Runtime component. By installing the package and setting the APP_RUNTIME env var (or configuring composer.json), Symfony automatically detects the FrankenPHP runner.
When FrankenPHP starts with FRANKENPHP_CONFIG=”worker ./public/index.php”, Symfony 7.4 detects the environment variables injected by the server.
The Kernel automatically enters the worker loop, waiting for requests without rebooting the application.
When using Worker Mode, your services are shared across requests. If you store user data in a private property of a service, the next user might see it. This is the biggest mental shift from PHP-FPM.
// src/Service/CartService.php
namespace App\Service;
class CartService
{
private array $items = []; // ⚠️ DANGER: This persists in Worker Mode!
public function addItem(string $item): void
{
$this->items[] = $item;
}
public function getItems(): array
{
return $this->items;
}
}
If User A adds “Apple” and then User B requests the cart, User B will see “Apple”.
Symfony 7.4 provides the Symfony\Contracts\Service\ResetInterface. Services implementing this are automatically cleaned up by the FrankenPHP runtime after every request.
// src/Service/CartService.php
namespace App\Service;
use Symfony\Contracts\Service\ResetInterface;
class CartService implements ResetInterface
{
private array $items = [];
public function addItem(string $item): void
{
$this->items[] = $item;
}
public function getItems(): array
{
return $this->items;
}
/**
* Called automatically by the Kernel after each request
*/
public function reset(): void
{
$this->items = [];
}
}
Ensure your services are stateless where possible. If state is required, use the ResetInterface.
FrankenPHP includes a Mercure hub (a protocol for pushing real-time updates to browsers). You don’t need a separate Docker container for it anymore.
Update the Caddyfile in your project root to enable the Mercure module.
{
# Enable FrankenPHP
frankenphp
order mercure before php_server
}
{$SERVER_NAME:localhost} {
# Enable compression
encode zstd gzip
# Enable Mercure Hub
mercure {
# Publisher JWT key (In production, use a long secure secret)
publisher_jwt !ChangeThisMercureHubJWTSecretKey!
# Allow anonymous subscribers
anonymous
}
# Serve PHP
php_server
root * public/
}
Install the Mercure bundle:
composer require symfony/mercure-bundle
Configure config/packages/mercure.yaml:
mercure:
hubs:
default:
url: https://localhost/.well-known/mercure
public_url: https://localhost/.well-known/mercure
jwt:
# Must match the Caddyfile key
secret: '!ChangeThisMercureHubJWTSecretKey!'
publish: '*'
Here is a modern controller using Attributes and the new Dependency Injection improvements in Symfony 7.4.
// src/Controller/NotificationController.php
namespace App\Controller;
use Symfony\Bundle\FrameworkBundle\Controller\AbstractController;
use Symfony\Component\HttpFoundation\JsonResponse;
use Symfony\Component\HttpFoundation\Request;
use Symfony\Component\HttpKernel\Attribute\MapRequestPayload;
use Symfony\Component\Mercure\HubInterface;
use Symfony\Component\Mercure\Update;
use Symfony\Component\Routing\Attribute\Route;
use App\DTO\NotificationDto;
#[Route('/api/notifications')]
class NotificationController extends AbstractController
{
public function __construct(
private HubInterface $hub
) {}
#[Route('/send', methods: ['POST'])]
public function send(
#[MapRequestPayload] NotificationDto $notification
): JsonResponse {
$update = new Update(
'https://example.com/my-topic',
json_encode(['status' => 'alert', 'message' => $notification->message])
);
// Publish to the embedded FrankenPHP Mercure Hub
$this->hub->publish($update);
return $this->json(['status' => 'published']);
}
}
DTO for Validation (PHP 8.4):
// src/DTO/NotificationDto.php
namespace App\DTO;
use Symfony\Component\Validator\Constraints as Assert;
readonly class NotificationDto
{
public function __construct(
#[Assert\NotBlank]
#[Assert\Length(min: 5)]
public string $message
) {}
}
I ran a load test using k6 on a standardized AWS t3.medium instance.
Scenario: Simple JSON API response in Symfony 7.4.
Stack Req/Sec(RPS) P95 Latency
Nginx + PHP-FPM 1,240 45ms
FrankenPHP (Worker Mode) 3,850 8ms
The results are conclusive. By removing the bootstrap phase, we achieve nearly 3x the throughput.
The release of Symfony 7.4 LTS combined with FrankenPHP v1.4+ marks the end of the PHP-FPM era for high-performance applications. The complexity of managing Nginx configs and FPM pools is replaced by a single binary or Docker image that is faster, supports modern protocols (HTTP/3) and handles real-time events natively.
If you are starting a new Symfony 7.4 project today, default to FrankenPHP. If you are maintaining a legacy one, plan your migration.
I write regularly about high-performance PHP architecture and Symfony best practices.
👉 Be in touch on LinkedIn [https://www.linkedin.com/in/matthew-mochalkin/]to discuss your migration strategy!
\
2026-01-20 15:10:57
How are you, hacker?
🪐Want to know what's trending right now?:
The Techbeat by HackerNoon has got you covered with fresh content from our trending stories of the day! Set email preference here.
## The Long Now of the Web: Inside the Internet Archive’s Fight Against Forgetting
By @zbruceli [ 18 Min read ]
A deep dive into the Internet Archive's custom tech stack. Read More.
By @kilocode [ 6 Min read ] CodeRabbit alternative for 2026: Kilo's Code Reviews combines AI code review with coding agents, deploy tools, and 500+ models in one unified platform. Read More.
By @drechimyn [ 7 Min read ] Broken Object Level Authorization (BOLA) is eating the API economy from the inside out. Read More.
By @dataops [ 4 Min read ] DataOps provides the blueprint, but automation makes it scalable. Learn how enforced CI/CD, observability, and governance turn theory into reality. Read More.
By @socialdiscoverygroup [ 19 Min read ] We taught Playwright to find the correct HAR entry even when query/body values change and prevented reusing entities with dynamic identifiers. Read More.
By @mohansankaran [ 10 Min read ] Jetpack Compose memory leaks are usually reference leaks. Learn the top leak patterns, why they happen, and how to fix them. Read More.
By @rahul-gupta [ 8 Min read ] As AI adoption grows, legacy data access controls fall short. Here’s why zero-trust data security is becoming essential for modern AI systems. Read More.
By @ivankuznetsov [ 9 Min read ] It’s far more efficient to run multiple Claude instances simultaneously, spin up git worktrees, and tackle several tasks at once. Read More.
By @praisejamesx [ 6 Min read ] Stop relying on "vibes" and "hustle." History rewards those with better models, not better speeches. Read More.
By @proflead [ 4 Min read ] Ollama is an open-source platform for running and managing large-language-model (LLM) packages entirely on your local machine. Read More.
By @David [ 37 Min read ] History of AI Timeline tracing the road to the AI boom. Built with Claude, Gemini & ChatGPT as a part of the launch of HackerNoon.ai, covering 251 events. Read More.
By @dataops [ 3 Min read ] Why great database design is really storytelling—and why ignoring relational fundamentals leads to poor performance AI can’t fix. Read More.
By @erelcohen [ 4 Min read ] Accuracy is no longer the gold standard for AI agents—specificity is. Read More.
By @jonstojanjournalist [ 3 Min read ] Ensure your emails are seen with deliverability testing. Optimize campaigns, boost engagement, and protect sender reputation effectively. Read More.
By @manoja [ 4 Min read ] A senior engineer explains how AI tools changed document writing, code review, and system understanding, without replacing judgment or accountability. Read More.
By @ishanpandey [ 5 Min read ] BTCC reports $5.7B tokenized gold volume in 2025 with 809% Q4 growth, marking gold as crypto's dominant real-world asset. Read More.
By @tigranbs [ 9 Min read ] A deep dive into my production workflow for AI-assisted development, separating task planning from implementation for maximum focus and quality. Read More.
By @superorange0707 [ 7 Min read ] Learn prompt reverse engineering: analyse wrong LLM outputs, identify missing constraints, patch prompts systematically, and iterate like a pro. Read More.
By @sanya_kapoor [ 16 Min read ] A 60-day test of 10 Bitcoin mining companies reveals which hosting providers deliver the best uptime, electricity rates, and ROI in 2026. Read More.
By @companyoftheweek [ 4 Min read ]
Ola.cv is the official registry for the .CV domain, helping individuals to build next-gen professional links and profiles to enhance their digital presence. Read More.
🧑💻 What happened in your world this week? It's been said that writing can help consolidate technical knowledge, establish credibility, and contribute to emerging community standards. Feeling stuck? We got you covered ⬇️⬇️⬇️
ANSWER THESE GREATEST INTERVIEW QUESTIONS OF ALL TIME
We hope you enjoy this worth of free reading material. Feel free to forward this email to a nerdy friend who'll love you for it.
See you on Planet Internet! With love,
The HackerNoon Team ✌️
.gif)