MoreRSS

site iconThe Practical DeveloperModify

A constructive and inclusive social network for software developers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of The Practical Developer

Deploying RouteReality: The Real Challenges of Running a Live, Real-Time Bus Prediction System

2026-02-04 02:12:33

RouteReality: Building a Community-Powered Bus Tracker for Belfast
When I first launched routereality.co.uk, the goal was simple: give Belfast and Northern Ireland bus users better, more accurate arrival predictions than the official sources provide. Unlike apps that rely solely on scheduled timetables or delayed GPS feeds, RouteReality is fully community-powered.

Users check predicted times, wait at the stop, and tap to report when the bus actually arrives. Every report feeds back into the system, refining predictions for everyone in real time.

Today, the site runs 24/7, covering 100+ routes and 17,000+ stops, with live updates constantly streaming in from real users across the country. But building and deploying a system like this one that people depend on every day was far from straightforward. Here are the main problems I encountered along the way, and how they shaped the project.

1. Keeping Real-Time Data Accurate When Users Are Always Reporting

The core of RouteReality is user-submitted arrival reports. In theory, more reports = better predictions. In practice, the live nature of the system creates immediate challenges.

Timing mismatches and outliers — People report arrivals at slightly different times due to boarding the bus, network delays, or simply tapping a second too early/late. Early on, a few bad reports could mess with predictions by minutes. I had to implement outlier detection (ignoring reports more than ~3 minutes off the median) and time window clustering to group reports for the same bus instance.

Duplicate or spam reports — With users constantly using the site, especially during peak commute hours, the same bus stop could receive multiple reports in seconds. Without careful deduplication logic (based on user session, location hints, and timestamp proximity), predictions would jump erratically.

Sparse data in the early days— When the user base was small, many stops had zero or one report per day. Predictions defaulted to timetable estimates, but users expected better. Bootstrapping accuracy required careful fallback logic and incentives to encourage early reporting.

**2. Scalability and Performance Under Constant User Load
**Unlike a static site, RouteReality has users actively querying predictions and submitting reports at all hours. The system needs to handle concurrent reads/writes without lag, especially during busy periods.

Database pressure — Storing millions of timestamped reports requires a time-series-friendly setup. Early prototypes using a standard relational DB choked on write-heavy loads. Moving to a more scalable store (with proper indexing and partitioning by route/stop/day) prevented slowdowns.

Server costs and monitoring — Running 24/7 means no off-hours for heavy maintenance. Unexpected spikes in traffic (e.g., bad weather driving more bus usage) could spike costs or cause brief slowdowns. Setting up real-time monitoring dashboards became essential to catch issues before users noticed.

  1. Live Deployment Nightmares: No Room for Downtime Deploying updates to a live system with constant user activity is unforgiving. Even a 30-second outage frustrates someone waiting at a rainy stop.

Zero-downtime deploys — Early deploys caused brief interruptions as the server restarted. Implementing blue-green deployments or rolling updates eliminated that pain, but required more infrastructure setup.

*Bug fixes under pressure *— A subtle bug in report aggregation once caused predictions to drift by 5+ minutes for a popular route during evening rush. Users noticed immediately and reported it (ironically helping debug). Hotfixes had to be rolled out without breaking ongoing sessions.

*Testing in production-like conditions *— Local tests missed real-world issues like network latency on mobile data, varied device clocks, or users in poor signal areas. Gradual rollouts and feature flags became my best friends.

Lessons Learned and What's Next
Launching RouteReality taught me that real-time, user-dependent systems are as much about people as technology. Community engagement is the biggest variable. The more people report, the better it gets, but getting that branch started requires patience and careful tuning.

Despite the challenges, the system is live, improving daily, and already helping commuters in Belfast and beyond. Future plans include better outlier handling, optional location-based reporting (with privacy controls), and deeper analytics to spot patterns (e.g., chronically late routes).

If you're a regular user, thank you for every report. It directly makes predictions more accurate for everyone. And if you haven't tried it yet, head to the journey page and start reporting. The more we all contribute, the better RouteReality becomes.

Always cross-check with official Translink sources, as RouteReality remains an independent community project.

Posted February 2026

MdBin Levels Up Again: E2E Encryption, Theme Toggle, and Responsive Nav

2026-02-04 02:07:57

The Evolution Continues (Again)

Last time, I shared how MdBin migrated to Streamdown for better markdown rendering—Mermaid diagrams, KaTeX math, built-in controls, the works.

But there was one feature request that kept coming up: "Can I share sensitive content without you seeing it?"

Today, I'm excited to announce: end-to-end encrypted pastes are live.

The Problem with Traditional Pastebins

Here's the uncomfortable truth about every pastebin service: they can read your content.

When you paste something into Pastebin, GitHub Gists, or even MdBin (until today), the server receives your plaintext, stores it, and serves it back. The service operator—and anyone who gains access to their database—can read everything you've shared.

For most use cases, this is fine. But what about:

  • API keys you need to share with a teammate
  • Private configuration snippets
  • Sensitive meeting notes
  • Personal information

The traditional answer is "just use Signal" or "encrypt it yourself first." But that adds friction, and friction kills adoption.

The Solution: True End-to-End Encryption

With MdBin's new encrypted paste feature, the server becomes a dumb blob storage. Here's what happens:

  1. You type content in the browser
  2. JavaScript encrypts it with your password before it leaves your device
  3. We store the encrypted blob—we literally cannot read it
  4. Recipients decrypt in their browser using the same password

The server never sees your plaintext. We don't store your password. We couldn't decrypt your content even if we wanted to.

The Crypto Implementation

I didn't roll my own crypto (please never do this). Instead, I used the Web Crypto API with industry-standard algorithms.

Key Derivation: PBKDF2

Passwords are weak. Turning a password into a strong encryption key requires a key derivation function:

const PBKDF2_ITERATIONS = 310000 // OWASP 2023 recommendation

async function deriveKey(
  password: string,
  salt: Uint8Array
): Promise<CryptoKey> {
  const encoder = new TextEncoder()
  const passwordBuffer = encoder.encode(password)

  const keyMaterial = await crypto.subtle.importKey(
    'raw',
    passwordBuffer,
    'PBKDF2',
    false,
    ['deriveBits', 'deriveKey']
  )

  return crypto.subtle.deriveKey(
    {
      name: 'PBKDF2',
      salt: salt,
      iterations: PBKDF2_ITERATIONS,
      hash: 'SHA-256',
    },
    keyMaterial,
    { name: 'AES-GCM', length: 256 },
    false,
    ['encrypt', 'decrypt']
  )
}

Why 310,000 iterations? That's the OWASP 2023 recommendation for PBKDF2-HMAC-SHA256. It makes brute-force attacks computationally expensive while still being fast enough on modern devices.

Encryption: AES-256-GCM

For the actual encryption, I chose AES-256-GCM—authenticated encryption that provides both confidentiality and integrity:

export async function encrypt(
  plaintext: string,
  password: string
): Promise<string> {
  const encoder = new TextEncoder()
  const plaintextBuffer = encoder.encode(plaintext)

  // Generate random salt and IV for each encryption
  const salt = crypto.getRandomValues(new Uint8Array(16))
  const iv = crypto.getRandomValues(new Uint8Array(12))

  const key = await deriveKey(password, salt)

  const ciphertext = await crypto.subtle.encrypt(
    { name: 'AES-GCM', iv },
    key,
    plaintextBuffer
  )

  // Combine: salt || iv || ciphertext
  const combined = new Uint8Array(
    16 + 12 + ciphertext.byteLength
  )
  combined.set(salt, 0)
  combined.set(iv, 16)
  combined.set(new Uint8Array(ciphertext), 28)

  return btoa(String.fromCharCode(...combined))
}

Key points:

  • Random salt per paste: Same password + different content = different ciphertext
  • Random IV per encryption: Required by GCM mode for security
  • Base64 output: Safe to store in any database text field

Decryption

Decryption reverses the process:

export async function decrypt(
  encrypted: string,
  password: string
): Promise<string> {
  const combined = new Uint8Array(
    atob(encrypted).split('').map(c => c.charCodeAt(0))
  )

  // Extract salt, iv, and ciphertext
  const salt = combined.slice(0, 16)
  const iv = combined.slice(16, 28)
  const ciphertext = combined.slice(28)

  const key = await deriveKey(password, salt)

  const plaintextBuffer = await crypto.subtle.decrypt(
    { name: 'AES-GCM', iv },
    key,
    ciphertext
  )

  return new TextDecoder().decode(plaintextBuffer)
}

If you provide the wrong password, crypto.subtle.decrypt throws—GCM's authentication tag verification fails. No partial decryption, no garbage output, just a clean error.

The UX Implementation

Crypto is useless if people don't use it. Here's how I made encryption approachable.

Normal/Encrypted Toggle

The paste form now has a mode switcher:

<div className="flex items-center gap-2 p-1 bg-gray-100 rounded-lg w-fit">
  <button
    onClick={() => setIsEncrypted(false)}
    className={!isEncrypted ? 'bg-white shadow-sm' : ''}
  >
    <LockOpen className="w-4 h-4" />
    Normal
  </button>
  <button
    onClick={() => setIsEncrypted(true)}
    className={isEncrypted ? 'bg-white shadow-sm' : ''}
  >
    <Lock className="w-4 h-4" />
    Encrypted
  </button>
</div>

Simple, obvious, no hidden settings pages.

Password Strength Meter

Weak passwords defeat encryption. I added real-time password validation:

export function validatePassword(password: string): ValidationResult {
  const checks = {
    minLength: password.length >= 8,
    hasLowercase: /[a-z]/.test(password),
    hasUppercase: /[A-Z]/.test(password),
    hasNumber: /[0-9]/.test(password),
    hasSpecial: /[!@#$%^&*...]/.test(password),
  }

  let score = Object.values(checks).filter(Boolean).length
  if (password.length >= 12) score++
  if (password.length >= 16) score++

  // Map to 0-4 strength scale
  return { isValid: checks.minLength, checks, strength: score }
}

The UI shows a color-coded bar and checkmarks for each requirement. Users see exactly what makes a strong password.

The Sharing Trick: URL Hash

Here's a clever feature: you can share the password in the URL.

https://mdbin.sivaramp.com/e/abc123#MySecretPassword

The fragment after # never gets sent to the server—it's browser-only. So you can share a complete self-decrypting link, and we still never see the password.

useEffect(() => {
  if (typeof window !== 'undefined') {
    const hash = window.location.hash.slice(1)
    if (hash) {
      const password = decodeURIComponent(hash)
      // Clear hash immediately to prevent browser history leak
      window.history.replaceState(null, '', window.location.pathname)
      handleDecrypt(password, false)
    }
  }
}, [])

The hash is immediately cleared from the URL bar after reading. It won't appear in browser history, bookmarks, or shared screenshots.

Security Considerations

What We Can't Do

With encrypted pastes, MdBin cannot:

  • Read your content
  • Reset or recover your password
  • Comply with data requests for your plaintext (we don't have it)
  • Tell you what you encrypted if you forget the password

This is a feature, not a bug.

localStorage Trade-offs

The "Remember password" feature stores passwords in localStorage. I added clear warnings:

{savePassword && (
  <p className="text-xs text-amber-600">
    Password will be stored in your browser.
    Only use on trusted devices.
  </p>
)}

And there's a "Forget & Lock" button to clear stored passwords and re-lock the paste.

Size Limits

Encrypted pastes have a 75KB limit (vs 100KB for normal). Base64 encoding and the salt/IV overhead add ~35% to the stored size.

What I Gained

Feature Before After
Server can read content ✅ Yes ❌ No (encrypted)
Password recovery N/A ❌ Impossible
Share sensitive content ❌ Risky ✅ Safe
Self-decrypting links ❌ No ✅ URL hash
Encryption algorithm N/A AES-256-GCM
Key derivation N/A PBKDF2 (310k iterations)

Try It Out

Head to mdbin.sivaramp.com, toggle to Encrypted mode, and paste something sensitive.

Here's a test you can try:

  1. Create an encrypted paste with password test123
  2. Note how the URL is /e/[id] instead of /p/[id]
  3. Share the link as https://mdbin.sivaramp.com/e/[id]#test123
  4. Open in incognito—it auto-decrypts

Streamdown Plugin Update

One more thing: Streamdown moved to a plugin architecture in a recent update. The new setup looks like this:

import { createCodePlugin } from '@streamdown/code'
import { mermaid } from '@streamdown/mermaid'
import { math } from '@streamdown/math'

const code = createCodePlugin({
  themes: ['github-light', 'github-dark'],
})

<Streamdown plugins={{ code, mermaid, math }}>
  {content}
</Streamdown>

Same great features, more modular architecture. I updated the home page to highlight all three new capabilities: Mermaid diagrams, Math/LaTeX, and end-to-end encryption.

Bonus: Theme Toggle & Responsive Navbar

While I was at it, I added two quality-of-life improvements that deserved their own deep dive.

Dark Mode Toggle with next-themes

Previously, MdBin only respected prefers-color-scheme—you got whatever your OS dictated. Now there's a proper theme toggle.

The Setup

First, install next-themes:

bun add next-themes

Create a ThemeProvider wrapper:

// src/components/theme-provider.tsx
'use client'

import { ThemeProvider as NextThemesProvider } from 'next-themes'

export function ThemeProvider({ children }: { children: React.ReactNode }) {
  return (
    <NextThemesProvider
      attribute="class"
      defaultTheme="system"
      enableSystem
      disableTransitionOnChange
    >
      {children}
    </NextThemesProvider>
  )
}

Key config:

  • attribute="class" — Adds .dark class to <html> instead of using data attributes
  • enableSystem — Respects OS preference when set to "system"
  • disableTransitionOnChange — Prevents flash-of-wrong-theme during hydration

Tailwind CSS v4 Dark Mode

Here's the trick: Tailwind v4 uses a different syntax for custom variants. In globals.css:

@import 'tailwindcss';

@custom-variant dark (&:where(.dark, .dark *));

This enables class-based dark mode alongside Tailwind's existing dark: utilities. All those dark:bg-gray-900 classes now work with next-themes.

The Toggle Component

'use client'

import { useTheme } from 'next-themes'
import { useEffect, useState } from 'react'
import { Sun, Moon, Monitor } from 'lucide-react'

export function ThemeToggle() {
  const [mounted, setMounted] = useState(false)
  const { theme, setTheme } = useTheme()

  // Avoid hydration mismatch
  useEffect(() => setMounted(true), [])
  if (!mounted) return <div className="w-9 h-9" /> // Placeholder

  const cycleTheme = () => {
    if (theme === 'light') setTheme('dark')
    else if (theme === 'dark') setTheme('system')
    else setTheme('light')
  }

  return (
    <button
      onClick={cycleTheme}
      className="p-2 rounded-lg hover:bg-gray-100 dark:hover:bg-gray-800"
      aria-label={`Current theme: ${theme}`}
    >
      {theme === 'light' && <Sun className="w-5 h-5" />}
      {theme === 'dark' && <Moon className="w-5 h-5" />}
      {theme === 'system' && <Monitor className="w-5 h-5" />}
    </button>
  )
}

The mounted check prevents hydration mismatches—next-themes doesn't know the theme until client-side JavaScript runs.

Responsive Hamburger Menu

The header was getting crowded on mobile: Copy Link, Raw, New Paste, plus the new theme toggle. Instead of cramming tiny buttons, I added a hamburger menu below the md: breakpoint.

Desktop vs Mobile

<div className="flex items-center gap-2">
  {/* Desktop: full button row */}
  <div className="hidden md:flex items-center gap-2">
    <button onClick={handleCopy}>Copy Link</button>
    <Link href={`/p/${pasteId}/raw`}>Raw</Link>
    <Link href="/">New Paste</Link>
  </div>

  {/* Always visible */}
  <ThemeToggle />

  {/* Mobile: hamburger */}
  <div className="md:hidden relative">
    <button onClick={() => setIsMenuOpen(!isMenuOpen)}>
      {isMenuOpen ? <X /> : <Menu />}
    </button>
    {isMenuOpen && <DropdownMenu />}
  </div>
</div>

Click-Outside & Escape Key Handling

Two patterns I always include for dropdowns:

const menuRef = useRef<HTMLDivElement>(null)
const buttonRef = useRef<HTMLButtonElement>(null)

// Close on click outside
useEffect(() => {
  function handleClickOutside(event: MouseEvent) {
    if (
      menuRef.current &&
      buttonRef.current &&
      !menuRef.current.contains(event.target as Node) &&
      !buttonRef.current.contains(event.target as Node)
    ) {
      setIsMenuOpen(false)
    }
  }

  if (isMenuOpen) {
    document.addEventListener('mousedown', handleClickOutside)
    return () => document.removeEventListener('mousedown', handleClickOutside)
  }
}, [isMenuOpen])

// Close on Escape
useEffect(() => {
  function handleEscape(event: KeyboardEvent) {
    if (event.key === 'Escape') setIsMenuOpen(false)
  }

  if (isMenuOpen) {
    document.addEventListener('keydown', handleEscape)
    return () => document.removeEventListener('keydown', handleEscape)
  }
}, [isMenuOpen])

Only attach listeners when the menu is open. Clean them up on close. No memory leaks.

The Dropdown

{isMenuOpen && (
  <div
    ref={menuRef}
    className="absolute right-0 top-full mt-2 w-48 bg-white dark:bg-gray-900
               border border-gray-200 dark:border-gray-700 rounded-lg shadow-lg py-2"
  >
    <button onClick={() => { handleCopy(); setIsMenuOpen(false) }}>
      Copy Link
    </button>
    <Link href={`/p/${pasteId}/raw`} onClick={() => setIsMenuOpen(false)}>
      Raw
    </Link>
    <Link href="/" onClick={() => setIsMenuOpen(false)}>
      New Paste
    </Link>
  </div>
)}

Each action closes the menu. The absolute right-0 top-full positions it below the hamburger button, aligned to the right edge.

Small details, but they matter for usability.

What's Next

With rendering and encryption sorted, the roadmap is clear:

  • Expiration options — 1 hour, 1 day, 1 week, or permanent
  • Edit links — Update pastes with a secret token
  • Syntax-aware editor — CodeMirror or Monaco for the input
  • Paste forking — Duplicate and modify existing pastes

The foundation is solid. The features are useful. Now it's about polish and power-user capabilities.

TL;DR: Added end-to-end encryption to MdBin using AES-256-GCM with PBKDF2 key derivation (310k iterations). Server never sees your plaintext or password. Share sensitive content via self-decrypting URL hash links. Also: Streamdown plugin architecture upgrade, dark/light/system theme toggle with next-themes + Tailwind v4 class-based dark mode, and responsive hamburger menu with proper click-outside and escape key handling.

Check out the encrypted paste feature at mdbin.sivaramp.com

Platform Thread vs Virtual Thread in JAVA

2026-02-04 02:07:24

For a long time, I was confident that I understood how concurrency worked in Java.

Create a thread.
Start it.
Join it.
After that, a thread pool handled the growing workload.

Simple… right?

Then I started reading about Virtual Threads in Java (Project Loom), and suddenly I realized — we’ve been living with limitations we just accepted as “normal”.

This post is my attempt to explain:

  1. What Platform Threads are
  2. What problems virtual threads solve
  3. How they differ
  4. And when it makes sense to use one over the other

not like documentation, but like how I actually understood it.

1. Platform Threads

In Java, the classic thread model is called a Platform Thread. These are traditional threads backed directly by the Operating System.

Every Java developer starts with this:

Thread thread = new Thread(() -> {
    doWork();
});
thread.start(); // Starts a new platform thread

It feels powerful at first. You’re doing real parallel work. Actual multitasking.

But then the cracks appear.

  • Creating too many threads? ❌ App slows down.
  • Thousands of requests? ❌ Thread pool exhausted.
  • Blocking I/O? ❌ Threads just sit there doing nothing.

Why...?

Because platform threads are directly mapped to OS threads.

That means:

  • One Java thread = one OS thread.
  • OS threads are expensive.
  • Each thread has a big stack and scheduling cost.

So instead of writing simple code, we started doing tricks:

  • Fixed-size thread pools.
  • Async APIs.
  • Reactive programming.
  • Complicated execution models.

Not because we wanted to — but because threads didn’t scale.

The Question That Couldn’t Be Ignored
If a thread is waiting for I/O, why is it still blocking an OS thread?

During I/O operations, a thread:

  • isn’t doing any computation.
  • isn’t using the CPU.
  • is simply waiting for an external operation to complete.

So why is it treated like a scarce system resource?
If the thread isn’t actively running, why should it continue holding an OS thread?

That question is exactly where Virtual Threads come in.

2. Virtual Threads — Lightweight & Scalable

Virtual threads are super-light, JVM-managed threads that let us handle millions of concurrent tasks without worrying about OS thread overhead.

Creating one is straightforward:

Thread thread = Thread.startVirtualThread(() -> {
    doWork();
});

That’s it.

  • Same programming model.
  • Same blocking style.
  • Totally different execution story.

Virtual threads are:

  • Created by the JVM
  • Not permanently attached to OS threads
  • Extremely cheap compared to platform threads

And that changes everything.

What Actually Happens?
Virtual threads run on top of platform threads, called carrier threads.

When a virtual thread:

  1. is running -> it’s mounted on a carrier thread.
  2. hits a blocking operation -> it gets parked.
  3. is unblocked -> it resumes on any available carrier thread.

So instead of blocking an OS thread while waiting, the JVM simply moves on.

In simple terms:

  • Platform threads block the system.
  • Virtual threads block only themselves.

This solves the core scalability problem of platform threads.

  • Blocking I/O no longer wastes OS threads.
  • Thread pools stop being a hard limit.
  • Simple blocking code can scale to massive concurrency.

Virtual threads don’t remove blocking — they remove the cost of blocking.

That’s the breakthrough.

3. How They Differ?

This difference is easier to see in the diagram below - notice what happens during blocking.

4. When it makes sense to use one over the other?

Virtual threads are a great default for most modern applications — especially when the workload is I/O-heavy.

They make sense when:

  • Handling many concurrent requests
  • Waiting on databases, APIs, or file systems
  • Building web servers, APIs, or microservices
  • Wanting simple, blocking code that still scales

Platform threads still matter, though.

They make sense when:

  • The work is CPU-intensive
  • Threads need tight interaction with the OS
  • Thread pinning or native calls are involved

In practice, it comes down to this:

  • Use virtual threads for waiting.
  • Use platform threads for computing.

That rule alone covers most real-world decisions.

Final take

Virtual threads didn’t make Java faster.
They just stopped punishing us for blocking.

We can write simple code.
We can wait on I/O.
And the JVM handles the mess.

No thread pool anxiety.
No async gymnastics.
No PhD in callbacks.

Same Java.
Same threads (mostly).
Just way fewer headaches.

That’s it.
If this helped you even a little, hit ❤️, drop a comment, or share your thoughts below.

How do you know the software is working?

2026-02-04 01:54:13

How do you know the software is working?

Hiya!

I'm back! I feel that I owe you an explanation of what's going on here.

I started this blog because I want to share my insights on agentic coding from the perspective of a developer, CTO, CEO and founder. I plan to cover the entire autonomous AI coding journey.

Throughout this series, we'll get our mindset right about the many roles you'll take on: product designer, project manager, tech lead and quality assurance engineer. Later, I'll take you through a brainstorming session. Once we have a feature specification in place, we will learn how to manage a group of coding agents. We'll learn how to enforce the rules and, most importantly, why they're important. By the end, you will be confidently shipping AI-generated code to production. We will be doing some 'vibe coding' in production.

Not necessarily in that order.

Buckle up. Here's post #2.

In previous post we covered how to make Claude stick to conventions (tl;dr - skills + hooks fix it). Now it follows the rules but...

Marcin, all tasks are complete.

I open a browser and see:

NoMethodError: undefined method 'hallucinated_method' for an instance of User (NoMethodError)

Yeah, good job, Claude! High five, let's ship it to production... NOT.

This brings us to a fundamental question: how do you know the software is working?

Back in 2017, I was working on a payment processor for a company called Paladin Software. We were processing huge YouTube earnings spreadsheets (yes, gigabyte-sized CSV files). My job was to ensure that we did it on time and that it simply worked.

One beautiful Thursday afternoon, I headed to my favourite spot in Krakow at the time, Dolnych Młynów. Friday was a day off.

As you might have guessed, one of the clients uploaded their spreadsheet on Friday. When I got back to the office on Monday, the earnings still hadn't been processed. Questions were being asked.

I'm looking at Sidekiq's failed jobs queue:

NoMethodError: undefined method 'hallucinated_method' for an instance of User (NoMethodError)

Was I an LLM before it was a thing? I was certainly shipping code like one.

LLMs have a condition called anterograde amnesia. This is the inability to form new memories after the onset of the condition. They remember their past, but new experiences don't stick. Unlike me, they can't learn from a production incident on a Friday. Every session starts from zero.

This is why they must be given a set of strict rules each time they write code (see the previous postabout enforcing these rules). However, rules alone are not enough. We also need checks and reviews.

Deterministic checks

LLMs are non-deterministic. This means that, just like me, the AI agent will sometimes produce excellent code and sometimes it won't. Sometimes it will spend a lot of time testing the feature. At other times, one test will look like more than enough.

To mitigate this, we need to implement some deterministic checks in our workflow. We need something that clearly indicates when something is wrong. Here's my opinion on what should be included in local CI:

  • Rubocop - static code analyser and linter. Autocorrect on.
  • Prettier - erb, css, js code formatter.
  • Brakeman: a static analysis tool that checks for security vulnerabilities.
  • RSpec - testing framework, use your favourite. Important! Use SimpleCov to report coverage
  • Undercover - warns about methods, classes and blocks that were changed without tests. It does so by analysing data from git diffs, code structure and SimpleCov coverage reports.

We enforce a single code style. Our code is secure. We have tests for new and changed code. All tests pass (by the way, how many times has an AI agent told you that a test failure is unrelated to their changes?).

There are no more runtime errors. There is better reliability. Some of my frustrations have gone again.

You might ask: aren't there too many tests and too much boilerplate? No, unit tests are fast. With coding agents, they are more maintainable than ever before. This is a pretty good deal for improved reliability.

Local CI

Wrap all of this in your local CI. If you're running Rails 8.1 or later, it's already in the framework. For Rails 8.0 and earlier you can take my ported implementation of it. Alternatively, you can create your own.

This is the sample output:

Continuous Integration
Running checks...

Rubocop
bundle exec rubocop -A

✅ Rubocop passed in 1.98s                                  

Prettier
yarn prettier --config .prettierrc.json app/packs app/components --write

✅ Prettier passed in 1.57s                                 

Brakeman
bundle exec brakeman --quiet --no-pager --except=EOLRails

✅ Brakeman passed in 8.76s                                 

RSpec
bundle exec parallel_rspec --serialize-stdout --combine-stderr

✅ RSpec passed in 1m32.45s                                 

Undercover
bundle exec undercover --lcov coverage/lcov/app.lcov --compare origin/master

✅ Undercover passed in 0.94s                               
✅ Continuous Integration passed in 1m45.70s

Code review

Back in 2014, I got my second IT job as a junior Rails developer at Netguru. The onboarding process included the Netguru way of writing code. Specific libraries and patterns.

I was writing code The Rails Way. I didn't have much experience with production-grade Rails apps. During one of the code reviews, I received feedback that my models were a bit too fat. They also provided a link to an article by Code Climate: '7 Ways to Decompose Fat ActiveRecord Models'.

I kinda heard about these rules. I was just so focused on getting the business logic right that I didn't apply them...

Oh wait, isn't it the exact same thing Claude told me?

"The CLAUDE.md says 'ALWAYS STOP and ask for clarification rather than making assumptions' and I violated that repeatedly. I got caught up in the momentum of the Rails 8 upgrade and stopped being careful."

It's not a new problem for the software industry. The remedy? Code review, obviously. Each pull request must be checked by another developer. This allows less experienced developers to learn good practices and enables more experienced developers to mentor others and pass on their knowledge. The rules are also enforced. Everybody wins.

Remember: never let the developer review their own code. The same applies to AI agents.

Three-stage code review

Why three stage?

Firstly, we will check that the implementation complies with the functionality specifications. This involves verifying that the agent has built what was requested (neither more nor less).

Secondly: A review of Rails and project-specific conventions. To do this, we have to load all the conventions (see previous post) and check them. Are the interfaces clean? View components instead of partial? Are jobs idempotent and thin? Do the tests verify behaviour?

Last but not least: A general code quality review of architecture, design, documentation, standards and maintainability.

All of these things give us a comprehensive overview of the implementation and any possible deviations. Each review is carried out by a different agent with a fresh perspective and no attachment to the feature.

Here's what a full report looks like in practice:

1. Spec compliance - line-by-line verification:

| Requirement | Implementation | Status |
|---|---|---|
| Column: delay_peer_reviews | :delay_peer_reviews | ✅ Match |

2. Rails conventions - checklist:

| Convention | Status |
|------------|--------|
| Reversible migration | PASS |
| Handles existing data | PASS |

3. Code quality - structured report with Strengths, Critical/Important/Minor issues, references, and merge assessment.

Final summary table:

| Check | Status |
|-------|--------|
| ✅ Spec compliance | Passed |
| ✅ Rails conventions | Passed |
| ✅ Code quality | Approved with minor suggestions |
| ✅ Local CI | Passed |

Ready for merge.

When issues are found, it consolidates them:

## Legitimate Findings to Address
1. No error handling in Discord::Client#post
2. No error handling in OAuth callback

## Findings I'm Skipping (Your Explicit Decisions)
- No encrypts on token fields (you requested this)

Which of these do you want me to address?

My /codereview command and review agent prompts are on GitHub.

How does this fit together

Let the AI agent write the code.

Tell the agent to run bin/ci and fix everything until it's green. Every fail is their responsibility.

They will make the local CI green.

Run the command /codereview.

The agent fixes any issues.

Run /codereview again until the code is ready.

Personally, I don't read the code until this point. As a good manager, I don't micromanage.

Be a good manager, too. Provide a set of rules and the tools needed to enforce them. Don't micromanage. If you're not happy with the results, adjust the rules. Repeat until you are happy with results.

Trust, but verify.

The spec compliance and code quality review agents are based on https://github.com/obra/superpowers.

This is the second post in a longer series.

So far, we have covered:

I'd love to hear your thoughts. Reach out to me on LinkedIn or at [email protected].

Backend API Optimization at Scale: Handling 100K+ Users with Node.js &amp; Express

2026-02-04 01:52:01

TL;DR: Discover the exact backend optimization strategies that reduced API response times from 800ms to 120ms, scaled from 120 req/s to 8,500 req/s, and cut costs by 60% - all while handling 100K+ concurrent users. Real metrics and production-ready patterns included! 🚀

Frontend performance means nothing if your backend can't keep up. At 100K+ users, every millisecond of API latency matters. Here's how I transformed my Node.js/Express backend from struggling with hundreds of requests per second to smoothly handling thousands.

🎯 The Backend Challenge: Speed, Scale & Reliability

When you scale from 1K to 100K+ users, backend challenges multiply:

  • API response times that were acceptable at 800ms become unacceptable
  • Single server architecture can't handle the load
  • Database connections become the bottleneck
  • Memory leaks that were hidden now crash servers
  • Error rates spike without proper handling
  • Costs skyrocket without optimization

The key insight: You can't just "add more servers" - you need systematic optimization.

📊 Starting Point vs. Results

Before Backend Optimization:

Performance:
  ├── Avg Response Time: 800ms
  ├── P95 Response Time: 2,400ms
  ├── P99 Response Time: 4,500ms
  ├── Throughput: 120 req/s
  └── Error Rate: 2.3%

Infrastructure:
  ├── Servers: 2 instances
  ├── Database Connections: Direct
  ├── Caching: None
  └── Load Balancing: Basic

Cost:
  └── Monthly: $450/month

After Backend Optimization:

Performance:
  ├── Avg Response Time: 120ms (85% faster) 🚀
  ├── P95 Response Time: 310ms (87% faster) ⚡
  ├── P99 Response Time: 580ms (87% faster) 🔥
  ├── Throughput: 8,500 req/s (70x increase) 💪
  └── Error Rate: 0.08% (96% reduction) ✅

Infrastructure:
  ├── Servers: Auto-scaling (2-20 instances)
  ├── Database Connections: Pool + replicas
  ├── Caching: Redis (87% hit rate)
  └── Load Balancing: Advanced with health checks

Cost:
  └── Monthly: $680/month (1.5x cost, 70x capacity!)

Cost per request dropped from $0.0031 to $0.000044 - that's 98.6% more efficient!

⚡ Strategy #1: API Response Optimization

The Problem: Slow Endpoints Killing UX

Before: Inefficient data fetching

// ❌ BAD: Multiple sequential database queries
@Get('/api/teams/:teamId/dashboard')
async getTeamDashboard(@Param('teamId') teamId: number): Promise<any> {
  // Query 1: Get team info (200ms)
  const team = await this.db.query(
    'SELECT * FROM teams WHERE id = $1',
    [teamId]
  );

  // Query 2: Get team members (300ms)
  const members = await this.db.query(
    'SELECT * FROM users WHERE team_id = $1',
    [teamId]
  );

  // Query 3: Get metrics for each member (400ms each!)
  const memberMetrics = [];
  for (const member of members) {
    const metrics = await this.db.query(
      'SELECT * FROM metrics WHERE user_id = $1',
      [member.id]
    );
    memberMetrics.push(metrics);
  }

  // Query 4: Get team stats (250ms)
  const stats = await this.db.query(
    'SELECT * FROM team_stats WHERE team_id = $1',
    [teamId]
  );

  return { team, members, memberMetrics, stats };
}

// Total time: 200 + 300 + (400 × members) + 250 = 2,000ms+ for 3 members!

After: Optimized with parallel queries and caching

// ✅ GOOD: Parallel queries with caching
@Get('/api/teams/:teamId/dashboard')
async getTeamDashboard(@Param('teamId') teamId: number): Promise<any> {
  const cacheKey = `dashboard:team:${teamId}`;

  // Check cache first
  const cached = await this.redis.get(cacheKey);
  if (cached) {
    return JSON.parse(cached);
  }

  // Execute all queries in parallel using Promise.all
  const [team, members, metrics, stats] = await Promise.all([
    // Query 1: Team info
    this.db.query('SELECT * FROM teams WHERE id = $1', [teamId]),

    // Query 2: Team members
    this.db.query('SELECT * FROM users WHERE team_id = $1', [teamId]),

    // Query 3: All metrics in one query using JOIN
    this.db.query(`
      SELECT m.*, u.name as user_name
      FROM metrics m
      JOIN users u ON m.user_id = u.id
      WHERE u.team_id = $1
    `, [teamId]),

    // Query 4: Team stats
    this.db.query('SELECT * FROM team_stats WHERE team_id = $1', [teamId])
  ]);

  const result = {
    team: team.rows[0],
    members: members.rows,
    metrics: metrics.rows,
    stats: stats.rows[0]
  };

  // Cache for 5 minutes
  await this.redis.setex(cacheKey, 300, JSON.stringify(result));

  return result;
}

// Total time: max(200, 300, 150, 250) + cache overhead = ~320ms
// With cache hit: ~5ms!

Results:

  • Response time: 2,000ms → 320ms (84% faster)
  • With cache: 320ms → 5ms (98% faster)

Request Batching & Debouncing

// Batch multiple API requests into single database query
@Injectable()
export class BatchRequestService {
  private batchQueue: Map<string, BatchRequest> = new Map();
  private batchTimer: NodeJS.Timeout | null = null;
  private readonly BATCH_WINDOW = 50; // ms

  async get(url: string, params: any): Promise<any> {
    return new Promise((resolve, reject) => {
      const key = `${url}:${JSON.stringify(params)}`;

      if (!this.batchQueue.has(key)) {
        this.batchQueue.set(key, {
          url,
          params,
          resolvers: []
        });
      }

      this.batchQueue.get(key)!.resolvers.push({ resolve, reject });
      this.scheduleBatch();
    });
  }

  private scheduleBatch(): void {
    if (this.batchTimer) return;

    this.batchTimer = setTimeout(() => {
      this.executeBatch();
    }, this.BATCH_WINDOW);
  }

  private async executeBatch(): Promise<void> {
    const batch = Array.from(this.batchQueue.values());
    this.batchQueue.clear();
    this.batchTimer = null;

    // Group requests by type for efficient querying
    const grouped = this.groupRequests(batch);

    for (const [type, requests] of Object.entries(grouped)) {
      try {
        const results = await this.executeBatchQuery(type, requests);

        // Distribute results to waiting promises
        requests.forEach((req, index) => {
          req.resolvers.forEach(r => r.resolve(results[index]));
        });
      } catch (error) {
        requests.forEach(req => {
          req.resolvers.forEach(r => r.reject(error));
        });
      }
    }
  }

  private async executeBatchQuery(type: string, requests: any[]): Promise<any[]> {
    // Execute optimized batch query based on type
    const ids = requests.map(r => r.params.id);

    const query = `SELECT * FROM ${type} WHERE id = ANY($1)`;
    const result = await this.db.query(query, [ids]);

    return result.rows;
  }
}

🚀 Strategy #2: Redis Caching Layer

Multi-Level Caching Strategy

@Injectable()
export class CachedDataService {
  private readonly CACHE_TTL = {
    SHORT: 60,        // 1 minute - highly dynamic data
    MEDIUM: 300,      // 5 minutes - semi-static data
    LONG: 3600,       // 1 hour - rarely changing data
    VERY_LONG: 86400  // 24 hours - static reference data
  };

  constructor(
    private redis: RedisClient,
    private db: DatabaseService
  ) {}

  async getWithCache<T>(
    key: string,
    fetchFn: () => Promise<T>,
    ttl: number = this.CACHE_TTL.MEDIUM
  ): Promise<T> {
    // Try cache first
    const cached = await this.redis.get(key);
    if (cached) {
      return JSON.parse(cached);
    }

    // Cache miss - fetch from source
    const data = await fetchFn();

    // Store in cache
    await this.redis.setex(key, ttl, JSON.stringify(data));

    return data;
  }

  // Cache with automatic invalidation
  async setWithInvalidation(
    key: string,
    data: any,
    relatedKeys: string[] = []
  ): Promise<void> {
    // Invalidate related caches
    if (relatedKeys.length > 0) {
      await this.redis.del(...relatedKeys);
    }

    // Update the data
    await this.updateData(key, data);
  }

  // Pattern-based cache invalidation
  async invalidatePattern(pattern: string): Promise<void> {
    const keys = await this.redis.keys(pattern);
    if (keys.length > 0) {
      await this.redis.del(...keys);
    }
  }
}

// Usage example
@Injectable()
export class TeamMetricsService {
  constructor(private cache: CachedDataService) {}

  async getTeamMetrics(teamId: number): Promise<TeamMetrics> {
    return this.cache.getWithCache(
      `metrics:team:${teamId}`,
      async () => {
        // Expensive database query
        return await this.fetchTeamMetricsFromDb(teamId);
      },
      this.cache.CACHE_TTL.MEDIUM
    );
  }

  async updateTeamMetrics(teamId: number, data: any): Promise<void> {
    // Invalidate related caches
    await this.cache.setWithInvalidation(
      `metrics:team:${teamId}`,
      data,
      [
        `dashboard:team:${teamId}`,
        `metrics:team:${teamId}`,
        `stats:team:${teamId}`
      ]
    );
  }
}

Cache Warming Strategy

// Proactively populate cache for frequently accessed data
@Injectable()
export class CacheWarmingService {
  constructor(
    private redis: RedisClient,
    private db: DatabaseService
  ) {
    this.startWarmingSchedule();
  }

  private startWarmingSchedule(): void {
    // Warm cache every 4 minutes (before 5-minute expiry)
    setInterval(() => {
      this.warmFrequentlyAccessedData();
    }, 4 * 60 * 1000);
  }

  private async warmFrequentlyAccessedData(): Promise<void> {
    try {
      // Get list of active teams
      const activeTeams = await this.db.query(`
        SELECT DISTINCT team_id 
        FROM user_sessions 
        WHERE last_activity > NOW() - INTERVAL '1 hour'
      `);

      // Warm cache for each active team
      const warmingPromises = activeTeams.rows.map(async (team) => {
        const metrics = await this.fetchTeamMetrics(team.team_id);
        await this.redis.setex(
          `metrics:team:${team.team_id}`,
          300,
          JSON.stringify(metrics)
        );
      });

      await Promise.all(warmingPromises);
      console.log(`Cache warmed for ${activeTeams.rows.length} teams`);
    } catch (error) {
      console.error('Cache warming failed:', error);
    }
  }
}

Results:

  • Cache hit rate: 0% → 87%
  • Database load: Reduced by 85%
  • API response time: 800ms → 120ms average

💪 Strategy #3: Connection Pooling & Database Optimization

PostgreSQL Connection Pool

// Optimized connection pool configuration
import { Pool } from 'pg';

const poolConfig = {
  host: process.env.DB_HOST,
  port: 5432,
  database: process.env.DB_NAME,
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD,

  // Connection pool settings
  min: 10,                    // Minimum connections
  max: 100,                   // Maximum connections
  idleTimeoutMillis: 30000,   // Close idle connections after 30s
  connectionTimeoutMillis: 2000,

  // Performance tuning
  statement_timeout: 10000,   // Kill queries after 10s
  query_timeout: 10000,
  keepAlive: true,
  keepAliveInitialDelayMillis: 10000
};

class DatabaseService {
  private pool: Pool;
  private readPool: Pool;

  constructor() {
    // Write pool (primary database)
    this.pool = new Pool(poolConfig);

    // Read pool (read replicas)
    this.readPool = new Pool({
      ...poolConfig,
      host: process.env.DB_READ_REPLICA_HOST
    });

    this.setupPoolMonitoring();
  }

  private setupPoolMonitoring(): void {
    // Monitor pool health
    setInterval(() => {
      console.log('Pool stats:', {
        total: this.pool.totalCount,
        idle: this.pool.idleCount,
        waiting: this.pool.waitingCount
      });

      // Alert if pool is saturated
      if (this.pool.waitingCount > 10) {
        console.error('Connection pool saturated!');
        // Send alert to monitoring service
      }
    }, 60000);
  }

  async executeWrite(query: string, params: any[]): Promise<any> {
    const client = await this.pool.connect();
    try {
      return await client.query(query, params);
    } finally {
      client.release();
    }
  }

  async executeRead(query: string, params: any[]): Promise<any> {
    const client = await this.readPool.connect();
    try {
      return await client.query(query, params);
    } finally {
      client.release();
    }
  }

  async transaction<T>(callback: (client: any) => Promise<T>): Promise<T> {
    const client = await this.pool.connect();
    try {
      await client.query('BEGIN');
      const result = await callback(client);
      await client.query('COMMIT');
      return result;
    } catch (error) {
      await client.query('ROLLBACK');
      throw error;
    } finally {
      client.release();
    }
  }
}

Read/Write Splitting

@Injectable()
export class DataAccessService {
  constructor(private db: DatabaseService) {}

  // Read operations use read replicas
  async getTeamMetrics(teamId: number): Promise<any> {
    return this.db.executeRead(
      'SELECT * FROM team_metrics WHERE team_id = $1',
      [teamId]
    );
  }

  // Write operations use primary database
  async updateTeamMetrics(teamId: number, data: any): Promise<void> {
    await this.db.executeWrite(
      'UPDATE team_metrics SET data = $1, updated_at = NOW() WHERE team_id = $2',
      [data, teamId]
    );
  }

  // Transactions always use primary
  async createTeamWithMembers(team: any, members: any[]): Promise<void> {
    await this.db.transaction(async (client) => {
      // Insert team
      const teamResult = await client.query(
        'INSERT INTO teams (name, created_at) VALUES ($1, NOW()) RETURNING id',
        [team.name]
      );

      const teamId = teamResult.rows[0].id;

      // Insert members
      for (const member of members) {
        await client.query(
          'INSERT INTO users (team_id, name, email) VALUES ($1, $2, $3)',
          [teamId, member.name, member.email]
        );
      }
    });
  }
}

🎯 Strategy #4: Pagination & Efficient Data Transfer

Cursor-Based Pagination

// Efficient pagination for large datasets
@Get('/api/metrics')
async getMetrics(
  @Query('limit') limit: number = 50,
  @Query('cursor') cursor?: string
): Promise<PaginatedResponse> {
  // Validate and sanitize
  const safeLimit = Math.min(Math.max(limit, 1), 100);

  let query: string;
  let params: any[];

  if (cursor) {
    // Decode cursor (base64 encoded ID)
    const cursorId = Buffer.from(cursor, 'base64').toString('utf-8');

    query = `
      SELECT id, name, value, created_at
      FROM metrics
      WHERE id > $1
      ORDER BY id ASC
      LIMIT $2
    `;
    params = [cursorId, safeLimit];
  } else {
    query = `
      SELECT id, name, value, created_at
      FROM metrics
      ORDER BY id ASC
      LIMIT $1
    `;
    params = [safeLimit];
  }

  const result = await this.db.executeRead(query, params);
  const items = result.rows;

  // Generate next cursor
  const nextCursor = items.length === safeLimit
    ? Buffer.from(items[items.length - 1].id.toString()).toString('base64')
    : null;

  return {
    items,
    nextCursor,
    hasMore: items.length === safeLimit
  };
}

interface PaginatedResponse {
  items: any[];
  nextCursor: string | null;
  hasMore: boolean;
}

Response Compression

// Enable compression for API responses
import compression from 'compression';
import express from 'express';

const app = express();

// Compression middleware
app.use(compression({
  filter: (req, res) => {
    if (req.headers['x-no-compression']) {
      return false;
    }
    return compression.filter(req, res);
  },
  level: 6, // Compression level (1-9, 6 is good balance)
  threshold: 1024 // Only compress responses > 1KB
}));

// Result: Typical API response reduced from 45KB to 8KB (82% smaller)

🛡️ Strategy #5: Error Handling & Circuit Breaker

Comprehensive Error Handling

@Injectable()
export class ErrorHandlerService {
  handleError(error: any, context: string): never {
    // Log error with context
    console.error(`Error in ${context}:`, {
      message: error.message,
      stack: error.stack,
      timestamp: new Date().toISOString()
    });

    // Send to monitoring service (Sentry)
    if (process.env.NODE_ENV === 'production') {
      this.sentryService.captureException(error, { context });
    }

    // Return appropriate error response
    if (error instanceof ValidationError) {
      throw new BadRequestException(error.message);
    }

    if (error instanceof NotFoundError) {
      throw new NotFoundException(error.message);
    }

    if (error instanceof UnauthorizedError) {
      throw new UnauthorizedException(error.message);
    }

    // Generic error response
    throw new InternalServerErrorException(
      'An unexpected error occurred. Please try again later.'
    );
  }
}

Circuit Breaker Pattern

@Injectable()
export class CircuitBreakerService {
  private failures = new Map<string, number>();
  private lastFailureTime = new Map<string, number>();
  private state = new Map<string, CircuitState>();

  private readonly FAILURE_THRESHOLD = 5;
  private readonly RESET_TIMEOUT = 60000; // 1 minute
  private readonly HALF_OPEN_MAX_CALLS = 3;

  async execute<T>(
    key: string,
    fn: () => Promise<T>,
    fallback?: () => Promise<T>
  ): Promise<T> {
    const currentState = this.state.get(key) || 'closed';

    if (currentState === 'open') {
      const lastFailure = this.lastFailureTime.get(key) || 0;

      if (Date.now() - lastFailure > this.RESET_TIMEOUT) {
        this.state.set(key, 'half-open');
      } else {
        if (fallback) {
          return fallback();
        }
        throw new ServiceUnavailableException(
          'Service temporarily unavailable'
        );
      }
    }

    try {
      const result = await fn();
      this.onSuccess(key);
      return result;
    } catch (error) {
      this.onFailure(key);

      if (fallback && this.state.get(key) === 'open') {
        return fallback();
      }

      throw error;
    }
  }

  private onSuccess(key: string): void {
    this.failures.set(key, 0);
    this.state.set(key, 'closed');
  }

  private onFailure(key: string): void {
    const currentFailures = this.failures.get(key) || 0;
    const newFailures = currentFailures + 1;

    this.failures.set(key, newFailures);
    this.lastFailureTime.set(key, Date.now());

    if (newFailures >= this.FAILURE_THRESHOLD) {
      this.state.set(key, 'open');
      console.error(`Circuit breaker opened for: ${key}`);
    }
  }
}

type CircuitState = 'closed' | 'open' | 'half-open';

// Usage
@Injectable()
export class ExternalApiService {
  constructor(private circuitBreaker: CircuitBreakerService) {}

  async fetchFromExternalApi(url: string): Promise<any> {
    return this.circuitBreaker.execute(
      `external-api:${url}`,
      async () => {
        const response = await fetch(url);
        return response.json();
      },
      async () => {
        // Fallback: return cached data or default response
        return this.getCachedData(url);
      }
    );
  }
}

📊 Strategy #6: Request Rate Limiting

Protect APIs from Abuse

import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL);

// Global rate limiter
const globalLimiter = rateLimit({
  store: new RedisStore({
    client: redis,
    prefix: 'rl:global:'
  }),
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 1000, // 1000 requests per window per IP
  message: 'Too many requests, please try again later',
  standardHeaders: true,
  legacyHeaders: false
});

// Stricter limits for expensive endpoints
const expensiveLimiter = rateLimit({
  store: new RedisStore({
    client: redis,
    prefix: 'rl:expensive:'
  }),
  windowMs: 60 * 1000, // 1 minute
  max: 10, // 10 requests per minute
  message: 'Rate limit exceeded for this endpoint'
});

// Apply middleware
app.use('/api/', globalLimiter);
app.use('/api/reports/generate', expensiveLimiter);

// Custom rate limiter by user ID
const createUserRateLimiter = (maxRequests: number) => {
  return rateLimit({
    store: new RedisStore({
      client: redis,
      prefix: 'rl:user:'
    }),
    windowMs: 60 * 1000,
    max: maxRequests,
    keyGenerator: (req) => {
      // Rate limit by user ID instead of IP
      return req.user?.id || req.ip;
    }
  });
};

app.use('/api/user/*', createUserRateLimiter(100));

📈 Real-World Performance Metrics

Load Testing Results

# Artillery load test - sustained load
artillery run loadtest.yml

# Configuration
config:
  target: 'https://api.orgsignals.com'
  phases:
    - duration: 300
      arrivalRate: 100
      rampTo: 1000
      name: "Ramp to peak"
    - duration: 600
      arrivalRate: 1000
      name: "Sustained peak load"

# Results after optimization:
Summary:
  ✅ Scenarios: 960,000 (100%)
  ✅ Requests: 4,800,000
  ✅ Success Rate: 99.92%
  ✅ Response Times:
     - Min: 35ms
     - Median: 118ms
     - P95: 298ms
     - P99: 562ms
     - Max: 1,841ms
  ✅ Throughput: 8,000 req/s sustained
  ✅ Error Rate: 0.08%

Database Performance:
  ✅ Connection Pool:
     - Total: 100
     - Idle: 45
     - Active: 55
     - Waiting: 0
  ✅ Query Performance:
     - Avg: 12ms
     - P95: 45ms
     - P99: 120ms

Production Metrics (30 days)

API Performance:
  ✅ Total Requests: 45.2M
  ✅ Avg Response Time: 118ms
  ✅ P95 Response Time: 298ms
  ✅ P99 Response Time: 562ms
  ✅ Error Rate: 0.08%
  ✅ Peak Throughput: 8,500 req/s

Cache Performance:
  ✅ Redis Hit Rate: 87%
  ✅ Avg Cache Response: 5ms
  ✅ Total Cache Hits: 39.3M
  ✅ Total Cache Misses: 5.9M
  ✅ Database Load Reduction: 85%

Infrastructure Health:
  ✅ Uptime: 99.98%
  ✅ Avg CPU: 45%
  ✅ Avg Memory: 52%
  ✅ Connection Pool: Healthy
  ✅ Auto-scaling Events: 47

💡 Key Lessons Learned

What Made the Biggest Impact

  1. Redis Caching (40% improvement): 87% hit rate eliminated most database queries
  2. Connection Pooling (25% improvement): Eliminated connection overhead
  3. Parallel Queries (20% improvement): Reduced response time by 60%
  4. Read Replicas (10% improvement): Distributed database load
  5. Compression (5% improvement): Reduced bandwidth by 80%

What Didn't Work

Microservices too early: Added complexity without benefits at this scale

Over-caching: Caused stale data issues, had to fine-tune TTLs

GraphQL: Added overhead without clear advantages for our use case

Too many middleware: Each middleware added latency

🎯 Build APIs That Scale

These backend optimization strategies transformed our API from struggling at 120 req/s to smoothly handling 8,500 req/s - a 70x improvement. But backend performance is just one component of delivering world-class developer productivity insights.

Experience Lightning-Fast APIs

Ready to see sub-200ms API responses in action?

Try OrgSignals for Free →

OrgSignals leverages every backend optimization strategy covered in this article:

  • 120ms average API response times
  • 🚀 8,500+ requests/second capacity
  • 💪 99.98% uptime with automatic failover
  • 🔄 Real-time data sync across all integrations
  • 🛡️ Enterprise-grade security and reliability

Transform Your Development Team's Productivity

Stop flying blind with your engineering metrics. OrgSignals provides:

Lightning-fast analytics - Get insights in milliseconds, not seconds

Real-time DORA metrics - Track deployment frequency, lead time, MTTR, and change failure rate

Seamless integrations - GitHub, GitLab, Jira, Slack - all your tools unified

AI-powered insights - Automatically identify bottlenecks and improvement opportunities

Developer-friendly dashboards - Beautiful visualizations that tell the story

Team & individual metrics - From C-suite to individual contributors

Learn More About Building Scalable Systems

📚 Read the complete series:

Questions about scaling your backend? Drop them in the comments - I respond to every question!

Found this helpful? Follow for more backend optimization and system design content.

🏷️ Tags

backend #nodejs #express #api #optimization #redis #caching #postgresql #performance #scaling #microservices #ratelimiting #circuitbreaker

Try out OpenMetaData!

2026-02-04 01:46:44

Hi everyone, just wanted to share this Github in case it’s useful!

I work with OpenMetaData, if you too are interested in or working with modern data platforms I think it’s worth having it on your radar!

Some of the features:

  • 🔍 A central data catalog
  • 🧬 End-to-end lineage across warehouses, transformations, and BI
  • 📘 Business glossary & ownership
  • ✅ Data quality signals tied directly to assets
  • 🧠 A metadata graph you can actually build on

👉GitHub: https://github.com/open-metadata/OpenMetadata

Fully open source. If you find it useful, consider giving the repo a ⭐ to bookmark and support the project.
Feedback and contributions are always welcome!